Justin Mackie

I extract value from data.

Oil Futures Trade Selection

Posted at — Oct 3, 2019

Rank Seasonal Oil Trades and Backtest Profit/Loss


This project was built for Commodity Analytics Group.

Suppose we want to predict oil prices several months ahead. Oil futures prices are daily time series data. Each business day, there is one settlement price for the commodity. Prices sort of repeat over the years in cyclical patterns called seasonality. My model captures price patterns and computes the chance of the pattern repeating using historical prices. The patterns correspond to commodity trades that are ranked best to worst.

The commodity illustrated in the model is Brent Crude, the key global price benchmark for Atlantic basin crude. Brent is used to price the majority of Earth’s internationally-traded crude supplies.

The Model: 💸

My statistical seasonality model selects the best long and short trades using historical data. Then, cumulative trade Profit/Loss performance is tallied for the designated time period. Model logic is built on Python and Pandas. The Model reads and writes data from the SQLite3 database named data.sqlite. Data visualizations are plotted with Seaborn and Matplotlib. The model is presented in a Jupyter Notebook – the .ipynb file.

Top 5 Long Trades by Contract Month – Jan and Feb:

Top 5 Long Trades

Brent Cumulative Profit-Loss for dozen selected trades:

Cumulative PL

Model Details:

When picking a basket of financial trades (or a standalone trade), the odds a trade will move up or down is not the full picture. Volatility and price path are relevant. Volatility is important because a less volatile trade is better than a trade with wild price swings, all other things being equal. Price path matters because as we go through time, a cumulative total trade return of $1 that never dips negative is better than a (larger) cumulative trade return of $1.10 that dips to ($2.00) and later ($1.00). Only a highly risk-tolerant investor would prefer the latter pattern.

The model visualizes trade movement, both in dollar terms and normalized z-score. The z-score is the move in standard deviations above/below the mean. See below.

1-3v4 Dollar Move – 2011-2017:

Dollar Move

1-3v4 Spread z-Normalized Move – 2011-2017:


We can run the model on a different commodity like West Texas Intermediate (WTI) or RBOB Gasoline Futures. We just need the data in a SQLite database!

The prices modeled are actually the price spread between two contracts for the designated calendar month. Ex. 1-3v4 means the month three settlement price minus the month 4 settlement price for calendar month 1. Terminology is bulleted below. Calculating the price spreads in a time spread matrix are outside our scope.

Picking valid spreads is complicated by the fact that contracts expire. We model spreads that are active for the whole calendar month (until any month-end expiration). Here is a concrete example. If the front contract is traded the whole month, but the back contract is traded only the last seven days, this is not a valid spread.

Why not model the spread between “Spread1” and “Spread2”? I’ve modeled that too! Commodity traders call them fly spreads. In contrast, if there is one spread modeled, it is a spread model.

Please note: The model is private. It was built for a commodity analytics startup called CAG. Model available upon request.

Trading Terminology: