Skip to content
Youngho Chaa cha cha
TwitterHomepage

Building a humble buy and sell decision engine for me in Python

python9 min read

My Python skill is pretty shallow. I hope this mini project can help me deepen my understanding.

Resources

  • Learn Algorithmic Trading source code

Google colab

It’s convenient that Python packages are automatically managed in the notebook

Buy when the price is low, and sell when the price is high

import pandas as pd
import yfinance as yf
import numpy as np
start_date = pd.to_datetime('2020-01-01')
end_date = pd.to_datetime('2024-01-01')
goog_data = yf.download('GOOG', start_date, end_date)
goog_data_signal = pd.DataFrame(index=goog_data.index)
goog_data_signal['price'] = goog_data['Adj Close']
goog_data_signal['daily_difference'] = goog_data_signal['price'].diff()
goog_data_signal['signal'] = 0.0
goog_data_signal['signal'] = np.where(goog_data_signal['daily_difference'] > 0, 1.0, 0.0)
goog_data_signal['positions'] = goog_data_signal['signal'].diff()
goog_data_signal.tail(10)
Datepricedaily_differencesignalpositions
2023-12-15 00:00:00133.839996337890620.63999938964843751.01.0
2023-12-18 00:00:00137.190002441406253.3500061035156251.00.0
2023-12-19 00:00:00138.100006103515620.9100036621093751.00.0
2023-12-20 00:00:00139.660003662109381.559997558593751.00.0
2023-12-21 00:00:00141.80000305175782.13999938964843751.00.0
2023-12-22 00:00:00142.720001220703120.91999816894531251.00.0
2023-12-26 00:00:00142.820007324218750.1000061035156251.00.0
2023-12-27 00:00:00141.44000244140625-1.38000488281250.0-1.0
2023-12-28 00:00:00141.27999877929688-0.1600036621093750.00.0
2023-12-29 00:00:00140.92999267578125-0.3500061035156250.00.0
  • High: The highest price of the stock on that trading day.
  • Low: The lowest price of the stock on that trading day.
  • Close: The price of the stock at closing time.
  • Open: The price of the stock at the beginning of the trading day (closing price of the previous trading day).
  • Volume: How many stocks were traded.
  • Adj Close: The closing price of the stock that adjusts the price of the stock for corporate actions. This price takes into account the stock splits and dividends.
  • I used yf.download() as pd.DataReader has compatibility issue with yahoo finance api

Visualise it

import pandas as pd
import yfinance as yf
import numpy as np
import matplotlib.pyplot as plt
start_date = pd.to_datetime('2020-01-01')
end_date = pd.to_datetime('2024-01-01')
goog_data = yf.download('GOOG', start_date, end_date)
goog_data_signal = pd.DataFrame(index=goog_data.index)
goog_data_signal['price'] = goog_data['Adj Close']
goog_data_signal['daily_difference'] = goog_data_signal['price'].diff()
goog_data_signal['signal'] = 0.0
goog_data_signal['signal'] = np.where(goog_data_signal['daily_difference'] > 0, 1.0, 0.0)
goog_data_signal['positions'] = goog_data_signal['signal'].diff()
fig = plt.figure()
ax1 = fig.add_subplot(111, ylabel='Google price in $')
goog_data_signal['price'].plot(ax=ax1, color='r', lw=2.)
ax1.plot(goog_data_signal.loc[goog_data_signal.positions == 1.0].index,
goog_data_signal.price[goog_data_signal.positions == 1.0],
'^', markersize=5, color='m')
ax1.plot(goog_data_signal.loc[goog_data_signal.positions == -1.0].index,
goog_data_signal.price[goog_data_signal.positions == -1.0],
'v', markersize=5, color='k')
plt.show()
Untitled.png

Backtesting

Backtesting is a key phase to get statistics showing how effective the trading strategy is.

  • Profit and loss (P and L): The money made by the strategy without transaction fees.
  • Net profit and loss (net P and L): The money made by the strategy with transaction fees.
  • Exposure: The capital invested.
  • Number of trades: The number of trades placed during a trading session.
  • Annualised return: This is the return for a year of trading.
  • Sharpe ratio: The risk-adjusted return. This is important because it compares the return of the strategy with a risk-free strategy.
import pandas as pd
import yfinance as yf
import numpy as np
import matplotlib.pyplot as plt
start_date = pd.to_datetime('2021-01-01')
end_date = pd.to_datetime('2024-01-01')
goog_data = yf.download('GOOG', start_date, end_date)
goog_data_signal = pd.DataFrame(index=goog_data.index)
goog_data_signal['price'] = goog_data['Adj Close']
goog_data_signal['daily_difference'] = goog_data_signal['price'].diff()
goog_data_signal['signal'] = 0.0
goog_data_signal['signal'] = np.where(goog_data_signal['daily_difference'] > 0, 1.0, 0.0)
goog_data_signal['positions'] = goog_data_signal['signal'].diff()
initial_capital = float(1000.0)
positions = pd.DataFrame(index=goog_data_signal.index).fillna(0.0)
portfolio = pd.DataFrame(index=goog_data_signal.index).fillna(0.0)
positions['GOOG'] = goog_data_signal['signal']
portfolio['positions'] = (positions.multiply(goog_data_signal['price'], axis=0))
portfolio['cash'] = initial_capital - (positions.diff().multiply(goog_data_signal['price'], axis=0)).cumsum()
portfolio['total'] = portfolio['positions'] + portfolio['cash']
# portfolio.tail(100)
portfolio.plot()
plt.show()
Untitled.png

As you can see, this strategy is not very profitable

Exponential Moving Average

The exponential moving average, EMA, is similar to the simple moving average, but, instead of weighing all prices in the history equally, it places more weight on the most recent price observation and less weight on the older price observations.

There two different types of EMAs.

  • Fast EMA: converges to new price observations faster and forgets older observations faster
  • Slow EMA: converges to new price observations slower and forgets old observations slower.
EMA = ( P - EMAp ) * K + EMAp
Where:
P = Price for the current period
EMAp = the Exponential moving Average for the previous period
K = the smoothing constant, equal to 2 / (n + 1)
n = the number of periods in a simple moving average roughly approximated by the EMA

Implementation of the EMA

import pandas as pd
import yfinance as yf
start_date = pd.to_datetime('2021-01-01')
end_date = pd.to_datetime('2024-01-01')
goog_data = yf.download('GOOG', start_date, end_date)
close = goog_data['Close']
'''
EMA = ( P - EMAp ) * K + EMAp
Where:
P = Price for the current period
EMAp = the Exponential moving Average for the previous period
K = the smoothing constant, equal to 2 / (n + 1)
n = the number of periods in a simple moving average roughly approximated by the EMA
'''
num_periods = 20 # number of days over which to average
K = 2 / (num_periods + 1) # smoothing constant
ema_p = 0
ema_values = [] # to hold computed EMA values
for close_price in close:
if (ema_p == 0):
ema_p = close_price
else:
ema_p = (close_price - ema_p) * K + ema_p
ema_values.append(ema_p)
goog_data = goog_data.assign(ClosePrice=pd.Series(close, index=goog_data.index))
goog_data = goog_data.assign(Exponential20DayMovingAverage=pd.Series(ema_values, index=goog_data.index))
close_price = goog_data['ClosePrice']
ema = goog_data['Exponential20DayMovingAverage']
import matplotlib.pyplot as plt
fig = plt.figure()
ax1 = fig.add_subplot(111, ylabel='Google price in $')
close_price.plot(ax=ax1, color='g', lw=2., legend=True)
ema.plot(ax=ax1, color='b', lw=2., legend=True)
plt.show()
Untitled.png

Backtesting EMA

import pandas as pd
import yfinance as yf
import numpy as np
start_date = pd.to_datetime('2023-07-01')
end_date = pd.to_datetime('2024-01-01')
goog_data = yf.download('GOOG', start_date, end_date)
close = goog_data['Close']
'''
EMA = ( P - EMAp ) * K + EMAp
Where:
P = Price for the current period
EMAp = the Exponential moving Average for the previous period
K = the smoothing constant, equal to 2 / (n + 1)
n = the number of periods in a simple moving average roughly approximated by the EMA
'''
num_periods = 20 # number of days over which to average
K = 2 / (num_periods + 1) # smoothing constant
ema_p = 0
ema_values = [] # to hold computed EMA values
for close_price in close:
if (ema_p == 0):
ema_p = close_price
else:
ema_p = (close_price - ema_p) * K + ema_p
ema_values.append(ema_p)
goog_data = goog_data.assign(ClosePrice=pd.Series(close, index=goog_data.index))
goog_data = goog_data.assign(Ema20=pd.Series(ema_values, index=goog_data.index))
goog_data['Signal'] = np.where(goog_data['ClosePrice'] > goog_data['Ema20'], 1.0, 0.0)
goog_data['Position'] = goog_data['Signal'].diff()
initial_capital = float(1000.0)
positions = pd.DataFrame(index=goog_data.index).fillna(0.0)
portfolio = pd.DataFrame(index=goog_data.index).fillna(0.0)
positions['GOOG'] = goog_data['Signal']
portfolio['Signal'] = goog_data['Signal']
portfolio['Price'] = goog_data['ClosePrice']
portfolio['Positions'] = (positions.multiply(goog_data['ClosePrice'], axis=0))
portfolio['Cash'] = initial_capital - (positions.diff().multiply(goog_data['ClosePrice'], axis=0)).cumsum()
portfolio['Total'] = portfolio['Positions'] + portfolio['Cash']
portfolio.tail(100)
# goog_data.tail(100)

The last 6 month result is poor. It started with 1,000 capital and end with 990.6

DateSignalPricePositionsCashTotal
2023-12-05 00:00:000.0132.389999389648440.0990.6200256347656990.6200256347656
2023-12-06 00:00:000.0131.429992675781250.0990.6200256347656990.6200256347656
2023-12-07 00:00:001.0138.4499969482422138.4499969482422852.1700286865234990.6200256347656
2023-12-08 00:00:001.0136.63999938964844136.63999938964844852.1700286865234988.8100280761719
2023-12-11 00:00:000.0134.69999694824220.0986.8700256347656986.8700256347656
2023-12-12 00:00:000.0133.639999389648440.0986.8700256347656986.8700256347656
2023-12-13 00:00:000.0133.970001220703120.0986.8700256347656986.8700256347656
2023-12-14 00:00:000.0133.19999694824220.0986.8700256347656986.8700256347656
2023-12-15 00:00:000.0133.839996337890620.0986.8700256347656986.8700256347656
2023-12-18 00:00:001.0137.19000244140625137.19000244140625849.6800231933594986.8700256347656
2023-12-19 00:00:001.0138.10000610351562138.10000610351562849.6800231933594987.780029296875
2023-12-20 00:00:001.0139.66000366210938139.66000366210938849.6800231933594989.3400268554688
2023-12-21 00:00:001.0141.8000030517578141.8000030517578849.6800231933594991.4800262451172
2023-12-22 00:00:001.0142.72000122070312142.72000122070312849.6800231933594992.4000244140625
2023-12-26 00:00:001.0142.82000732421875142.82000732421875849.6800231933594992.5000305175781
2023-12-27 00:00:001.0141.44000244140625141.44000244140625849.6800231933594991.1200256347656
2023-12-28 00:00:001.0141.27999877929688141.27999877929688849.6800231933594990.9600219726562
2023-12-29 00:00:001.0140.92999267578125140.92999267578125849.6800231933594990.6100158691406

Strategy that uses MACD

Python Implementation

# MACD Strategy
import pandas as pd
import pandas_datareader as pdr
import matplotlib.pyplot as plt
from datetime import datetime
import yfinance as yf
# Fetch stock data
def fetch_stock_data(ticker, start, end):
return yf.download(ticker, start, end)
# Calculate MACD
def calculate_macd(data, short_window=12, long_window=26, signal_window=9):
data['EMA_short'] = data['Close'].ewm(span=short_window, adjust=False).mean()
data['EMA_long'] = data['Close'].ewm(span=long_window, adjust=False).mean()
data['MACD'] = data['EMA_short'] - data['EMA_long']
data['Signal_Line'] = data['MACD'].ewm(span=signal_window, adjust=False).mean()
# Generate trading signals (buy=1 , sell=-1, do nothing=0)
def generate_signals(data):
data['Signal'] = 0
data['Signal'][data['MACD'] > data['Signal_Line']] = 1
data['Signal'][data['MACD'] < data['Signal_Line']] = -1
data['Position'] = data['Signal'].diff()
# Backtesting the strategy
def backtest_strategy(data):
initial_capital= float(100000.0)
positions = pd.DataFrame(index=data.index).fillna(0.0)
portfolio = pd.DataFrame(index=data.index).fillna(0.0)
# Buy a 100 shares
positions['Stock'] = 100*data['Signal']
portfolio['positions'] = (positions.multiply(data['Close'], axis=0))
portfolio['cash'] = initial_capital - (positions.diff().multiply(data['Close'], axis=0)).cumsum()
portfolio['total'] = portfolio['positions'] + portfolio['cash']
# Calculate daily returns
portfolio['returns'] = portfolio['total'].pct_change()
return portfolio
# Calculate the Sharpe Ratio
def calculate_sharpe_ratio(portfolio):
# Assuming risk-free rate = 0 for simplicity
risk_free_rate = 0
sharpe_ratio = (portfolio['returns'].mean() - risk_free_rate) / portfolio['returns'].std()
# Annualize the Sharpe ratio
sharpe_ratio_annualized = (252**0.5) * sharpe_ratio
return sharpe_ratio_annualized
# Main execution function
def run_strategy(ticker):
start_date = '2023-06-09'
end_date = datetime.now().strftime('%Y-%m-%d')
data = fetch_stock_data(ticker, start_date, end_date)
calculate_macd(data)
generate_signals(data)
portfolio = backtest_strategy(data)
sharpe_ratio = calculate_sharpe_ratio(portfolio)
# Plot the results
plt.figure(figsize=(10,6))
plt.plot(data['Close'], label='Close Price', alpha=0.5)
plt.plot(data['EMA_short'], label='12-day EMA', alpha=0.5)
plt.plot(data['EMA_long'], label='26-day EMA', alpha=0.5)
plt.plot(data['MACD'], label='MACD', alpha=0.5)
plt.plot(data['Signal_Line'], label='Signal Line', alpha=0.5)
plt.scatter(data.index, data['Position'], label='Buy Signal', marker='^', color='green', alpha=1)
plt.scatter(data.index, data['Position'], label='Sell Signal', marker='v', color='red', alpha=1)
plt.title(f'MACD Strategy: {ticker}')
plt.legend()
plt.show()
plt.figure(figsize=(10,6))
plt.plot(portfolio['total'], label='Portfolio Value')
plt.title('Portfolio Performance')
plt.legend()
plt.show()
print(f"Sharpe Ratio: {sharpe_ratio}")
# Run the strategy for a given stock
run_strategy('AAPL')

Output and Backtesting

Untitled.png Untitled.png

The result is not very impressive and the Sharpe Ratio is 0.336, which is moderate. The following is the typical Sharpe Ratios of other strategies.

StrategyAverage Sharpe Ratio
Buy and Hold0.2-0.4
Moving Average Crossover0.3-0.5
MACD (Your Strategy)0.3363
RSI0.4-0.6
Momentum0.5-0.7
© 2024 by Youngho Chaa cha cha. All rights reserved.
Theme by LekoArts