In this Challenge, you’ll assume the role of a financial advisor at one of the top five financial advisory firms in the world. Your firm constantly competes with the other major firms to manage and automatically trade assets in a highly dynamic environment. In recent years, your firm has heavily profited by using computer algorithms that can buy and sell faster than human traders.
The speed of these transactions gave your firm a competitive advantage early on. But, people still need to specifically program these systems, which limits their ability to adapt to new data. You’re thus planning to improve the existing algorithmic trading systems and maintain the firm’s competitive advantage in the market. To do so, you’ll enhance the existing trading signals with machine learning algorithms that can adapt to new data.
Use the starter code file to complete the steps that the instructions outline. The steps for this Challenge are divided into the following sections:
-
Establish a Baseline Performance
-
Tune the Baseline Trading Algorithm
-
Evaluate a New Machine Learning Classifier
-
Create an Evaluation Report
In this section, you’ll run the provided starter code to establish a baseline performance for the trading algorithm. To do so, complete the following steps.
Open the Jupyter notebook. Restart the kernel, run the provided cells that correspond with the first three steps, and then proceed to step four.
-
Import the OHLCV dataset into a Pandas DataFrame.
-
Generate trading signals using short- and long-window SMA values.
-
Split the data into training and testing datasets.
-
Use the
SVC
classifier model from SKLearn's support vector machine (SVM) learning method to fit the training data and make predictions based on the testing data. Review the predictions. -
Review the classification report associated with the
SVC
model predictions. -
Create a predictions DataFrame that contains columns for “Predicted” values, “Actual Returns”, and “Strategy Returns”.
-
Create a cumulative return plot that shows the actual returns vs. the strategy returns. Save a PNG image of this plot. This will serve as a baseline against which to compare the effects of tuning the trading algorithm.
-
Write your conclusions about the performance of the baseline trading algorithm in the
Evaluation_report.md.md
file that’s associated with your GitHub repository. Support your findings by using the PNG image that you saved in the previous step.
In this section, you’ll tune, or adjust, the model’s input features to find the parameters that result in the best trading outcomes. (You’ll choose the best by comparing the cumulative products of the strategy returns.) To do so, complete the following steps:
- Tune the training algorithm by adjusting the size of the training dataset. To do so, slice your data into different periods. Rerun the notebook with the updated parameters, and record the results in your
Evaluation_report.md
file. Answer the following question: What impact resulted from increasing or decreasing the training window?
Hint To adjust the size of the training dataset, you can use a different
DateOffset
value—for example, six months. Be aware that changing the size of the training dataset also affects the size of the testing dataset.
-
Tune the trading algorithm by adjusting the SMA input features. Adjust one or both of the windows for the algorithm. Rerun the notebook with the updated parameters, and record the results in your
Evaluation_report.md
file. Answer the following question: What impact resulted from increasing or decreasing either or both of the SMA windows? -
Choose the set of parameters that best improved the trading algorithm returns. Save a PNG image of the cumulative product of the actual returns vs. the strategy returns, and document your conclusion in your
Evaluation_report.md
file.
In this section, you’ll use the original parameters that the starter code provided. But, you’ll apply them to the performance of a second machine learning model. To do so, complete the following steps:
-
Import a new classifier, such as
AdaBoost
,DecisionTreeClassifier
, orLogisticRegression
. (For the full list of classifiers, refer to the Supervised learning page in the scikit-learn documentation.) -
Using the original training data as the baseline model, fit another model with the new classifier.
-
Backtest the new model to evaluate its performance. Save a PNG image of the cumulative product of the actual returns vs. the strategy returns for this updated trading algorithm, and write your conclusions in your
Evaluation_report.md
file. Answer the following questions: Did this new model perform better or worse than the provided baseline model? Did this new model perform better or worse than your tuned trading algorithm?
In the previous sections, you updated your README.md
file with your conclusions. To accomplish this section, you need to add a summary evaluation report at the end of the README.md
file. For this report, express your final conclusions and analysis. Support your findings by using the PNG images that you created.