This repository contains various machine learning and deep learning models applicable to the financial domain.
- 1. Models Included
- 2. Dependencies
- 3. Installation
- 4. Data Fetching
- 5. Data Preprocessing
- 6. Usage
- 7. Models Explained
- 8. Beyond The Models: Real-World Applications in Finance
- 9. Disclaimer
The repository consists of the following categories:
-
Supervised Learning Models ๐ค ๐ฝ
- Linear Regression
- Logistic Regression
- Naive Bayes
- Random Forest
-
Unsupervised Learning Models ๐พ ๐ฆฝ
- Clustering (K-means)
- Dimensionality Reduction (PCA)
-
Deep Learning Models ๐ก โ๏ธ
- Supervised Deep Learning Models
- Recurrent Neural Networks (LSTM)
- Convolutional Neural Networks (CNN)
- Unsupervised Deep Learning Models
- Autoencoders
- Generative Adversarial Networks (GANs)
- Supervised Deep Learning Models
-
Reinforcement Learning Models ๐ฆพ ๐ฅ
- Q-Learning
- Python 3.x
- yfinance
- NumPy
- TensorFlow
- Scikit-learn
To install all dependencies, run (make a conda or python virtual environment if needed, optionally):
pip install -r requirements.txt
To install just the essentials needed, run:
pip install yfinance numpy tensorflow scikit-learn
Data is fetched using the yfinance library for real-world financial data.
import yfinance as yf
def fetch_data(ticker, start_date, end_date):
return yf.download(ticker, start=start_date, end=end_date)['Close'].values
Data is preprocessed to create training and testing datasets, which are then fed into machine learning models.
import numpy as np
def create_dataset(data, look_back=1):
X, Y = [], []
for i in range(len(data) - look_back - 1):
a = data[i:(i + look_back)]
X.append(a)
Y.append(data[i + look_back])
return np.array(X), np.array(Y)
Navigate to the respective folder and run the Python script for the model you're interested in.
python script_name.py
Linear Regression tries to fit a linear equation to the data, providing a straightforward and effective method for simple predictive tasks.
Logistic Regression is traditionally used for classification problems but has been adapted here for regression tasks.
Naive Bayes is particularly useful when you have a small dataset and is based on Bayes' theorem.
Random Forest combines multiple decision trees to make a more robust and accurate prediction model.
K-means clustering is used to partition data into groups based on feature similarity.
PCA is used to reduce the number of features in a dataset while retaining the most relevant information.
Recurrent Neural Networks, particularly using Long Short-Term Memory (LSTM) units, are highly effective for sequence prediction problems. In finance, they can be used for time-series forecasting like stock price predictions.
Convolutional Neural Networks are primarily used in image recognition but can also be applied in finance for pattern recognition in price charts or for processing alternative data types like satellite images for agriculture commodity predictions.
Autoencoders are used for anomaly detection in financial data, identifying unusual patterns that do not conform to expected behavior.
GANs are used for simulating different market conditions, helping in risk assessment for various investment strategies.
Q-Learning is a type of model-free reinforcement learning algorithm used here for stock trading.
In addition to the core machine learning models that form the backbone of this repository, we'll explore practical applications that span various dimensions of the financial sector. Below is a snapshot of the project's tree structure that gives you an idea of what these applications are:
5. ml_applications_in_finance
โ โโโ risk_management
โ โโโ decentralized_finance_(DEFI)
โ โโโ environmental_social_and_governance_investing_(ESG)
โ โโโ behavioural_economics
โ โโโ blockchain_and_cryptocurrency
โ โโโ explainable_AI_for_finance
โ โโโ robotic_process_automation_(RPA)
โ โโโ textual_and_alternative_data_for_finance
โ โโโ fundamental_analysis
โ โโโ satellite_image_analysis_for_finance
โ โโโ venture_capital
โ โโโ asset_management
โ โโโ private_equity
โ โโโ investment_banking
โ โโโ trading
โ โโโ portfolio_management
โ โโโ wealth_management
โ โโโ multi_asset_risk_model
โ โโโ personal_financial_management_app
โ โโโ market_analysis_and_prediction
โ โโโ customer_service
โ โโโ compliance_and_regulatory
โ โโโ real_estate
โ โโโ supply_chain_finance
โ โโโ invoice_management
โ โโโ cash_management
From risk management to blockchain and cryptocurrency, from venture capital to investment banking, and from asset management to personal financial management, we aim to cover a wide array of use-cases. Each of these applications is backed by one or more of the machine learning models described earlier in the repository.
Note: The list of applications is not exhaustive, and the project is a work in progress. While I aim to continually update it with new techniques and applications, there might be instances where certain modules may be added or removed based on their relevance and effectiveness.
The code provided in this repository is for educational and informational purposes only. It is not intended for live trading or as financial advice. Please exercise caution and conduct your own research before making any investment decisions.