Ark Invest holdings tracking and analytics app built on GCP (Google Cloud Platform).
- Daily trade price prediction based on open/high/low/close price, volume, action (Sell/Buy) and past Ark real execution prices
- More analytical graphs, tables, figures etc.
a GCP Cloud function
- Trigger: a new (including overrides) csv file uploaded to a specific cloud storage location (
nw-msds498-ark-etf-analytics
) - This function does one task: load the csv file (Ark Invest daily holding, specific format) to BigQuery table:
ark.holdings
env.yaml
- environment variables for cloud functionmain.py
- main logicrequirement.txt
- python libraries required to run this function
One-off function to convert Ark Invest's PDF files to csv format, then load it to cloud storage/BigQuery.
docs
- Store pdf files to be converted temporarilycsv
- Store converted csv files temporarilyupload.sh
- Upload converted csvs to cloud storage
main.py
- convert holdings pdf to csv usingtabula
andpandas
librarytrade_price.py
- Convert ark trade log (with real execution prices) pdf to csv formatmerge_price.py
- merged converted csvs, validates duplicates and inconsistencies as well
ixe.py
- core functions to get minute price and day price information usingiexcloud API
viaiexfinance
python libraryconfig.py
- configuration files used by multiple py file. Including reading/preparing API keys, dictionaries of commonly used fields like:- foreign symbols that can't be handled now
- standard API errors
- standard API request types
- query all available trading days and symbols etc.
get_trade_days.py
usepandas_market_calendars
library to get trading days between date rangetoken.py
- Retrieve iex API token, will deprecate soon
temporary folders for data and logging
Shell script to create python requirements file and remove pkg-resources==0.0.0
, this is specific for develop in google cloud shell
A wrapper function to call cli/iex.py
Upload files that with specific name pattern to cloud storage then load them from cloud storage to BigQuery
- Delete table from BigQuery first if needed
One-off function to merge mulitple csv file in the same format into one.
WIP python library to get data from iex API (potentially other APIs as well in the future), to solve the efficiency issue with iexfinance
library
one off shell script to rename some files produced before
python libraries required to run everything under price folder
bulk upload files to cloud storage
local --> cloud storage --> BigQuery
- takes in two arguments
- price type, min or day
- file name to be loaded (on cloud storage)
This folder contains all components of a full-fledged working cloud functions.
It pulls holdings data from Ark Invest website (csv format) and upload the csv files onto cloud storage. (Which would then trigger csv_loader
function automatically)
env.yaml
- environmental variables for cloud functionmain.py
code of the cloud functionrequirements.txt
- list of python libraries neededsh folder
- scripts to support deployment of the function etc.deploy.sh
- script to deploy this cloud functionlog_today.sh
to get logs of this function (ark-holdings-daily-pull
) running on GCP for today.- Takes specific date and start time as well
list_raw.sh
- simple script to list cloud storage where raw files are stored with a specific patternless_than_7_etfs.sh
- old script to see whether load has failed for certainly day that resulted in not all 7 etfs are loadedcount_dup.sh
- old command line version of BQ query to verify duplicate record counts in ark.holdings table
Web interface built with Dash framework
on Google App Engine
.
app.py
- python code to createDash
app and serverapps
folderholdings.py
page for ark holdingschanges.py
page for daily changes in ark holdings
main.py
- main page of web interfacecore.py
- core backend logic to support data presented on web interface
- filling up data in dropdown of all pages
- Once selection is made (including the default selection while opening the first page), retrieve data from BigQuery tables and return data to frontend
app.yaml
- App Engine config fileapp_flex.yaml
- To deploy this app into flexible Google App Engine, overrideapp.yaml
using this fileapp_stand.yaml
- To deploy thsi app into standard Google App Engine, overrideapp.yaml
using this file.- By default
app.yaml
uses this standard config
- By default
assets
- folder for css file and Ark Invest logorequirements.txt
- python libraries required to run this web interface
This fodler contains some deprecated functions, temporary fixes and functions that are not yet running in production
Some temporary scripts and data mostly for data fixes
fixes
- This folder contains data that's been fixed manually and some scripts related to ittmp
- trades table related describe, schema etc. info
A full-fledged cloud function that remove duplicates records from ark.holdings table
- Still running on cloud function
- No longer really needed as the problem that would cause duplicate records has been fixed
for gmail authentication to retrieve daily trade notification emails from ARK, not working yet.