For more detailes about the project, please refer to the documentation.
This project is meant to scrape stocks financial data from yahoo finance for a wide range of companies (mostly US and EU based companies) and store them in a cloud based database. Since the databse is private, it cannot be accessed by publicly, but scheduled tasks extract financial data using yahoo api and loads the data in the database.
- Yahoo finance only provides last 4 quarters or years of financial data for a company. This project solves this problem by scraping the data from yahoo finance every quarter, storing all old records in a database as well as the new ones. therefore, the database contains all the financial data for a company since the scraping started, having more that last 4 quarters or years of data.
- Yahoo finance does not provide a way to download all the financial data for a wide range of companies at once. This project solves this problem by scraping the data from yahoo finance and storing them in a postgres database. Access to data is quick through SQL queries.
- Yahoo finance does not provide a way to filter companies based on their financial data. This project solves this by enabling SQL queries to filter companies based on their financial data.
Once every month, the database is backed up and stored as parquet files in s3 bucket. The backup job is scheduled using github actions.