this work was published in HICS 2021
Finding exponential growth trends and exponential growth potential is key to success in businesses and startups. Building a product for a market that can grow exponentially would increase the likelihood of success. These growth potentials can be found in a variety of sectors. Different challenges lie ahead in terms of finding exponential patterns and trends. This paper deals with finding these exponential patterns in data lakes. It also proposes different algorithms that can scale up to petabytes of data which can come in different sizes and formats (tabular files). These algorithms can be key to pattern discovery in data lakes, ultimately empowering our search for growth opportunities.
The scripts in this file are used to preprocess data.
Downloads datasets from Kaggle datalake
Generates headerfile.txt from the dataset folder which will be used by splitter.
This code generates sample files in the local machine.
Generates sample file from the map reduce.
Generates sample files from the mapreduce part files
Does preprocessing for the dataset and does exponential and logistic pattern fit
It has support files for exponential and logistic fit and will be called by step3_potentialfinder
Generates graphs and plot
Generate graphs