- Data Science is a process, not an event. It is the process where we use data to generalize the useful insights.
- And the overall process is done by professionals like data analysts and data scientists.
- From the data, you must be able to ask questions and be prepared to answer it. Suppose, there is a lot of data from the Census.
- So, think what could be the possible questions? You may think, how many people are married, or have children? Which age group of people are in foreign? What is the literacy rate?
- These are the facts, values, figures, texts, audio, videos which get generated by ourselves in day-to-day life.
- These are generated from our smartphones, photos, videos and texts. In the beginning, they are not being analyzed or utilized for solving business problems. Hours of videos are being uploaded in YouTube every second, thousands of Facebook users are posting something every minute etc. Information: The meaningful insights that come from data after proper analysis of it is information.
flowchart LR
A((Data)) --> B(Vs)
B --> BA(Volume)
B ---> BB(Velocity)
B ----> BC(Veracity)
B ---> BD(Value)
B ----> BE(Variety)
- Data Scientist are those skilled person who solves real world problem using the data. These are the genius who collect the data, analyze the data, manipulate the data, visualize the data and extracts useful information from the data. These are the skills the data scientist must have.
flowchart LR
A((Data Scientist)) --> B(Skills)
B --> BA(Statistics)
B ---> BB(Python)
B ----> BC(R)
B ---> BD(Data Cleaning)
B ----> BE(Preprocessing)
B ----> BF(Visualization)
B ----> BG(SQL)
B ----> BH(Machine Learning)
flowchart LR
A((Data Science)) --> B(Skills)
B --> BA(Linear Algebra)
B ---> BB(Basic Mathematics)
B ----> BC(Probability )
B ---> BD(Data Cleaning)
B ----> BE(Preprocessing)
B ----> BF(Visualization)
B ----> BG(Machine Learning)
B ----> BH(SQL)
B ----> BI(MongoDB)
B ----> BJ(Numpy)
B ----> BK(Pandas)
B ----> BL(Matplotlib)
B ----> BM(Sklearn)
B ----> BN(Tensorflow)
B ----> BO(Bs4)
B ----> BP(PowerBI)
B ----> BQ(Tablaeu)
B ----> BR(Big Data)
B ----> BS(Hadoop)