Skip to content

Data Science projects completed while studying at "Data Scientist Plus" program by Yandex.Practicum

Notifications You must be signed in to change notification settings

dstrebkov/data-scientist-plus-projects

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

78 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

"Data Scientist Plus" projects

Projects completed while studying at "Data Scientist Plus" program by Yandex.Practicum (2021-2023)

State-recognized Diploma of professional retraining (PDF): English / Russian

Project Description Tools / Libraries Status
01 Big cities music preferences Check several statistical hypotheses regarding music preferences of people living in Moscow & Saint-Petersburg pandas Done
02 Analysis of bank's borrower reliability Analyze if the marital status and the number of children of the bank's client affect the fact of loan repayment on time pandas Done
03 Analysis of advertisements for the sale of apartments Based on data from Yandex.Estate, determine the market value of real estate in Saint-Petersburg pandas, matplotlib Done
04 Study of data on film distribution Perform analysis to study the film distribution market and identify current trends pandas, numpy, matplotlib, seaborn Done
05 Determination of a prospective tariff for a telecom company Telecom company's commercial department needs analysis to understand which tariff brings in more money pandas, numpy, scipy, matplotlib, seaborn Done
06 Mobile tariff recommendation for a client Classification model is required to select the appropriate mobile tariff for clients pandas, numpy, matplotlib, seaborn, sklearn Done
07 Bank's customer churn modelling Based on historical data about clients' activities, it is necessary to predict whether some particular client will stop being bank's client in the nearest future or not pandas, numpy, matplotlib, seaborn, sklearn Done
08 Choosing a location for an oil well Having data about oil samples in three geographoc regions, build a model to choose the most profitable oil well location pandas, numpy, matplotlib, seaborn, sklearn Done
09 Predicting rejection of hotel reservation Develop a model that predicts hotel booking rejection and find out if the profit from such model would cover model's development expenses pandas, numpy, matplotlib, seaborn, sklearn Done
10 SQL Basics Train writing basic SQL queries SQL Done
11 Git and Command-Line Practice using Git and Linux command-line workflow git, cmd Done
12 California housing cost prediction Build a linear regression model on California housing data in 1990-s to predict median cost of a house in a residential area pandas, numpy, pyspark Done
13 Linear algebra to protect personal data Needed to protect the data of clients of an insurance company by developing a data transformation method that makes it difficult to recover personal information pandas, numpy, sklearn Done
14 Cars cost determining Based on historical data about technical characteristics, completeness and prices of cars, it's needed to build a model to determine car costs pandas, numpy, matplotlib, seaborn, sklearn, catboost, lightgbm Done
15 Advanced SQL Write 10 more advanced SQL queries from Jupyter environment SQL Done
16 Star temperature prediction Having characteristics of 240 cosmic stars already studied, create a neural network to determine the temperature on the surface of the discovered stars pandas, numpy, matplotlib, seaborn, sklearn, pytorch Done
17 Risk of road accident assessing Create a system that could assess the risk of a road accident along the selected route; find out whether it's possible to predict an accident based on the historical data of one of the regions SQL, pandas, numpy, matplotlib, seaborn, sklearn, sqlalchemy, lightgbm, catboost Done
18 Forecasting taxi orders Build a model to predict the number of taxi orders for the next hour to attract more drivers during the peak period pandas, numpy, matplotlib, seaborn, sklearn, statsmodels, lightgbm Done
19 Classification of comments whether they are toxic or not Having a labeled English comments dataset with toxity markup, create a model to classify them into positive or negative pandas, numpy, matplotlib, seaborn, nltk, spacy, lightgbm, afinn, nrclex Done
20 Determining age of buyers by their photos Build a model that will determine the approximate age of a person from a photograph by using a labeled dataset of people photographs pandas, matplotlib, seaborn, keras Done
21 Search images by text query Build a model that is capable to get a textual description of some scene, and return several photos with the same or similar scene pandas, numpy, matplotlib, seaborn, sklearn, pytorch, transformers, torchvision Done
22 Predict telecom contract termination Build a model that will predict whether the subscriber will terminate the contract with telecom company or not SQL, pandas, numpy, matplotlib, seaborn, sqlalchemy, sklearn, lightgbm, catboost, pytorch Done

About

Data Science projects completed while studying at "Data Scientist Plus" program by Yandex.Practicum

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published