e-mail: dvkazakov @ gmail.com
(remove spaces on both sides of @)
Data Scientist reskilling course
Tomsk State University
Project-based peer-to-peer learning.
- I was the first to complete training out of ~160 students
- Students learning faster presented their projects to other students to help them progress on the course. I presented four projects ouf of 16
- As a top student, I also checked other students' graduation projects (normally done by staff)
Tech stack: Pandas, Sklearn, GIT, REST API, SQL, Matplotlib, Seaborn and Plotly.
- 0. Data acquisition. Merging data from different sources into a single dataframe
- 1. Data cleaning Cleaning a dataset on bank clients for further processing
- 2. Predictive analytics I. Pre-processing (scaling and one-hot encoding) and logistic regression. Predicting bank client churn.
- 3. Predictive analytics II. Predicting bank client churn. Logistic regression, classification tree, random forest and gradient boosting. Cross-validation, ROC AUC
- 4. GIT basics
- 5. Optimization. Coding to implement a gradient descent algorithm
- 6. Computational complexity
- 7. Combinatorics
- 8. REST API. Free choice of projects. I developed a linguistic service to compare frequencies of phrases in Google Books, testing the statistical significance of found differences. More
- 9. Mathematical statistics, probability theory, linear algebra. Distribution parameters, confidence intervals, probability theory formulas and operations with matrices. Image convolution. Singular value decomposition.
- 10. Pandas Ц working with dataframes.
- 11. Intro to data analysis. SQL + Pandas.
- 12. Data visualization. Matplotlib, Seaborn and Plotly.
- 13. Intro to machine learning. Binary and multiclass classification. Regression. Overfitting. Clustering
- 14. Advanced machine learning. Regularization, GridSearch, metrics, ensembles, pipelines and OOP
- 15. Prototype of a recipe recommender system aiming to help users with healthy diets. Using machine learing to predict dish rating from its ingredients. Providing data on nutritional value. Proposing recipes and daily menus based on nutritional value
- *** Airbnb price prediction. An extra project, not part of the curriculum
Top of page