home

e-mail: dvkazakov @ gmail.com
(убрать пробелы с обеих сторон '@')

Тел./WhatsApp: +7-916-909-7864

Telegram: @denis_v_kazakov

GitHub

Skype: denis.v.kazakov

photo

English


An analysis of n-gram frequencies in Google Books

A study project at the Data Science reskilling course delivered by the Tomsk State University.

Skills:


Project files: Jupyter notebook and Python script.

One of translators' tools is analyzing word and phrase frequencies, for example with Google Ngrams. The web-service plots annual frequencies, however the comparison is qualitative and it is not always possible to say which phrase is more frequent and whether the difference is statistically significant.

The project goal was to supplement the plot with a qualitative evalation.

Example of results:



<