A Humorous Introduction to Natural Language Processing

Area

Data Science

Date

Date 14 - 25 November 2022

A Humorous Introduction to Natural Language Processing

Area

Data Science

Date

Date 14 - 25 November 2022

Exactpro events

A Humorous Introduction to Natural Language Processing is a mini-course which took place online on 14-25 November. Our lead expert for the course was Pavel Braslavski, Associate Professor and Senior Research Fellow, Faculty of Computer Science, HSE University. 

The mini-course is a brief introduction to natural language processing (NLP), with the task of funny news title generation as a working example. The course is an introduction to basic NLP concepts and approaches, which provides hands-on experience in working with various NLP tools. As part of the course, we also became acquainted with subtasks such as:

  • tokenization, 
  • POS-tagging, 
  • semantic distance, 
  • sentiment analysis,
  • edit distance,
  • evaluation of results.

The mini-course consisted of two parts: theory and practice. The theory part featured 5 lectures and hands-on assignments, and the practice part focused on individual project work and included technical support sessions guided by teaching assistants from the Exactpro team. 

Prerequisites: basic Python skills, Google account to work in Colab
Theory dates/time: 14-18 November, 18:00 GET/19:30 SLST
Practice dates/time: 21-25 November, TBA
Exactpro teaching assistants: Stanislav Glushkov, DocOps Engineer; Tornike Baramidze, QA Analyst; Julia Emelianova, Researcher.

The full mini-course agenda featured the following topics:

  • Natural Language Processing: a very short introduction. Computational Humor. Task and Data Used in the Mini-course. 
  • Tokenization and Part-Of-Speech (POS) Tagging. Stanza Package. 
  • Semantic Resources: WordNet. Word embeddings: fastText. 
  • Similarly Sounding/Spelled words: soundex, Levenshtein distance. Datamuse API. Evaluation in NLP, a Measure for Inter-Rater Agreement: Cohen's Kappa.
  • Sentiment Analysis: Sentiment dictionaries, Sentiment classifiers. 

You can also check out our Python4ML Bootcamp. The bootcamp aims to introduce you to Python programming basics and reviews various beginner-level tasks that typically need to be solved for data science and data analysis purposes.