Tom Auger (taugr)

Coding, researching, and teaching with AI

Information Retrieval and Text Analysis Lab - TUMO 2023

Preserved from the old site: https://tomauger.gitlab.io/tumo2023/.

Բարև Ձեզ

Welcome to the lab page for Information Retrieval and Text Analysis Lab taught at TUMO Armenia in Yerevan and Gyumri in Winter 2023!

Here you’ll find all the lab materials which will be freely available during and after the lab completes.

Download Course Files

Download everything from the git repository, make sure to fetch the datasets submodule by adding the --recurse-submodules flag.

git clone https://github.com/tom-auger/tumo-2023-irta.git --recurse-submodules

Contact Me

If you have any questions or comments during or after the course please feel free to email me: [email protected].

Schedule and Materials


Lesson 1 - Introduction & Text Laws

Introduction to the course. Preprocessing text documents. Exploring text laws including Zipf’s Law, Benford’s Law, Heap’s Law, and clumping and contagion.