Machine Learning for

Language Analysis

Late Summer School

// 26.9.–29.9.2018

© GaudiLab/shutterstock





// Program

© Laremenko Sergii/shutterstock

The Late Summer School mainly adressed students and doctoral candidates from linguistics and digital humanities, but numerous students from other fields who are interested in machine learning participated as well. The first part of the program consisted of a basic introduction to machine learning for the analysis of natural languages. The second part dealt with more specific questions from the field (see below for course material). We recruited lecturers for various topics, from theoretical background and its practical implementation.

Download the full program pdf here


< / >

Wednesday / Thursday   //  26./27.9.2018


learning Machine Learning


(Nils Reiter, IMS Stuttgart) - The theoretical basics of machine learning methods are presented in a mixture of hackaton and tutorial, including an example implementation in Python and the concrete evaluation of text-analytical methods within

the framework of a small shared task.


The full course material (including slides, data and code) is available here.




Friday / Saturday // 28./29.9.2018


Machine Learning with audio and speech data

(Parallel session)


(Fraunhofer IAIS) - In the audio mining and speech recognition part, participants will be introduced to using machine learning to solve problems relation to audio data and in particular audio recording of speech. The participants will work with multilingual audio data and in particular focus on language independent problems such as speech detection, speaker diarization and related tasks. The course is based on an open source setup built on Keras (and Tensorflow).

Material: Abdullah's Github; D. Laqua: [1] [2] [3] [whiteboard pics]



// registration

© smolaw/shutterstock

Friday / Saturday // 28./29.9.2018


Deep Learning with Text Data
(parallel session)


(IDH Cologne) - The workshop aims to apply deep neural networks (DNNs) on written text. After some theoretical introduction on DNNs, the participants learned to solve classification problems with DNNs. The second part of the workshop was a hands-on session that is about building a

text generator with a DNN. All implementations were written in Python. The full course material (including ipython notebooks and presentation slides) is available in this Github repository.


Registration // contact








All participants will receive their certificate of participation via e-mail in the week after the school. For further question, please use our contact e-mail:

Additional information about the school including the detailed program can be downloaded here.

The School was organized in the framework of the University of Cologne's Competence Area III (CA3: Quantitative Modeling
of Complex Systems) and by Jürgen Hermes, Claes Neuefeind (Institut for Digital Humanities, UoC), and Felix Rau (Institute for Linguistics, UoC). The School took place at the University
of Cologne's Philosophikum.


We thank all participants and organizers for a successful and productive school!

// organization