Background

Natural Language Processing is an interdisciplinary discipline building on insights from various fields including

  • Language and Linguistics
  • Computer Science in general and programming in particular
  • Probabilities and Statistics
  • Machine Learning and "Data Science"

Students who come to this class have different backgrunds. Some are familiar with some of the fields, others are familiar with different fields. We will try to cover much of the background material - but not all. You might have to read some on your own. What we will cover in class will be adepted to what can be assumed from the first year master students in Informatics: Language and Computation, since this course is mandatory for these students.

Here is some more on assumed background and recommendations on what to read.

Language and Linguistics

You have to be familiar with some core concepts of linguistics, like "parts of speech" and "sentence structure". If you have not taken any courses in linguistics or NLP/Computational Linguistics you should consult some of the following.

  • Chapter 3, "Linguistic Essentials", p. 81-115, in Manning and  Schütze: Foundations of Statistical Natural Language Processing. This is the best overview for what will be assumed in the course. Unfortunately, the book is not online, but you find it in the library.
  • Jurafsky and Martin, Speech and Language Processing, anyhow. The sections 3.1 + 12.1-12.3 introduce some of the key concepts of morphology and syntax.

Probabilities

Probabilities and probability theory are basic tools for many language technological applications including machine translation. Since not all of you may be familiar with this, we will give a crash course on probabilities and on logarithms and mathematical notation in the slot for group sessions Thursday  25 August and Thursday 1 September for they who need it. We will not go beyond what is covered in INF5830.

Programming in Python

We assume you have programming experience from some language(s), and that if you have no experience with Python, you're able to catch up. Python will be used, in particular in the second half of the course.

Distributional semantics

The part on distributional semantics in the second half of the  semester will assume familiarity with the topic from INF4820

    Published Aug. 17, 2016 11:14 AM - Last modified Aug. 17, 2016 11:22 AM