UiO University of Oslo

INF5830 - Autumn 2011

Syllabus - details

Book by book

Manning and Schütze: Foundations of Statistical Natural Language processing (FSNLP):

Ch 1
Ch 2:
- Sec 2.1 except 2.1.10,
- the essencials from sec 2.2 up to and including 2.2.3
Ch 3 is considered known background and should be studied by they who lack this background
Ch 4
Ch 5, except 5.3.4
Ch 6:
- Introduction
- Sec 6.1
- Sec 6.2 up to (but not including) Sec. 6.2.4
Ch 7, except 7.3-7.4
Ch 8:
- Sec 8.1
- Sec 8.5
Ch 14:
- Sec 14.2
  - Introduction+
  - 14.2.1
Ch 15:
- Sec 15.1-15.2
Ch 16:
- Introduction (up to but not including Sec 16.1)
- Sec 16.2
- Sec 16.4

Nivre’s web course: Statistical Natural Language Processing (NW)

Lect. 1-4

Bird, Klein and Loper: Natural Language Processing with Python (NLTK)

Ch 1: Sec 1.1-1.3, 1.5
Ch 2: Sec 2.1-2.2
Ch 3: Sec 3.1-3.2
Ch 6: Everything except Sec 6.4

Manning, Raghavan and Schütze: Introduction to Information Retrieval (IIR):

Ch 13, except 13.2.1
Ch 14:
- Introduction
- Sec 14.1-14.3

Jurafsky and Martin: Speech and Language Processing (J&M)

Ch 6: Sec. 6.6-6.8 (except 6.6.4)
Ch 20: Sec 20.7

By subject

Basics: ”Working with texts”

FSNLP:Ch 1, Ch 4
NLTK: Ch 1, 2.1, 3.1-3.2
Slides from lecture 22 Aug

Probability theory

FSNLP: Sec 2.1 except 2.1.10
Nivre’s web course: Lect. 1-3

Main concepts of Entropy

FSNLP 2.2-2.2.3 (We do not expect all details here, but you should know formulas 2.26 and 2.36 from FSNLP and have some ideas about why entropy is an essential concepts.)

Statistics and inference

FSNLP 5.1-5.3.3
Nivre’s web course, lect. 4
Slides from lecture 12 Sept.
Could be useful to consider other sources as well

Collocations

FSNLP:Ch. 5, except 5.3.4

Methodology, evaluation, smoothing

FSNLP:
- Ch 6 up to (but not including) Sec. 6.2.4
- Sec 8.1
- Ch 16: Introduction (up to 16.1)

Naïve Bayes classification and word sense disambiguation

FSNLP Ch. 7, except 7.3-7.4
Manning, Raghavan, Schütze, IIR, Ch. 13
NLTK 6.1-6.3, 6.5

Vector space semantics and IR

FSNLP Sec 8.1, 8.5, 15.1-2
J&M, Sec. 20.7
Slides from lecture 14 Nov.

Vector space classification: Rocchio and k nearest neighbors

FSNLP 16.4
IR 14-14.3
Slides from lecture 14 Nov.

Vector space flat clustering: k means

FSNLP 14.2: intro+14.2.1
Slides from lecture 14 Nov.

Linear classifiers, logistic regression, maximum entropy classifiers and tagging

FSNLP, Ch. 16.2
Jurafsky&Martin, Sec. 6.6-6.8 (except 6.6.4)
Ratnaparkhi 1996
NLTK, sec. 6.6
Slides from lecture 21& 28 Nov.

Published Dec. 5, 2011 4:14 PM - Last modified Dec. 5, 2011 4:44 PM