Syllabus

The syllabus consists of

Ch. 2 Regular expressions, etc.
- Sec. 2.0
- Sec. 2.2 Words
- Sec. 2.3 Corpora
- Sec. 2.4 Normalization, except 2.4.3 and the technical details of 2.4.1
- Sec 2.5 Edit distance
Ch. 3, "N-gram Language Models"
- Sections 3.0-3.5
Ch. 4, "Naïve Bayes Classification and Sentiment"
- Except (for now) section 4.9 Statistical significance testing
Ch. 5, "Logistic Regression", Except
- Scaling input features in Sec. 5.2.2
- Sec. 5.7, Last paragraph, starting with "Both L1 and L2..."
Ch. 6, "Vector Semantics and Embeddings", everything except
- Not section 6.6 Pointwise Mutual Information (PMI)
Ch. 7 "Neural Networks and Neural Language Models"
Ch. 8 "Sequence labeling"
- Except 8.4.5-8.4.6 "The Viterbi Algorithm"
Ch. 9 Deep Learning Architectures for Sequence Processing
- Sec. 9.1-9.5
Ch. 10 Machine Translation and Encoder-Decoder Models
- Sec. 10.0, 10.2-10.4
Ch. 17 Relation Extraction
- Sec. 17.0-17.1
Ch. 18, "Word Senses and Word Net"
- Sec. 18.0-18.3
Ch. 24, "Dialogue systems and chatbots,
- Sections 24.1-24.6
Ch. 25, "Phonetics"
- Sections 25.1-25.5 (excluding the details not discussed in class)
Chap 26, "Speech Recognition and ASR"
- Sections 26.1 and 26.5 (excluding the part on statistical significance)

Wikipedia

Other:

Section 24.6 from the Dialogue chapter of 2nd edition of Jurafsky & Martin (on MDPs).
Ziyuan Zhong, "A tutorial on Fairness in Machine Learning", Towards Data science. NB: you can skip Section 5 of the text.
Ribeiro, M. T., Singh, S., & Guestrin, C. (2016, August). " Why should I trust you?" Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining (pp. 1135-1144). NB: you can skip Section 4 of the paper, as well as the details of the experimental design and evaluation results.
Chapter 2 of Domingo-Ferrer, J., Sánchez, D., & Soria-Comas, J. (2016). Database anonymization: privacy models, data utility, and microaggregation-based inter-model connections. Synthesis Lectures on Information Security, Privacy, & Trust, 8(1), 1-136. NB: You can skip the technical details on measuring information loss.

Published Dec. 6, 2022 10:21 AM - Last modified Dec. 6, 2022 10:21 AM