POS Tagging - Hidden Markov Model

Textbooks

  1. Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics and Speech Recognition (3rd Edition)
    By: Daniel Jurafsky and James H. Martin
    Chapter 8: Part-of-Speech Tagging
    Available online: https://web.stanford.edu/~jurafsky/slp3/

  2. Foundations of Statistical Natural Language Processing
    By: Christopher D. Manning and Hinrich Schütze
    MIT Press, 1999
    Chapter 10: Part-of-Speech Tagging

  3. Natural Language Processing with Python
    By: Steven Bird, Ewan Klein, and Edward Loper
    O'Reilly Media, 2009
    Chapter 5: Categorizing and Tagging Words

  4. Introduction to Information Retrieval
    By: Christopher D. Manning, Prabhakar Raghavan, and Hinrich Schütze
    Cambridge University Press, 2008

Research Papers

  1. A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition
    By: Lawrence R. Rabiner
    Proceedings of the IEEE, Vol. 77, No. 2, February 1989, pp. 257-286

  2. A Maximum Entropy Approach to Natural Language Processing
    By: Adam L. Berger, Vincent J. Della Pietra, and Stephen A. Della Pietra
    Computational Linguistics, Vol. 22, No. 1, 1996, pp. 39-71

  3. Part-of-Speech Tagging with Neural Networks
    By: Tomas Mikolov, et al.
    Conference on Neural Information Processing Systems (NIPS), 2013

Online Resources

  1. Stanford NLP Course Materials
    CS224N: Natural Language Processing with Deep Learning
    https://web.stanford.edu/class/cs224n/

  2. MIT OpenCourseWare - Introduction to Algorithms
    Dynamic Programming and Viterbi Algorithm
    https://ocw.mit.edu/courses/electrical-engineering-and-computer-science/

  3. Natural Language Toolkit (NLTK) Documentation
    POS Tagging Tutorial
    https://www.nltk.org/book/ch05.html

Video Lectures

  1. Hidden Markov Models - Stanford CS229 Machine Learning
    By: Andrew Ng
    https://www.youtube.com/watch?v=TPRoLreU9lA

  2. Part-of-Speech Tagging - NLP Course by Dan Jurafsky
    Stanford University
    https://www.youtube.com/watch?v=hX-psXx3rbA

  3. Viterbi Algorithm Explained
    By: Zach Star
    https://www.youtube.com/watch?v=6JVqutwtzmo

Software and Tools

  1. NLTK (Natural Language Toolkit)
    Python library for NLP with HMM POS taggers
    https://www.nltk.org/

  2. spaCy
    Industrial-strength NLP library
    https://spacy.io/

  3. Stanford CoreNLP
    Java-based NLP toolkit
    https://stanfordnlp.github.io/CoreNLP/

Datasets

  1. Penn Treebank
    Large corpus of English text with POS annotations
    https://catalog.ldc.upenn.edu/LDC99T42

  2. Universal Dependencies
    Multilingual treebanks with consistent annotation
    https://universaldependencies.org/

  3. Brown Corpus
    First million-word electronic corpus of English
    Available through NLTK

Additional Reading

  1. Statistical Methods for Speech Recognition
    By: Frederick Jelinek
    MIT Press, 1997

  2. Probabilistic Models for Natural Language Processing
    By: Ciprian Chelba
    Various IEEE and ACL publications

  3. Machine Learning for Natural Language Processing
    By: Tom Mitchell
    Carnegie Mellon University Course Materials

Interactive Resources

  1. Towards Data Science - HMM and POS Tagging
    Medium articles with practical examples
    https://towardsdatascience.com/

  2. Coursera - Natural Language Processing Specialization
    By: deeplearning.ai
    https://www.coursera.org/specializations/natural-language-processing

  3. edX - MIT Introduction to Computational Thinking
    Including probabilistic modeling sections
    https://www.edx.org/course/introduction-computational-thinking-data-science