Building POS Tagger
1. How is a training model formed in a Hidden Markov Model (HMM) for POS tagging?
2. How is a training model formed in a Conditional Random Field (CRF) for POS tagging?
3. How is testing (tagging new sentences) performed in HMM-based POS tagging?
4. How is testing performed in CRF-based POS tagging?
5. Which features (e.g., context, word, bigram/trigram) are most helpful for training a model for POS tagging, and why?
6. How does increasing the size of the training corpus affect the accuracy of a POS tagger?
7. What is the main difference between POS tagging and chunking in NLP?
8. Give an example of a sentence where context is crucial for correct POS tagging. Explain your answer.