Building POS Tagger

1. How is a training model formed in a Hidden Markov Model (HMM) for POS tagging?

2. How is a training model formed in a Conditional Random Field (CRF) for POS tagging?

3. How is testing (tagging new sentences) performed in HMM-based POS tagging?

4. How is testing performed in CRF-based POS tagging?

5. Which features (e.g., context, word, bigram/trigram) are most helpful for training a model for POS tagging, and why?

6. How does increasing the size of the training corpus affect the accuracy of a POS tagger?

7. What is the main difference between POS tagging and chunking in NLP?

8. Give an example of a sentence where context is crucial for correct POS tagging. Explain your answer.