POS Tagging - Hidden Markov Model
Given a corpus with the sequence 'EOS noun verb EOS', what is the transition probability P(verb|noun)?
If the word 'run' appears 3 times as a verb and 2 times as a noun in a corpus, what is the emission probability P(run|verb)?
What is the significance of the EOS (End of Sentence) marker in HMM-based POS tagging?
In the Viterbi algorithm, what does the backtracking step accomplish?
Why might the word 'can' be challenging for an HMM-based POS tagger?
What happens during the initialization step of the Viterbi algorithm?
In a more complex scenario, if we have the sequence 'The/DT dog/NN runs/VBZ fast/RB', what would be the transition probability P(VBZ|NN)?
What is the computational complexity of the Viterbi algorithm for a sentence of length T with N possible POS tags?
How does the HMM handle data sparsity problems in POS tagging?