N-Grams

After completing this experiment, students will be able to:

  1. Build N-Gram Models: Construct bigram and trigram models from a given text corpus, understanding the process of extracting N-Gram counts and probabilities.

  2. Calculate Sentence Probabilities: Compute the probability of a sentence using N-Gram models, applying the Markov assumption to simplify probability calculations.

  3. Analyze Model Limitations: Recognize the limitations of simple N-Gram models, such as data sparsity and context length, and discuss possible solutions (e.g., smoothing).

  4. Apply N-Gram Models in NLP: Use N-Gram models for practical tasks in natural language processing, such as language modeling and text prediction.

Learning Focus

  • Construct and interpret bigram and trigram models
  • Calculate and compare sentence probabilities
  • Understand the Markov assumption in language modeling
  • Discuss applications and limitations of N-Gram models