Building Chunker

Step-by-Step Procedure for Building a Chunker

Step 1: Select Language

  • Choose the target language (English or Hindi) for chunking analysis.

Step 2: Choose Training Corpus Size

  • Select the size of the training data. Larger corpora may improve accuracy but take longer to train.

Step 3: Select Algorithm

  • Pick the machine learning model for chunking: Hidden Markov Model (HMM) or Conditional Random Field (CRF).

Step 4: Choose Feature Set

  • Decide which features to use for training:
    • Lexicon only
    • POS tags only
    • Lexicon + POS tags (recommended for best results)

Step 5: Train and Evaluate

  • Click "Check Accuracy" to train the chunker and view its accuracy for your chosen configuration.
  • Review example sentences with predicted chunk boundaries.

Step 6: Experiment and Compare

  • Try different combinations of features, corpus sizes, and algorithms.
  • Use "Reset / Try Another Configuration" to start over and explore more settings.

Tips:

  • Use larger corpora and richer feature sets for higher accuracy.
  • Compare HMM and CRF results to understand model differences.
  • Analyze error patterns in the output to improve your chunker.

OUTPUT:

  • The accuracy of the chunker for the selected configuration is shown.
  • Example sentences with their predicted chunks are displayed for better understanding.
  • You can use the "Reset / Try Another Configuration" button to start over and explore different settings.