Movie Review Sentiment Analysis using Naïve Bayes

Procedure

Step 1: Upload the Dataset

Start by uploading the movie reviews dataset with sentiment labels. This dataset will be used for training and testing the Naive Bayes classifier.

Description of image

Click the Next button to proceed to the Data Preprocessing step.

Description of image
Step 2: Data Preprocessing

Prepare the text data by applying preprocessing steps like removing stopwords, converting text to lowercase, removing special characters, and tokenization.

Click the Next button to iterate through each data cleaning step.

Description of image

Click the Proceed button to move forward to the next step: Feature Engineering.

Description of image
Step 3: Feature Engineering

In this step, the preprocessed text is converted into numerical features for model training. Key techniques include:

  • Tokenization: Splitting text into individual words or tokens.
  • Word Frequencies: Counting occurrences of each word in the dataset.
  • Description of image
  • Bag of Words (BoW): Representing text as a vector of word counts, ignoring grammar and word order.

Click the Proceed button to move forward to the next step.

Description of image
Step 4: Data Splitting

Split the dataset into training and testing sets to evaluate model performance. A good split ensures the model learns effectively and generalizes well to new data.

  • Training Set: Used to train the Naive Bayes classifier.
  • Testing Set: Used to assess model accuracy on unseen data.

Adjust the split ratio and observe whether it's a good split or a bad split based on the description.

Click the Proceed button to move forward to the next step.

Description of image
Step 5: Training the Model

Train the Naive Bayes classifier using the prepared training dataset. The model learns patterns from the features extracted in the previous steps.

  • Probability Calculation: The classifier computes the probability of a review being positive or negative based on word occurrences.
  • Bayes' Theorem: It applies Bayes’ Theorem to update probabilities as new data is introduced.
Description of image

Click the Start Training button to train the model.

Observe the probability and likelihood calculations happening in real time as the Naive Bayes classifier learns from the data.

Description of image

Enter a movie review and click the Detect button.

Observe the posterior probability calculations in real time, along with the prediction graph and final sentiment classification result.

Description of image

Click the Evaluation button to move forward with the process.

Step 6: Model Evaluation

Evaluate the performance of the trained Naive Bayes classifier using key metrics.

  • Accuracy: Measures the percentage of correctly classified reviews.
  • Precision: Indicates how many of the predicted positive reviews are actually positive.
  • Recall: Measures how well the model identifies all actual positive reviews.
  • F1-Score: A balance between precision and recall for overall model performance.
Description of image