Virtual Labs

Movie Review Sentiment Analysis using Naïve Bayes

Procedure

Step 1: Upload the Dataset

Start by uploading the movie reviews dataset with sentiment labels. This dataset will be used for training and testing the Naive Bayes classifier.

Click the Next button to proceed to the Data Preprocessing step.

Step 2: Data Preprocessing

Prepare the text data by applying preprocessing steps like removing stopwords, converting text to lowercase, removing special characters, and tokenization.

Click the Next button to iterate through each data cleaning step.

Click the Proceed button to move forward to the next step: Feature Engineering.

Step 3: Feature Engineering

In this step, the preprocessed text is converted into numerical features for model training. Key techniques include:

Tokenization: Splitting text into individual words or tokens.
Word Frequencies: Counting occurrences of each word in the dataset.

Bag of Words (BoW): Representing text as a vector of word counts, ignoring grammar and word order.

Click the Proceed button to move forward to the next step.

Step 4: Data Splitting

Split the dataset into training and testing sets to evaluate model performance. A good split ensures the model learns effectively and generalizes well to new data.

Training Set: Used to train the Naive Bayes classifier.
Testing Set: Used to assess model accuracy on unseen data.

Adjust the split ratio and observe whether it's a good split or a bad split based on the description.

Click the Proceed button to move forward to the next step.

Step 5: Training the Model

Train the Naive Bayes classifier using the prepared training dataset. The model learns patterns from the features extracted in the previous steps.

Probability Calculation: The classifier computes the probability of a review being positive or negative based on word occurrences.
Bayes' Theorem: It applies Bayes’ Theorem to update probabilities as new data is introduced.

Click the Start Training button to train the model.

Observe the probability and likelihood calculations happening in real time as the Naive Bayes classifier learns from the data.

Enter a movie review and click the Detect button.

Observe the posterior probability calculations in real time, along with the prediction graph and final sentiment classification result.

Click the Evaluation button to move forward with the process.

Step 6: Model Evaluation

Evaluate the performance of the trained Naive Bayes classifier using key metrics.

Accuracy: Measures the percentage of correctly classified reviews.
Precision: Indicates how many of the predicted positive reviews are actually positive.
Recall: Measures how well the model identifies all actual positive reviews.
F1-Score: A balance between precision and recall for overall model performance.