Click here to Skip to main content
15,920,513 members
Articles / Artificial Intelligence / Machine Learning

Machine Learning - Opinion and Text Mining by Naive Bayes Classifier

Rate me:
Please Sign up or sign in to vote.
5.00/5 (21 votes)
26 Apr 2018CPOL7 min read 30.9K   647   32   6
Best practice for opinion and Text Mining based on Naïve Bayesian Classifier.
This article will explain complete and describe detail by detail and step by step computational matters.

Requirement: Machine Learning

Sentiment Analysis

People have the tendency to want to know how others are thinking about them and their business, no matter what is it, whether it is a product such as a car, restaurant or it is a service. If you know how your customers are thinking about you, then you can keep or improve or even change your strategy to enhance customer satisfaction. It is possible by the aid of gathering their email which has been sent to you and use some methodology to categorize their opinion according to their words which they have used. Also, it has other applications in industry and science for researches.

  1. It reboots marketing.
  2. Categorization documents in office.
  3. Making recommendation list for customers or your firma.

In the below figure, on the training set, there are samples about specific sentences and assign them positive or negative sign based on their content. Then we have on the right section test data which its positivity or negativity is ambiguous for us and should be computed by some methods such as naive Bayes classifier.

Image 1

Text Mining in Scientific Research

A project at The University of Manchester to analyse 15,000 mouse studies, does not have their gender and age which has a bad influence on research result and output is ambiguous. But with the aid of text mining tools and scanning documents, it is now possible to fill this gap and improve research according to new information based on gender and age of mice. [*]

Opinion Mining

Different people have different reaction to one subject and you can measure how many people agree with a specific topic and how much. It is possible with natural language process solution. But there is another way with naive bayes classifier.

Image 2

Naive Bayesian

Naive Bayesian is one of the most applicable data mining algorithms to classify and interpret data. This algorithm helps to statistical interpretation by giving probability for each occurrence.

Simple Definition: Finding the probability of happening of the sorts of the events (C1, C2, ..., Cn), based on happening some conditions (x1, x2, x3, …, xn) which have been happened sooner than specific C.

Image 3

Image 4

Image 5

There is given training data set and it is categorized according to some factors. In the above example, there is text which determine their content into positive or negative. Then as a test data set, there are two sentences which their content direction into positive or negative is ambiguous. Our goal is to find their direction by studying the training set. The appropriate solution is to use naïve Bayesian classifier which is the most applicable in text and opinion mining.

Image 6

To reach our purpose, we should know or review some concepts about statistical and probability science. There are some fundamental definitions which I explain as follows:


When an event happens, we look at it carefully, we count all of possibilities to happen for a specific event and call it Sample Space, then count how many times it can happen. If we throw one coin, our sample space is two, because we have S = {head, tail}, when we throw the coin, there is just one occurrence or event, either head or tail, in healthy conditions, coin just falls into one of these two. But when new throw dice there are six possibilities as sample space. S = {1, 2, 3, 4, 5, 6}, also in falling dice, there are just one happening. The probability would be computed from dividing count of event by total sample space, for instance, the probability of falling dice on number 1 is 1/6. There are other events which happen more than once. In the above table, you can see that there are 8 as sample space. The probability of happening event “positive” is 4/8=0.5 because positive happened 4 times in total 8 times, it is the same story for “negative”, it happened 4 times in 8 times and the probability of happening negative is equal to positive 0.5.

Image 7

Conditional Probability

The probability of one event happening based on another event happening P(A|B) is called conditional probability, which is read as “The probability of A under the B”. For example, one patient firstly has high glucose which is “B” and then we measure the probability of happening diabetes disease which is “A”, so we measure P (A= diabetes | B= glucose). In the above example, how much is the probability of “good” happening under the “positive” condition.

Image 8

In good column, there are three yes as blue color which is combination of good as positive and the positive in the last column happened four times, therefore three divided by four is 0.75.

Bayesian Theory

In some other problems, we have sort of events and also some different classes, which some specific sorts of events are belonged to specific class. Then after studying the past information, we want to classify new conditions into classes. In the above examples, there are different sentences which have positive or negative. Then we want to classify new sentence according to the previous categorization. P (new sentence | negative) or P (new sentence | positive) means that whether new sentence is positive or negative. Because sentence includes different words, it should be solved with naïve Bayesian classifier.

According to ten formulas in the below, I have proved how naïve Bayesian has been extracted from the conditional probability formula. Firstly, sentence should be divided by the different words, then once it should be computed, the probability according to positive and then negative state. Then have a comparison between two values rom negative and positive, whatever is greater so sentence belongs to that category. In the below example, “It is good” is 0.28 which is greater than 0.047 so it is a positive sentence.

Image 9

Image 10

In the above equation, c has high percentage to occur and is a subset of total hypothesis, because P(x) is independent from h and the existence of P(x) will diminish the , eventually we remove P(x).

Naïve Bayesian Classifier

This method is as practical as neural network and decision tree and is applicable to text categorization and medicine diagnosis. Naïve Bayesian is an approach when we have huge data samples but they pick finite value from set of features that are independent from each other and conjunction. Assume f is target function and x will pick range of value from , our goal is calculate the maximum probability.

Image 11

Calculation Description

Image 12


Download Python

If you want to have a comfortable IDE and professional editor, without needing to install libraries, you can use Anaconda & Spider.

Image 13

Then open Anaconda Navigator from star and select “Spider”:

Naive Bayes

Image 14

Python Implementation For Naive Bayes Classifier

Step 1: Open "Anaconda Prompt"

Image 15

Step 2: Install "textblob" for Having Necessary Library

Image 16

Step 3: Download Corpora

Image 17

Step 4: Import Important Libraries such as textblob and textblob.classifiers

from textblob.classifiers import NaiveBayesClassifier
from textblob import TextBlob

Step 5: Import Train and Test Set and Use Naive Bayes

train = [
    ('It is good', 'positive'),
    ('I feel good about it', 'positive'),
    ('It is bad', 'negative'),
    ('It is good', 'positive'),
    ('It is bad', 'negative'),
     ('I feel bad about it', 'negative'),
     ('I feel it is not bad about', 'positive'),
     ('I feel not good about', 'negative')

test = [
     ('It is good', 'positive'),
     ('I feel bad about it', 'negative')

NB = NaiveBayesClassifier(train)

Step 6: Accuracy Computation

print("Accuracy: {0}".format(NB.accuracy(test)))

Step 7: Test Algorithm With Different Sentences

print(NB.classify('I feel good about spring'))  # 'positive'
print(NB.classify('I feel bad about dark'))   # 'negative'

Step 8: Opinion Mining or Text Mining for One Document Instead of Sentence

#Put collect of text in one textblob
#and textblob gives the result by sum up wach result of sentence
#Classify a TextBlob
blob = TextBlob("I feel bad about dark."      # 'negative'
                "I feel good about spring. "  # 'positive'
                "finally darkness is bad."    # 'negative'
                , classifier=NB)        #sum up='negative'

for sentence in blob.sentences:


Bayesian Approach

  • Gathering existence possible data and knowledge about specific knowledge for data set
  • Assign to these data their probabilities, probability distribution, independent hypothesis
  • We do the above steps without observing that unknown parameter and based on our judgment
  • Collecting raw data from environment
  • Observing data and comparing to data set
  • Calculate secondary probability distribution
  • Do prediction according to average of secondary probability
  • Make decision for reduce error from result


  • Observation of each sample can be deviated our hypothesis accuracy
  • Obtain new hypothesis by combination of previous knowledge and new sample
  • Bayesian approaches are being able to predict situation in probability way
  • Categorize new samples by weight synthetic to several hypothesis
  • In some situation, if Bayesian approaches does not work, they are good scale for evaluation of other approaches


  • Prerequisite of almost complete data set (huge volume of data)
  • In the absence of this training data set, we have to estimate probability
  • This estimation is based on previous observation, experience …
  • Making this approximation is very expensive


Feel free to leave any feedback on this article; it is a pleasure to see your opinions and vote about this code. If you have any questions, please do not hesitate to ask me here.


  • 21st July, 2017: Initial version


This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Written By
Doctorandin Technische Universität Berlin
Iran (Islamic Republic of) Iran (Islamic Republic of)
I have been working with different technologies and data more than 10 years.
I`d like to challenge with complex problem, then make it easy for using everyone. This is the best joy.

ICT Master in Norway 2013
Doctorandin at Technische Universität Berlin in Data Scientist ( currently )
Diamond is nothing except the pieces of the coal which have continued their activities finally they have become Diamond.

Comments and Discussions

QuestionText Classifier Pin
Member 101447241-May-18 22:38
Member 101447241-May-18 22:38 
Questionwunderbar Pin
Member 1004948827-Apr-18 7:02
Member 1004948827-Apr-18 7:02 
GeneralMy vote of 5 Pin
san2debug31-Oct-17 17:49
professionalsan2debug31-Oct-17 17:49 
GeneralRe: My vote of 5 Pin
Mahsa Hassankashi2-Nov-17 6:50
Mahsa Hassankashi2-Nov-17 6:50 
PraiseGreat Pin
Helenczka21-Jul-17 6:30
Helenczka21-Jul-17 6:30 
GeneralRe: Great Pin
Mahsa Hassankashi21-Jul-17 7:05
Mahsa Hassankashi21-Jul-17 7:05 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.