Click here to Skip to main content
Licence CPOL
First Posted 28 May 2006
Views 65,017
Downloads 2,377
Bookmarked 66 times

A Naive Bayesian Classifier in C#

By | 28 May 2006 | Article
A Naive Bayesian Classifier in C#
Sample Image - pict.gif

Introduction

I was looking for a way to classify short texts into several categories. A simple but probably sufficient method seemed to be naive bayesian classification. Looking for some readily available stuff, I found many different implementations in Perl or Java. The only CLR implementation I could find was NClassifier, yet it was not doing classification into multiple classes. Therefore I decided to write my own.

Background

There is plenty of information around on the Internet describing the theory of bayesian classification. Wikipedia has a good introduction.

Using the Code

First, create an instance of BayesClassifier.Classifier.

BayesClassifier.Classifier m_Classifier = new BayesClassifier.Classifier();

Tip: You may experiment with BayesClassifier.ExcludedWords to define the words that you will consider irrelevant for your classification. That can lead to smaller dictionaries and therefore speed up the classification.

Then define the categories and teach each category:

m_Classifier.TeachCategory("Cat1", new System.IO.StreamReader(file));
m_Classifier.TeachPhrases("Cat2", new string[] { "Hi", "HoHo" });

Finally the method BayesClassifier.Classifier.Classify will return the classification result.

Dictionary<string, double> score = 
    m_Classifier.Classify(new System.IO.StreamReader(file));

Let me know if you have any questions or suggestions, and let me know if you have any experiences with the applicability of the naive bayesian approach. (Since the (wrong) assumption of word independence might turn out to influence the result).

History

  • 28th May, 2006: Version 1.0

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

About the Author

ErichG



Germany Germany

Member



Sign Up to vote   Poor Excellent
Add a reason or comment to your vote: x
Votes of 3 or less require a comment

Comments and Discussions

 
You must Sign In to use this message board. (secure sign-in)
 
Search this forum  
 FAQ
    Noise  Layout  Per page   
  Refresh
GeneralMy vote of 4 PinmemberT. Abdul Rahman1:24 15 Nov '11  
GeneralMy vote of 4 PinmemberInterstingCodes17:11 7 Mar '11  
GeneralLearn and Test PinmemberMember 31252911:49 16 Jan '11  
GeneralNeed help for mail filter PinmemberKasunmit16:50 21 Jul '10  
QuestionWhy are the results so low? PinmemberKeith Vinson13:04 10 Apr '08  
GeneralTranslating the scores to regular probabilities. PinmemberJeepy11:44 29 Jan '08  
In the forum notes below the author stated that the Bayesian probability is really a logarithm. For those of you who don't like math, you can return it from a logarithm by raising 10 to the power of the returned values from this program.
 
So if you got a returned value like.... Cat1: -0.30102999566, you would take 10^-0.30102999566 which roughly equals 0.5 or 50%.
 
In the buttonTest_Click event handler of the BayesClassifierDemo you can change the values back to proportions by using Math.Pow(10, score[c]). Math.Pow(10, score[c]) takes 10 and raises it to the power of score[c], which is the returned value from the classifier.
 
It's a shame the guy who wrote this couldn't have added that simple step, it would have removed a lot of other peoples' confusion. Cry | :((
GeneralRe: Translating the scores to regular probabilities. PinmemberKeith Vinson12:55 10 Apr '08  
QuestionImplementing in VC++ ? PinmemberVaclav_Sal16:11 11 Sep '07  
QuestionWhat exactly score means PinmemberNateD8:41 23 Aug '07  
GeneralPlease contact me Pinmemberyonido23:24 13 May '07  
QuestionI ve a Question PinmemberJunaid_Arif_Mufti20:03 26 Apr '07  
AnswerRe: I ve a Question PinmemberJunaid_Arif_Mufti20:05 26 Apr '07  
QuestionQuestions/suggestions - please contact me PinmemberMalteSteckmeister14:23 17 Dec '06  
Questionhow to use your classifier Pinmemberrkamalakar0:57 12 Dec '06  
AnswerRe: how to use your classifier PinmemberErichG5:19 2 Jan '07  
QuestionHow can I use your classifier to classify 20NewsGroup? Pinmemberjohnny198313:49 29 Nov '06  
AnswerRe: How can I use your classifier to classify 20NewsGroup? PinmemberErichG6:02 2 Jan '07  
QuestionRe: How can I use your classifier to classify 20NewsGroup? PinmemberTokes Erno13:29 25 Apr '07  
Generalwhy results tends to negative Pinmemberabdo12345678912:00 30 May '06  
GeneralRe: why results tends to negative PinmemberErichG22:39 1 Jun '06  
General5-stars all the same PinmemberGrav-Vt5:01 28 May '06  
GeneralRe: 5-stars all the same PinmemberErichG21:06 28 May '06  

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

Permalink | Advertise | Privacy | Mobile
Web04 | 2.5.120529.1 | Last Updated 28 May 2006
Article Copyright 2006 by ErichG
Everything else Copyright © CodeProject, 1999-2012
Terms of Use
Layout: fixed | fluid