Click here to Skip to main content
15,880,967 members
Articles / Programming Languages / C#
Article

Word stemming for German on .NET Framework

Rate me:
Please Sign up or sign in to vote.
3.57/5 (6 votes)
22 Feb 20031 min read 80.3K   918   19   10
An article on word stemming algorithm, implemented for German language on the .NET framework.

Introduction

I found many word stemming algorithms implemented for the English language, some very good, others not so, and so on... I once had a project where my task was implementation of a stemming algorithm on .NET framework, for German, and I could not found any implementation in .NET framework, for German. So this is my implementation of word stemming for German language, on the .NET framework in C#.

There is source code for stemingLib.dll and example source code on how to use stemingLib.dll. With the demo project there is a sample of German vocabulary in a file rjecnik.txt and same vocabulary correct stemmed in output.txt. Demo application provides new stemmed vocabulary rezultat.txt and compares with output.txt to check for errors. This is actually my test application for this implementation. The other example is in the next section of this article.

Using the code

This is a simple library with only one class called destemmer. This class has only one property Word and one method Stem. You can use this class like:

C#
// Create object who will performe stemming
destemmer Stemmer = new destemmer();
Console.WriteLine("\n Input some german word: ");
string g_word = Console.ReadLine();
// You can call function Stem(string word) to get stemmed word.
string stemmed_word = Stemmer.Stem( g_word.ToLower() );
// Or You can initialize 'Word' property
Stemmer.Word = g_word.ToLower();
// Then call stem function like this
Stemmer.Stem();
/* and retrive result on this way 
(property Word after calling Stem() function 
contains stemmed word, not original word) */ 
string stemmed_word2 = Stemmer.Word;
Console.WriteLine("\n Stemming result on first way: {0} ", 
                                              stemmed_word);
Console.WriteLine("\n Stemming result on second way: {0} ", 
                                             stemmed_word2);
/* And its so simple */

Points of interest

This library works only for Unicode string, and all word strings passed to the Stem() function or in to Word property have to be lowercase. Algorithm: There is no need for me to give you the details of the algorithm which I use for the implementation of German word stemming. All information you could be interested to know, can be found on http://snowball.tartarus.org/, where I found the original algorithm and implemented it for the German language, on .NET framework.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here


Written By
Web Developer
Bosnia and Herzegovina Bosnia and Herzegovina
This member has not yet provided a Biography. Assume it's interesting and varied, and probably something to do with programming.

Comments and Discussions

 
Generalanother german stemmer Pin
bergloman25-Feb-03 1:27
bergloman25-Feb-03 1:27 
GeneralRe: another german stemmer Pin
EasyWay27-Feb-03 9:55
EasyWay27-Feb-03 9:55 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.