Click here to Skip to main content
Click here to Skip to main content

Word stemming for German on .NET Framework

, 22 Feb 2003
Rate this:
Please Sign up or sign in to vote.
An article on word stemming algorithm, implemented for German language on the .NET framework.

Introduction

I found many word stemming algorithms implemented for the English language, some very good, others not so, and so on... I once had a project where my task was implementation of a stemming algorithm on .NET framework, for German, and I could not found any implementation in .NET framework, for German. So this is my implementation of word stemming for German language, on the .NET framework in C#.

There is source code for stemingLib.dll and example source code on how to use stemingLib.dll. With the demo project there is a sample of German vocabulary in a file rjecnik.txt and same vocabulary correct stemmed in output.txt. Demo application provides new stemmed vocabulary rezultat.txt and compares with output.txt to check for errors. This is actually my test application for this implementation. The other example is in the next section of this article.

Using the code

This is a simple library with only one class called destemmer. This class has only one property Word and one method Stem. You can use this class like:

// Create object who will performe stemming
destemmer Stemmer = new destemmer();
Console.WriteLine("\n Input some german word: ");
string g_word = Console.ReadLine();
// You can call function Stem(string word) to get stemmed word.
string stemmed_word = Stemmer.Stem( g_word.ToLower() );
// Or You can initialize 'Word' property
Stemmer.Word = g_word.ToLower();
// Then call stem function like this
Stemmer.Stem();
/* and retrive result on this way 
(property Word after calling Stem() function 
contains stemmed word, not original word) */ 
string stemmed_word2 = Stemmer.Word;
Console.WriteLine("\n Stemming result on first way: {0} ", 
                                              stemmed_word);
Console.WriteLine("\n Stemming result on second way: {0} ", 
                                             stemmed_word2);
/* And its so simple */

Points of interest

This library works only for Unicode string, and all word strings passed to the Stem() function or in to Word property have to be lowercase. Algorithm: There is no need for me to give you the details of the algorithm which I use for the implementation of German word stemming. All information you could be interested to know, can be found on http://snowball.tartarus.org/, where I found the original algorithm and implemented it for the German language, on .NET framework.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here

Share

About the Author

EasyWay
Web Developer
Bosnia And Herzegovina Bosnia And Herzegovina
No Biography provided

Comments and Discussions

 
SuggestionUpdate: words ending with 'nisse' PinmemberStormrider4212-Dec-11 10:46 
GeneralWord Stemming with the open office spell checker PinmemberThomas Maierhofer15-Nov-09 4:49 
GeneralLicense for the C# stemming PinmemberJohn Kuntoff7-Feb-05 0:34 
GeneralRe: License for the C# stemming PinsussAnonymous28-Aug-05 10:45 
Generalanother german stemmer Pinmemberbergla25-Feb-03 1:27 
GeneralRe: another german stemmer PinmemberEasyWay27-Feb-03 9:55 
Generallowercase PinmemberAndreas Saurwein24-Feb-03 5:23 
GeneralRe: lowercase PinmemberEasyWay27-Feb-03 10:00 
GeneralRe: lowercase PinsussStoyan Damov8-Mar-03 18:13 
GeneralRe: lowercase PinmemberEasyWay9-Mar-03 12:34 
Yes You and Andreas are right. And sory for that mistake. In my case, when I use it, I did not nead to convert words in to lowercase inside of destemming class. And it's main reason why there is no ToLower() where it should be.Blush | :O
 
Faruk Kasumovic.
Student of Electrical Enginering (Information Technologies) in Tuzla, Bosnia and Herzegovina.

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

| Advertise | Privacy | Mobile
Web01 | 2.8.141015.1 | Last Updated 23 Feb 2003
Article Copyright 2003 by EasyWay
Everything else Copyright © CodeProject, 1999-2014
Terms of Service
Layout: fixed | fluid