|
I have been working on this version and it is a little more involved than the CEdit implementation. Basically, you must disable updates, use SetSel to select each word, check it and when you are done reset the selection to the original position and enable updates. Where it gets complicated is calculating where to draw the squigly lines. You have to get the charformat, create a compatible DC, determine the text height and from that calculate the position of the line. This can be problematic when a single word uses different fonts for different characters.
I am glad to see that someone is using or at least looking at this project. Unfortunately, due to my workload I have not been able to invest much time in completing this spell checker. I am hopefull that an upcoming project will require a spell checker and I will be able to finish it then.
|
|
|
|
|
Ya I was kind of hoping I could use it in CRichEditView but I guess I will just have to do some work on it, my current project is kind of boring me.
-Matt Newman
|
|
|
|
|
I have the same question. So, if you find any thing, please tell me.
Best regards
George Clarence
|
|
|
|
|
Sory, my right Email address is george_clarence@yahoo.co.uk
|
|
|
|
|
Make your engine much faster and try to make it an ocx
|
|
|
|
|
Application crashes on close. Bug is in following function:
void CFPSSpellingEditCtrl::Terminate()
{
if (m_pEngine)
{
ASSERT(m_pEngine->GetUserDic());
....
ASSERT above notifies error in debug mode, but there is no protection code that will take care of error. After ASSERT above you should add:
if(m_pEngine->GetUserDic())
{
}
Regards,
Miroslav Rajcic
http://www.spacetide.com
|
|
|
|
|
Thanks for the note. It did not occur on my test PC because there is always a user dic. I will fix the problem ASAP.
|
|
|
|
|
LOL guess not ASAP. Luckily I debugged it and ignored the assert error the 1st time through..<cough>
|
|
|
|
|
As the developer of this project, I am interested in knowing what languages the users of this site would like to see. Also, I am in need of word lists for the various languages.
Thank You,
Matt Gullett
|
|
|
|
|
I may be able to get some word lists for you. Send me an email (link below).
I work with languages that have never been written, so I'm very interested in this project. I am having trouble finding the Common Speller API. MS seems to deny any knowledge of it.
Birch
|
|
|
|
|
Russian, please.
|
|
|
|
|
I'd like to see:
Mexican Spanish
Spain Spanish
Japanese
Simplified Chinese
Portuguese
After that, I'd be able to use this in a app we have at work. I'd give you word lists for each language, but I haven't the first clue about finding them.
Jeremy Falcon
|
|
|
|
|
There's a newer algorithm by the author of Metaphone known as the Double Metaphone. You can find it at http://www.cuj.com/archive/1806/feature.html. I don't know how it will compare to your modified Metaphone, but you shoud check it out.
|
|
|
|
|
Thanks for the URL. Actually, I have already been researching the double-metaphone algorithm and I intend to implement it into the spelling engine. Some other things I have learned, though:
1) Some commercial spell checkers also use a word-reduction algorithm (which keeps some vowels) to augment their search results. I have been looking at how to implement such a routine as well.
2) At least MS Word (and probably others) also have developed a database of human-created word-reduction and metaphone outputs. These human-created outputs are used in thier dictionaries as opposed to the computer-created ones to provide a better output. I have already gone through my USEnglish dictionary and hand-coded many words with the letter 'G' in them.
|
|
|
|
|
Would it be enough to replace the dictionaries (e.g. english => german) to use your class in another language, or is the algorithmn written around "english words"?
Uwe Keim
See me: http://www.zeta-software.de/~uwe
|
|
|
|
|
The matching engine uses numerous "english specific" algorithms to enhance the reuslt list. I do not know much German, so I am not sure how well the engine will map to German. It would be worth a try, though. Progably the #1 function in the whole class which would need modification for various languages is the MetaphoneEx function. I think this function could be modified to work for German.
|
|
|
|
|
finish english first, but know:
english is too easy comparing to german and especialy (eastern)-europe/slavic-and-others languages
generaly one big difference is in english is one word for all circumstancies
in german there are 4, we (s) 7 object-word sub-kinds;
in english you say: of word, about word, with word
we say: zo slova, o slove, so slovom
similar for another word kinds (i do not try name them in english):
in english you have green, in mine: zeleny (he), zelana (she), zelene (it), ... (similar in german: something like gruner, grune, grunes)
in english more green/greener (?, stupidity, take as example only), we have zelensi; most green - najzelensi
and combinations: about green word - o zelenom slove (german: um grune wort (?!))
sometimes (very) regular, sometimes not
etc. etc.
knowing this complicated rules you can eliminate many duplicate/similar cases to keep concrete database smaller
keep smiling and finish english first
t!
|
|
|
|
|
I guess english is the only western language (the only ones I can talk about) where a spell-checker with out grammer-check makes sense.
In all other languages you would probably first reduce a word to its pre-/suffixless root(s) spellcheck root(s), check if root(s) support the pre-/suffixes and then recombine.
Multiple roots occur in languages like german (which allows allmost free combination of many words into one, a feature which is very commonly used up to three words (the combinatorics start numbers getting big here )).
Roots would in generally not be unique (suffixes like -s -es, prefixes like a- an-).
The suffixes are mostly grammer implied and make for a good part of the spelling errors.
Suffixes of different words must match (or rather the implied grammatic entities).
Grammar only can decide if a specific word is noun adjective or verb (with nouns capitalized in german).
So an 'english' spellchecker could be used to check the roots, with some code added for the rest .
Wolfgang Reichl
|
|
|
|
|
In my opinion you might do following to improve the code:
- separarate language dependent parts from language independent
- break entire code into more than one class (CDictionary abstract class, CMainDictionary, CUserDictionary, CDictionaryIndex, ...)
- create some kind of standard dictionary format with header fields (language, creator, ...) , indexing, compression (finding common word suffixes, gzip)
- create application for conversion of worlists into dictionary format
I think your project have big potential and lots of us are willing to help you to create something really big from it.
Also I have lot of wordlists of different languages, so contact me if you are interested to publish them.
Regards,
Miroslav Rajcic <miroslav.rajcic@inet.hr>
http://www.spacetide.com
|
|
|
|
|
OK, so the problem is not the Spell Checker, but the languages. As I'm coming from Holland, I know a saying in English: Double Dutch, so it is. Like the big brother of Dutch: German, Dutch has words which are male, female, multiple or no-gender. In German, you've got articles like:
Der, Des, Dem, Den, Die which all do mean: THE
and
Das, Der, Die which all do mean: IT
In Dutch it's more English-Like: "De" and "Het" for IT and "De" only for THE, but if you whant to use a prefix, to get a word have a more tiny sound, you must use "Het" in any case, even if it has a gender.
I think, if someone wants to create a new language, the english classes are obsolete. I think the solution is to create different classes with words in it like:
CGenderMale, CGenderFemale, CMultiple and CNoGender. Also CNoun, and CSuffix (Whick can be language specific, e.g.
Class CSuffix
{
if (Gender == "Male")
{
AddSuffix("er")
}
else if {Gender == "Female")
.
.
.
}
And so on. Also another class, for something specialy in english would be CWordPartCount. In this class You can set something as:
if (Count < 2)
{
DoSuffix();
}
else if (Count = 2)
{
DoSuffixAndMoreOrLess();
}
else
{
DoMoreOrLess();
}
Get the idea? Now, if a word has only one part, only Suffixes are displayed e.g. Cool (Cooler and Coolest). For a word with two parts the Suffixes and Prefixes are shown e.g. Crazy (Crazier but olso More Crazy) and the last, when a words has 3 parts or more, only Prefixes are shown e.g. Pathetic (More pathetic, Most pathetic)
Get the drill?
And there are numerous classes needed for feeding information about whether the word is Irregular or not, if it's a verb or not and so on. So I think a wide discussion is needed to get the "Perfect" Spell Checker.
CString Dutch = "Double Dutch";
|
|
|
|
|
Alright, this is great so far. I'd love to use this in our commercial project, I'd love to help develop the software at no cost providing we can us it in our commercial software. BUT, it's lacking language support, as mentioned. Now, previous writers wants some European languages, even eastern European, but we'd require world wide support including Thai, Chinese etc and preferrably also some functionality for translation dictionary (?). Meaning there is an english text and you want to get suggestions for translated words in a second language.
How does this sound? Currently I think we would not want to dig into this because it's too far off at the moment.
http://www.cavena.com
|
|
|
|
|
Any support on this project is appreciated. I have been working on it now for about two months in my spare time and there is still a great deal to do. I have been doing some research on non-english language support, and I have learned a great deal about it. I am comfortable stating that when finished it will support multiple european languages. I have not researched languages like chineese but I know that there would be signifigant requirements to make it work.
That said, I am a professional deverloper doing this on the side and I would not feel good recomending this project for a commercial product at this time. The amount of time required to complete it and the probably availibility of comparable existing products would probably lead me to look for an off-the-shelf solutions.
The concept of suggesting words in another language has come up before. From what I have learned, it is doable, but requires very good and exhaustive dictionaries with information on word usage, sentence patterning, etc.
|
|
|
|
|