Click here to Skip to main content
Rate this: bad
good
Please Sign up or sign in to vote.
See more: C# VB.NET
Hello guys good evening:
i want to make a simple parsing, but i have no idea to start.
first i get the text and split it to a words of array, then what can i do.
 
for example i have the following phrase :
 
"Open the CD_Drive"
 
the result will be:
Open --> Verb.
the --> det.
CD_Drive --> Noun.
 

thanks for help.
Posted 6-Feb-13 6:53am
Edited 6-Feb-13 7:05am
v2
Comments
Jameel Moideen at 6-Feb-13 12:58pm
   
what you want actually?please improve your question by adding your requirement
alcitect at 6-Feb-13 13:00pm
   
thanks for replay a will improve it.
Sergey Alexandrovich Kryukov at 6-Feb-13 13:10pm
   
I think the problem is just the opposite: what you need is basically clear, but the problem is big, hardly answerable in one quick question. I think the real parsing of real natural languages is the hopeless task, you need to simplify things highly. In other words, device some very simple language only resembling a natural one. Pretty much like a programming language. And learn some theory and practice of parsers, but not pick up an extremely universal and complex way. You need to start with the formulation of the language, most probably, with EBNF.
—SA
alcitect at 6-Feb-13 13:20pm
   
thanks Sergey for replay
Sergey Alexandrovich Kryukov at 6-Feb-13 13:32pm
   
Will you accept my answer formally (green button)? This is basically a way to go, but it's hard to give a more detail advice, as the topic is big.
It won't prevent you from getting other advice; maybe someone knows a project which is closer to what you want, but only you can make a decision.
—SA
alcitect at 6-Feb-13 14:05pm
   
yes Sergey thanks, but i just want to describe each word in the phrase as noun or verb and etc...
Sergey Alexandrovich Kryukov at 6-Feb-13 14:08pm
   
This is not "just", this is too difficult if you are talking about natural language. I recommend to follow my advice and start learning the topic...
—SA
alcitect at 6-Feb-13 14:13pm
   
yes i follow your advice.
thanks
Jameel Moideen at 6-Feb-13 13:19pm
   
you should create an custom algorithm for to do this
alcitect at 6-Feb-13 13:26pm
   
thanks for good advice
Ravi Bhavnani at 6-Feb-13 13:22pm
   
As Sergey has already pointed out, NLP is a non-trivial endeavour. That being said, you may want to take a look at SharpNLP (http://sharpnlp.codeplex.com/) and Jiayun Han's work at http://nlpdotnet.com/.
 
--Ravi
alcitect at 6-Feb-13 13:25pm
   
thanks for good advice
Rate this: bad
good
Please Sign up or sign in to vote.

Solution 2

Once you've worked your way through EBNF and LALR1 parsers and parser generators like Flex and Bison you might want to look at chart parsers for NLP like Alvin, they are or were thought by some to have greater potential than other approaches. My info is a little old now though so you'll have to trust your own research. You could also look into Boost::spirit if you want some seriously C++ parsing technology. This is mind bending stuff but also a lot of fun.
  Permalink  
Comments
alcitect at 6-Feb-13 14:07pm
   
thanks Mathew
Sergey Alexandrovich Kryukov at 6-Feb-13 14:10pm
   
It's a good idea to look at all this stuff, but as it's all C++, it will make it way more difficult. It's possible to find .NET stuff. Main thing is formulation of reasonably simple language. (I voted 4.)
—SA
Matthew Faithfull at 6-Feb-13 14:13pm
   
Fair enough, you spotted my attempt to recruit this one for the dark side :<{
Sergey Alexandrovich Kryukov at 6-Feb-13 17:15pm
   
:-)
Rate this: bad
good
Please Sign up or sign in to vote.

Solution 1

Please see my comments to the question. You should start with formulation of the language you want to parse. If cannot be a "real" natural language, and should not be. Just the opposite: think about making it only slightly resembling a "real" natural language, but try to make it as simple and strict as possible, unless you problem will be hopeless.
 
Some references:
http://en.wikipedia.org/wiki/Backus%E2%80%93Naur_Form[^],
http://en.wikipedia.org/wiki/EBNF[^],
http://en.wikipedia.org/wiki/Parser[^].
 
Some ideas:
https://npmjs.org/package/ebnf-parser[^],
http://www.nongnu.org/bnf/[^],
http://social.msdn.microsoft.com/Forums/en-US/csharpgeneral/thread/87a9c718-d549-48f6-a81d-795a349d4ce7[^].
 
This is a .NET project: http://ebnfnet.sourceforge.net/[^].
 
Again, the topic is too big to give a definitive advice. More importantly, it all depends on how big is the scope of your application and complexity of your language. Maybe, you need something simpler, but made all by yourself.
 
Please consider my advice mostly as the directions on how to learn what's involved and for getting some basic ideas.
 
—SA
  Permalink  
v2
Comments
alcitect at 6-Feb-13 14:08pm
   
Ok let we c
alcitect at 8-Feb-13 9:16am
   
ok now i was read many thing as your advice, and found there is to layers of parsing:
1: lexical Parsing
2: semantic parsing.
what i search about is the lexical parsing only.
Sergey Alexandrovich Kryukov at 8-Feb-13 12:29pm
   
Yes, you see, I still say your should use as simple language as possible. Extreme simplicity of basic English would help you. I would advise that you reduce semantic parsing to as simple as possible, so it would be trivial, almost non-existing. For example: fixed position or order of verb, object, etc., fixed set of verbs or other members, and so on.
—SA
alcitect at 8-Feb-13 12:43pm
   
thanks sergey for help. pls can you replay this. cause my English not so good.
Sergey Alexandrovich Kryukov at 8-Feb-13 12:55pm
   
English? I understand your English well enough, but please just do some minimal effort to make it more readable: 1) capitalize properly, 2) use full punctuation, 3) don't use abbreviations; this is just a matter of politeness and is usually required by members. Other that, you English is almost fine; I can see only minor grammar problems; and spelling is mostly fine (if you use spell check, it's a great idea; many Web browsers have spell check embedded, works as you type...).
 
So, what are you asking about now?
—SA
alcitect at 8-Feb-13 13:41pm
   
Thanks, i mean if can you repeat the last advise in another word, because i can't understand it all.
your message:(Yes, you see, I still say your should use as simple language as possible. Extreme simplicity of basic English would help you. I would advise that you reduce semantic parsing to as simple as possible, so it would be trivial, almost non-existing. For example: fixed position or order of verb, object, etc., fixed set of verbs or other members, and so on.)
 
thanks again
Sergey Alexandrovich Kryukov at 8-Feb-13 14:02pm
   
My advice was really fuzzy, it all needs design and thinking. Again, the key is to define a simplified language. For example, assume that you have, say, always 2-3 words. First is always a verb, second is subject, which is variable. Find verb, parse it into some item of the fixed set of all possible verb. You "semantic" will be just a selection of data item from predefined set (what it will be: structures with name and delegate instance, for example, and next term, subject, is passed as argument; I don't know; you have to design all that).
—SA
alcitect at 8-Feb-13 14:09pm
   
sergey i just want to make the lexical parsing only, no need to do the semantic parsing.
can i make a db to save few words and it's type, then when i have a word search in the db to know its type??
Sergey Alexandrovich Kryukov at 8-Feb-13 14:14pm
   
I though you were asking about semantic parsing. Also, do you understand that you are trying to solve in a quick attack something which need thorough work, even if the work is not so big? Yes, you can make a DB, so what? I would suggest you have s small data structure in memory instead. It should represent all grammar expected: all expected words in each sentence member. You can populate this data structure from database, if you want, but it should be in memory. Unless you devise a really complex language with a lot of lexemes. But complex language will require huge work, which would make our forum talking just ridiculous. I'm trying to discuss something relatively simple, not research or computer science work...
—SA

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



Advertise | Privacy | Mobile
Web01 | 2.8.140709.1 | Last Updated 6 Feb 2013
Copyright © CodeProject, 1999-2014
All Rights Reserved. Terms of Service
Layout: fixed | fluid