Click here to Skip to main content
Click here to Skip to main content
Add your own
alternative version

Tokenizer and analyzer package supporting precedence prioritized rules

, 1 Jan 2002
A library allowing you to conveniently build a custom tokenizer and analyzer supporting precedence priorized rules
cxtpackagetut_win32vc.zip
cxTokenizer
cxTokenizer.dsp
cxTokenizer.plg
cxtPackage
cxtPackage.plg
cxtPackage.dsp
mathTok
mathTok.plg
mathTok.dsp
simpleCalc
simpleCalc.bmp
simpleCalc.dsp
simpleCalc.plg
tkCommon
cxAnalyzer
cxAnalyzer.dsp
cxAnalyzer.plg
COPYING
cpAbsd.dsw
cxtpackage_win32vc.zip
cxTokenizer.dsp
cxTokenizer.plg
Debug
cxAnalyzer.dsp
cxAnalyzer.plg
cxtPackage.plg
cxtPackage.dsp
COPYING
cpAbsd.dsw
grammaride.zip
stlport_vc645.dll
grammarIDE.exe
pkgcomplete.zip
cxTokenizerMatchTokenRule.inl
cxTokenizer.dsp
cxtPackage.dsp
emptyTestApp
emptyTestApp.clw
res
emptyTestApp.ico
emptyTestApp.dsp
grammarIDE
grammarIDE.clw
res
grammarIDEDoc.ico
icon1.ico
zoomable.ico
Toolbar.bmp
grammarIDE.ico
vssver.scc
grammarIDE.dsp
simpleCalc.bmp
simpleCalc.dsp
vssver.scc
cxaToken.inl
cxAnalyzer.dsp
COPYING
cpAbsd.dsw
pkgsrconly.zip
cxTokenizerMatchTokenRule.inl
cxTokenizer.dsp
cxtPackage.dsp
emptyTestApp.clw
emptyTestApp.ico
emptyTestApp.dsp
vssver.scc
cxaToken.inl
cxAnalyzer.dsp
COPYING
cpAbsd.dsw
grammarIDE ReadMe
-----------------

This product is not nearly finished, so please be forgiveful if you find some bugs 
(this refers to the IDE, not to the tokenizer/analyzer library).

To get some results fast, extract the contents of this .ZIP-Archive someplace on your
HD, lets say c:\grammarIDE. Now start the IDE, select File/Open and search for 
"sample-grammar.txt" located in the directory you unzipped the files to.

You should now see in the left pane the tree structure of the grammar, and in the editor
pane you see the source code.

Now select Parse/Evaluate Expression and enter for example 1*2+3-4*(5/6)*8-9 and press "Evaluate".
You should now get a graphical display of the parsed expression.

Now, press "Rebalance", and the parse tree is instantly reorganized with respect to the 
precedence priorities of the grammar.


C++ grammar
---------------
As an example of a pretty complex grammar you can alternatively open the file "cpp-grammar.txt".
This is an pre-beta version of the grammar of a C++ compiler, slightly modified to run without the context
of a compiler (for example variable names are treated simply as literals).
To test it, select again Parse/Evaluate, select in the "Select rule" - Combo the rule
".globalscopeblock" and enter a C++ program without templates or preprocessor statements.

struct x { int **(const *a[3])[4]; };
class t : public b {
private:
 x s;
};
void test()
{
  int *b;
  (*x->a[1])[0]=&b;
}


---
Small grammar reference:
[tokens]	- defines tokens
[seperators]	- defines seperators
[rules]		- defines pre-defined tokenizer rules ("numbers" for example)
[grammar]	- defines the analyzer grammar

----
A typical line in the sections [tokens], [seperators] or [rules] looks like:
xxx:zzz

Where 'xxx' is the ID the item gets assigned to, and 'zzz' describes the item itself.
For both [tokens] and [seperators], 'zzz' is plain text describing the token in question
and can include escape characters in a subset of the C escape notation.

In the section [rules] things are a bit different:
'zzz' equals 'numbers' means: include the number token recognition rule into the parser.
For more information on this topic, see http://www.subground.cc/devel.
CAUTION: The resulting token of the rule 'numbers' is named 'number' - see below

----
A typical line in the secion [grammar] looks like this:

xxx:{.rulename}=yyy:{item}[{item},...]

Where 'xxx' is again the ID, 'rulename' - surrounded by '{}' and prepended with a '.' - is
the name of the rule to declare. 'yyy' is the precedence priority which must be in the range
0 (maximum precedence) to 32767 (minimum precedence).

There are different classes of 'item's:
{$hello} -> refers to the token 'hello' CAUTION: must be defined in the tokens or seperators section!
{!number} -> refers to the rule 'numbers' CAUTION: The rule is named 'numbers' instead of 'number'!
{.rule} -> refers to the grammar rule 'rule'
{#literal#} -> refers to an undefined literal

By viewing downloads associated with this article you agree to the Terms of Service and the article's licence.

If a file you wish to view isn't highlighted, and is a text file (not binary), please let us know and we'll add colourisation support for it.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here

Share

About the Author

Alexander Berthold
Web Developer
Germany Germany
No Biography provided

| Advertise | Privacy | Mobile
Web03 | 2.8.141015.1 | Last Updated 2 Jan 2002
Article Copyright 2001 by Alexander Berthold
Everything else Copyright © CodeProject, 1999-2014
Terms of Service
Layout: fixed | fluid