Click here to Skip to main content
15,612,987 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
Fast version :
All is in the title.

Long version :

I am building an interpreted language in C++ and wanted to know what you think of a parser running using hashed strings.

* Look-up in a table of hashed values (32 or 64bits) is way faster than in a string table ;
* Sweet switch statement with compile time string hashing ;
* The amount of characters in a programming language word (keyword, variable, function name) is quite low (depending on how you code) so collision should be rare (but this is only a guess).

* A lot of hashing computation at lexer time ;
* Dealing with collision is not easy in multithreaded execution of the lexer.

These arguments gave me three choices if wish to implement a hashed parser:
1) Ignore collision and get crazy performances ;
2) Deal with collision with a high impact of performances ;
3) Detect collision (maybe only in debug build) and ask programmer to change the collided name.

Thank you for your time.

PS: If you have experience in parsing and want to share about that, I would be pleased to discuss with you !

What I have tried:

I already implemented a hashing parser but without collision check (and no multithreading).
I never encountered a hash collision using it but I didn't tested a lot of input text data.
Updated 17-Jan-21 1:44am
Stefan_Lang 17-Jan-21 12:13pm    
I'm not an expert on parsing, but I'd use a std::map<string_token,function_object> rather than implementing my own hash function and managing the results with a switch statement.

Or I just don't understand this topic enough and should leave this Q&A to the experts ;-)

Take a look at How to Make an LL(1) Parser: Lesson 1[^]. The author is probably the best person to answer your question.
Share this answer
Matthieu Mv 17-Jan-21 8:53am    
Thank you that's a good advice !
You might be interested in: Fastest Hash Function for Table Lookups in C?![^]
Share this answer
Matthieu Mv 17-Jan-21 8:53am    
Thank you that's very interesting !

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900