Click here to Skip to main content
15,890,882 members

Welcome to the Lounge

   

For discussing anything related to a software developer's life but is not for programming questions. Got a programming question?

The Lounge is rated Safe For Work. If you're about to post something inappropriate for a shared office environment, then don't post it. No ads, no abuse, and no programming questions. Trolling, (political, climate, religious or whatever) will result in your account being removed.

 
GeneralRe: There's an app for that! Pin
Jörgen Andersson21-Jan-20 23:29
professionalJörgen Andersson21-Jan-20 23:29 
GeneralRe: I got this new breakfast cereal called "Lorem ipsum" Pin
Rick York22-Jan-20 4:55
mveRick York22-Jan-20 4:55 
GeneralCCC 22/1/2019 Pin
super21-Jan-20 21:53
professionalsuper21-Jan-20 21:53 
GeneralRe: CCC 22/1/2019 Pin
OriginalGriff21-Jan-20 21:54
mveOriginalGriff21-Jan-20 21:54 
GeneralRe: CCC 22/1/2019 Pin
Duncan Edwards Jones21-Jan-20 22:03
professionalDuncan Edwards Jones21-Jan-20 22:03 
GeneralRe: CCC 22/1/2019 Pin
OriginalGriff21-Jan-20 22:17
mveOriginalGriff21-Jan-20 22:17 
GeneralRe: CCC 22/1/2019- we have a winner. Pin
super22-Jan-20 1:35
professionalsuper22-Jan-20 1:35 
GeneralA tale of two lexers Pin
honey the codewitch21-Jan-20 21:50
mvahoney the codewitch21-Jan-20 21:50 
(Reference source provided below)

So optimizing my lexer machine is slow going, but i'm feeling clever.

Introducing a new switch/case instruction worked, cutting the lexing time for the DFA version of my simple test lexer to as little as half what the NFA version costs

NFA lex: fubar bar 123 1foo bar -243 0
Lexed in 0.0217 msec
DFA lex: fubar bar 123 1foo bar -243 0
Lexed in 0.0134 msec

The trouble i'm having is
A) loading lazy quantified regexs into the NFA graph. I don't know how to represent them
B) Doing a partial, opportunistic transformation from NFA to DFA. I am running into trouble due to my use of a rangeset as my basic input alphabet element among other things.

Once I can do that though, I'll have significantly optimized this little machine. And I'll have what will be pretty close to a "real" optimizing compiler for it.

NFA version:
jmp id, int, space, error

id: ; [A-Z_a-z][0-9A-Z_a-z]*
set "A".."Z", "_", "a".."z"
id_loop: jmp id_part, id_done
id_part: set "0".."9", "A".."Z", "_", "a".."z"
jmp id_loop
id_done: save 1
match 0

int: ; (0|\-?[1-9][0-9]*)
jmp int_zero, int_nonzero
int_zero:
char "0"
jmp int_done
int_nonzero: jmp int_neg, int_pos
int_neg: char "-"
int_pos: set "1".."9"
int_loop: 
jmp int_part, int_done
int_part: set "0".."9"
jmp int_loop
int_done: save 1
match 1

space: ; (\t|\n|\v|\f|\r| )
set "\t", "\n", "\v", "\f", "\r", " "
save 1
match 2

error: ; anything not caught above returns -1
any
save 1
match -1


DFA version:
save 0
switch case "A".."Z","_","a".."z":id_part, case "0":int_done,case "-":int_neg,case "1".."9":int_digits,case "\t", "\n", "\v", "\f", "\r", " ":space,default:error

id_part:
switch case "0".."9","A".."Z","_","a".."z":id_part, default:id_done
id_done:
save 1
match 0

int_neg:
set "1".."9"
int_digits:
switch case "0".."9":int_digits, default:int_done
int_done:
save 1
match 1

space:
save 1
match 2

error:
any
save 1
match -1

Real programmers use butterflies

GeneralRe: A tale of two lexers Pin
Sander Rossel21-Jan-20 22:26
professionalSander Rossel21-Jan-20 22:26 
GeneralRe: A tale of two lexers Pin
OriginalGriff21-Jan-20 22:34
mveOriginalGriff21-Jan-20 22:34 
GeneralRe: A tale of two lexers Pin
honey the codewitch21-Jan-20 22:49
mvahoney the codewitch21-Jan-20 22:49 
GeneralRe: A tale of two lexers Pin
Sander Rossel21-Jan-20 22:51
professionalSander Rossel21-Jan-20 22:51 
GeneralRe: A tale of two lexers Pin
honey the codewitch21-Jan-20 22:46
mvahoney the codewitch21-Jan-20 22:46 
GeneralRe: A tale of two lexers Pin
Sander Rossel21-Jan-20 22:52
professionalSander Rossel21-Jan-20 22:52 
GeneralRe: A tale of two lexers Pin
honey the codewitch21-Jan-20 22:54
mvahoney the codewitch21-Jan-20 22:54 
GeneralRe: A tale of two lexers Pin
Richard Deeming21-Jan-20 23:57
mveRichard Deeming21-Jan-20 23:57 
GeneralRe: A tale of two lexers Pin
Rick York22-Jan-20 4:57
mveRick York22-Jan-20 4:57 
GeneralRe: A tale of two lexers Pin
Mark_Wallace21-Jan-20 23:23
Mark_Wallace21-Jan-20 23:23 
GeneralRe: A tale of two lexers Pin
honey the codewitch21-Jan-20 23:27
mvahoney the codewitch21-Jan-20 23:27 
GeneralFinally! A truly useful update! Pin
Mark_Wallace21-Jan-20 20:42
Mark_Wallace21-Jan-20 20:42 
GeneralRe: Finally! A truly useful update! Pin
Sander Rossel21-Jan-20 21:45
professionalSander Rossel21-Jan-20 21:45 
GeneralRe: Finally! A truly useful update! Pin
Mark_Wallace21-Jan-20 23:17
Mark_Wallace21-Jan-20 23:17 
GeneralRe: Finally! A truly useful update! Pin
Kornfeld Eliyahu Peter21-Jan-20 21:50
professionalKornfeld Eliyahu Peter21-Jan-20 21:50 
GeneralWondering why all those funny names for new programming languages? Pin
Kornfeld Eliyahu Peter21-Jan-20 20:31
professionalKornfeld Eliyahu Peter21-Jan-20 20:31 
GeneralRe: Wondering why all those funny names for new programming languages? Pin
Sander Rossel21-Jan-20 21:48
professionalSander Rossel21-Jan-20 21:48 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.