The Lounge is rated Safe For Work. If you're about to post something inappropriate for a shared office environment, then don't post it. No ads, no abuse, and no programming questions. Trolling, (political, climate, religious or whatever) will result in your account being removed.
Yes I do and the performance is god awful. This is for a compound regular expression rather than a web browser, so this is more than a little excessive. Normally the machine will spawn like 2 or 3 while it's doing normal character scans, but when it has to split it quickly grows.
The reason it spawns more than one is disjunctions in the regex, like foo|bar - it spawns a fiber to scan each one. In truth it spawns slightly more than 1 fiber on average because save points spawn a fiber. Plus each fiber only lives for the duration of one character.
They're already allocated since they're simple structs sitting inside an array. The only field that gets set are two simple 32 bit fields on the struct =) Since they're allocated this way, at least unless .NET sucks in this arena (i haven't checked the IL) they don't need to be recycled - they're permanent instances.
Furthermore, the fibers get used at maximum - they are never idle, ergo, a threadpool won't benefit me.
Ah yes, the rarely appropriate Thread Per Request pattern. Almost always better is a work queue served by a single thread, or a pool if blocking is an issue. Threads eat up memory, add context switching overhead, and introduce critical regions. I recently discovered how often Windows schedules a new thread, and I'm still flabbergasted.
Fibers, being lighter weight, shouldn't be as bad, but evidently it's still plenty bad.
honey the codewitch wrote:
each fiber only lives for the duration of one character
The fibers are already allocated since they're simple structs sitting inside an array. The only field that gets set are two simple 32 bit fields on the struct =) Since they're allocated this way, at least unless .NET sucks in this arena (i haven't checked the IL) they don't need to be recycled - they're permanent instances.
So a threadpool doesn't buy me anything. These aren't traditional threads.
No, the issue is most fibers resolve to examination of a single character in the input so if you have 10 of them the same character gets examined as much as 10 times.
This is a byproduct of the design of a Pike VM, itself an artifact of the way NFA expressions work so there's very little to be done about it except convert to a DFA (the optimization process)
Reduce the fibers and it speeds right up:
NFA ran with 10 max fibers and 3.5 average char passes
NFA+DFA (optimized) ran with 6 max fibers and 2.5 average char passes
DFA ran with 2.5 max fibers and 1 average char passes
NFA: Lexed in 1.575287 msec
NFA+DFA (optimized): Lexed in 1.054843 msec
DFA: Lexed in 0.901254 msec
NFA: Lexed in 1.529819 msec
NFA+DFA (optimized): Lexed in 1.100836 msec
DFA: Lexed in 0.830835 msec
NFA: Lexed in 1.523334 msec
NFA+DFA (optimized): Lexed in 1.049213 msec
DFA: Lexed in 0.851737 msec
NFA: Lexed in 1.400265 msec
NFA+DFA (optimized): Lexed in 1.03485 msec
DFA: Lexed in 0.829009 msec
Well, it's not on purpose per se. I mean yes, I'm spawning a lot of them, but the idea is to keep as few active or "alive" at one time as possible.
when I see a jmp with 3 operands it spawns 2 fibers in addition to a primary fiber.
That's what I don't want, since every fiber has to examine the character under the cursor which leads to many examinations of the same character. There's no way to optimize this out because it's rather the point of the fiber running in the first place. Multiple examinations are a byproduct of the NFA algorithm.
My goal is simply to reduce/eliminate the amount of jmps and especially the number of operands they have.
A pure DFA can run by examining each character only once.