|
|
Comments and Discussions
|
|
 |

|
Embrace yourself for the hit of 16 railguns in a second.
Recently I wrote a superfast fuzzy (GALADRIEL) console text searcher, and to put exact matching under one roof along with wildcards and Levenshtein Distance (Wagner-Fischer) was a spontaneous idea which in turn resulted in the fastest 'on the fly' text searcher known to me.
Since the exact matching was not supposed to enter among fuzzy guns it was invoked just as other two hitters (wildcards&fuzzy) for EACH LINE - which is very slow and for high speeds is highly not recommended, I did it that way in order to see how it would behave, just of curiosity.
But my laptop's 2 threads can do it better, so revision 1-++ showcases the full might of my top-gun text hitter: Railgun_Quadruplet_7Gulliver - the fastest text search function known to me, and on top of that multi-threaded.
I wrote it as mix of Boyer-Moore-Horspool-Sunday order 2 and some other nifty micro etudes.
This resulted in appearance of first in the INTERNET text searcher utilizing the I/O read bandwidth at its fullest!
To be more precise, the upper theoretical speed limit is 16 threads * 3 GB/s = 48 GB/s, those 3GB/s are nominal for 'Railgun_Quadruplet_7Gulliver' on my laptop. Of course, in reality, my estimation is 8-20GB/s.
Because of this monstrous bandwidth Kazahana proves to be typhoon class tool.
For those who don't believe I suggest to give it a try on their superfast I/O systems, nowadays mainstream drives offer the "miserable" 512MB/s, it would be very interesting those with 1++GB/s to share their results.
For example my humble laptop is equipped with Samsung 470 64GB which gives average linear read 241MB/s at 1MB blocks, obtained with 'Everest'.
From Wikipedia torture below you can see how full is my fullest: (241-232)/232*100% = 3.8% deviation:
To run the test below you need to download enwiki-20121201-pages-articles.xml (42,153,646,707 bytes) file from http://dumps.wikimedia.org/enwiki/.
Yes, that's right, you can extract lines of latest Wikipedia containing strings you need.
E:\_Kaze_Kazahana>timer "Kazahana_r1-++_HEXADECAD-Threads_IntelV12.exe" "metal fatigue" ..\enwiki-20121201-pages-articles.xml
Timer 9.01 : Igor Pavlov : Public domain : 2009-05-31
Kazahana, a superfast exact & wildcards & Levenshtein Distance (Wagner-Fischer) searcher, revision 1-++, copyleft Kaze 2013-Feb-09.
omp_get_num_procs( ) = 2
omp_get_max_threads( ) = 2
Enforcing HEXADECAD i.e. hexadecuple-threads ...
Allocating Master-Buffer 7MB ... OK
-; 00,000,244,543 bytes/clock
Kazahana: Dumped xgrams: 329
Kazahana: Performance: 238 KB/clock
Kazahana: Done.
Kernel Time = 58.391 = 33%
User Time = 148.902 = 86%
Process Time = 207.294 = 119%
Global Time = 172.780 = 100%
E:\_Kaze_Kazahana>timer grep\grep -c "metal fatigue" ..\enwiki-20121201-pages-articles.xml
Timer 9.01 : Igor Pavlov : Public domain : 2009-05-31
329
Kernel Time = 24.148 = 13%
User Time = 78.718 = 44%
Process Time = 102.867 = 58%
Global Time = 175.565 = 100%
E:\_Kaze_Kazahana>timer "Kazahana_r1-++_HEXADECAD-Threads_IntelV12.exe" "metal fatigue" ..\enwiki-20121201-pages-articles.xml
Timer 9.01 : Igor Pavlov : Public domain : 2009-05-31
Kazahana, a superfast exact & wildcards & Levenshtein Distance (Wagner-Fischer) searcher, revision 1-++, copyleft Kaze 2013-Feb-09.
omp_get_num_procs( ) = 2
omp_get_max_threads( ) = 2
Enforcing HEXADECAD i.e. hexadecuple-threads ...
Allocating Master-Buffer 7MB ... OK
-; 00,000,244,102 bytes/clock
Kazahana: Dumped xgrams: 329
Kazahana: Performance: 238 KB/clock
Kazahana: Done.
Kernel Time = 59.108 = 34%
User Time = 143.224 = 82%
Process Time = 202.333 = 116%
Global Time = 173.508 = 100%
E:\_Kaze_Kazahana>"Kazahana_r1-++_HEXADECAD-Threads_IntelV12.exe"
Kazahana, a superfast exact & wildcards & Levenshtein Distance (Wagner-Fischer) searcher, revision 1-++, copyleft Kaze 2013-Feb-09.
Usage: Kazahana [AtMostLevenshteinDistance] string textualfile
Note1: There are three regimes: exact, wildcards and fuzzy searches. First two kick in when 2 parameters are given, fuzzy when 3.
Note2: What decides whether exact or wildcards? Of course presence of at least one wildcard. To see exact search see Example #4.
Note3: Exact search hits with 'Railgun_Quadruplet_7Gulliver'.
Note4: Incoming string is automatically lowercased for wildcards searches i.e. they are case insensitive.
Note5: Incoming string could be up to 21168/126 chars for exact&wildcards/Levenshtein respectively.
Note6: Incoming textualfile could be bigger than 4GB.
Note7: Each line should end with [CR]LF, that is Windows or/and UNIX style.
Note8: The dump goes to Kazahana.txt file.
Note9: Seven wildcards are available:
wildcard '*' any character(s) or empty,
wildcard '@'/'#' any character {or empty}/{and not empty},
wildcard '^'/'$' any ALPHA character {or empty}/{and not empty},
wildcard '|'/'~' any NON-ALPHA character {or empty}/{and not empty}.
Example1: E:\>Kazahana 0 ramjet MASAKARI_General-Purpose_Grade_English_Wordlist_r3_316423_words.wrd
Example2: E:\>Kazahana 3 psychedlicize MASAKARI_General-Purpose_Grade_English_Wordlist_r3_316423_words.wrd
Example3: E:\>Kazahana "psyched^^^^^^ize^" MASAKARI_General-Purpose_Grade_English_Wordlist_r3_316423_words.wrd
Example4: E:\>Kazahana "metal fatigue" enwiki-20121201-pages-articles.xml
Example5: E:\>Kazahana "out^^^^^^^^^^^^^ize*" MASAKARI_General-Purpose_Grade_English_Wordlist_r3_316423_words.wrd
E:\>type Kazahana.txt
[out^^^^^^^^^^^^^ize*] outhyperbolize /MASAKARI_General-Purpose_Grade_English_Wordlist_r3_316423_words.wrd/
[out^^^^^^^^^^^^^ize*] outsize /MASAKARI_General-Purpose_Grade_English_Wordlist_r3_316423_words.wrd/
[out^^^^^^^^^^^^^ize*] outsized /MASAKARI_General-Purpose_Grade_English_Wordlist_r3_316423_words.wrd/
[out^^^^^^^^^^^^^ize*] outstrategize /MASAKARI_General-Purpose_Grade_English_Wordlist_r3_316423_words.wrd/
[out^^^^^^^^^^^^^ize*] outtyrannize /MASAKARI_General-Purpose_Grade_English_Wordlist_r3_316423_words.wrd/
E:\_Kaze_Kazahana>
And to behold the real hit of only 2 railguns:
E:\_Kaze_Kazahana>timer Kazahana_r1-+_MONAD-Thread_IntelV12 ramjet 4andabove_Gamera.tar.2.sorted
Timer 9.01 : Igor Pavlov : Public domain : 2009-05-31
Kazahana, a superfast exact & wildcards & Levenshtein Distance (Wagner-Fischer) searcher, revision 1-++, copyleft Kaze 2013-Feb-09.
Enforcing MONAD i.e. single-thread ...
Allocating Master-Buffer 7MB ... OK
|; 00,000,639,411 bytes/clock
Kazahana: Dumped xgrams: 49
Kazahana: Performance: 625 KB/clock
Kazahana: Done.
Kernel Time = 0.795 = 33%
User Time = 1.513 = 63%
Process Time = 2.308 = 96%
Global Time = 2.381 = 100%
E:\_Kaze_Kazahana>timer Kazahana_r1-+_HEXADECAD-Threads_IntelV12 ramjet 4andabove_Gamera.tar.2.sorted
Timer 9.01 : Igor Pavlov : Public domain : 2009-05-31
Kazahana, a superfast exact & wildcards & Levenshtein Distance (Wagner-Fischer) searcher, revision 1-++, copyleft Kaze 2013-Feb-09.
omp_get_num_procs( ) = 2
omp_get_max_threads( ) = 2
Enforcing HEXADECAD i.e. hexadecuple-threads ...
Allocating Master-Buffer 7MB ... OK
|; 00,000,729,181 bytes/clock
Kazahana: Dumped xgrams: 49
Kazahana: Performance: 703 KB/clock
Kazahana: Done.
Kernel Time = 0.904 = 51%
User Time = 1.778 = 100%
Process Time = 2.683 = 151%
Global Time = 1.771 = 100%
E:\_Kaze_Kazahana>timer grep\grep ramjet 4andabove_Gamera.tar.2.sorted
Timer 9.01 : Igor Pavlov : Public domain : 2009-05-31
0,000,083 bussard_ramjet
0,000,051 the_ramjet
0,000,048 the_ramjets
0,000,046 a_ramjet
0,000,031 a_scramjet
0,000,027 the_scramjet
0,000,026 bussard_ramjets
0,000,018 interstellar_ramjet
0,000,014 ramjet_engine
0,000,012 scramjet_powered
0,000,012 ramjet_is
0,000,011 scramjet_engines
0,000,011 scramjet_engine
0,000,011 ramjet_engines
0,000,010 ramjets_were
0,000,010 combustion_ramjet
0,000,009 ramjet_and
0,000,008 ramjet_controls
0,000,007 combustion_ramjets
0,000,006 water_ramjet
0,000,006 scramjet_technology
0,000,006 ramjets_on
0,000,006 ramjet_will
0,000,006 ramjet_speeds
0,000,006 ramjet_ship
0,000,006 ramjet_rocket
0,000,006 ramjet_in
0,000,006 mode_scramjet
0,000,005 scramjets_can
0,000,005 ramjet_to
0,000,005 ramjet_scramjet
0,000,005 ramjet_operation
0,000,005 of_scramjets
0,000,005 of_scramjet
0,000,005 of_ramjets
0,000,005 by_ramjets
0,000,005 and_ramjets
0,000,005 and_ramjet
0,000,004 scramjet_to
0,000,004 scramjet_s
0,000,004 scramjet_is
0,000,004 scramjet_intake
0,000,004 ramjet_was
0,000,004 ramjet_a
0,000,004 raking_ramjets
0,000,004 or_scramjet
0,000,004 expander_ramjets
0,000,004 ejector_ramjet
0,000,004 a_turboramjet
Kernel Time = 0.546 = 10%
User Time = 4.258 = 82%
Process Time = 4.804 = 93%
Global Time = 5.138 = 100%
E:\_Kaze_Kazahana>
Glad I will be to receive some feedback on how Kazahana performs on 1++GB/s I/O systems.
Forgot to give the link: http://www.sanmayce.com/Downloads/Kazahana_r1-++.zip[^]
Get down get down get down get it on show love and give it up
What are you waiting on?
modified 11 Feb '13 - 13:40.
|
|
|
|

|
Just want to share my latest hash package, I was lucky to write the fastest hash function:
FNV1A_Yorikke
http://www.sanmayce.com/Fastest_Hash/index.html#FNV1A_Yorikke[^]
FNV1A_Yorikke outspeeds monstrously (lines below) CRC-32 while maintaining similar collisions.
In OSHO.TXT 'Building-Blocks' test with (85868050-70657880)/70657880*100% = 21%
In '5,000,000 Knight Tours' test with (9774609-5986428)/5986428*100% = 63%
In '100MB as one line' test with (764957-176506)/176506*100% = 333%
Of course T7500 limits her, on new CPUs like i5/i7 FNV1A_Yorikke simply 'dances on the water'.
Enjoy!
Get down get down get down get it on show love and give it up
What are you waiting on?
|
|
|
|

|
At last I made an easy to be run test package (a NSIS installation):
http://www.sanmayce.com/Downloads/index.html#Jesters[^]
I initiated a thread on a cool (COLD yes) overclock maniacs forum at:
http://www.overclockaholics.com/forums/showthread.php?t=5132[^]
'Monstrous Jesters' benchmark package short overview:
This is my latest 32bit/64bit (strstr-showdown included) CPU/RAM benchmark package (a NSIS installation).
File: Monstrous_Jesters.exe
Size: 153 MB (161,009,933 bytes)
Size unpacked: 500 MB
Size needed: 1200 MB
After installation 5 shortcuts (tests) are placed on Desktop/Programs.
All tests are written in C (sources included), and compiled with latest Intel 12.1 and Microsoft 16 optimizers.
The MEMMEM (strstr-showdown) test takes some 21minutes to complete on Core2Duo_E7500_2.93Ghz.
Of course in order to obtain decent results stop all the concurrent processes before running the test.
Also enable 100% computing power.
Well, there are some additional tests (Intel 12.1 and Microsoft 16 executables included):
- lzpre a LZ77 32bit/64bit [de]compressor, written by Matt Mahoney;
- Yappy a LZ 32bit/64bit [de]compressor, written by IronPeter;
- Knight tour benchmark, finds first 9,000,000 tours (at rate some 1 billion per minute jumps), in fact tests/stresses only CPU clock;
- Quicksort 32bit/64bit used to sort 200,000,000+ pointers (pointing to 7bytes chunks).
Also I would be glad for some feedback and results on your machines.
Enjoy!
Get down get down get down get it on show love and give it up
What are you waiting on?
|
|
|
|

|
Just noticed your comment in an older post:
"Quick note3: 'Trident' being 200MB/s faster than BNDM_64 for text and 700MB/s slower than BNDM_64 for DNA, ouch!"
Yes, pretty much that exact statement is in "Flexible Pattern Matching...", the Navarro text.
The statistical analysis there says that BNDM on average beats Horspool for small alphabets like DNA,
Horspool beats BNDM for larger alphabets like English -- i.e. where the probability of any given character occurring is less and less.
BNDM immediately benefits from larger word sizes, Horspool doesn't.
For longer patterns, there's another algorithm (Backward Oracle Matching) that beats Horspool, but has a higher setup cost.
The 64bit tests I'm running mostly confirm this. Linux 2.6.32, GCC 4.3.4, Intel Core2 12-CPU 3.33GHz, L1 cache 12MB.
"Dreams come true, not free" -- S.Sondheim
|
|
|
|

|
Mischa thanks for keeping the pace.
My last test (you referring to) shows that for 16/32 long patterns 'Trident' is (but slowing down with patterns getting longer) significantly faster than BNDM_64. However starting from 48- long patterns (due to increased number of repetitive chars/pairs) 'Trident' is inferior thus it should be silenced. For 64- byte long patterns BNDM_64 thrashes 'Trident'. That is why my eyes are on mixing the BNDM_64 and 'Trident'.
>... Backward Oracle Matching ...
I should fill another caldera of mine, the cost is the last thing I look for.
Just got emotional (again) and went to 'The Lounge'. Did you know that UNIX stands for "UNIversal eXecutive" - I like knowing such stuff.
http://www.codeproject.com/Lounge.aspx?msg=4179708#xx4179708xx[^]
Get down get down get down get it on show love and give it up
What are you waiting on?
|
|
|
|

|
Yep. But then I've been using Unix since 1977
"Dreams come true, not free" -- S.Sondheim
|
|
|
|

|
Just found an inspiring PDF article at:
http://www.stringology.org/event/2008/p14.html[^]
An excerpt:
We presented two efficient variants of the Backward Oracle Matching algorithm which
is considered one of the most effective algorithm for exact string matching. The first
variation, called Extended-BOM, introduces an efficient fast-loop over transitions of
the oracle by reading two consecutive characters for each iteration. The second varia-
tion, called Forward-BOM, extends the previous one by using a look-ahead character
at the beginning of transitions in order to obtain larger shift advancements.
Thanks for pointing out. Very good tests they conducted - it is quite obvious there is a lot of yummy food.
Get down get down get down get it on show love and give it up
What are you waiting on?
|
|
|
|

|
In following tests 'Tridentx64' ('Gulliver' reinforced by BNDM_64 order 4 presence check) failed miserably to stand against BNDM_64 for patterns 32+ bytes long.
Five stable facts:
- BNDM_64 much slower for DNA-type needles and DNA-type haystacks
- BNDM_64 much slower for English-type needles (32-- long) and English-type haystacks
- BNDM_64 much faster for English-type needles (32+ long) and English-type haystacks
- BNDM_64 much faster for English-type needles and DNA-type haystacks
- BNDM_64 much faster for DNA-type needles and English-type haystacks
In next two (DNA-type haystacks) tests 'Hasherezade' is faster than 'Tridentx64', both faster than BNDM_64.
Note: Still I am stunned badly from the awful 64bit 'Hasherezade' performance compared to 32bit 'Hasherezade'.
Haystack: hs_alt_HuRef_chr1.fa
Executable: strstr_SHORT-SHOWDOWN_Microsoft_v16_Ox_32bit.exe
Skip-Performance(bigger-the-better): 1266%, 17578894 skips/iterations
Found ('AGATTTTAAAGATTTT') 3 time(s), Railgun_Quadruplet_7Tridentx64 performance: 1309KB/clock
Skip-Performance(bigger-the-better): 1261%, 17648956 skips/iterations
Found ('TTGACATAGAATCTTA') 1 time(s), Railgun_Quadruplet_7Tridentx64 performance: 1294KB/clock
Skip-Performance(bigger-the-better): 1268%, 17542938 skips/iterations
Found ('TGGAGGCTGAGAAATA') 1 time(s), Railgun_Quadruplet_7Tridentx64 performance: 1221KB/clock
Skip-Performance(bigger-the-better): 2551%, 8724089 skips/iterations
Found ('AGATTTTAAAGATTTTCTTTTTTTTTGACATA') 1 time(s), Railgun_Quadruplet_7Tridentx64 performance: 2195KB/clock
Skip-Performance(bigger-the-better): 2438%, 9128017 skips/iterations
Found ('GAATCTTATGGAGGCTGAGAAATAATTTTTTT') 1 time(s), Railgun_Quadruplet_7Tridentx64 performance: 2031KB/clock
Skip-Performance(bigger-the-better): 3624%, 6142214 skips/iterations
Found ('AGATTTTAAAGATTTTCTTTTTTTTTGACATAGAATCTTATGGAGGCT') 1 time(s), Railgun_Quadruplet_7Tridentx64 performance: 2823KB/clock
Skip-Performance(bigger-the-better): 3264%, 6820145 skips/iterations
Found ('TAAAGATTTTCTTTTTTTTTGACATAGAATCTTATGGAGGCTGAGAAA') 1 time(s), Railgun_Quadruplet_7Tridentx64 performance: 2498KB/clock
Skip-Performance(bigger-the-better): 4386%, 5074980 skips/iterations
Found ('AGATTTTAAAGATTTTCTTTTTTTTTGACATAGAATCTTATGGAGGCTGAGAAATAATTTTTTT') 1 time(s), Railgun_Quadruplet_7Tridentx64 performance: 3019KB/clock
Skip-Performance(bigger-the-better): 4590%, 4849475 skips/iterations
Found ('TTTAAAGATTTTCTTTTTTTTTGACATAGAATCTTATGGAGGCTGAGAAATAATTTTTTTTCTA') 1 time(s), Railgun_Quadruplet_7Tridentx64 performance: 3150KB/clock
Railgun_Quadruplet_7Tridentx64 49 i.e. average performance: 3269KB/clock
Railgun_Quadruplet_7Tridentx64 49 total Skip-Performance/Iterations: 2634368/7091550000
Skip-Performance(bigger-the-better): 1449%, 15356686 skips/iterations
Found ('AGATTTTAAAGATTTT') 3 time(s), BNDM_64 performance: 1045KB/clock
Skip-Performance(bigger-the-better): 1396%, 15944910 skips/iterations
Found ('TTGACATAGAATCTTA') 1 time(s), BNDM_64 performance: 1045KB/clock
Skip-Performance(bigger-the-better): 1408%, 15804660 skips/iterations
Found ('TGGAGGCTGAGAAATA') 1 time(s), BNDM_64 performance: 997KB/clock
Skip-Performance(bigger-the-better): 2972%, 7488740 skips/iterations
Found ('AGATTTTAAAGATTTTCTTTTTTTTTGACATA') 1 time(s), BNDM_64 performance: 1811KB/clock
Skip-Performance(bigger-the-better): 2963%, 7512750 skips/iterations
Found ('GAATCTTATGGAGGCTGAGAAATAATTTTTTT') 1 time(s), BNDM_64 performance: 1739KB/clock
Skip-Performance(bigger-the-better): 4533%, 4909974 skips/iterations
Found ('AGATTTTAAAGATTTTCTTTTTTTTTGACATAGAATCTTATGGAGGCT') 1 time(s), BNDM_64 performance: 2470KB/clock
Skip-Performance(bigger-the-better): 4530%, 4913981 skips/iterations
Found ('TAAAGATTTTCTTTTTTTTTGACATAGAATCTTATGGAGGCTGAGAAA') 1 time(s), BNDM_64 performance: 2442KB/clock
Skip-Performance(bigger-the-better): 6121%, 3636339 skips/iterations
Found ('AGATTTTAAAGATTTTCTTTTTTTTTGACATAGAATCTTATGGAGGCTGAGAAATAATTTTTTT') 1 time(s), BNDM_64 performance: 2937KB/clock
Skip-Performance(bigger-the-better): 6120%, 3637209 skips/iterations
Found ('TTTAAAGATTTTCTTTTTTTTTGACATAGAATCTTATGGAGGCTGAGAAATAATTTTTTTTCTA') 1 time(s), BNDM_64 performance: 2977KB/clock
BNDM_64 49 i.e. average performance: 3502KB/clock
BNDM_64 49 total Skip-Performance/Iterations: 2806144/6595760528
Skip-Performance(bigger-the-better): 1242%, 17921894 skips/iterations
Found ('AGATTTTAAAGATTTT') 3 time(s), Railgun_Quadruplet_7Hasherezade performance: 1411KB/clock
Skip-Performance(bigger-the-better): 1083%, 20540215 skips/iterations
Found ('TTGACATAGAATCTTA') 1 time(s), Railgun_Quadruplet_7Hasherezade performance: 1187KB/clock
Skip-Performance(bigger-the-better): 1151%, 19328506 skips/iterations
Found ('TGGAGGCTGAGAAATA') 1 time(s), Railgun_Quadruplet_7Hasherezade performance: 1228KB/clock
Skip-Performance(bigger-the-better): 2619%, 8497158 skips/iterations
Found ('AGATTTTAAAGATTTTCTTTTTTTTTGACATA') 1 time(s), Railgun_Quadruplet_7Hasherezade performance: 2860KB/clock
Skip-Performance(bigger-the-better): 2654%, 8385171 skips/iterations
Found ('GAATCTTATGGAGGCTGAGAAATAATTTTTTT') 1 time(s), Railgun_Quadruplet_7Hasherezade performance: 2619KB/clock
Skip-Performance(bigger-the-better): 4119%, 5403967 skips/iterations
Found ('AGATTTTAAAGATTTTCTTTTTTTTTGACATAGAATCTTATGGAGGCT') 1 time(s), Railgun_Quadruplet_7Hasherezade performance: 3882KB/clock
Skip-Performance(bigger-the-better): 4124%, 5397642 skips/iterations
Found ('TAAAGATTTTCTTTTTTTTTGACATAGAATCTTATGGAGGCTGAGAAA') 1 time(s), Railgun_Quadruplet_7Hasherezade performance: 3684KB/clock
Skip-Performance(bigger-the-better): 5609%, 3968150 skips/iterations
Found ('AGATTTTAAAGATTTTCTTTTTTTTTGACATAGAATCTTATGGAGGCTGAGAAATAATTTTTTT') 1 time(s), Railgun_Quadruplet_7Hasherezade performance: 4025KB/clock
Skip-Performance(bigger-the-better): 5643%, 3944537 skips/iterations
Found ('TTTAAAGATTTTCTTTTTTTTTGACATAGAATCTTATGGAGGCTGAGAAATAATTTTTTTTCTA') 1 time(s), Railgun_Quadruplet_7Hasherezade performance: 4180KB/clock
Railgun_Quadruplet_7Hasherezade 49 i.e. average performance: 3376KB/clock
Railgun_Quadruplet_7Hasherezade 49 total Skip-Performance/Iterations: 2691888/7089590528
Haystack: hs_alt_HuRef_chr1.fa
Executable: strstr_SHORT-SHOWDOWN_Microsoft_v16_Ox_64bit.exe:
Skip-Performance(bigger-the-better): 1266%, 17578894 skips/iterations
Found ('AGATTTTAAAGATTTT') 3 time(s), Railgun_Quadruplet_7Tridentx64 performance: 1341KB/clock
Skip-Performance(bigger-the-better): 1261%, 17648956 skips/iterations
Found ('TTGACATAGAATCTTA') 1 time(s), Railgun_Quadruplet_7Tridentx64 performance: 1509KB/clock
Skip-Performance(bigger-the-better): 1268%, 17542938 skips/iterations
Found ('TGGAGGCTGAGAAATA') 1 time(s), Railgun_Quadruplet_7Tridentx64 performance: 1325KB/clock
Skip-Performance(bigger-the-better): 2551%, 8724089 skips/iterations
Found ('AGATTTTAAAGATTTTCTTTTTTTTTGACATA') 1 time(s), Railgun_Quadruplet_7Tridentx64 performance: 2557KB/clock
Skip-Performance(bigger-the-better): 2438%, 9128017 skips/iterations
Found ('GAATCTTATGGAGGCTGAGAAATAATTTTTTT') 1 time(s), Railgun_Quadruplet_7Tridentx64 performance: 2264KB/clock
Skip-Performance(bigger-the-better): 3624%, 6142214 skips/iterations
Found ('AGATTTTAAAGATTTTCTTTTTTTTTGACATAGAATCTTATGGAGGCT') 1 time(s), Railgun_Quadruplet_7Tridentx64 performance: 3196KB/clock
Skip-Performance(bigger-the-better): 3264%, 6820145 skips/iterations
Found ('TAAAGATTTTCTTTTTTTTTGACATAGAATCTTATGGAGGCTGAGAAA') 1 time(s), Railgun_Quadruplet_7Tridentx64 performance: 2787KB/clock
Skip-Performance(bigger-the-better): 4386%, 5074980 skips/iterations
Found ('AGATTTTAAAGATTTTCTTTTTTTTTGACATAGAATCTTATGGAGGCTGAGAAATAATTTTTTT') 1 time(s), Railgun_Quadruplet_7Tridentx64 performance: 3344KB/clock
Skip-Performance(bigger-the-better): 4590%, 4849475 skips/iterations
Found ('TTTAAAGATTTTCTTTTTTTTTGACATAGAATCTTATGGAGGCTGAGAAATAATTTTTTTTCTA') 1 time(s), Railgun_Quadruplet_7Tridentx64 performance: 3450KB/clock
Railgun_Quadruplet_7Tridentx64 49 i.e. average performance: 3263KB/clock
Railgun_Quadruplet_7Tridentx64 49 total Skip-Performance/Iterations: 2634368/7091550000
Skip-Performance(bigger-the-better): 1449%, 15356686 skips/iterations
Found ('AGATTTTAAAGATTTT') 3 time(s), BNDM_64 performance: 1175KB/clock
Skip-Performance(bigger-the-better): 1396%, 15944910 skips/iterations
Found ('TTGACATAGAATCTTA') 1 time(s), BNDM_64 performance: 1228KB/clock
Skip-Performance(bigger-the-better): 1408%, 15804660 skips/iterations
Found ('TGGAGGCTGAGAAATA') 1 time(s), BNDM_64 performance: 1156KB/clock
Skip-Performance(bigger-the-better): 2972%, 7488740 skips/iterations
Found ('AGATTTTAAAGATTTTCTTTTTTTTTGACATA') 1 time(s), BNDM_64 performance: 2110KB/clock
Skip-Performance(bigger-the-better): 2963%, 7512750 skips/iterations
Found ('GAATCTTATGGAGGCTGAGAAATAATTTTTTT') 1 time(s), BNDM_64 performance: 2031KB/clock
Skip-Performance(bigger-the-better): 4533%, 4909974 skips/iterations
Found ('AGATTTTAAAGATTTTCTTTTTTTTTGACATAGAATCTTATGGAGGCT') 1 time(s), BNDM_64 performance: 2860KB/clock
Skip-Performance(bigger-the-better): 4530%, 4913981 skips/iterations
Found ('TAAAGATTTTCTTTTTTTTTGACATAGAATCTTATGGAGGCTGAGAAA') 1 time(s), BNDM_64 performance: 2860KB/clock
Skip-Performance(bigger-the-better): 6121%, 3636339 skips/iterations
Found ('AGATTTTAAAGATTTTCTTTTTTTTTGACATAGAATCTTATGGAGGCTGAGAAATAATTTTTTT') 1 time(s), BNDM_64 performance: 3450KB/clock
Skip-Performance(bigger-the-better): 6120%, 3637209 skips/iterations
Found ('TTTAAAGATTTTCTTTTTTTTTGACATAGAATCTTATGGAGGCTGAGAAATAATTTTTTTTCTA') 1 time(s), BNDM_64 performance: 3450KB/clock
BNDM_64 49 i.e. average performance: 3736KB/clock
BNDM_64 49 total Skip-Performance/Iterations: 2806144/6595760528
Skip-Performance(bigger-the-better): 1242%, 17921894 skips/iterations
Found ('AGATTTTAAAGATTTT') 3 time(s), Railgun_Quadruplet_7Hasherezade performance: 1103KB/clock
Skip-Performance(bigger-the-better): 1083%, 20540215 skips/iterations
Found ('TTGACATAGAATCTTA') 1 time(s), Railgun_Quadruplet_7Hasherezade performance: 869KB/clock
Skip-Performance(bigger-the-better): 1151%, 19328506 skips/iterations
Found ('TGGAGGCTGAGAAATA') 1 time(s), Railgun_Quadruplet_7Hasherezade performance: 945KB/clock
Skip-Performance(bigger-the-better): 2619%, 8497158 skips/iterations
Found ('AGATTTTAAAGATTTTCTTTTTTTTTGACATA') 1 time(s), Railgun_Quadruplet_7Hasherezade performance: 2050KB/clock
Skip-Performance(bigger-the-better): 2654%, 8385171 skips/iterations
Found ('GAATCTTATGGAGGCTGAGAAATAATTTTTTT') 1 time(s), Railgun_Quadruplet_7Hasherezade performance: 1976KB/clock
Skip-Performance(bigger-the-better): 4119%, 5403967 skips/iterations
Found ('AGATTTTAAAGATTTTCTTTTTTTTTGACATAGAATCTTATGGAGGCT') 1 time(s), Railgun_Quadruplet_7Hasherezade performance: 2977KB/clock
Skip-Performance(bigger-the-better): 4124%, 5397642 skips/iterations
Found ('TAAAGATTTTCTTTTTTTTTGACATAGAATCTTATGGAGGCTGAGAAA') 1 time(s), Railgun_Quadruplet_7Hasherezade performance: 2937KB/clock
Skip-Performance(bigger-the-better): 5609%, 3968150 skips/iterations
Found ('AGATTTTAAAGATTTTCTTTTTTTTTGACATAGAATCTTATGGAGGCTGAGAAATAATTTTTTT') 1 time(s), Railgun_Quadruplet_7Hasherezade performance: 3506KB/clock
Skip-Performance(bigger-the-better): 5643%, 3944537 skips/iterations
Found ('TTTAAAGATTTTCTTTTTTTTTGACATAGAATCTTATGGAGGCTGAGAAATAATTTTTTTTCTA') 1 time(s), Railgun_Quadruplet_7Hasherezade performance: 3563KB/clock
Railgun_Quadruplet_7Hasherezade 49 i.e. average performance: 2998KB/clock
Railgun_Quadruplet_7Hasherezade 49 total Skip-Performance/Iterations: 2691888/7089590528
In next (English-type haystack) test BNDM_64 dominates almost everywhere.
Haystack: OSHO.TXT
Executable: SHORT-SHOWDOWN_Intel_O3_64bit.exe
Found ('If you have read') 9 time(s), Railgun_Quadruplet_7Tridentx64 performance: 2525KB/clock
Found ('you should have ') 181 time(s), Railgun_Quadruplet_7Tridentx64 performance: 2624KB/clock
Found ('pretty good idea') 1 time(s), Railgun_Quadruplet_7Tridentx64 performance: 2845KB/clock
Found ('Osho Books on CD') 4 time(s), Railgun_Quadruplet_7Tridentx64 performance: 2845KB/clock
Found ('use you will put') 1 time(s), Railgun_Quadruplet_7Tridentx64 performance: 2658KB/clock
Found ('to is up to you.') 1 time(s), Railgun_Quadruplet_7Tridentx64 performance: 2624KB/clock
Found ('the largest ever') 1 time(s), Railgun_Quadruplet_7Tridentx64 performance: 2494KB/clock
Found ('of understanding') 852 time(s), Railgun_Quadruplet_7Tridentx64 performance: 2806KB/clock
Found ('and knowledge on') 1 time(s), Railgun_Quadruplet_7Tridentx64 performance: 2658KB/clock
Found ('more, a complete') 1 time(s), Railgun_Quadruplet_7Tridentx64 performance: 2806KB/clock
Found ('and a new way of') 5 time(s), Railgun_Quadruplet_7Tridentx64 performance: 2694KB/clock
Found ('access to Osho's') 1 time(s), Railgun_Quadruplet_7Tridentx64 performance: 2767KB/clock
Found ('words, ideas and') 1 time(s), Railgun_Quadruplet_7Tridentx64 performance: 2658KB/clock
Found ('ideas and vision') 1 time(s), Railgun_Quadruplet_7Tridentx64 performance: 2624KB/clock
Found ('and to make them') 6 time(s), Railgun_Quadruplet_7Tridentx64 performance: 2377KB/clock
Found ('AGATTTTAAAGATTTT') 0 time(s), Railgun_Quadruplet_7Tridentx64 performance: 3207KB/clock
Found ('TTGACATAGAATCTTA') 0 time(s), Railgun_Quadruplet_7Tridentx64 performance: 3207KB/clock
Found ('TGGAGGCTGAGAAATA') 0 time(s), Railgun_Quadruplet_7Tridentx64 performance: 3259KB/clock
Found ('fastest fox with biggest strides') 0 time(s), Railgun_Quadruplet_7Tridentx64 performance: 3483KB/clock
Found ('you will put it to is up to you.') 1 time(s), Railgun_Quadruplet_7Tridentx64 performance: 3367KB/clock
Found ('on meditation and its techniques') 1 time(s), Railgun_Quadruplet_7Tridentx64 performance: 3367KB/clock
Found ('way of life. The purpose of this') 1 time(s), Railgun_Quadruplet_7Tridentx64 performance: 3367KB/clock
Found ('ROM is to provide access to Osho') 1 time(s), Railgun_Quadruplet_7Tridentx64 performance: 3424KB/clock
Found ('Osho's words, ideas and vision, ') 1 time(s), Railgun_Quadruplet_7Tridentx64 performance: 3424KB/clock
Found ('and to make them available to as') 1 time(s), Railgun_Quadruplet_7Tridentx64 performance: 3424KB/clock
Found ('AGATTTTAAAGATTTTCTTTTTTTTTGACATA') 0 time(s), Railgun_Quadruplet_7Tridentx64 performance: 3812KB/clock
Found ('GAATCTTATGGAGGCTGAGAAATAATTTTTTT') 0 time(s), Railgun_Quadruplet_7Tridentx64 performance: 3812KB/clock
Found ('you have read through the preceding chapters you') 1 time(s), Railgun_Quadruplet_7Tridentx64 performance: 3544KB/clock
Found ('through the preceding chapters you should have a') 1 time(s), Railgun_Quadruplet_7Tridentx64 performance: 3424KB/clock
Found ('preceding chapters you should have a pretty good') 1 time(s), Railgun_Quadruplet_7Tridentx64 performance: 3544KB/clock
Found ('CD-ROM. What use you will put it to is up to you') 1 time(s), Railgun_Quadruplet_7Tridentx64 performance: 3544KB/clock
Found ('of understanding and knowledge on meditation and') 1 time(s), Railgun_Quadruplet_7Tridentx64 performance: 3483KB/clock
Found ('much more, a complete, world view of the New Man') 1 time(s), Railgun_Quadruplet_7Tridentx64 performance: 3544KB/clock
Found ('world view of the New Man and a new way of life.') 1 time(s), Railgun_Quadruplet_7Tridentx64 performance: 3544KB/clock
Found ('is to provide access to Osho's words, ideas and ') 1 time(s), Railgun_Quadruplet_7Tridentx64 performance: 3483KB/clock
Found ('provide access to Osho's words, ideas and vision') 1 time(s), Railgun_Quadruplet_7Tridentx64 performance: 3544KB/clock
Found ('access to Osho's words, ideas and vision, and to') 1 time(s), Railgun_Quadruplet_7Tridentx64 performance: 3544KB/clock
Found ('Osho's words, ideas and vision, and to make them') 1 time(s), Railgun_Quadruplet_7Tridentx64 performance: 3483KB/clock
Found ('AGATTTTAAAGATTTTCTTTTTTTTTGACATAGAATCTTATGGAGGCT') 0 time(s), Railgun_Quadruplet_7Tridentx64 performance: 3885KB/clock
Found ('TAAAGATTTTCTTTTTTTTTGACATAGAATCTTATGGAGGCTGAGAAA') 0 time(s), Railgun_Quadruplet_7Tridentx64 performance: 3885KB/clock
Found ('the preceding chapters you should have a pretty good idea on how') 1 time(s), Railgun_Quadruplet_7Tridentx64 performance: 3544KB/clock
Found ('Books on CD-ROM. What use you will put it to is up to you. It is') 1 time(s), Railgun_Quadruplet_7Tridentx64 performance: 3608KB/clock
Found ('What use you will put it to is up to you. It is the largest ever') 1 time(s), Railgun_Quadruplet_7Tridentx64 performance: 3544KB/clock
Found ('repository of understanding and knowledge on meditation and its ') 1 time(s), Railgun_Quadruplet_7Tridentx64 performance: 3544KB/clock
Found ('of understanding and knowledge on meditation and its techniques.') 1 time(s), Railgun_Quadruplet_7Tridentx64 performance: 3608KB/clock
Found ('to provide access to Osho's words, ideas and vision, and to make') 1 time(s), Railgun_Quadruplet_7Tridentx64 performance: 3544KB/clock
Found ('to Osho's words, ideas and vision, and to make them available to') 1 time(s), Railgun_Quadruplet_7Tridentx64 performance: 3608KB/clock
Found ('AGATTTTAAAGATTTTCTTTTTTTTTGACATAGAATCTTATGGAGGCTGAGAAATAATTTTTTT') 0 time(s), Railgun_Quadruplet_7Tridentx64 performance: 3961KB/clock
Found ('TTTAAAGATTTTCTTTTTTTTTGACATAGAATCTTATGGAGGCTGAGAAATAATTTTTTTTCTA') 0 time(s), Railgun_Quadruplet_7Tridentx64 performance: 3961KB/clock
Railgun_Quadruplet_7Tridentx64 49 i.e. average performance: 3241KB/clock
Railgun_Quadruplet_7Tridentx64 49 total Skip-Performance/Iterations: 2708288/6416464496
Found ('If you have read') 9 time(s), BNDM_64 performance: 1924KB/clock
Found ('you should have ') 181 time(s), BNDM_64 performance: 1980KB/clock
Found ('pretty good idea') 1 time(s), BNDM_64 performance: 2245KB/clock
Found ('Osho Books on CD') 4 time(s), BNDM_64 performance: 2149KB/clock
Found ('use you will put') 1 time(s), BNDM_64 performance: 1980KB/clock
Found ('to is up to you.') 1 time(s), BNDM_64 performance: 1924KB/clock
Found ('the largest ever') 1 time(s), BNDM_64 performance: 1924KB/clock
Found ('of understanding') 852 time(s), BNDM_64 performance: 2349KB/clock
Found ('and knowledge on') 1 time(s), BNDM_64 performance: 2020KB/clock
Found ('more, a complete') 1 time(s), BNDM_64 performance: 2126KB/clock
Found ('and a new way of') 5 time(s), BNDM_64 performance: 2000KB/clock
Found ('access to Osho's') 1 time(s), BNDM_64 performance: 2083KB/clock
Found ('words, ideas and') 1 time(s), BNDM_64 performance: 2104KB/clock
Found ('ideas and vision') 1 time(s), BNDM_64 performance: 2000KB/clock
Found ('and to make them') 6 time(s), BNDM_64 performance: 1804KB/clock
Found ('AGATTTTAAAGATTTT') 0 time(s), BNDM_64 performance: 4592KB/clock
Found ('TTGACATAGAATCTTA') 0 time(s), BNDM_64 performance: 4592KB/clock
Found ('TGGAGGCTGAGAAATA') 0 time(s), BNDM_64 performance: 4699KB/clock
Found ('fastest fox with biggest strides') 0 time(s), BNDM_64 performance: 3741KB/clock
Found ('you will put it to is up to you.') 1 time(s), BNDM_64 performance: 3108KB/clock
Found ('on meditation and its techniques') 1 time(s), BNDM_64 performance: 3424KB/clock
Found ('way of life. The purpose of this') 1 time(s), BNDM_64 performance: 3483KB/clock
Found ('ROM is to provide access to Osho') 1 time(s), BNDM_64 performance: 3424KB/clock
Found ('Osho's words, ideas and vision, ') 1 time(s), BNDM_64 performance: 3483KB/clock
Found ('and to make them available to as') 1 time(s), BNDM_64 performance: 3367KB/clock
Found ('AGATTTTAAAGATTTTCTTTTTTTTTGACATA') 0 time(s), BNDM_64 performance: 5051KB/clock
Found ('GAATCTTATGGAGGCTGAGAAATAATTTTTTT') 0 time(s), BNDM_64 performance: 5051KB/clock
Found ('you have read through the preceding chapters you') 1 time(s), BNDM_64 performance: 3885KB/clock
Found ('through the preceding chapters you should have a') 1 time(s), BNDM_64 performance: 3885KB/clock
Found ('preceding chapters you should have a pretty good') 1 time(s), BNDM_64 performance: 4123KB/clock
Found ('CD-ROM. What use you will put it to is up to you') 1 time(s), BNDM_64 performance: 3885KB/clock
Found ('of understanding and knowledge on meditation and') 1 time(s), BNDM_64 performance: 4123KB/clock
Found ('much more, a complete, world view of the New Man') 1 time(s), BNDM_64 performance: 4041KB/clock
Found ('world view of the New Man and a new way of life.') 1 time(s), BNDM_64 performance: 3961KB/clock
Found ('is to provide access to Osho's words, ideas and ') 1 time(s), BNDM_64 performance: 4123KB/clock
Found ('provide access to Osho's words, ideas and vision') 1 time(s), BNDM_64 performance: 4041KB/clock
Found ('access to Osho's words, ideas and vision, and to') 1 time(s), BNDM_64 performance: 4123KB/clock
Found ('Osho's words, ideas and vision, and to make them') 1 time(s), BNDM_64 performance: 3961KB/clock
Found ('AGATTTTAAAGATTTTCTTTTTTTTTGACATAGAATCTTATGGAGGCT') 0 time(s), BNDM_64 performance: 4699KB/clock
Found ('TAAAGATTTTCTTTTTTTTTGACATAGAATCTTATGGAGGCTGAGAAA') 0 time(s), BNDM_64 performance: 4699KB/clock
Found ('the preceding chapters you should have a pretty good idea on how') 1 time(s), BNDM_64 performance: 4209KB/clock
Found ('Books on CD-ROM. What use you will put it to is up to you. It is') 1 time(s), BNDM_64 performance: 4209KB/clock
Found ('What use you will put it to is up to you. It is the largest ever') 1 time(s), BNDM_64 performance: 3961KB/clock
Found ('repository of understanding and knowledge on meditation and its ') 1 time(s), BNDM_64 performance: 4299KB/clock
Found ('of understanding and knowledge on meditation and its techniques.') 1 time(s), BNDM_64 performance: 4299KB/clock
Found ('to provide access to Osho's words, ideas and vision, and to make') 1 time(s), BNDM_64 performance: 4299KB/clock
Found ('to Osho's words, ideas and vision, and to make them available to') 1 time(s), BNDM_64 performance: 4299KB/clock
Found ('AGATTTTAAAGATTTTCTTTTTTTTTGACATAGAATCTTATGGAGGCTGAGAAATAATTTTTTT') 0 time(s), BNDM_64 performance: 4810KB/clock
Found ('TTTAAAGATTTTCTTTTTTTTTGACATAGAATCTTATGGAGGCTGAGAAATAATTTTTTTTCTA') 0 time(s), BNDM_64 performance: 4810KB/clock
BNDM_64 49 i.e. average performance: 3141KB/clock
BNDM_64 49 total Skip-Performance/Iterations: 2779920/6213485968
Despite the weak segments in 'Tridentx64' I intend not to abandon it, it is still my favorite gun due to its relatively stable skip-performance - yet standing alone it is an easy prey i.e. it ought to be put in some mix.
Bottom-line:
BNDM_64: a piece-of-beauty. It really shines when the needle differs A LOT from haystack's straws.
Get down get down get down get it on show love and give it up
What are you waiting on?
|
|
|
|
|

|
Last night (2012 Feb 25) achieved 13%-20% boost of my strstr-like (i.e. short strings/patterns cases) function compared to GNU's Berg's strstr function - both compiled as 64bit code using Microsoft 16 and Intel 12.1 compilers. For full benchmark see the commented lines further below. After some ups-and-downs experienced in Windows 7 64bit with Intel 12.1 64bit and Microsoft 16 64bit C compilers I revised Railgun in order to fit it better in the new environment which resulted in appearance of Railgun_Doublet. The function is tiny: with 00cc5-00c71+1+1 = 86 bytes main-loop.
GNU Berg's performance (compiled with Microsoft 2010 64bit compiler): 376KB/clock+400KB/clock+564KB/clock+346KB/clock+540KB/clock+395KB/clock+535KB/clock+401KB/clock+542KB/clock+455KB/clock+510KB/clock+468KB/clock+541KB/clock+469KB/clock+557KB/clock+482KB/clock=7581
GNU Berg's performance (compiled with Intel C++ 64 Compiler XE 12.1): 296KB/clock+320KB/clock+421KB/clock+276KB/clock+417KB/clock+323KB/clock+415KB/clock+326KB/clock+415KB/clock+367KB/clock+405KB/clock+378KB/clock+415KB/clock+378KB/clock+418KB/clock+388KB/clock=5958
Railgun_Doublet performance (compiled with Microsoft 2010 64bit compiler): 421KB/clock+476KB/clock+551KB/clock+393KB/clock+550KB/clock+534KB/clock+553KB/clock+550KB/clock+559KB/clock+557KB/clock+553KB/clock+562KB/clock+568KB/clock+573KB/clock+587KB/clock+594KB/clock=8581
Railgun_Doublet performance (compiled with Intel C++ 64 Compiler XE 12.1): 348KB/clock+397KB/clock+464KB/clock+328KB/clock+463KB/clock+452KB/clock+465KB/clock+463KB/clock+469KB/clock+468KB/clock+464KB/clock+470KB/clock+477KB/clock+479KB/clock+490KB/clock+494KB/clock=7191
BNDM_32 performance (compiled with Microsoft 2010 64bit compiler): 267KB/clock+290KB/clock+462KB/clock+255KB/clock+390KB/clock+379KB/clock+451KB/clock+395KB/clock+400KB/clock+433KB/clock+419KB/clock+420KB/clock+436KB/clock+447KB/clock+435KB/clock+441KB/clock=6320
BNDM_32 performance (compiled with Intel C++ 64 Compiler XE 12.1): 241KB/clock+262KB/clock+409KB/clock+234KB/clock+352KB/clock+343KB/clock+402KB/clock+359KB/clock+367KB/clock+392KB/clock+383KB/clock+384KB/clock+398KB/clock+407KB/clock+400KB/clock+405KB/clock=5738
Summary for all 16 patterns:
GNU Berg's performance: 7581 Microsoft / 5958 Intel
Railgun_Doublet performance: 8581 Microsoft / 7191 Intel
BNDM_32 performance: 6320 Microsoft / 5738 Intel
Using Microsoft:
Railgun_Doublet is (8581-7581)/7581*100% = 13% faster than GNU Berg's
Railgun_Doublet is (8581-6320)/6320*100% = 35% faster than BNDM_32
Using Intel:
Railgun_Doublet is (7191-5958)/5958*100% = 20% faster than GNU Berg's
Railgun_Doublet is (7191-5738)/5738*100% = 25% faster than BNDM_32
For full benchmark dumps: Railgun homepage.
// Notes on 80x86 and x64, 2012-Feb-25:
// Three compilers were used (first two on Windows 7 64bit, the third on Windows XP 32bit):
// Intel(R) C++ 64 Compiler XE for applications running on Intel(R) 64, Version 12.1.1.258 Build 20111011
// Microsoft (R) C/C++ Optimizing Compiler Version 16.00.30319.01 for x64
// Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 16.00.30319.01 for 80x86
// I have been using x64 for more than 12 hours and quickly the picture has become clear: code written for 32bit must be replaced with dedicated 64bit counterpart, relying on former only is a gambling venture.
// For example Railgun_Doublet dominates when compiled as 64bit (both with Intel XE 2011 12.1 and Microsoft 2010 16.00.30319.01 for x64), but for 32bit it is inferior compared to Railgun_Quadruplet_8Triplet.
// Summary for strstr-like (i.e. memmem for short strings/patterns) usage:
// - for 32bit use Railgun_Quadruplet_8Triplet
// - for 64bit use Railgun_Doublet
// My interest has been shifted from strstr toward memmem, Railgun_Doublet will fill the gap (short strings/patterns cases) in Railgun_r8_Mimino_x64 whereas 'Trident2'+'Hasherezade' will deal with memmem part, 'Hasherezade' should be surely tuned for 64bit.
Get down get down get down get it on show love and give it up
What are you waiting on?
|
|
|
|
 |
|
|
General News Suggestion Question Bug Answer Joke Rant Admin
|
Tuned function for searching a needle in a haystack
| Type | Article |
| Licence | CPOL |
| First Posted | 9 Sep 2011 |
| Views | 40,244 |
| Downloads | 650 |
| Bookmarked | 16 times |
|
|