Algorithms

Re: Wheel Factorization to simplify a Sieve of Eratosthenes

2-Dec-08 23:15

Yes, I plan to write at least one article on the subject; I still need to finish some research first, so it will be a few more weeks before anything shows though.

Smile | :)

Luc Pattyn [Forum Guidelines] [My Articles]

Fixturized forever. Confused | :confused:

⁷³Zeppelin3-Dec-08 0:14

⁷³Zeppelin

3-Dec-08 0:14

Excellent!

Re: Wheel Factorization to simplify a Sieve of Eratosthenes

PIEBALDconsult3-Dec-08 4:11

Re: Wheel Factorization to simplify a Sieve of Eratosthenes

3-Dec-08 4:11

No! It'll be mis-used by students. Frown | :(

Luc Pattyn3-Dec-08 5:45

Re: Wheel Factorization to simplify a Sieve of Eratosthenes

3-Dec-08 5:45

Potential misuse should not stop us now; it did not stop mankind from coming up with good and bad things such as fire, steam engines, cars, roads, arms, computers, Internet, and Windows.

I'll make the article long enough so none of the lazy students will ever read it. Big Grin | :-D

Luc Pattyn [Forum Guidelines] [My Articles]

Fixturized forever. Confused | :confused:

PIEBALDconsult3-Dec-08 16:16

A la recherche des nombres premiers

3-Dec-08 16:16

That might work. It would be better in French, though.

Luc Pattyn3-Dec-08 16:37

Re: Wheel Factorization to simplify a Sieve of Eratosthenes

3-Dec-08 16:37

PIEBALDconsult wrote:
It would be better in French

The CP police won't allow it, they are very reluctant to articles and messages of which they don't understand most of the words.

Smile | :)

Luc Pattyn [Forum Guidelines] [My Articles]

Fixturized forever. Confused | :confused:

modified on Wednesday, December 3, 2008 11:52 PM

PIEBALDconsult3-Dec-08 5:06

Re: Wheel Factorization to simplify a Sieve of Eratosthenes

3-Dec-08 5:06

Certain optimizations can be made to a classic or implied-two-spoke sieve that won't work for more spokes.

If it's hard-coded to only count the primes, it'll be quicker too.
For mine, I pass in a delegate of what to do with each prime, so I have a lot more method calls.

C:\>Eratosthenes 4294967296 Three
Target=4294967296; Spokes=(30,8); Bytes=(17895704,143165632); Elapsed=00:04:53.1354700; Primes=203280221
                           ^  ^              ^        ^
                           |  |              |        |
Total spokes ---------------  |              |        |
Allocated spokes --------------              |        |
Bytes per array ------------------------------        |
Total bytes allocated for arrays ----------------------

Luc Pattyn wrote:
I do expect spokes to improve performance a lot

Try it, you'll see. Big Grin | :-D

How would you cross off the 25?

1  7 13 19 25 31
5 11 17 23 29 35

This is an area that still needs work, and I think I may have an idea...

Luc Pattyn3-Dec-08 5:53

Re: Wheel Factorization to simplify a Sieve of Eratosthenes [modified]

3-Dec-08 5:53

Hi,

thanks for the clarification.

PIEBALDconsult wrote:
I pass in a delegate of what to do with each prime

I don't. I might do that when I am done, right now I do not want to wait for a machine to execute millions of delegates I do not really need. I call a non-virtual method for each prime; it displays a few of them, and keeps track of count.

PIEBALDconsult wrote:
Try it, you'll see.

I will.

PIEBALDconsult wrote:
How would you cross off the 25?

I am considering only one array containing the primes of all relevant spokes; I currently store 1, 3, 5, 7, 9, 11, 13, 15 etc because the odd one is the only relevant spoke in a two-spoke setup. In the six-spoke scenario that would be 1, 5, 7, 11, 13, 17, 19, 23, 25, 29, 31, 35, etc.

Therefore 25, just like any other non-prime, will be flagged as composite, every time it appears as a multiple of one of its factors; in this case that would be only once, and it would also be the first relevant multiple of prime 5.

Smile | :)

Luc Pattyn [Forum Guidelines] [My Articles]

Fixturized forever. Confused | :confused:

PIEBALDconsult9-Jan-09 11:53

Efficient Search/Comparison Algorithm

9-Jan-09 11:53

Update:

I now have a Core 2 Quad and 4GB RAM (still running XR Pro), and after a few more tweaks I have:

C:\>eratosthenes 4294967296 Zero
Target=4294967296; Spokes=(1,1); Bytes=(536870920,536870920); Elapsed=00:11:24.2121472; Primes=203280221
 
C:\>eratosthenes 4294967296 One
Target=4294967296; Spokes=(2,1); Bytes=(268435464,268435464); Elapsed=00:05:14.2174818; Primes=203280221
 
C:\>eratosthenes 4294967296 Two
Target=4294967296; Spokes=(6,2); Bytes=(89478488,178956976); Elapsed=00:03:21.4457024; Primes=203280221
 
C:\>eratosthenes 4294967296 Three
Target=4294967296; Spokes=(30,8); Bytes=(17895704,143165632); Elapsed=00:02:59.1627191; Primes=203280221
 
C:\>eratosthenes 4294967296 Four
Target=4294967296; Spokes=(210,48); Bytes=(2556536,122713728); Elapsed=00:05:38.4098073; Primes=203280221
 
C:\>eratosthenes 4294967296 Five
Target=4294967296; Spokes=(2310,480); Bytes=(232416,111559680); Elapsed=00:36:54.0737311; Primes=203280221

Now I have to add threading... I'll be back.

Update to the update:

I tweaked a bit more (inlined some methods) so now I get:

C:\>eratosthenes 4294967296 Zero
Target=4294967296; Spokes=(1,1); Bytes=(536870920,536870920); Elapsed=00:12:00.3882952; Primes=203280221
 
C:\>eratosthenes 4294967296 One
Target=4294967296; Spokes=(2,1); Bytes=(268435464,268435464); Elapsed=00:05:24.4898381; Primes=203280221
 
C:\>eratosthenes 4294967296 Two
Target=4294967296; Spokes=(6,2); Bytes=(89478488,178956976); Elapsed=00:03:24.4878007; Primes=203280221
 
C:\>eratosthenes 4294967296 Three
Target=4294967296; Spokes=(30,8); Bytes=(17895704,143165632); Elapsed=00:02:46.4519220; Primes=203280221
 
C:\>eratosthenes 4294967296 Four
Target=4294967296; Spokes=(210,48); Bytes=(2556536,122713728); Elapsed=00:04:22.4536602; Primes=203280221
 
C:\>eratosthenes 4294967296 Five
Target=4294967296; Spokes=(2310,480); Bytes=(232416,111559680); Elapsed=00:24:59.2846800; Primes=203280221
 
C:\>eratosthenes 4294967296 Six
Target=4294967296; Spokes=(30030,5760); Bytes=(17880,102988800); Elapsed=06:30:34.0253234; Primes=203280221

Notice the first successful run of 2^32 with six primes!

Further update:

C:\>eratosthenes 4294967296 Six
Target=4294967296; Spokes=(30030,5760); Bytes=(17880,102988800); Elapsed=05:53:56.1556709; Primes=203280221

I did try threading, but it was awful, I'll try again later.

modified on Monday, January 12, 2009 10:05 AM

SanchitK20-Nov-08 3:29

SanchitK

20-Nov-08 3:29

My requirement is : I will be having billions of data records coming in per day.From those I have to filter some 1000-2000 records based on record names which are stored in a filter.
My doubt is what is the suiatble data structure to be used and which search algo will meet this high volume requirement.

Sanchit

Member 419459320-Nov-08 6:17

Member 4194593

20-Nov-08 6:17

Sanchit,

You need to be more specific. In what format are the input records or is this something which can be defined for this implementation? In what form are the record names in the filter and can this format be defined for this implementation?

Dave.

Alan Balkany20-Nov-08 10:20

Alan Balkany

20-Nov-08 10:20

The most efficient data structure for lookups is the hash table.

Create a hash table with a size that's about three times the number of names you're looking for, that's a prime number. Insert the 1000-2000 names you're looking for.

Then for each data record that comes in, look up the name in the hash table. Use a language like C++, which executes faster than C#.

cmk20-Nov-08 12:59

cmk

20-Nov-08 12:59

You may also want to look at a radix tree:
http://en.wikipedia.org/wiki/Radix_tree[^]

It will save the time required to calculate a hash for each of the billions of records at the cost of a longer search per record.

Hash table:
- calc hash of record (slow)
- check table for hash value (quick)
- handle searching collisions (quick)
- if found matching hash in table do a full string compare on string record and hash table record as hash algo can produce same hash for different strings. (slow)

Radix tree:
- search tree for record string (slow)

The radix should be faster. The hash function will likely have a similar cost to the radix tree search. It may even be worse as the hash looks at each character in the string whereas the radix search can fail out early.

...cmk

The idea that I can be presented with a problem, set out to logically solve it with the tools at hand, and wind up with a program that could not be legally used because someone else followed the same logical steps some years ago and filed for a patent on it is horrifying.
- John Carmack

Member 419459320-Nov-08 14:18

Member 4194593

20-Nov-08 14:18

The initial post mentioned that the search would be for record names that were stored in a filter. If only a field of the record (the name) needed to be checked, wouldn't that reduce the effort to create a hash - in fact wouldn't a Radix search be faster? That is why I asked for more information about the input and filter format.

Dave.

cmk20-Nov-08 16:21

cmk

20-Nov-08 16:21

Member 4194593 wrote:
If only a field of the record (the name) needed to be checked,

I understood that each record would contain a (single) filter field, that's why i suggested as radix tree. A hash would likely be more approriate if you wanted to match several fields.

Member 4194593 wrote:
wouldn't that reduce the effort to create a hash

Yes. Sounds like you're in favour of using a hash table. Smile | :)

Member 4194593 wrote:
in fact wouldn't a Radix search be faster?

Yes, based on my op. Sounds like you're in favour of using a radix tree. Smile | :)

Original post: 'based on record names'
I assume this to mean strings, in which case my op stands.

Alan Balkany21-Nov-08 3:35

Alan Balkany

21-Nov-08 3:35

Good point. Some other pros and cons: A radix tree would require more memory accesses. But on the other hand, a radix tree would take less memory than a hash table, so would be more likely to fit entirely in a cache.

The hash-value calculation wouldn't necessarily be slow; it would be all integer operations on a key that could be in a single cache line. Also the number of collisions can be reduced in two ways:

1. Randomly-hashed values follow a Poisson distribution, so the expected number of collisions can be calculated for a given hash-table size. The table size can be increased to achieve an acceptable collision rate.

2. Assuming the list of names is fixed, a perfect hash function http://en.wikipedia.org/wiki/Perfect_hash_function[^] can be constructed, which will have no collisions.

supercat921-Nov-08 9:44

supercat9

21-Nov-08 9:44

If it's necessary to check billions of records per day, that sounds like enough computer work--regardless of algorithm--that profiling will be necessary to achieve optimal performance. Radix trees can be good if the records to be searched for differ from other records in the characters that are examined first. They can be bad if records often differ only in the characters that are examined much later.

With hash functions, there is often going to be some trade-offs involving the time required to perform a hash, the hash table size, and the frequency of hash collisions and false matches. Increasing the complexity of a hash function such that the time required increases by more than half the cost of a direct record comparison won't make much sense if doing so unless it saves an average of one direct record comparison for every two records. Having a hash table so large that every access entails a cache miss may slow things down more than would the occasional hash collision or false match.

One approach that may be helpful would be to compute a somewhat long hash of a record, hash that to produce an index into a hash table, and have the table itself store the 'long' hash in addition to the actual record. One could thus at very little expense test the long hash of the record to be checked against the long hash in the table. This would eliminate a large fraction of the direct record comparisons that would otherwise be needed.

BTW, how fast are modern machines at 32x32 integer multiplication? If the strings are zero-padded as well as zero-terminated and multiplies are cheap, it might be good to do something like:

unsigned long *ptr;
unsigned long hash,temp;

/* Point ptr at the start of the data, preferably word aligned */
do
{
  temp = *ptr++;
  hash += hash >> 23;
  hash ^= temp;
  hash *= some_big_prime_number;
} while(temp);

That would gobble up 32 bits of string data per loop iteration; I would expect the multiply, even if it takes multiple cycles, could be done concurrently with the next word fetch. Not the world's most sophisticated hash function by any stretch, but the execution time should be pretty minimal. If you want to use the hash function to yield a table index, use the upper bits. Taking the modulus of a prime number (distinct from the one used for the multiply) may be better, but modulus operations are more expensive than shifts.