Click here to Skip to main content
15,895,779 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
I am trying to do something similar to this:
I have a list of integers, anywhere from 0-99.
I need to store them in a data structure, and then be able to hash from a lower boundary to an upper boundary to access the ones I need in the fastest time complexity possible.
For example, if I had 0 1 2 3 6 8 9 and I wanted to access those in the range of 3 to 7 inclusive, it would simply just return 3 and 6.
I know I could start at the lower boundary and just use the equal_range(int) function on every value is the range, but is that the most efficient way?
Posted
Comments
barneyman 23-Mar-15 20:01pm    
store them sorted, binary search them to find your lowest, continue ascending until you exceed your highest - your worst case search is O(log n)
PIEBALDconsult 23-Mar-15 20:09pm    
Just use an array. Particularly if you have so few values.
[no name] 24-Mar-15 12:48pm    
I don't know how many items you want to store, but you should try boost::flat_map. It is really fast. Of course, as barneyman said, the data should be stored sorted into container.

1 solution

The trouble with maintaining an ordered list is the insertion cost - O(log N) - each time your add something to your list. A bit vector would be easier.

std::bitset almost does what you want. It lacks an efficient way to build a mask of ranges.

boost::flat_map looks like it does more than what you want - unless there's a specialization specifically for bool values.

Here's some straight C++ code that should work for you.

C++
#include <memory.h>

#define _countof(arg) ( (sizeof arg) / (sizeof arg[0]) )

class Set
{
    typedef unsigned long long word_t;
    enum {word = 8 * sizeof(word_t), limit = 100};
    word_t mask[ (limit + word - 1) / word];
public:
    Set()
    {
        clear();
    }

    Set(const Set & copy)
    {
        memcpy(mask, copy.mask, sizeof mask);
    }

    void clear()
    {
        memset(mask, 0, sizeof mask);
    }

    bool addKey(unsigned char key)
    {
        if (key < limit)
        {
            word_t bit = 1;
            bit <<= (key % word);
            mask[key / word] |= bit;
            return true;
        }
        return false;
    }

    // range is from low (inclusive) to high (exclusive)
    bool addRange(unsigned char low, unsigned char high)
    {
        if (low < limit && high <= limit && low < high)
        {
            unsigned char low_index = low / word;
            unsigned char high_index = high / word;
            word_t work = (word_t)1 << (low % word);
            work--;
            word_t low_mask = ~work;
            work = (word_t)1 << (high % word);
            work--;
            word_t high_mask = work;
            for (unsigned char i = low_index; i <= high_index; i++)
            {
	        word_t bits = -1;
	        if (i == low_index) bits &= low_mask;
                if (i == high_index) bits &= high_mask;
                    mask[i] = bits;
            }
            return true;
        }
        return false;
    }

    Set& intersect(const Set &right)
    {
         for (unsigned char i = 0; i < _countof(mask); i++)
             mask[i] &= right.mask[i];
         return *this;
    }

    void print() const
    {
        for (unsigned char i = 0; i < _countof(mask); i++)
        {
            if (!mask[i]) continue;
            for (unsigned bit = 0; bit < word; bit++)
            {
                word_t test = 1;
                test <<= bit;
                if (mask[i] & test)
                    printf(" %2d", i * word + bit);
            }
        }
        printf("\n");
    }
};

int main(int argc, _char* argv[])
{
	Set set;

	/*
	For example, if I had 0 1 2 3 6 8 9 and I wanted to access those in the range of 3 to 7 inclusive, it would simply just return 3 and 6.
	*/

    set.addKey(0);
    set.addKey(1);
    set.addKey(2);
    set.addKey(3);
    set.addKey(6);
    set.addKey(8);
    set.addKey(9);

    printf("Set:");
    set.print();

    Set range;

    printf("Range:");
    range.addRange(3, 8);
    range.print();

    printf("Intersect:");
    range.intersect(set);
    range.print();

    getchar();
    return 0;
}
</memory.h>


The above code uses long long as the word size so it can scan all 100 bits very quickly (two 64 bit accesses). You can change the word size to long, short or char to see the effect on performance. It should work for up to 255 bits.
 
Share this answer
 

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900