Click here to Skip to main content
15,896,522 members
Articles / Programming Languages / C

DRUM - A C++ Implementation for the URL-seen Test of a Web Crawler

Rate me:
Please Sign up or sign in to vote.
4.80/5 (17 votes)
19 May 2009MIT12 min read 40.6K   509   35  
A C++ implementation of DRUM (Disk Repository with Update Management) - Storage of key/value pairs and asynchronous check/update operations
#include "drum.hpp"
#include "bucket_identifier.hpp"
#include "db_compare.hpp"
#include "rabin_fingerprint.hpp"
#include <iostream>

using namespace drum;

template <class key_t, class value_t, class aux_t>
struct URLSeenDispatcher : NullDispatcher<key_t, value_t, aux_t>
{
  void UniqueKeyUpdate(key_t const& k, value_t const& v, aux_t const& a) const
  {std::cout << "UniqueKeyUpdate: " << k << " Data: " << v << " Aux: " << a << "\n";}

  void DuplicateKeyUpdate(key_t const& k, value_t const& v, aux_t const& a) const
  {std::cout << "DuplicateKeyUpdate: " << k << " Data: " << v << " Aux: " << a << "\n";}
};

int main()
{
  try
  {
    std::cout << "Example of Drum usage." << std::endl;

    typedef Drum<boost::uint64_t, std::string, std::string, 2, 4, 64, URLSeenDispatcher> DRUM;
    DRUM drum("url-seen.db"); //A file with this name is created. It's the Berkeley DB database.

    RabinFingerprint fp;

    std::string url0 = "http://www.codeproject.com";
    boost::uint64_t key0 = fp.Compute(url0.c_str(), url0.size());
    drum.CheckUpdate(key0, "", url0);

    std::string url1 = "http://www.oracle.com/technology/products/berkeley-db/index.html";
    boost::uint64_t key1 = fp.Compute(url1.c_str(), url1.size());
    drum.CheckUpdate(key1, "", url1);

    std::string url2 = "http://www.boost.org";
    boost::uint64_t key2 = fp.Compute(url2.c_str(), url2.size());
    drum.CheckUpdate(key2, "", url2);

    std::string url3 = "http://www.codeproject.com";
    boost::uint64_t key3 = fp.Compute(url3.c_str(), url3.size());
    drum.CheckUpdate(key3, "", url3);

    //Synchronize and dispose.
    drum.Synchronize();
    drum.Dispose();

    std::cout << "Done!" << std::endl;
  }
  catch (DrumException & e)
  {
    std::cout << "Drum error: " << e.what() << " Number: " << e.get_error_code() << std::endl;
  }
  catch (std::exception & e)
  {
    std::cout << "Drum error: " << e.what() << std::endl;
  }
  catch (...)
  {
    std::cout << "Something wrong..." << std::endl;
  }
  return 0;
}

By viewing downloads associated with this article you agree to the Terms of Service and the article's licence.

If a file you wish to view isn't highlighted, and is a text file (not binary), please let us know and we'll add colourisation support for it.

License

This article, along with any associated source code and files, is licensed under The MIT License


Written By
Software Developer
Brazil Brazil
This member has not yet provided a Biography. Assume it's interesting and varied, and probably something to do with programming.

Comments and Discussions