Click here to Skip to main content
15,881,670 members
Articles / Programming Languages / C#

Convert String to 64bit Integer

Rate me:
Please Sign up or sign in to vote.
4.16/5 (11 votes)
20 Mar 2009CPOL1 min read 85.3K   22   14
A C# library general purpose function converts string to unique 64bit integer like object.GetHashCode()

Introduction

.NET Framework provides the 'GetHashcode()' function returning a 32bit Integer. You can convert 'string' to 32bit Integer but it doesn't guarantee a unique number. Here is the function that converts string to a unique 64bit integer, avoids collision domain level string store, compare and trace.

Background

If you need to store a large amount of URLs in your database, you must define 'character' index for manipulation. If you can generate a matching 'one-string-to-one-numeric-value' that can use as a numeric 'Key' instead of a variant length of string.

Using the Code

  1. Convert a variable length of string to fixed length of hashcode, and it must have fast hashing speed, so use .NET provided System.Security.Cryptography.SHA256CryptoServiceProvider.
  2. Convert 32 byte hashcode to 8 byte integer, avoiding making a collision.
C#
/// <summary>
/// Return unique Int64 value for input string
/// </summary>
/// <param name="strText"></param>
/// <returns></returns>
static Int64 GetInt64HashCode(string strText)
{
    Int64 hashCode = 0;
    if (!string.IsNullOrEmpty(strText))
    {
        //Unicode Encode Covering all characterset
          byte[] byteContents = Encoding.Unicode.GetBytes(strText);
        System.Security.Cryptography.SHA256 hash = 
		new System.Security.Cryptography.SHA256CryptoServiceProvider();
        byte[] hashText = hash.ComputeHash(byteContents);
        //32Byte hashText separate
        //hashCodeStart = 0~7  8Byte
        //hashCodeMedium = 8~23  8Byte
        //hashCodeEnd = 24~31  8Byte
        //and Fold
        Int64 hashCodeStart = BitConverter.ToInt64(hashText, 0);
        Int64 hashCodeMedium = BitConverter.ToInt64(hashText, 8);
        Int64 hashCodeEnd = BitConverter.ToInt64(hashText, 24);
        hashCode = hashCodeStart ^ hashCodeMedium ^ hashCodeEnd;
    }
    return (hashCode);
}        

Collision and Performance Test

Tested platform: Core2Duo, Windows 2003 Server SP2, .NET Framework 3.5 SP1

10,000,000 Times generate GetInt64HashCode

Collision: Not found

100,000 Times generate ElapsedTime: 830 milliseconds

10,000,000 Times generate .NET Framework provided object.GetHashCode

Collision: 4,150 found

100,000 Times generate ElapsedTime: 35 milliseconds

Points of Interest

I know that Cryptography.SHA256 does not provide perfect collision avoided hash value and compressed 32Byte to 8Byte can increase collision probability, but I think the above function shows enough performance and avoids collision.

Your reply will make the function more efficient and reliable.

This function now uses and collects large amount of URLs. There is no collision for 40,000,000 unique URLs.

History

  • 20th March, 2009: Initial post

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Software Developer Seoul , Korea
Korea (Republic of) Korea (Republic of)
develop, management software, website last six years, on .netframework environment

Interested in Architect and Framework

you can visit my blog : http://vandbt.tistory.com/

articles for c#, framework, architect, and recommend books
but, only korean languages.
in the immediate future I will posts english version also.

Comments and Discussions

 
QuestionPerhaps, for this purpose, you might also to use simply GetHashCode. Pin
stefan.babos21-Jun-18 21:36
stefan.babos21-Jun-18 21:36 
QuestionImprovement Pin
oleum17-Jul-17 4:47
oleum17-Jul-17 4:47 
QuestionJavascript/Typescript/Angular Pin
jadeboy12-Feb-17 14:07
jadeboy12-Feb-17 14:07 
GeneralMy vote of 5 Pin
Rob 1234523-Oct-12 13:35
Rob 1234523-Oct-12 13:35 
GeneralCRC Pin
AndreyMir28-Mar-09 18:55
AndreyMir28-Mar-09 18:55 
QuestionRe: CRC Pin
Composition412-Apr-09 17:42
Composition412-Apr-09 17:42 
GeneralAlternative approach Pin
djlove26-Mar-09 11:39
djlove26-Mar-09 11:39 
AnswerRe: Alternative approach Pin
Composition426-Mar-09 16:41
Composition426-Mar-09 16:41 
Generalmore tested 100,000,000 unique string for hashe value , collision not found yet Pin
Composition422-Mar-09 13:56
Composition422-Mar-09 13:56 
General"unique 64bit inteager" Pin
harold aptroot21-Mar-09 13:08
harold aptroot21-Mar-09 13:08 
GeneralRe: "unique 64bit inteager" Pin
assmax21-Mar-09 23:04
assmax21-Mar-09 23:04 
AnswerRe: "unique 64bit inteager" Pin
Composition422-Mar-09 4:43
Composition422-Mar-09 4:43 
AnswerRe: "unique 64bit inteager" Pin
Composition422-Mar-09 4:56
Composition422-Mar-09 4:56 
GeneralRe: "unique 64bit inteager" Pin
harold aptroot22-Mar-09 5:56
harold aptroot22-Mar-09 5:56 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.