Random Thoughts..

John Underhill

Rate me:

4.93/5 (45 votes)

26 Jul 2014CPOL41 min read

72.6K

1.3K

CSRG random generator in C# and the NIST STS Library (Huge update!)

Download Drbg_Test.zip - 469 KB

Sometimes..

you have to chuck it all, and start all over again, (sigh). So I did, and if you've visited this article before, you will notice that all the methods I tested previously to create random data have vanished.. but trust me, that's a good thing. So, I spent last night racking my brain, trying to figure out how to make this stronger, that is, fortify it against any attempts at reversal, and at the same time, perhaps produce even better random from an algorithm. What follows, is the best ideas I've had to date, combined in a way that I believe creates a great random generator with a strong focus on security. So read on, a lot has changed, and I think this might be the one.. I have left all of the original algorithms in, and added new ones that I created while experimenting, (more about that later), anyways, I think this really warrants a re-read.. Oh, and I renamed it to: Chaotic State Random Generator, CSRG (nick. Chaos). Better name I think..

July 24: Some more big updates!

I was integrating this into my main project today, and made some significant changes to this generator, so I thought I'd let you know.. I added an entropy collector to the main loop of the Generate() methods. It collects bytes and periodically, extracts the entropy and uses that to rotate the expansion table, (explained under the 'Table Rotation' section of the Generate() method breakdown). I increased the seed size for both engines to 96, now in the DUAL engine, AES keys and counters are initialized purely with seed material, in SINGLE mode, the remaining 32 bytes are thrown to the entropy pool. I also changed the recommended block size from 10Kib to 100Kib, which would bring it up to about a 1000 to 1 ratio of psuedo random output to required seed material in the test implementation.

July 25: Parallel example

Added a test method to frmNist that demonstrates using CSRG in a Parallel For loop, ParallelTest(Size). Throughput on my AMD A6-3600 quadcore runs at almost 900Mib per minute using this technique.

It always starts..

with something you need.. and I needed a really good random generator; one that could be seeded manually, was self-contained, and performed remarkably, and so my search began.. Through the wild tundra of indecipherable C code written by introverts addicted to shorthand, the mathematical voodoo penned by guys with foreheads the size of skateboards, the Blum Blum Schub's, the Yarrow's and Fortuna's, Twister's and ISSAC's, and.. then I started looking into block ciphers running in counter mode. I already had an Aes class written based largely on the mono implementation, so this should be a snap! My first counter mode implementation gave me great results, but tested with Ent as a guide, it just couldn't beat the windows built-in api RNGCryptoServiceProvider, which is a managed wrapper for the CryptGenRandom api. So I started experimenting, and expanded my random testing tool-kit with a C# version of the Ent tests, and later the Nist batch of random tests.

Criteria

These are the expectations I intend for the implementation:

It must generate the same random output with the same seed each time
It must be fast
It must be extremely difficult to unwind
It must produce excellent random output

The Good Stuff

Let's start small, like generating 16 bytes of random for integers, and creating seeds for the generator. The first method is used to create random integers and taken from the CSPRNG class, it gets a random buffer, (in this case from RNGCryptoServiceProvider), gets its hash value from SHA256, and returns the first half of that hash value. The integers are drawn from the bytes via one of the BitConverter overloads. These methods are all contained in the static class CSPRNG.cs.

/// <summary>
/// Get a 16 byte/128 bit seed
/// </summary>
/// <returns>Random seed [byte[]]</returns>
internal static byte[] GetSeed16()
{
    byte[] data = new byte[64];
    byte[] result = new byte[16];

    _rngRandom.GetBytes(data);
    // copy first half of hash in
    Buffer.BlockCopy(_shaHash.ComputeHash(data), 0, result, 0, 16);

    return result;
}

The class contains methods that return random for all of the integer and floating types, they look like this..

/// <summary>
/// Get a random 32bit integer
/// </summary>
/// <returns>Random Int32</returns>
internal static Int32 NextInt32()
{
    return BitConverter.ToInt32(GetSeed16(), 0);
}

A random generator requires a random seed for it's initial state, you could get creative and do something like this:

/// <summary>
/// Get a 64 byte/512 bit seed
/// </summary>
/// <returns>Random seed [byte[]]</returns>
internal static byte[] GetSeed64()
{
    byte[] data = new byte[128];
    byte[] seed = new byte[64];

    _rngRandom.GetBytes(data);

    // entropy extractor
    using (SHA512 shaHash = SHA512Managed.Create())
        Buffer.BlockCopy(shaHash.ComputeHash(data), 0, seed, 0, 64);

    return seed;
}

In this routine, we are generating 128 bytes of random with RNGCrypto, which aligns to the SHA512 blocksize, then returning the hash for that random array. This is called 'entropy extraction', it involves taking a secret random value and deriving its hash, then using that as the random source, and is a good way to build encryption keys, IVs or seeds.. (now, for the purposes of seeding a generator, I think this may be a bit excessive, RngCrypto output should be good enough.. and you can even lose entropy by going too far.. ;o)

CSRG Random Generator

Before we even start looking at this, well, somewhat intimidating looking algorithm, I'll start right at the beginning, when the generator is initialized..

/// <summary>
/// Initialize the generator
/// </summary>
public CSRG()
{
    // use the default table seed
    _expansionTable = ExpandTable(SEED_TABLE);
}

/// <summary>
/// Initialize the generator with a unique internal expansion table
/// </summary>
/// <param name="TableSeed">Generator key: must be 48 bytes of random data</param>
public CSRG(byte[] TableSeed)
{
    // wrong seed size!
    if (TableSeed.Length != 48)
        throw new ArgumentOutOfRangeException("Expansion Seed size must be 48 bytes long!");

    _expansionTable = ExpandTable(TableSeed);
}

As you can see, there are two constructors, they both create something called an 'expansion table'. The first one uses a default local byte array, in the second, the user inputs a 48 byte array of random, this is, for lack of a better description, the key to the generator. Flip just one bit in that array, and CSRG will produce a completely different random output, even with an identical seed. So, it *is* kind of like a 384 bit key, that when combined with the seed used as the initial state in each Generate(Buffer) call, adds a layer of fortification to the generator. Well, how is that really useful.. I have made an encryption software, it is a very complex software, with debugger foilers, secure erasure, erasure triggers, anti-coersion tactics, encrypted memory, double encrypted login, encrypted state/logs/database, imports raw random binary for a true OTP, and on and on.. all housed in the coolest wpf interface that I could dream up.. So, I really went kinda berzerk on this thing for the last 8 months, but now it's winding down, it's almost done, I am writing the help files.

The one thing I kept revisiting in my mind though, was that at the heart of it, it was an AES CTR, written to the 80090 spec, but still, if you know how that machine really works, and you see past the specification details, it is just a block cipher running in counter mode, and nothing much more sophisticted then that. So how do we make that better, I mean, anything coming out of AES is going to be highly randomized, and CTR especially, so it's doing a great job creating random, but..

First of all, some details about how my little xor cipher works:

For one thing, I reveal nothing in the message header. It looks like this:

Flag - just an id that tells us this is a valid message, always the same
Position - Not the actual position, but offset with a random ulong in the key header
Creator - A unique 16 byte field identifying the key used to encrypt this file
Extension - the file extension, xored with random in a key header field

The Key header looks like this:

EngineType - AES or OTP
Position - Current virtual position within the key file
KeyOffset - A random ulong, that when subtracted from the message position, reveals the 'actual' position, if the key were completely expanded
KeyID - Unique 16 byte key identifier
MessageHash - 16 bytes of random, xored with the file extension
TableKey - Here it is.. 48 bytes of random to unlock the generator

The rest of the key is just seed material, so a 1Mib key file would contain 10482 seeds, which would expand to about 1023 Mib of random, (so about a 1000 to 1 ratio). The messages are encrypted and decrypted on the fly by extracting the necessary seeds, generating the random, then xoring with the data.

So what useful information does an attacker really have? He can't tell what size the keyfile is without the keyfile itself, let alone the position, and whether the random used to xor the message starts on the 1st byte of the 1st seed, or the 317th byte of the 90th block.. (seeds build out to 102400 byte 'blocks' in my implementation), a 10 Mib file would use approx 100 seeds.. so it is only costing about 9.7Kib of key space to encrypt that file.

With thumb drives now at.. 256 Gb, you could store about 2,748,779,069 seeds.. so you could encrypt about 260 Tib of data with the key material from that one thumb drive.. so OTP or Pseudo OTP, has become a viable means of strong encryption, given the ever increasing capacity of portable storage devices.

Why am I telling you all this.. well, to show you how you can use a good random generator to build some very powerful encryption, and to show you how that cipher might be constructed. One of the most important tasks in building this cipher, is to hide any information in a message that might aid in its decryption, and perhaps the most important detail to hide is the positional offset within the key file, particularly if the key was generated via a deterministic algorithm. Ok, more about the cipher later..

How is the table built?

Just a straight AES CTR implementation, it generates 4096 bytes of random, which are copied to a Uint32 array with 1024 members..

/// <summary>
/// Transform a seed into a 1024 UInt32 array of random.
/// </summary>
/// <param name="Seed">Random seed, Fixed size: must be 64 bytes long</param>
/// <returns>Random data [byte[]]</returns>
private UInt32[] ExpandTable(byte[] Seed)
{
    const int SIZE = 4096;

    UInt32[] exTable = new UInt32[1024];
    UInt32[] exKey = new UInt32[AESEXPANDED_KEYSIZE];
    byte[] ctrBuffer = new byte[BLOCK_SIZE];
    byte[] keyBuffer = new byte[KEY_BYTES];
    byte[] outputBlock = new byte[BLOCK_SIZE];
    byte[] outputData = new byte[SIZE];

    // copy the seed to key, iv, and counter
    Buffer.BlockCopy(Seed, 0, keyBuffer, 0, KEY_BYTES);
    Buffer.BlockCopy(Seed, KEY_BYTES, ctrBuffer, 0, BLOCK_SIZE);

    // expand key
    exKey = ExpandKey(keyBuffer);

    for (int i = 0; i < SIZE; i += BLOCK_SIZE)
    {
        // increment counter
        Increment(ctrBuffer);
        // transform the counter
        Transform(ctrBuffer, outputBlock, exKey);
        // copy transform to output
        Buffer.BlockCopy(outputBlock, 0, outputData, i, BLOCK_SIZE);
    }

    // copy to table array
    Buffer.BlockCopy(outputData, 0, exTable, 0, SIZE);
    return exTable;
}

So the counter and AES key are copied from the table seed, the 16 byte counter is incremented by a single bit on each iteration, then encrypted with AES, and that transform is copied to the output buffer, and finally to the uint array.

How is the table used?

The table is used in array stretching and randomizing routines, called extraction methods, (ExtractArray16(Buffer), ExtractArray32(Buffer)), and what these routines do, is stretch an array to 64 bytes (to align with the SHA256 block size), while randomizing it by xoring with the expansion table.. The whole apparatus goes towards a self re-seeding scheme, we'll look at in a minute. ExtractArray32() looks like this:

/// <summary>
/// Expand an array with bit shifting
/// </summary>
/// <param name="SubBuffer">Buffer to expand</param>
/// <returns></returns>
private byte[] ExtractArray32(byte[] SubBuffer)
{
    UInt32[] tmpNum = new UInt32[8];
    UInt32[] arrNum = new UInt32[16];
    byte[] data = new byte[64];
    Int32 ct = 0;

    // copy first array in
    Buffer.BlockCopy(SubBuffer, 0, tmpNum, 0, 32);
    // get the first buffer table index
    UInt16 iter = ExtractShort(tmpNum[0], 10);

    // randomize the bits
    for (int i = 0; i < 8; i++)
    {
        arrNum[ct++] = ~tmpNum[i] ^ _expansionTable[iter];
        iter = ExtractShort(arrNum[ct - 1], 10);
        arrNum[ct++] = tmpNum[i] ^ _expansionTable[iter];
        iter = ExtractShort(arrNum[ct - 1], 10);
    }

    // copy it to byte array
    Buffer.BlockCopy(arrNum, 0, data, 0, 64);

    // get the hash
    return ComputeHash64(data);
}

So the buffer comes in, (the combined AES transforms of 32 bytes).. the byte buffer is copied to a temporary UInt32[8] array, and then extracted to a larger 16 member UInt32 array, which is then copied back into a byte array, and hashed with SHA256 which is the actual return value. The first array member is unary reversed and xored with the expansion table using a secret indice, which is the last 8 bits of the previous value (8 bits = 0-1023), the second value is simply xored with another table value, using the indice derived from the value before it.. and so on. So this is how new keys are generated for AES, and new 16 byte counters are created for the main method. We are using the whitening to scramble the bits before it enters SHA256, this guarantees we get a unique hash every time. Now you can see, how if the table seed was changed in any way, it would mean a vastly different offset table, and in turn, different seeds and counters created when the generator re-seeds itself.

The Generator:

There are two aspects to the generator, how the random is created, and how the re-seed mechanism works. The random is created by running 2 AES CTR transforms in tandem, each with its own unique key and 16 byte counter. Both transforms are xored together at the bottom of each loop and the resulting array is copied to the ouput buffer. Simple as that.

Chaotic ReSeed:

Along with the Key and Counter, 2 random iteration counters are running. (reSeedCounter1 and reSeedCounter2), one for each AES transform. On each iteration through the main loop, they are both tested independantly for an even division into the differential divisor, that is; when (ReSeedCounterX % 256== 0) it triggers a state reset for that AES transform. When that happens, the last 2 transforms are combined and sent to the ExpandArray32() routine, which stretches and whitens the data, then returns the SHA256 hash, which is expanded into a new AES key. The new counter is created by sending a single block to the ExpandArray16() routine, and the UInt64 reSeedCounter for that transform is created by xoring the halves of the 16 byte CTR counter. So, because the two reseed counters are random, and because both transforms have a unique reseed counter, debugging the algo is like watching a horse race.. transform1 will gallop ahead, resetting state 3 times in a row, then transform2 will catch up and surpass it, and back and forth they go.. the point is, these two AES transforms (that are xored together to create the output), are constantly and unpredictably swapping out their state.. it should all combine to make crypto analysis, well, a bit difficult.. Here's what it looks like:

/// <summary>
/// Dual-CTR Chaotic ReSeed Random Generator:
/// Generate a block of random bytes using dual AES CTR mode.
/// Uses a dual CTR xor with a chaotic state change machine.
/// </summary>
/// <param name="Size">Size of data return in bytes</param>
/// <param name="Seed">Random seed, Fixed size: must be 96 bytes long</param>
/// <returns>Random data [byte[]]</returns>
private byte[] GenerateT2(Int32 Size, byte[] Seed)
{
    // too large!
    if (Size > ENGINE_MAXPULL)
        throw new ArgumentOutOfRangeException("The size requested is too large! Maximum is "
            + ENGINE_MAXPULL.ToString() + " bytes.");
    // wrong seed size!
    if (Seed.Length != DUAL_SEED_BYTES)
        throw new ArgumentOutOfRangeException("Seed length must be " + DUAL_SEED_BYTES.ToString() + " bytes long!");

    UInt16 poolSize = 0;
    // align to upper divisible of block size
    Int32 alignedSize = (Size % BLOCK_SIZE == 0 ? Size : Size + BLOCK_SIZE - (Size % BLOCK_SIZE));
    Int32 lastBlock = alignedSize - BLOCK_SIZE;
    UInt64 reSeedCounter1 = 0;
    UInt64 reSeedCounter2 = 0;
    UInt32[] exKey1;
    UInt32[] exKey2;
    byte[] ctrBuffer1 = new byte[BLOCK_SIZE];
    byte[] ctrBuffer2 = new byte[BLOCK_SIZE];
    byte[] keyBuffer1 = new byte[KEY_BYTES];
    byte[] keyBuffer2 = new byte[KEY_BYTES];
    byte[] outputBlock1 = new byte[BLOCK_SIZE];
    byte[] outputBlock2 = new byte[BLOCK_SIZE];
    byte[] outputData = new byte[Size];
    byte[] seedBuffer = new byte[KEY_BYTES];

    const Int32 SUPPRESSOR = 32;

    // copy seed to key1, key2
    Buffer.BlockCopy(Seed, 0, keyBuffer1, 0, KEY_BYTES);
    Buffer.BlockCopy(Seed, KEY_BYTES, keyBuffer2, 0, KEY_BYTES);

    // copy seed to counter 1
    Buffer.BlockCopy(Seed, KEY_BYTES * 2, ctrBuffer1, 0, BLOCK_SIZE);
    // copy remainder to counter 2
    Buffer.BlockCopy(Seed, (KEY_BYTES * 2) + BLOCK_SIZE, ctrBuffer2, 0, BLOCK_SIZE);

    // expand AES key1
    exKey1 = ExpandKey(keyBuffer1);
    // expand AES key2
    exKey2 = ExpandKey(keyBuffer2);

    // extract the state reset counters
    reSeedCounter1 = ExtractULong(ctrBuffer1);
    reSeedCounter2 = ExtractULong(ctrBuffer2);

    for (int i = 0; i < alignedSize; i += BLOCK_SIZE)
    {
        // increment counter1
        Increment(ctrBuffer1);
        // increment counter2
        Increment(ctrBuffer2);

        // aes transforms //
        // encrypt counter1 (aes: data, output, key)
        Transform(ctrBuffer1, outputBlock1, exKey1);
        // encrypt counter2 (aes: data, output, key)
        Transform(ctrBuffer2, outputBlock2, exKey2);

        // independant random rekey mechanism for transforms 1 and 2
        if (reSeedCounter1 % _differential == 0)
        {
            // Reset State 1 //
            // copy the 2 transforms to the new seed
            Buffer.BlockCopy(outputBlock2, 0, seedBuffer, 0, BLOCK_SIZE);
            Buffer.BlockCopy(outputBlock1, 0, seedBuffer, BLOCK_SIZE, BLOCK_SIZE);
            // create new key via sha256
            keyBuffer1 = ExtractArray32(seedBuffer);
            // expand the aes key
            exKey1 = ExpandKey(keyBuffer1);
            // create a new counter
            ctrBuffer1 = ExtractArray16(outputBlock2);
            // get the int64 reseed counter
            reSeedCounter1 = ExtractULong(ctrBuffer1);
        }
        // re-state when remainder of modulo with counter is 0
        if (reSeedCounter2 % _differential == 0)
        {
            // Reset State 2 //
            // copy the 2 transforms to the new seed
            Buffer.BlockCopy(outputBlock1, 0, seedBuffer, 0, BLOCK_SIZE);
            Buffer.BlockCopy(outputBlock2, 0, seedBuffer, BLOCK_SIZE, BLOCK_SIZE);
            // create new key via sha256
            keyBuffer2 = ExtractArray32(seedBuffer);
            // expand the aes key
            exKey2 = ExpandKey(keyBuffer2);
            // create a new counter
            ctrBuffer2 = ExtractArray16(outputBlock1);
            // get the int64 reseed counter
            reSeedCounter2 = ExtractULong(ctrBuffer2);
        }

        // Entropy collector //
        // number of bytes to pass to entropy collector
        // determined by last 4 bits of output block 2
        if (reSeedCounter1 % SUPPRESSOR == 0)
        {
            // get the size
            poolSize = ExtractShort(outputBlock2, 4);
            if (poolSize > 0)
            {
                // copy to array
                byte[] pool = new byte[poolSize];
                Buffer.BlockCopy(outputBlock1, 0, pool, 0, poolSize);
                // send the bytes to the entropy pool
                EntropyPool(pool);
            }
        }
        // collect from second block
        if (reSeedCounter2 % SUPPRESSOR == 0)
        {
            poolSize = ExtractShort(outputBlock1, 4);
            if (poolSize > 0)
            {
                byte[] pool = new byte[poolSize];
                Buffer.BlockCopy(outputBlock2, 0, pool, 0, poolSize);
                // send the bytes to the entropy pool
                EntropyPool(pool);
            }
        }

        // xor the two transforms
        for (int j = 0; j < BLOCK_SIZE; j++)
            outputBlock1[j] ^= outputBlock2[j];

        // copy to output
        if (i != lastBlock)
        {
            // copy transform to output
            Buffer.BlockCopy(outputBlock1, 0, outputData, i, BLOCK_SIZE);
        }
        else
        {
            // copy last block
            int finalSize = (Size % BLOCK_SIZE) == 0 ? BLOCK_SIZE : (Size % BLOCK_SIZE);
            Buffer.BlockCopy(outputBlock1, 0, outputData, i, finalSize);
        }

        reSeedCounter1++;
        reSeedCounter2++;
    }
    return outputData;
}

Table Rotation

So we have the expansion table, which as you now know, is used in the expansion routines where new keys and counters are created for the internal state transitions. Well, what if we could change that table on the fly, repeatedly filling it with new offset values at irregular intervals? If you look in the code above, you'll see a few lines near the bottom of the routine beneath the comment 'Entropy collector'. What this code does, is grab a few bytes and pass them to a method called EntropyPool(Buffer). There is a clause for each of the two transforms, it uses the (reSeedCounterX % SUPPRESSOR == 0) modulus to slow it down, and it gets the number of bytes to send to the entropy pool, from the last 4 bits of the other transform. It may pass between 1 and 15 bytes to the pool, the number dependent on whatever those 4 bits might be.. So it's just stealing a few bytes here and there, and about every 4000 bytes of output created, (adjustable via the SUPPRESSOR constant, which is set to 32), it will trigger a rebuild of the expansion table, giving it a whole new set of values. Here's what the EntropyPool() method looks like:

/// <summary>
/// Collects entropy for the expansion table rotation
/// </summary>
/// <param name="Data">byte array, length 1-128</param>
private void EntropyPool(byte[] Data)
{
    int count = Data.Length;

    // to max 128 bytes
    if (count + _entropyCount > 128)
        count = 128 % _entropyCount;

    // copy it in
    Buffer.BlockCopy(Data, 0, _tableEntropy, _entropyCount, count);
    // update the counter
    _entropyCount += count;

    // triggers expansion table rotation
    if (_entropyCount == 128)
    {
        byte[] data = new byte[64];
        byte[] tableSeed = new byte[48];

        // copy the poll to 64 byte buffer
        Buffer.BlockCopy(_tableEntropy, 0, data, 0, 64);
        // get hash from sha256
        byte[] entropy = ComputeHash64(data);
        // copy to the tables seed
        Buffer.BlockCopy(entropy, 0, tableSeed, 0, 32);

        // do the same for the next 64 bytes
        Buffer.BlockCopy(_tableEntropy, 64, data, 0, 64);
        entropy = ComputeHash64(data);
        // copy to table seed
        Buffer.BlockCopy(entropy, 0, tableSeed, 32, 16);

        // create a new expansion table
        _expansionTable = ExpandTable(tableSeed);
        // reset the counter
        _entropyCount = 0;
    }
}

Nothing too complicated here.. the bytes are copied to a class variable_tableEntropy, and when the bytes number 128, the two halves of that array are passed to SHA256, and their returns build the 48 byte generator key, which is then used to rebuild or 'rotate' the expansion table.

So let's summarize an analysis of our little xor cipher thus far:

The bad guy, gets a hold of a 10Mib message, and from that he can deduce:

Nothing. He doesn't know if the random was extracted from the first seed in the key, or the last, the first byte of random derived from that seed, or the last byte. He doesn't know the when, what, or how an unpredictable internal state machine has swapped out keys and counters. He doesn't even know what type of file it was!

That 10Mib file would require 100 seeds to build, so the attacker would have to guess: AES(key + counter) 2 * 100 times to read that file, -but, you also have to add in the internal random state resets. With a modulo of 256 for the seedcounters, it seems to reset about 100 times when generating a 102400 byte block. So if the state auto-resets 100 times per 102400 bytes, and a hundred different seeds are used to encrypt that 10Mib file, that would be 10,000 internal state resets.. so, unwinding the random data would require that you break AES more then 10,000 times.. not to mention it's actually the xor from *2* different AES transforms, and he's not even examining the random itself, he'd actually be looking at the random after a binary xor with the message data.. This is also amplified by the size of the file itself, 100Mib file, 10,000 seeds, 1000Mib file, 100,000 seeds (1 million times!!!).. Keep in mind, that any computer attempting to 'solve' the random, would also have to reproduce the exact conditions of this algorithm; building the table, key and counter expansions, passes through AES and SHA256, this would involve thousands of processer cycles per iteration.

Actually I had time to test the ratios of internal resets to differential today (avg. over 1000 runs, each with a unique seed and expansion table seed):

Avg. Differential/Internal State Changes per 102400 block

1024 : 17.6
512 : 43
256 : 94
128 : 191
64 : 388
32 : 770

Now you might think, but if you could *guess* 2 keys/counters from a seeded block, you could then get the hashes and construct the next key/counter and roll through the block that way.. but remember the expansion table? Ahh.. without the 48 byte key used to build that randomizing table, there is no way you could get the correct hash value for the next key and counter, see how it all ties together?

Now I know what the next question is, 'yeah, but what about throughput?' Granted, the resolution is turned up pretty high, but AES is very fast.. I can create 100 Mib of random with CSRG using this resolution in about 24 seconds on my amd quad-core, add the file writes and one additional xor and you are still looking at about 200 Mib per minute, very reasonable considering the kind of security this could provide. But I may turn in down, add a 0 to the build size and bring up the reseed modulus; (the 'Differential' property in the class), to 512, which would probably near double the throughput, (I'll have to test it, and weigh the tradeoffs..).

If the data produced by this process is sufficiently random, (and all the statistical tests indicate that it is), the computational power required to unwind the random, (which would be the necessary step in decrypting any data xored with that random), would involve such an enormous number of calculations, as to be well beyond the capabilities of computing systems, now, and for the forseeable future. The ideas I am setting forth here are essentially a recipe for creating an encryptor that in all practical terms, can create unbreakable encryption. A One Time Pad created using deterministic machines..

So far, this is as good an xor cipher as I can build, but I know it could be improved upon, and I definately welcome any advice on making it better, (I see SHA as the 'weak link' in this implementation, with only 2 table runs on 64 bytes, block and finalizer.. and the expansion methods could do with a little more whitening). So if you know a better way, a way in which these methods, or anything else could be improved upon, do speak up, I'm listening..

So, why so much security, do I think AES is broken? Doubt it, wouldn't use it if I did.. Do I think that there may be methods by which a brute force attack against AES can be 'steered' somehow? Maybe, and if not, you can bet someone is working on it right now. As it stands though, brute forcing a 256 bit key, particularly if a mode like CBC is used, is virtually impossible.. but some people think quantum computing may change all that..

What I set out to do here, was create the engine for a One Time Pad implementation. Remove any chance that a brute force attack could unwind the random, and, if there are discreet patterns introduced into the output using an AES counter implementation, to mitigate those patterns by creating a short period between state changes. I also felt that making the state changes erratic would reduce any pattern visibility, and combining 2 transforms, with the overlap between key exchanges, and resulting loss in byte information, would make any patterns even less distinct.

This was not my first implementation, I also tried xoring AES transforms with Camellia and Twofish, but both have elaborate key expansion routines, so it really drags, this was the best in keeping with the 'Must be fast' criteria. I left those experiments in the class LokiOld.cs in case you want to play with it yourself.

CSRG API

Constructors:
public CSRG()
Loads the default table seed.

public CSRG(byte[] TableSeed)
Loads user defined table seed. Must be 48 bytes of random.

Properties:
public EngineModes Engine
Select the engine type: Dual or Single AES CTRs.

public int OutputMaximum
Maximum output from a single seed in bytes.

public int SeedSize
Required Seed buffer size.

public Differentials StateDifferential
Changes the chaotic reseed differential. Smaller values mean more aggressive reseeding.

Methods
public byte[] Generate(Int32 Size, byte[] Seed)
Generate a block of random bytes using property defined parameters.
Size: Size of data return in bytes.
Seed: Random seed, Fixed size: must be 64 bytes long.
Returns: data byte[]

public byte[] Generate(Int32 Size, byte[] Seed, EngineModes Engine)
Generate a block of random bytes.
Size: Size of data return in bytes.
Seed: Random seed, Fixed size: must be 64 bytes long.
Engine: The AES engine type, single CTR, or Dual xored CTRs.
Returns: data byte[]

public byte[] Generate(Int32 Size, byte[] Seed, EngineModes Engine, Differentials Differential)
Generate a block of random bytes.
Size: Size of data return in bytes.
Seed: Random seed, Fixed size: must be 64 bytes long.
Engine: The AES engine type, single CTR, or Dual xored CTRs.
Differential: The reseed modulus differential. Smaller values mean more reseeds.
Returns: data byte[]

SHA256

SHA256 is now completely inline with the CSRG class as an optimized method that only hashes on blocksize boundaries, (64 bytes). This was one of my design criteria, no Microsoft crypto Api in the class. It's still SHA256, a rewrite of the bouncycastle implementation, but the internal method ComputeHash64(), is now an internal method with a fixed input size. I did however, include the full class with the project, SHA256Digest.

SHA256Digest API

The SHA256Digest class is a rewrite of the bouncycastle class, making it a stand alone class, with some optimizations and refactoring. The output was tested against the criteria in the Secure Hash Standard Documentation. I added a ComputeHash method to align it with the windows implementation, and a Dispose mechanism for the class. A more compact method with a fixed input size of 64 bytes is contained in the CSRG.cs class. SHA256Digest class is in the 'Experiments' folder of this project.

public void BlockUpdate(byte[] Input, int InputOffset, int Length)
Used if doing generating a hash for a file or large output streams. Call DoFinal to retrieve the hash value once all of the data has been processed. Input blocks passed to this method must be block aligned, exactly 64 bytes in length.
Input -bytes to process.
InputOffset -starting position within Input.
Length -length of data to process.

public byte[] ComputeHash(byte[] Input)
Single call method (as used in the windows implementation). Input the bytes for the hash input.
Input -bytes to process.
Returns -byte[] value hash.

public Int32 DoFinal(byte[] Output, Int32 OutputOffset)
When using the BlockUpdate() method, this is the last method to call after the data is processed, and can contain any size as the Output. The Output is then changed to the hash value itself.
Output -last block of data to process, becomes the hash return in bytes.
OutputOffset - The starting position within the Output array.
Returns -bytes processed.

public void Reset()
Resets the internal state of the algorithm.

public void Update(byte Input)
Updates the message digest with a single byte.
Input -one byte of data to process.

public void Dispose()
Disposes of internal variables. Called when class is to be destroyed.

Internal AES Implementation

Advanced Encryption Standard (AES) methods in the CSRG.cs class are a one way encryptor, with a fixed key size of 256 bits. It has been optimized to perform only the minimum number of calculations necessary given those parameters. It is based on a well known implementation used in the mono library, which in turn I believe was created using the Novell implementation. The two main methods are Transform, which encrypts a 16 byte block of data, and ExpandKey, which expands a 32 byte key to a 60 member UInt32 array. The output was tested against the specification tests contained in the AES Standard document published by NIST, (a good read if you want to understand AES better).

private void Transform(byte[] InData, byte[] OutData, UInt32[] Key)

private UInt32[] ExpandKey(byte[] Key)

NIST Statistical Test Suite

Well, what can I say, I just got tired of working with the command console and doing one test at a time, so I made the NIST Statistical Test Suite into a library; added the exports, remmed the fprints and wrote a couple of default methods in utilities, (but I didn't change anything computational!). So now, you can run this as a C library using PInvoke. All the api is in the class NistSTS.cs, and it's all well commented.

The original NIST Statistical Test Suite console application is available here, the manual is here. Many thanks to the authors of this great tool, the code inside that application is very impressive, and well worth a look..

The NIST test suite is one of the most authoritative test suites for assesing the entropy from psuedo random generators, as far as statistical testing goes, it's very thorough, and well regarded in the cryptogrphic community, (it was sanctioned by NIST, and they define the American standards for cryptography). So, good deal, (that was a lot of work!). I built a nifty little test bed, it can compare CSGR to RngCrypto, the bouncycastle CTRSP800DRBG, (which I converted from java to C#, and made the fewest possible changes to make it work, mostly just remming TDEA stuff). It can also get true random data from random.org so you can establish a baseline, (but there is a 1MB limit of free random per day!). Apparently they get their data from radio towers picking up atmospheric noise.

So, if you run the tests, you can see CSRG passes consistantly, and P-Values are all in good margins, (most of them are good at near 0.5, except the 2 frequency tests are better at 0.6-0.8, the longest run your aiming for around 0.2, and approximate entropy should be a bit low as well, say 0.1-0.3 is best). I added exerpts from the manual to the help file, so you can get a better sense of input recommendations and result expectations, if you want more then that, read the manual. You'll notice a lot of p-value fluctuations (from all the generators, even random.org), that's normal, you have to run many thousands of tests before you can establish trends, (just remember; SUCCESS=good, FAIL, bad..). Some of the tests should only be run on large samples, (1Mib or more), or they will throw an OUT OF RANGE exception, that's why those tests are disabled until you choose a larger size.

Except for the Linear Complexity tests, (which have no pass score and are always rated N/A), you'll see a SUCCESS or FAIL for each test and the calculated p-values. Now it's normal to see the occasional FAIL, you see that even with raw random data. Also keep in mind, the larger the sample size, the more values have time to converge on their statistical averages, so bigger builds give a more accurate read on a generators output entropy levels. If you see a significant amount of failures, or groupings of failures, or a particular test fails consistantly, well, then its back to the drawing board ;o)

I've noticed all AES CTR based generators seem to have some bias in certain tests, (just as SHA/G, BBS and every other generator do in other tests), and the more tests you perform, the more the trends become apparent in the outcome, signaling the weaknesses in a particular algorithm. Overall, I'd have to say CSRG tests are coming out very good, at least as good as any CTR generator I've tested, and that it meets or exceeds all of my criteria for the project, so I'll probably end up using some variant of it for my xor cipher..

The focus of the test is the longest run of ones within M-bit blocks. The purpose of this test is to determine whether the length of the longest run of ones within the tested sequence is consistent with the length of the longest run of ones that would be expected in a random sequence. Note that an irregularity in the expected length of the longest run of ones implies that there is also an irregularity in the expected length of the longest run of zeroes. Therefore, only a test for ones is necessary.

Conclusion and Interpretation of Results

For the example in Section 2.4.8, since the P-value ≥ 0.01 (P-value = 0.180609), the conclusion is that the sequence is random. Note that large values of χ2(obs) indicate that the tested sequence has clusters of ones.

Input Size Recommendation

It is recommended that each sequence to be tested consists of a minimum of bits as specified in the table in Section 2.4.2.

Returns: P Value.

Binary Matrix Rank Test

Test Purpose

The focus of the Overlapping Template Matching test is the number of occurrences of pre-specified target strings. Both this test and the Non-overlapping Template Matching test of Section 2.7 use an m-bit window to search for a specific m-bit pattern. As with the test in Section 2.7, if the pattern is not found, the window slides one bit position. The difference between this test and the test in Section 2.7 is that when the pattern is found, the window slides only one bit before resuming the search.

Interpretation of Results

Since the P-value obtained in step 4 of Section 2.8.4 is ≥ 0.01 (P-value = 0.274932), the conclusion is that the sequence is random. Note that for the 2-bit template (B = 11), if the entire sequence had too many 2-bit runs of ones, then: 1) ν5 would have been too large, 2) the test statistic would be too large, 3) the P-value would have been small (< 0.01) and 4) a conclusion of non-randomness would have resulted.

Input Size Recommendation

The values of K, M and N have been chosen such that each sequence to be tested consists of a minimum of 10⁶bits (i.e., n ≥ 10⁶). Various values of m may be selected, but for the time being, NIST recommends m=9 or m = 10. If other values are desired, please choose these values as follows:

n ≥ MN.

N should be chosen so that N • (min πi) > 5.

If the P-value were < 0.01, this would have indicated that the observed frequency counts of Ti stored in the νI bins varied from the expected values; it is expected that the distribution of the frequency of the Ti (in the νI bins) should be proportional to the computed πi .

Input Size Recommendation

Choose n ≥ 106. The value of M must be in the range 500≤ M ≤ 5000, and N ≥ 200 for the χ2 result to be valid.

Params M: block size, default is 512.
Returns: P Value.

Serial Test

public static extern double NistSerial(int M);

The focus of this test is the frequency of all possible overlapping m-bit patterns across the entire sequence. The purpose of this test is to determine whether the number of occurrences of the 2mm-bit overlapping patterns is approximately the same as would be expected for a random sequence. Random sequences have uniformity; that is, every m-bit pattern has the same chance of appearing as every other m-bit pattern. Note that for m = 1, the Serial test is equivalent to the Frequency test of Section 2.1.

Interpretation of Results

Since the P-value obtained in step 5 of Section 2.11.4 is ≥ 0.01 (P-value1 = 0.808792 and P-value2 = 0.670320), the conclusion is that the sequence is random. Note that if ∇2ψ2m or ∇ψ2m had been large then non-uniformity of the m-bit blocks is implied.

Input Size Recommendation

Choose m and n such that m< log2 n -2.

Params M: block size, default is 16.
Returns: P Value.

The Tests

To test a DRBG, you need something to compare it with to get an idea of what good numbers really are. Well, we have RNGCrypto, but in C# anyways, not much else.. There is an outline proposed in the NIST document 80090A, from which several C and java versions emerged. Bouncycastle's java version includes the class CTRSP800DRBG, which I have re-made into a C# version for these tests, (now you have 2 Drbg's in C#!).

Both the CTRSP800 and RngCrypto produce excellent random scores in all the tests I have run in Diehard, Ent, and the Nist Statistical Test Suite, so they all appear to be good random generators, (keep in mind though, that in order to properly guage which algorithm delivers the best entropy, you have to test many thousands of samples in order to smooth out small fluctuations). I have run the CSRG output through the Nist's tests many thousands of times, using various sizes from 10Kib to 100Mib, and it consistantly passes with excellent results. Given the criteria I have for my project, the best choice is CSRG, because of the complexity of the restate mechanism, and the inherent unpredictability of internal state changes, it would be nearly impossible to unwind, while still returning excellent random output at good speed. I'm still testing it, but that's ramping down now, as the results remain constant and within metrics I established for the output tests.

Now for the inevitable caveat: though I researched this all carefully, and it is built upon an accepted method of producing cryptographically secure psuedo random, (AES CTR), I am not a cryptographer, so use at your own risk. It takes a lot of testing before any new algorithm becomes accepted into general use, keep that in mind, test it yourself, figure out how it works, and then make an informed choice.

I approached this project not as a cryptographer, but as an engineer, and as such broke it down into a series of logical problems to solve, and I think the result is a clever design, and something with potential. Anyways, send me an email if you have any ideas about improving the design, or a comment if you have a question..

cheers

Updates

What's with all the freaking updates, John? Ok, ok! I think I'm finally done now.. have fun (July 20)

July 14 - Renamed 'Loki' algorithm to CSRG, cleaned up the code.

July 19 - Added new test harness with NIST STS library included, added a new test harness, and a second generator method with expanded property and method options.

July 20 - Expanded article with api definitions, removed old content. Expanded the help file to include CSRG api definitions. Probably the last update..

July 24 - Added entropy pool and expansion table rotation methods. Increased DUAL seed size to 96 bytes. Upped recommended block size to 100Kib.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Written By

John Underhill

Network Administrator vtdev.com

Canada

Network and programming specialist. Started in C, and have learned about 14 languages since then. Cisco programmer, and lately writing a lot of C# and WPF code, (learning Java too). If I can dream it up, I can probably put it to code. My software company, (VTDev), is on the verge of releasing a couple of very cool things.. keep you posted.

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.