Add your own alternative version
Stats
44.1K views 2.5K downloads 47 bookmarked
Posted
7 Jan 2012

Comments and Discussions



It may actually prove very useful as a source of entropy in a secure random number generator that is then used for other purposes. Other sources of entropy would still be needed, but even if one of those is compromised by a malicious party (or through incompetence), using CA and it's statistical randomness would at least partially, if not completely, mitigate the damage from that compromise.
This is a very interesting thing, and may lead to many other cool things.
So have a 5.
What do you get when you cross a joke with a rhetorical question?
The metaphorical solid rearend expulsions have impacted the metaphorical motorized bladed rotating air movement mechanism.
Do questions with multiple question marks annoy you???





Thanks! I had a big long discussion with someone who missed this point entirely and just wanted to focus on the insecure parts of the (quite simplistic) implementation. Really glad that at least someone has seen it as a discussion of a potential technology, rather than a prepackaged solution or a way to cheat at school or work.
I am actually in the process of producing a set of secure RNG's for .NET  something that's far more secure than the builtin "Random" class, but unlike the RNGCryptoServiceProvider can be seeded to reproduce the same sequence. I've currently got three methods for this: repeatedly rehashing the seed using a Hash algorithm like SHA, Iterating Fibonacci Sequences with the BigInteger class, and using the Fredkin rule in a CA grid.
Now, forgive my waffling, but thinking about this was actually a good creative process and I just worked out a use for a seeded RNG: I doubt it's at all original but it just crystallised in my head.
Create a table with 256 entries, one column is sequential, the other random.
var rng = new RandomNumberGenerator(keyBytes);
var entries = from i in Enumerable.Range(256) select new { INDEX =(byte)i, VALUE = rng.NextByte() }
You then order the table by the random column, and output the index values as a byte array.
byte[] encoder = (from e in entries order by e.VALUE select e.INDEX).ToArray()
This produces a random sequence, encoder[] with one occurrence of every possible byte value. You can then do a transform by:
outputByte = encoder[inputByte];
You could also make it a bit more tricky by rolling over the lookup like the rotors on an enigma coding machine:
outputByte = encoder[inputByte + offset % 256];
assuming 'offset' is the current byte position from the start of the input stream.
The lookup could be inverted to produce a decoder:
for (byte inputByte=0; inputByte <=255; inputByte++)
{
decoder[encoder[inputByte]] = inputByte;
}
If you recreate the key for every 256 bytes of data, the same input will never have the same output... and you could use almost any longenough sequence of bytes to produce the encoder.





I may try to plug this into tinhat random[^].
I noticed the discussion you mention below, and yes, that guy seems to be rather negative.
Again, this is a very interesting idea.
What do you get when you cross a joke with a rhetorical question?
The metaphorical solid rearend expulsions have impacted the metaphorical motorized bladed rotating air movement mechanism.
Do questions with multiple question marks annoy you???





So, I was reading how your explanation on how you managed to generate a purely random key to
encrypt your message.
Although I did not download or use the program, but at first glance I just wanted to leave a
comment.
As I understand, you used images to generate the key from them, I might be mistaken.
Images are purely random, the pixels of black and white to generate a binary key.
You must have used a simple 250x250 image or required a specific range for an image to
be placed in the program.
What if a user created a key by randomly inputting 0 and 1 or white and black pixels inside
an image and tried to retrieve a key from them, it's just purely theoretical but if used
a specific algorithm to generate the an image of random pixels each time and not become similar
then the other, then after a while each key generated from the generated image would be used
to decrypt your message.
It would take lots of time yes, but is not impossible at all.
Hands down the way of using images to get random keys is very impressive and I never read about
them but I might use it in a way after this.
But Brute Force attacking with random keys from generated random images could actually crack your
encryption and I might say the encryption is not the powerful aspect but the key generation is.
Again sorry for those speculations, but I liked it so much, I had to comment.
Regards,
Ibrahim El Sayed





Hi
Thanks for your comment.
I haven't actually thought about this much in a couple of years.
In every encryption scheme, the weak part is the key (unless the encryption has been mathematically cracked, which is not often)
The standard method for generating keys in cryptography these days is to take a password (small, easily remembered sequence of bytes that can be typed using a keyboard) then to hash the result and use that as the seed and initialization vector for the algorithm.
Most people use a password that is less than 9 characters. This actually leaves a much smaller domain for a brute force algorithm to crack than even a small 8bpp (bitsperpixel) B+W image (50x50) which would be equivalent to a 250 character password.
A 32 bpp (which is a standard fullcolour image format these days) 50x50 image would be equivalent to having a password that is 1000 characters long  this would take, using current supercomputers and brute force, about a million years to crack.
Most people wouldn't use/couldn't remember a 250 character password, but they might be able to remember which image they used as the key.
The real weakness to the system of using an image key would be that on the users computer, there might only be a few thousand images  if an attacker was able to access that computer, they would have a problem domain of only the images on the computer to search, which is much smaller than even a 6 character password.
I recently did a course in computer security, offered by an Australian Company called SenseOfSecurity.
One thing that stood out was this: brute force cracking of passwords won't look past about 10 characters, so
"TheQuickBrownFoxJumpedOverTheLazyDog"
is actually a much, much more secure password than say, "a!f$G**6!F" simply because of the length, even though it is using dictionary words.
I must admit, that when I wrote the original project, I was under the illusion that bespoke encryption would be an advantage over commercially available encryption, because no one would be expecting it, but it's a law of system security that you have to assume that anyone targeting your encryption has complete access to the source code and algorithm used to generate it (like, eg, if you post the code on CodeProject). Meaning it's always a bad idea to rollyourown encryption because it will never be as good as the stuff generated by maths geniuses that work their entire careers to come up with this stuff.





hi
i want know how does this algorithm work?
please answer about algorithm





Hi,
The entire article is about how this algorithm works, and the attached VS project contains the actual source code for the whole thing. It is very well commented, even if you are not familiar with C#, the comments should provide enough info to understand how it works.
I can't explain it any better than what has already been posted.
I suggest you read about:
Cellular Automata, Conways GameofLife & simple XOR encyrption





Hie Simon...i have read your articles and it was v.interesting but some things i didn't understood like Generations,Size and that QR code image...does it encrypt things using these things .... Can u give me rough idea how this project works??? i understood the second part about encrypting and decrypting file...only confused about Generations , Size , Pass code and that QR code type image part.
Pls Help Simon........ :(
modified 8Mar12 11:51am.





Hi Sorry this is months late... I guess I'm not recieving notifications any more.
I'm not sure how much I can explain... the "QR Code" like image is the representation of the cellularautomata's 2 dimensional world. The black bits are living cells, the white bits are "dead" cells.
The generations indicates the number of iterations of the cellularautomata rules. In this case, any cell that has an even number of living neighbours will be alive in the next generation, any other cell will have 'died'
By running these rules over and over, an unpredictable pattern will emerge, that is unique to the starting grid pattern and the number of generations that have been run.
The pattern is then read as a series of ones (alive) and zero's (dead) to generate a sequence of binary values that are used as a key for simple XOR encryption.
The idea of the article was to discuss the possiblities, benefits, drawbacks and limitations to using cellularautomata simulations to generate repeatable encryption keys. It is not purported to be a new form of encryption *(there are dozens of commercially available CA based encyrption systems in existence)





hi sir... thanks for reply... i did research about CA from wiki and may other sites... came to know its not QR code... and i appreciate your work.... have a nice day.....





Ultimately it may not be a good encryption scheme, but the concept is interesting and worth taking the time to understand.





How did you test the encryption strength. By the way, it seems like your software has excellent concepts and everything I really like it!





It seems that according to the experts, the encryption isn't particularly strong.
I'm not a cryptographer, so my tests were on the statistical distribution of the values of the generated data, using the DIE HARD battery of tests.
My other test was to post an encrypted string and see if anyone could decode it.
It doesn't look like anyone is going to bother though.
I did briefly consider getting a prepaid visa card, encrypting the card number and posting that string as an incentive





Okay so what DIE HARD? Me being a non cryptographer as well I am not familiar with that term.
Encryption is awesome!





The Die Hard series of tests are supposed to evaluate the level of randomness in a ~10Mb block of data.
Basically it performs a series of little simulations, like using the random data to assign parkingbay numbers to virtual cars, and seeing how many times a 'car' is parked on top of another one. Its pretty heavy stuff, I don't really understand most of the tests it does.
Apparently it isn't really up to testing for cryptostrength randomness, but there doesn't seem to be anything else available.
http://stat.fsu.edu/pub/diehard
http://en.wikipedia.org/wiki/Diehard_tests





It's a good suite of statistical tests. It shows that there are no obvious correlations in the data  if it passes, the data looks nicely random, and would suffice for most purposes that need good quality randomlooking data.
What it does not show is that there are no correlations in the data (e.g. if you transform it in some way). Since this is what an attacker is looking for, simply passing statistical tests of randomness does not suffice. In cryptography you are up against active attackers.






I really like the implementation. It is very interesting and intreaging!





Just don't use it for data you really care about!





Absolutely Fantastic. A program does not always have to be the final product. All that we share here are bettered by someone. Human beings never had a cell phone 50k year back. So its fine. Great job coding. I enjoyed it mate!





... rating as a three for doing it in such an entertaining way!
The CA grid is built using a completely insecure random function, or a bitmap. Iterating this data over generations of a cellular automata does not add any new entropy. It's as simple for an attacker to perform the CA iterations as it is for you to generate your key material.
The security, such as it is, rests entirely on the initial state of the grid. The CA adds nothing to the security. You could replace cellular automata with, say, a hash function and get just as good security, which is to say, very little. I'm afraid you have not suddenly discovered a new way of generating secure cryptographic material that has so far eluded trained mathematicians and cryptographers!





If its so insecure, you should have absolutely no trouble decrypting the example I gave.
Like I said, put your money where your mouth is, then you get to claim its insecure.





Sorry, it's fun code but don't you see why the OP said what he did? That's not to say it is trivial to brute force, the original message is 2912 bits, which would necessitate a correspondingly large key size, but any security here doesn't come from the CA.
The OP is entirely correct.
That said, voted 5 because its a fun app.
Pete
Insert Sig. Here!





I never made any claims this was unbreakable, or even a new idea. its an interesting concept that has been around for at least ten years. Myself, I cant think of how you could break it, so I was hoping someone might offer some actual proof one way or the other. Its very easy to say something is insecure, but I dont see anyone with a decryption of the example yet.
The guy may be completely correct, but offers no proof, nor even a cogent argument. I stand by my statement: if it is completely insecure, then logically it must be a trivial matter to prove it.





I thought the argument against the CA adding any real security was fairly cogent, but I'll expand a bit:
1) You take some initial set of notverysecure pseudorandom data, or a bitmap
2) You apply a deterministic function to it iteratively.
3) No new entropy (i.e. randomness) has been added to the initial data, even though it now looks entirely unlike the original data.
4) The security of your encryption rests on the initial data, not how you subsequently processed it.
You are essentially creating a onetime pad but without the data in the pad being very random. This is known to be insecure: http://en.wikipedia.org/wiki/Onetime_pad#Problems
The fact that you processed your initial seed using a CA iteratively does not materially increase the amount of work (i.e. by many, many orders of magnitude) which would need to be done by an attacker. It would essentially be just as difficult if you had just left out the CA part entirely.
I would not be able to crack your example without a lot of effort  and possibly not even with a lot of effort. I am not a cryptographer or mathematician, although I have studied cryptography at Masters level, so I know a bit about my limitations here. It may not be possible for anyone to crack your example at all, even if they were inclined to expend the effort.
The onus on proving security is on the inventor of an algorithm; it does not fall on everyone else to prove it is insecure. Posing cracking contests is a particular red flag (have a look at http://www.schneier.com/cryptogram9902.html#snakeoil ).





Excellent, some actual arguments!
I don't have a master's degree in Cryptography, so I'm probably not qualified to offer any real counter points, but I'll have a go.
I agree the solution is entirely dependent on the initial configuration of the grid. This I believe is the main point to the method,  assuming your encrypted information is meant to be read by the intended recipient, they need to be able to reproduce the key sequence that will reveal the data.
The CA Algorithm may not significantly increase the work required to decode the data, but it does allow you to generate a much larger data set than the size of the initial grid configuration. Without it, you would have to repeat your key (bad), or be restricted to a few MB of data (limited). Also, unless you match the exact number of generations, the key is invalid, and you can run almost any number of generations.
The initialization method that uses the pseudorandom number generator isn't very secure, it was just quick and easy. Using an image as an encryption key is a valid approach that is actually used in cryptographic systems. By combining the CA with a reference image, you can encode a lot more data than you could using just the reference image, without duplicating the key data.
I was hoping someone might offer some insights into breaking the encryption without using brute force, until that happens let just look at how it might be solved simply by using brute force: (so for anyone reading that doesn't know, that means testing every possible keycombination until you find the correct one)
The number of possible starting grid combinations is beyond enormous. Even with the limitations of my simple implementation, there is a possibility of having a starting encryption key that is up to 16777215 bits in length. This could easily be expanded.
So, that is 2^16777215 possible key combinations, and that is assuming a fixed width and height to the grid. Considering that the grid doesn't have to be square, that is 2^16777215 * 2^16777215 different initial grid combinations (a conservative estimate at best), and this is before we have run any rules on it. There are also numerous possible ways that the data could be read off the grid to generate the key sequence, but I won't get into that.
Now, on top of that, we can run any number of iterations of the grid rules over those initial combinations. Assuming someone was patient enough you could run a hundred trillion iterations of the rules,  (I will try to prove later that running these rules produces a nonrepeating stream of combinations)
so, thus far, for a bruteforce crack of the data, we are looking at 2^16777215 * 2^16777215 * 100,000,000,000,000 possible combinations to test. And this assumes that the computer is able to tell when it has actually generated the correct key. We could assume what we are decoding contains English words, and that a result containing a sufficient number of English words, or even just printable characters, means success.
I might note these numbers are too big to be handled by my scientific calculator, or even the .NET Decimal type.
Compare this to the AES standard, where you have a 256 bit Key and 256 bit Initialization Vector, (so 2^256 * 2^256 possible key combinations to check for a brute force attack, which is still an awful lot.)
I also believe that the CA algorithm does add randomness (or 'entropy') to the data. And fairly good randomness at that. Just saying it doesn't add entropy doesn't make it true. You might have some special definition of the word, but I can only go with the dictionary definition: "Lack of order or predictability;"
I can even offer some proof. There is a set of statistical analysis tests used to determine the true randomness of any sufficiently large set of data, called the 'DieHard' set of tests.
http://stat.fsu.edu/pub/diehard/[^]
http://en.wikipedia.org/wiki/Diehard_tests[^]
I have looked and looked, and it seems these tests are the benchmark for establishing the validity of PRN sequences.
I took an ordered (non entropic) set: that is a CA grid, 256x256, of which exactly 50% of the data is 'alive' and in an ordered pattern. I repeatedly extracted the data from the grid, without running the CA rules on it, until I had enough data to run the diehard test. The results of which are an epic fail. the Average pvalue is 1.0. which could not be worse,  no randomness found in the data. As expected.
(one of the sets of results from the DIEHARD battery of tests)
:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
:: THE OVERLAPPING 5PERMUTATION TEST ::
:: This is the OPERM5 test. It looks at a sequence of one mill ::
:: ion 32bit random integers. Each set of five consecutive ::
:: integers can be in one of 120 states, for the 5! possible or ::
:: derings of five numbers. Thus the 5th, 6th, 7th,...numbers ::
:: each provide a state. As many thousands of state transitions ::
:: are observed, cumulative counts are made of the number of ::
:: occurences of each state. Then the quadratic form in the ::
:: weak inverse of the 120x120 covariance matrix yields a test ::
:: equivalent to the likelihood ratio test that the 120 cell ::
:: counts came from the specified (asymptotically) normal dis ::
:: tribution with the specified 120x120 covariance matrix (with ::
:: rank 99). This version uses 1,000,000 integers, twice. ::
:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
OPERM5 test for file withoutCA.bin
For a sample of 1,000,000 consecutive 5tuples,
chisquare for 99 degrees of freedom=*******; pvalue=1.000000
OPERM5 test for file withoutCA.bin
For a sample of 1,000,000 consecutive 5tuples,
chisquare for 99 degrees of freedom=*******; pvalue=1.000000
This is the code for the method that generated the data: it was run with generationsPerRecursion = 0
public static void GenerateRecursiveOutputFile(CAGrid initialGrid, int generationsPerRecursion, String outputFilename)
{
long written = 0;
using (FileStream fs = File.Create(outputFilename))
{
while (written < 1024 * 1024 * 12)
{
for (int i = 0; i < generationsPerRecursion; i++)
initialGrid.RunFredkinRule();
byte[] buffer = initialGrid.GetBytes();
fs.Write(buffer, 0, buffer.Length);
written += buffer.Length;
}
}
}
I then regenerated the data, using exactly the same method, exactly the same ordered starting grid arrangement, only this time, each iteration gets 100 generations of the fredkin rule. It took ages to generate the data, which is the specific drawback to the system.
If the CA Algorithm doesn't add any entropy, then the pvalues for the results of the test, they should all be 1, correct?
This is the result of the OPERM5 diehard test for the second set of data: (the closer the pvalue is to zero, the better the result)
:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
:: THE OVERLAPPING 5PERMUTATION TEST ::
:: This is the OPERM5 test. It looks at a sequence of one mill ::
:: ion 32bit random integers. Each set of five consecutive ::
:: integers can be in one of 120 states, for the 5! possible or ::
:: derings of five numbers. Thus the 5th, 6th, 7th,...numbers ::
:: each provide a state. As many thousands of state transitions ::
:: are observed, cumulative counts are made of the number of ::
:: occurences of each state. Then the quadratic form in the ::
:: weak inverse of the 120x120 covariance matrix yields a test ::
:: equivalent to the likelihood ratio test that the 120 cell ::
:: counts came from the specified (asymptotically) normal dis ::
:: tribution with the specified 120x120 covariance matrix (with ::
:: rank 99). This version uses 1,000,000 integers, twice. ::
:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
OPERM5 test for file withCA.bin
For a sample of 1,000,000 consecutive 5tuples,
chisquare for 99 degrees of freedom= 81.966; pvalue= .107474
OPERM5 test for file withCA.bin
For a sample of 1,000,000 consecutive 5tuples,
chisquare for 99 degrees of freedom= 80.803; pvalue= .091050
The result shows that the CA algorithm made the ordered set of data random, in fact it passed the diehard tests with flying colours.
Cellular Automata represent a microcosm of the universe, and by definition a closed system. The second law of thermodynamics states that entropy always increases in a closed system. Not only do the numbers back me up, so do the laws of thermodynamics.
And finally, one thing that just bugs me: You posted your comment with the subject "Completely Insecure", then followed up by saying that you have studied cryptography at a Masters level, yet could not break what you define as "Completely Insecure" encryption without significant effort, if at all. I'll let you mull over the contradiction.
I personally, would not claim something I was trained for (at a master level no less) yet could not do, was easy.
Also, apparently the onus is on me as the author, to come up with the method for breaking the encryption, but I might add: this isn't my idea. This is actually a growing field of cryptography  try entering "Cellular Automata Encryption" into a Google search. 245,000 results.
Oh, and thanks for comparing me to a 19th century Quack selling snake oil. Although, you must admit I'm giving the snake oil away, and trying to establish if it works.





This discussion is starting to be silly.
Quote: The CA Algorithm may not significantly increase the work required to decode the data, but it does allow you to generate a much larger data set than the size of the initial grid configuration. Without it, you would have to repeat your key (bad), or be restricted to a few MB of data (limited).
No. "Expanding" keys is a solved problem: the usual encryption algorithms use smallish keys with arbitrarily large inputs with a variety of modes of operation, among which ECB (repeating the key) is usually a weak choice.
Simon Bridge wrote: Using an image as an encryption key is a valid approach that is actually used in cryptographic systems.
Using an image provides only as much entropy as implied by choosing an image among available ones, i.e. much less than choosing the colour of each pixel from a uniform distribution.
Example: you use a certain image you keep on your computer as a key, then your computer is seized by the bad guys. Assuming a rich porn collection or the like, let's say there are one million large images on your computer: that's only one million possible keys to try, an effective key size of less than 20 bits. This does not depend on how well your CA works.
Simon Bridge wrote: I also believe that the CA algorithm does add randomness (or 'entropy') to the data. And fairly good randomness at that. Just saying it doesn't add entropy doesn't make it true. You might have some special definition of the word, but I can only go with the dictionary definition: "Lack of order or predictability;"
Randomness comes from random sources. Your CA is deterministic, so it cannot add randomness; it can only shuffle around the content of the key, which is still as random as the method that was used to choose it.
Simon Bridge wrote: I can even offer some proof. There is a set of statistical analysis tests used to determine the true randomness of any sufficiently large set of data, called the 'DieHard' set of tests.
These tests don't verify that a data set is random in the strong cryptographic sense that if you use it as a cryptographic key the adversary needs to bruteforce guess all of it, but only that it doesn't have certain properties that would make it distinguishable from a random data set and unsuitable for certain uses (mainly Monte Carlo simulations of some sort). For example, the OPERM5 test you cite looks for patterns of increasing and decreasing values. A key can pass any of these tests and still be less than perfectly random for cryptographic purposes: if you shred one of the 1000000 images discussed above with a CA like you suggest or with any kind of good hash function, the best possible outcome is obtaining a large blob of "noise" that is a fine input for a large Monte Carlo simulation, but is still worth less than 20 bits of entropy as an encryption key.
Quote: Cellular Automata represent a microcosm of the universe, and by definition a closed system. The second law of thermodynamics states that entropy always increases in a closed system. Not only do the numbers back me up, so do the laws of thermodynamics.
Entropy is a measure of ignorance, and ignorance cannot be increased by processing information. Physical systems can "produce" ignorance by carefully failing to observe and know details (e.g. how exactly you are throwing dice) in order to be surprised by other observation (on which face the dice land); but obviously in your proposed system there's nothing hidden, and entropy cannot increase.





No discussion you can learn from is silly. Getting a good debate rolling was most of the reason I wrote the article.
I didn't expect to be shot down so completely in the opening round though.
Lorenzo Gatti wrote:
Entropy is a measure of ignorance, and ignorance cannot be increased by processing information.
You are using your own specific definition of the word
Lorenzo Gatti wrote: Assuming a rich porn collection or the like, let's say there are one million large images on your computer:
That's a starter's collection at best.
But just to be serious for half a second: That is assuming you chose an image that is actually stored on your computer. The interweb provides an unlimited source of images. And Why assume it was a static image? you could pick an individual frame from a video...
...Scratch that, you could use the whole freakin video.... maybe that's my next fun project, "youtube encryption".





Simon Bridge wrote: Lorenzo Gatti wrote: Assuming a rich porn collection or the like, let's say there are one million large images on your computer:
That is assuming you chose an image that is actually stored on your computer. The interweb provides an unlimited source of images. And Why assume it was a static image? you could pick an individual frame from a video...
Choosing within a large image collection on your computer and encrypting data you keep there, without the challenge of exchanging keys with someone else, is only the simplest (and most secure) practical way to use an image as an encryption key.
The effective key length is too short, but you are able to memorize which image to use (with a collection of pinups, it might be "miss A in the swimming pool series, seen from the right, standing in the water, facing the house, with eyes closed and hands behind her back"; with a collection of personal photos, it could be "Christmas 2007, my living room with the bookcase & TV wall in the background, cousins A, B and C playing Bingo around the table, C shouting"), without storing the key anywhere.
Now, let's imagine the practical implications of downloading a "random" image from the web instead.
Apart from noncryptographical concerns, for example that your chosen image can be removed or replaced (denying access to your data), apart from the sad fact that the choice of an image is unlikely to be particularly random without extravagantly cumbersome procedures, and apart from additional attack types (e.g. snooping your web browser history or cache and your live network traffic to know which image you use as a key), you need an exact URL, so you have to write it down.
Having to write down and keep safe your URL (which is actually your key) is a big deal because, with the same inconvenience and insecurity budget, you could instead write down a perfectly random key (easily obtained by truly random means like flipping coins) that is as long as that URL but has more entropy, and use it in any ordinary proven and highperformance encryption algorithm.
Using a frame of a movie is a combination of a file on your computer or from the web, with the same problems, and a short but potentially wellbehaved key (the frame number); there is no free lunch. Varying the number of CA iterations is another piece of wellbehaved key that can be tacked on, but if you start employing truly random values you are better off using them directly as keys.





I just realized something...by your definition, a computer program cannot generate entropy.
Any operation a computer performs, is deterministic. A computer program cannot do something that the outcome of which cannot be predicted. At the most basic level, all a computer can do is evaluate combinations or AND, OR and NOT, and the outcomes of those operations are definitely known.
You can use simple hardware to generate actual random noise, by reading the value of a floating IO pin (a pin that is not grounded, the value of which is indeterminate, as it is affected by unknown environmental factors and wobbles around between 0 and 1) however, that would be absolutely useless for encryption, as the values cannot be reproduced.. you would just be irretrievably scrambling your data.
Therefore, any computer generated cryptographic number sequence must be deterministic, and by your definition, contain no entropy. So why is it even being discussed?
None of the existing commercial cryptographic systems can be capable of generating entropy (your definition of) because they are deterministic computer programs, yet it seems to be the biggest stick my system is being hit with.





Correct! No computer program can "generate" entropy. However, a computer can acquire entropy from its environment. The operating system does this by combining various sources of information (e.g. mouse movements, system timing irregularities, etc.).
It's quite a slow process  you can't get megabytes of good quality random information quickly. This is why cryptographically secure random number generators are often seeded using the highest quality entropy source available from the operating system, the proceed to iterate that information for a while to generate a stream of psuedo random numbers.
For example, in Java, the SecureRandom psuedorandom number generator (PRNG) class is seeded from the operating system entropy pool. It may then use the SHA1 generator, which uses the SHA1 hash function, and works roughly as follows:
// initialisation
high quality entropy > seed PRNG
PRNG counter = 1
// get random number
random number = SHA1(seed + counter)
increment counter
It all depends on the quality of the entropy source used to seed it. Even with this, it is recommended to periodically reseed the PRNG from a high quality entropy source every now and again.





A further comment:
"however, that would be absolutely useless for encryption, as the values cannot be reproduced.. you would just be irretrievably scrambling your data."
It's quite true to say that just encrypting information with a completely random source of information (and then throwing away the random data) would be entirely useless.
However, you do want highly random data when you are generating keys for encryption in the first place. But you don't go on to throw away the key  you do need to share the key with the intended recipient, or they can't decrypt your data (at least, in the case of symmetric encryption). You also want highly random data in many, many other cryptographic primitives and protocols.
Here's an interesting article on how Netscape rendered their implementation of SSL completely insecure by using a poor method of seeding their random number generator:
Randomness and the Netscape Browser[^]





Quote: I just realized something...by your definition, a computer program cannot generate entropy.
Any operation a computer performs, is deterministic. A computer program cannot do something that the outcome of which cannot be predicted. At the most basic level, all a computer can do is evaluate combinations or AND, OR and NOT, and the outcomes of those operations are definitely known.
Computers can measure random physical phenomena. Hard disk seek times seem quite popular: Brownian motion pushes the moving arm and causes it to arrive slightly earlier or later than the average. Timing of keyboard and mouse inputs, and microphones or cameras picking up noise (e.g. the bubbles of a lava lamp) are also used.
On the other hand, when we bother you with the need for random keys with sufficient entropy, we mean that the user generates the key once with an adequately random procedure, not necessarily involving a computer (for example, I have 16side dice labelled with hexadecimal digits), keeps it safe, distributes it to correspondents and passes it to encryption and decryption primitives as needed; not that the cryptographic system "generates entropy" by magic.
On the third hand, pseudorandom number generators are not adequate sources of randomness because they are deterministic; the entropy contained in their output cannot exceed the entropy contained in their seed, and it can be lower in case there are imperfections.
Your CA scheme is a pseudorandom number generator, which can have a very large seed in theory (but not in most practical applications) but is unlikely to behave well (for instance, successive generations of the same cell are going to be severely correlated in most CA rules).





The CA adds no new entropy, as it is a deterministic algorithm which can be run by anyone without knowing any additional secrets. You do say you regard the number of initial iterations as being part of the secret initial state, so that does introduce a small amount of entropy. So the entropy available is fully determined by the initial state and the number of iterations you do initially.
The methods you chose to populate the initial state are not cryptographically secure. For example, the initialisation using the Random function is seeded using an integer, giving only 2^32 possible starting grids. Even if you used a better function than Random, it would still be insecure. But I accept you don't regard this as secure in any case.
So lets assume you manage to seed the CA with some cryptographically secure data in the first place. You are then using the CA as a means of generating a synchronous stream cipher which is XOR'd with the plaintext.
http://en.wikipedia.org/wiki/Stream_cipher#Loose_inspiration_from_the_onetime_pad
These sorts of ciphers have a number of problems  including vulnerability to active attacks (you can predictably change a bit in the plaintext by modifying the equivalent bit in the ciphertext), and you can never reuse the same starting state for two messages or you can trivially crack them.
So the real question is whether the CA functions adequately as a cryptographically secure synchronous stream cipher. It has a number of disadvantages compared with accepted methods of doing this:
1) It requires a very large initial seed to populate it securely, or you must find yet another way to securely generate an initial state using a smaller seed (but larger than an integer!). Given that you can't reuse a starting state, this is a big problem.
2) It is very slow, and CPU and memory intensive. Given that low memory and CPU requirements are one of the key drivers in using this form of encryption, there is no good reason to use it.
3) Whether there are exploitable biases in the CA generation mechanism is entirely unproven. Simply running the Die Hard tests won't automatically reveal cryptographic weaknesses  it would require a much stronger analysis (which is beyond my competence)  but it's good you ran them as a starting point.
4) It is unclear whether there are weak keys (i.e. starting states which would create insecure encryption). It seems likely that there would be starting states for a CA which would create insecure patterns or repeating states in the resulting key material.
I may have been a bit unkind in saying this system was completely insecure. I should probably have said it is only very, very likely to be insecure. It is certainly not a practical system.
In any case, I did enjoy reading about your approach. It wouldn't hurt you to read up on some of the techniques used in stream ciphers  you don't have to be a trained mathemetician or cryptographer to understand how they work and why they are structured the way they are. Unfortunately, you probably do have to be one to create a new secure algorithm!





Member 8418173 wrote: 3) Whether there are exploitable biases in the CA generation mechanism is entirely unproven.
As I was reading the original article, and skimming this discussion, the CA part of the encryption began to bug me.
Each generation of CA is, in essence, a oneway hash. A weak oneway hash to be sure, but nonetheless so. Consider a particular iteration of the CA. Because it was generated using a deterministic algorithm, it can be reversed, but because information was lost, there may be multiple input states that lead to the came current state.
This "information was lost" is a big concern at it means entropy is being lost. The worst case scenario is that an iteration leads to complete cell death. Once that happens, all the rest of the iterations result in the identical state. This would be bad because when you do the XOR step, you'll get the original message unencrypted. Worse, there are multiple inputs that will lead to this. Even worse is that you can't know which initial inputs will result in this kind of a failure until after you run the CA iterations (well, maybe you can, but I can't ). Even more worse is that regions of the generation may end up with no cells, and when you go to XOR the original data with a string with a run of falses, you'll get the clear text of the original message. Even patches of clear text are undesirable for two reasons: A) they're not encrypted. B) they provide a known portion of the CA result to work from, allowing one to start to reverse its state starting with the edges (maybe that's possible, maybe not).
Its an interesting article, and thanks for taking the time to write it and the program. I'm dubious over whether it makes a good encryption algorithm, but its still an interesting idea.
We can program with only 1's, but if all you've got are zeros, you've got nothing.





patbob wrote: This "information was lost" is a big concern at it means entropy is being lost.
Loss of information increases entropy, not decreases.
It would be an excellent result if information was being lost, this would mean that the process could not be reversed.
patbob wrote: The worst case scenario is that an iteration leads to complete cell death. Once that happens, all the rest of the iterations result in the identical state.
You're right, this is exactly what happens when using Conway's rules for the Game of Life. However, the Fredkin rule doesn't appear to do this. There was actually a study that looked into this for real, done at the University of San Paul (Brazil) (it's where I got the idea to write the article) and they discovered that using a specific variation of the Fredkin rule over a period of 1000 generations created 'consistently random' number sequences.
This is a little snippet about the study: http://www.technologyreview.com/blog/arxiv/27460/[^]





Simon Bridge wrote: Loss of information increases entropy, not decreases
Oops.. you are correct. I was thinking of disorder. The CA is used to add disorder to the message in order to render it undecipherable. If the CA ever degrades into a state of complete order, that would be bad.
Simon Bridge wrote: the Fredkin rule doesn't appear to do this
For cryptology purposes, there's a world of difference between "doesn't appear to" and "can be proven not to". The cryptology community will probably require a rigorous proof before accepting such a scheme.
That link is an interesting read. It implies that they ran each CA ruleset through 20 million iterations and found that with their provided input(s?), that this one generated results that were statistically random. That's not a proof, but I can certainly see why you thought it was an interesting idea to toy with.
Thanks again.
We can program with only 1's, but if all you've got are zeros, you've got nothing.





Interesting concept and application of seemingly unrelated things, my 5.
Its the man, not the machine  Chuck Yeager
If at first you don't succeed... get a better publicist
If the final destination is death, then we should enjoy every second of the journey.







General News Suggestion Question Bug Answer Joke Praise Rant Admin Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

