|
|
Comments and Discussions
|
|
 |

|
i would just like to ask if what really is the significance of your HTML steg and what particular part of a system can we apply it? and another favor, can you please give the algorithm of your program? you know, begginer's dilemma! peace out! tnx in advance.
modified on Thursday, August 28, 2008 6:50 PM
|
|
|
|

|
On many websites, especially corporate ones, there are headers that get used on all pages which load tons of CSS and JavaScript. Furthermore, these headers often have you load code that isn't used for the page you're on. It's there because the same header is used throughout the site and it contains all of the scripts and style definitions for that site.
1) There are tons of item here to play with, reorder, etc.
2) If you know what you're doing, CSS can be written very concisely with no unnecessary repetition. The corollary to this is that by doing the opposite, you can greatly increase the amount of data you can hide. Regarding secrecy, seeing large sloppy CSS should arouse no suspicion since that's how it's often seen if WYSIWYG generated or if written by someone not very good at it.
3) Again, it's likely that for any given page there will be CSS present that won't be used for that page. So it should also arouse no suspicion if you add CSS that is used on no page at all.
Haven't given much thought about whether your methods would be compatible with JavaScript. If they are, you can apply all of the above to JS as well.
|
|
|
|

|
Hi,
I saw your beautiful programs , and enjoyed from them ,thank u very because of them,u are really expert girl in programer's world(it's rare and it's honorific because i'm too girl)
but i have one question about HTML steganography, in your program ,when we want to save web page ,it doesn't save completely,it meens the folder that have the image ,... of web page it not created,and at real world at internet,with this problem it can't be used,if u can solve this problem plz change it,thank u from your attention
|
|
|
|

|
A very interesting idea. Thank you for showing us other ways of hiding data. I hadn't considered using an html (or other text file) before. This has really got me thinking.
5 from me, great job. Love the key combination ideas.
|
|
|
|

|
{"Syntax error: Missing operand after 'events' operator."}
In the Line
rows = keyTable.Select("firstAttribute = '" + attribute.Name + "'");
|
|
|
|

|
Thanks for the hint! There's a bug in the HtmlAttribute class. The quotation marks in attribute names like "document.getelementbyid('sectionmenu')" must be escaped before they're used in SQL queries.
I've just submitted an update to CodeProject, the fixed code will arrive on this page in a few days. If you need the bugfix today, please tell me where to email it.
_____________________________________________________
This statement is false.
|
|
|
|

|
Everything's fine. Thanks to our great CP editors!
The fixed code has just been posted. Please re-download the archive and try again.
_____________________________________________________
This statement is false.
|
|
|
|

|
hello mam,
in our project v r getting error of
'System.Windows.Forms.Application' does not contain a definition for 'EnableRTLMirroring'
so please tel how to solve this error. if v comment this line v r not getting the html output file same as input file,..
so please................ help us
thank u
|
|
|
|

|
You are using an old version of VisualStudio. Newer versions generate that line by default.
Comment it and you'll be fine. If the application doesn't work then, there must be a different problem.
____________________________________
There is no proof for this sentence.
|
|
|
|

|
your method is a data hiding method and does not concern about attacks that is able to reveal existanse of a message.
1- Security of such methods is based on Security by obscurity. Knowing your method, with a simple statistical analisys can reveal that there is a hidden message in it.
2-If some body consideres hiding data within html pages in a way that its browser view is intact, there is lots of space than can be used for hiding data.
for example in comments field or by capitalizing charaters of attributes name
(vlink for hiding 00000 and VlInK for hiding 10101).
4-steganography needs Stego key which is not described how to use a stego key in your method. Can some body apply stego key with your methode or not. As you better know that any other one, Stego key is used to spread hidden data whithin cover media.
5-Actually I am searching for Steganography in text's. Do you know some methods that can do it for me (With steganograhy considerations). Any kind of comment is very usefull.
|
|
|
|

|
mahdavi110 wrote: with a simple statistical analisys can reveal that there is a hidden message
How would such a statistical analysis work, could you please post an example?
mahdavi110 wrote: capitalizing charaters of attributes name
1. Which HTML generator would write such a chaos?
2. A web designer who writes that style should be fired.
=> If there are uncommon capitals within a word, it is 100% sure that's something is wrong with the document.
mahdavi110 wrote: steganography needs Stego key
There are several way of applying a key. For example, you could use only certain tags or attributes.
mahdavi110 wrote: Do you know some methods that can do it for me
Sorry, I'm not Google.
_____________________________________________________________________________
I don't expect too much, all I want is your vote for Halbsichtigkeit.
|
|
|
|

|
I read most of your steganography articles but some how I skiped this one. So now when I read it I was thinking about some improvmants and most important how to save larger message.
There is a way to use all the attributes and even to use every attribute to save one bit. We can use the alphabet order to determined which attribute suppose to be first.
For example:
let's pretend that the numbers 1, 2, 3 are attributes and alphabeticaly thay are in the same oreder like 1, 2, 3. so in this example I'm using every attribute for storing 1 bit. Only the first one is skiped - used for comparing.
here I store all the possible 2 bits with 3 attributes.
00 - 123
01 - 132
10 - 213
11 - 321
I didn't try this with complex example but I'm pretty sure this strategy suppose to work.
-- modified at 16:02 Monday 17th April, 2006
|
|
|
|

|
Good idea - somebody else already told me.
Your suggestion works for any list, I tried to implement it there:
http://www.codeproject.com/csharp/steganodotnet14.asp
_________________________________
Please inform me about my English mistakes, as I'm still trying to learn your language!
|
|
|
|

|
First Idea that may come into mind.
generally if you can find a set on N attributes that are permutable you can hide a message as long as log2(N!) bits.
But finding such sets may be difficault. But we may constract some additional attributes when needed. for example if a text tag just has font and color attributes we may add size and ... other tags to it while considering not to chage the appearence.
So we may find These long sets and hide more bits.
If we can find 10 attributes set we may hide 21 bits.
How ... I am.
Let me know your Idea.
((: )
|
|
|
|
|

|
this might just be the craziest CP article i've read...
seriously... like... wow...
(really good though )
|
|
|
|

|
Does that mean.. *sniff* ... you have not read part 14, yet?
It does the same, only a lot crazier.
_________________________________
Please inform me about my English mistakes, I still try to learn your language!
|
|
|
|

|
actually i've read them all now.. i think.
...
i demand more!
|
|
|
|

|
Thanks four your interest.
I'm happy about every CPian who can read 15 articles without becoming as crazy as I am.
Dead Skin Mask wrote:
i demand more!
Hmmm... that's not easy. Before I can post more, I have to write more, and before I can write more, I have to discover more. Usually, an article needs much time, and the longest parts are "waiting for the next idea" and "finishing all other things I planned".
_________________________________
Please inform me about my English mistakes, I still try to learn your language!
|
|
|
|

|
yeh i understand.
i'm just demanding more because your articles are actually worth reading from start to finish and not just reading the intro and conclusions like i do with most of them now.
in any case, i hope there's more..
|
|
|
|

|
Hi Coco, Do you know of anyway to provide source for C# v1.1? Partial and Static aren't part of production yet... also, if spaces (not non-breaking spaces, are maintained when sent by your webserver (they are on Mine...), you could use varying spaces to encode quite a few bits... place 1 to 16 spaces between any tags....<input type="text" name="TheName" size= "15" LENGTH = "6" > most people wouldn't notice. The Key would be a guide to look inside which tags and how the spaces are interpreted. the encryptor would first clean out all unnecessary spaces, then pad the spaces to add the bits to the HTML. Cheers, John R. Hanson
|
|
|
|
|

|
No and yes... yesterday I thought a stegano-webserver or a pseudo-SOAP-formatter would be of nearly no interest to anyone out there. But now that Andrew pointed out a few improvements, maybe the next article will be on hiding a meta-stream in website content. All references in a every HTML page can link to more carrier documents with more hidden data... Every website can contain a full meta-website. We'll need a browser plugin to view the hidden webs...
Damn it, I'll never finish this series, it's going to be my fellow for years!
_________________________________
Vote '1' if you're too lazy for a discussion
|
|
|
|

|
Wow, I inspired someone. Or gave them more work to do, I'm not sure... Though on the flip side, this article has given me a great idea for my current project in Combinatorial Optimisation, and converting combinations into permutations. So you see, this really is useful, if not specifically for steganography.
I'm also poking round with a linguisitic engine, intially for semantic analysis and translation, but given the amount of redundancy in natural language, you could possibly use it for steganography - and then you're hiding data in the very way sentences are worded.
But first, I have to get it working...
Andrew
Will code for bandwidth and caffeine
|
|
|
|

|
Thinking about increasing the secret text efficiency, I've thought of a possible improvement to your attribute arrangement system. My first thought was that there's a lot of overlap - for example, <body text="#000000" bgcolor="#FFFFFF" link="#FF0000" alink="#FF0000" vlink="#FF0000"> is the same as <body link="#FF0000" alink="#FF0000" vlink="#FF0000" text="#000000" bgcolor="#FFFFFF">, which means there's already one bit we've lost. But going further, there's 5! (or 120) permutations of the five attributes, which is just short of 7 bits. So we have at least 6 bits to play with in this tag alone. The first thing to do is, rather than define individual pairs of attributes, assign each tag an ordinal, so that we can say that tag A is greater or less than tag B. Thus, if alink has a lower ordinal than link- alink="#FF0000" link="#FF0000" ==> 0 link="#FF0000" alink="#FF0000" ==> 1 A very simple means of assigning ordinals, and thus pair orders, is by using alphabetic order - a pair of attributes are in alphabetic order means 0, in reverse order means 1. Thus, <body text="#000000" bgcolor="#FFFFFF" link="#FF0000" alink="#FF0000" vlink="#FF0000"> ==> 1010 Because- text="#000000" > bgcolor="#FFFFFF" ==> 1 bgcolor="#FFFFFF" < link="#FF0000" ==> 0 link="#FF0000" > alink="#FF0000" ==> 1 alink="#FF0000" < vlink="#FF0000" ==> 0 Whereas, <body link="#FF0000" alink="#FF0000" vlink="#FF0000" text="#000000" bgcolor="#FFFFFF"> ==> 1011 link="#FF0000" > alink="#FF0000" ==> 1 alink="#FF0000" < vlink="#FF0000" ==> 0 vlink="#FF0000" > text="#000000" ==> 1 text="#000000" > bgcolor="#FFFFFF" ==> 1 And, because we're working with every attribute tag, every HTML Tag with n attributes can store n-1 bits. And a quick algorithm to implement this system- Sort all the attributes within a tag into alphabetical order in an array called attr_strings. n is the number of bits we can store (number of attributes minus 1) create an array of integers called attr, and two integers, min=0 and max=0 set attr[0] to 0 for(i = 0; i < n-1; i++) if bit[i] = 1 min-- attr[i+1] = min else max++ attr[i+1 = max next i for(i = 0; i < n; i++) attr[i] -= min use attr as an array of indexes in attr_strings, and reassemble the tag Therefore, to encode 0101 with our <body> tag: Sort the attributes attr_strings = { alink="#FF0000", bgcolor="#FFFFFF", link="#FF0000", text="#000000", vlink="#FF0000" } min = 0 max = 0 attr = {0,0,0,0,0} iteration 1: bit[1] = 0, so max = 1 attr = {0,1,0,0,0} iteration 2: bit[2] = 1, so min = -1 attr = {0,1,-1,0,0} iteration 3: bit[3] = 0, so max = 2 attr = {0,1,-1,2,0} iteration 4: bit[4] = 1, s0 min = -2 attr = {0,1,-1,2,-2} Subtract min (-2) from each attr to give attr = {2,3,1,4,0} Then reconstruct the tag- Tag = "<body "+attr_strings[2]+" "+attr_strings[1]+" "+attr_strings[3]+" "+attr_strings[0]+" "+attr_strings[4]+">"; Thus, Tag= "<body link="#FF0000" text="#000000" bgcolor="#FFFFFF" vlink="#FF0000 alink="#FF0000">" Which equates to 0101. Note that you needn't use alphabetic order, any ordering will suffice, just as long as both the encoder and decoder are using the same order. This algorithm is still far from the mathematical optimum, but it's a start, and its pretty simple, as it doesn't really vary from your original idea. Taking into account all the attributes, across the entire file, you could fit quite a lot in a single document, expecially if you double up by including/removing quotation marks on attribute values (though that would need to ignore attributes where quotation marks were necessary). Thanks for the idea! Any comments? Andrew Will code for bandwidth & caffeine
|
|
|
|

|
Thanks for the great idea an algorithm!
What do you think of another extension?
In <img tags you can define one attribute to be not part of the ordering, e.g. src. This attribute at the beginning of the tag means "more bits in the image".
"0", and look for more bits in the image:
<img src="abc.gif" title="anything" border="0">
"0" and ignore the image:
<img title="anything" src="abc.gif" border="0">
|
|
|
|
|

|
Hmm, interesting. You could extend that further, and rather than just use the src attribute, say that the first bit stored in any <img> tag marks whether or not the image holds further bits. Then you can use the same technique in other referencing tags, such as <a>. Choosing the first bit has an additional advantage - if there is only one attribute in a tag, which is common with <a> tags, you can use any form of encoding to ensure that you store that one bit. For example, <a href="url"> is 0, while <a href="url" >, with an additional space at the end, is 1. This allows you to spread secret text over an entire website, but prevent a decoding agent from taking inappropriate links (e.g. those that leave the site). If a generated page (say with PHP or ASP.NET) links back to itself, you could code an entire message, irrelevant of size, by changing the secret text in the page each time it is generated, though you'd have to use cookies or session state to keep track of where in the message the decoder is. So, a casual observer would just find a page that has one dud link, whereas a decoder bot could find an entire message. On a more algorithmic note, a better form of the algorithm above changes the way attributes are arranged to improve overall efficiency, by using as many different permutations as possible. The principle, again, is not notably different to Corinna's original idea, but is a tad more pedantic about optimality. Consider a tag with five attributes. If we think of it as a tag with five vacant slots, and then iterate through each attribute in order (in the case of the algorithm above, that's alphabetic order, but again, it doesn't matter, as long as there's [i]an[/i] order)- The first attribute can be placed in one of five slots - this gives us 2 (complete) bits of storage, thus the first two bits of the tag can be stored with the placement of the first attribute, i.e. if the first two bits to encode are 01, the first attribute goes in the second slot (or the slot at index 1). The second can be placed in one of four slots, ignoring the one taken up by the first tag, and so encodes the next two bits. The third attribute then has three available slots, storing one bit, the fourth tag stores another bit, and the last tag has only one place to go, so it doesn't store anything. Et voila, the five attribute <body> tag which stored 4 bits above now stores 6 bits, which is the maximum you can store intact. And an algorithm to do this- Sort the attributes, according to your sort order, into an array called attr_strings n is the number of tags Create an array of n integers called attr, initialised to -1 Then iterate through the tags for(int i = 0; i < n; i++) b is the number of bits to encode with this attribute = trunc(log2(n - i)) s is the number of the free slot in which to place the current tag = the next b bits to encode, parsed into an integer offset is the number of previously filled slots to 'skip' when placing this attribute = 0 for(int j = 0; j < s + offset; j++) if(attr[j] > -1) offset++; (That is, skip a slot that is not free) next j attr[j] = i; next i Reconstruct the tag using attr & attr_strings as before Decode by doing the reverse - if the first attribute is in the second free slot, it must represent 01. If the second tag is in the third free slot (actually the [i]fourth[/i] overall slot, as it gets shunted along one by the first attribute in the second overall slot) it represents 10. I knew I'd work it out eventually. The above can be explained/proven with binary trees and the like, but suffice to say that it works. Andrew Will code for bandwidth and caffeine
|
|
|
|

|
You can even do better than that. Consider a tag with 6 attributes. Using your method you get 2+2+2+1+1+0=8 bits, but in fact the number of permutations is 6*5*4*3*2*1=720, which means you should be able to squeeze 9 bits out of it, and occasionally, 10. To accomplish this, first sort the attributes by name, as before. E.g. for a tag like <MyTag N="..." F="..." A="..." C="..." Z="..." Q="..."> I'll describe how to decode the tag and leave encoding as an exercise In alphabetical order, these are ACFNQZ, which you should assign the numeric values 012345. The first attribute in this example is N, and its corresponding numeric value is 3. Now, change the set of numeric values by (conceptually) removing N from the list. So the new set of possible attribute is ACFQZ and the corresponding values are 01234. The second attribute is F. F corresponds to 2. As before, remove F and continue. So what's left is ACQZ with values 0123. The next attribute is A, which is the first in the remaining set, so it has the value 0. To summarize the steps above: ACFNQZ = 012345; N => 3 ACFQZ = 01234; F => 2 ACQZ = 0123; A => 0 The remaining steps are CQZ = 012; C => 0 QZ = 01; Z => 1 Q = 0; Q => 0 (this last step always produces zero, so you may as well leave it out) The (useful) numbers we got are 3 2 0 0 1. Now, let's call these numbers A B C D E (i.e. A=3, B=2, etc). You can convert those numbers to a single number between 0 and 719 by plugging them into this magic formula: (((((A) * 5 + B) * 4 + C) * 3 + D) * 2 + E) * 1 = (((((3) * 5 + 2) * 4 + 0) * 3 + 0) * 2 + 1) * 1 = ((((15 + 2) * 4 + 0) * 3 + 0) * 2 + 1) * 1 = (((68 + 0) * 3 + 0) * 2 + 1) * 1 = ((204 + 0) * 2 + 1) * 1 = (408 + 1) * 1 = 409. Similarly, if there were 7 attributes instead of 6, the magic formula would be ((((((A) * 6 + B) * 5 + C) * 4 + D) * 3 + E) * 2 + F) * 1 Similarly, if there were only 4 attributes, the magic formula would be (((A) * 3 + B) * 2 + C) * 1 To get 0, just put the attributes in order: <MyTag A="..." C="..." F="..." N="..." Q="..." Z="..."> is 0 To get the highest possible value, put them in reverse order: <MyTag Z="..." Q="..." N="..." F="..." C="..." A="..."> is 719 Since you can encode any number between 0 and 719, you can assume there are 10 bits if the number is greater than 511, and 9 bits if the number is less than 512. Alternately, you could assume that any number above 511 is invalid; therefore, if you find a file that apparently has a number higher number than 511, you could assume that the file does not contain stenographic information.
|
|
|
|

|
This is the coolest thing I have seen all day
~Alexander Kent
|
|
|
|

|
I really enjoyed the article, thanks
Carl (www.assemblySoft.com)
|
|
|
|

|
Thank you, all very well and so on..
but i got nothing tho hide
|
|
|
|
|
|

|
Have you though about extending this to hiding info inside of WebService messages?
|
|
|
|

|
Not exactly WebService messages. I thought about an steganographic serializer.
You know the framework's SoapFormatter class... it should be possible to write a class that does the same serialization, and encodes short messages in the XML. Deserialization would return two things: the deserialized object and the message.
_________________________________
Vote '1' if you're too lazy for a discussion
|
|
|
|

|
Yes i'm back... and i'm happy to read your new articel...
It's a nice idea to hiding data, but i'm think it needs to much tags for hidding much data... i believe hidden fields can optimize this problem...
Best regards...
|
|
|
|
|
|

|
by my understanding attribute order has no meaning for serialization of a html document, and as a consequence placing attributes is not required to follow any particular order. I wonder if this could be trusted to be future compatible.
|
|
|
|

|
Nothing is guaranteed to be future compatible.
As long as static HTML pages are stored on the webserver as files, it will work. Load any file onto a web server via FTP, download it via HTTP, and you'll see the same text.
If one day database oriented web server will be the standard, pages will be stored as document object models and get streamed to he clients tag-by-tag, a special attribute order will be hard to keep alive.
_________________________________
Vote '1' if you're too lazy for a discussion
|
|
|
|

|
is it really useful? such a lot of work for such a small amount of hidden text?
i mean its great as a theoretical concept but in reality i can think of a bajillion ways to hide info better
"there is no spoon" biz stuff about me
|
|
|
|

|
Have you ever thought about meta-content in a whole website?
The images and linked media files are holding the main part of the hidden content - and the header data needed to extract it is hidden in the HTML.
Anyway, you don't have to use it.
_________________________________
Vote '1' if you're too lazy for a discussion
|
|
|
|

|
Well no it's not useful, but this is like building a modern pc in a NES case or running Linux on a watch - it's just plain cool. Like some wise slashdotter remarked after someone asked 'why?' about one of the thousands of useless projects that show up on slashdot: 'if you have to ask 'what's the point', then don't bother, you'll never get the point.'
|
|
|
|
|

|
It's useful if you're the type looking for some new novel way to share information without someone else knowing you're sharing information..which is exactly the point of stego! Great job. Great series!
|
|
|
|

|
Even better, this great article really opens your mind to the limitless possibilities regarding the hiding of data...
I expect Corinna's next article to be on hiding secret messages within CodeProject articles and the related responses (or maybe she's already done that)
|
|
|
|

|
Unfortunately, this document does not, apparently, hide a secret message, at least according to the keys I have.
Not that I checked or anything... No no no no...
Andrew
Will code for bandwidth and caffeine
|
|
|
|

|
Didn't it occur to you that the secret message might also be encrypted as well as hidden? Then it would look like random data :P
|
|
|
|

|
Hmm, I (blue torino) have no idea (corner of lexington and 1st) what you could be (under drivers seat) talking about. Ahhm..(Kevin Nealon.)
|
|
|
|
 |
|
|
General News Suggestion Question Bug Answer Joke Rant Admin
Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.
|
Some ideas on how to hide binary data in text documents
| Type | Article |
| Licence | CPOL |
| First Posted | 14 Nov 2004 |
| Views | 99,986 |
| Downloads | 1,021 |
| Bookmarked | 50 times |
|
|