Click here to Skip to main content
15,886,794 members
Please Sign up or sign in to vote.
2.50/5 (4 votes)
See more:
Hi Guys..

I am using
C#
Convert.FromBase64String
but i have old data in database and I don't know whether the input data (that is coming from database) is divisible by 4 or not.

How can I check whether data is base64 string or the simple string...???

Slight Over View of Convert.FromBase64String Method [^]

Regards
Nikhil S.
Plz I need an URGENT solution to my problem. If you wish I am here to talk at <removed email>

Thanks In Advance...


--EDIT: I removed your email address. It's not a good idea to post that here. When people suggest a solution or post a question or comment about your question you will be emailed and notified.

Hello and Thanks to all who considered a visit :)

As I have already Mentioned that my database contained old data i.e. before I started saving converted vales (Base64String). Now the Issue is that now the DB contains both the BASE64STRING as well as simple STRING values

I'll try and generalize my problem.
this bit of code here throws an error...
C#
String s = "Hello";
byte[] abc = Convert.FromBase64String(s);

Why doesn't this bit do...???
C#
String s = "Hell";
byte[] abc = Convert.FromBase64String(s);


Thankks..
Ns.
Posted
Updated 12-Oct-11 20:30pm
v6
Comments
the headless nick 12-Oct-11 10:02am    
thanks for that bud. It would have been much nicer of you if you tried to help me on my problem instead.
NS
Sergey Alexandrovich Kryukov 12-Oct-11 10:49am    
Who do you think is not nice enough to you? Do you consider your own comment above as nice?
--SA
the headless nick 13-Oct-11 2:17am    
I don't know. That's not the issue we are here for.
NS

Divisibility of length by 4 is a good indication only when the padding is used properly. Since the padding is optional (you can figure out how many '='s to append by taking the remainder of length%4) your strings may have any length.

In general, you cannot reliably tell base-64[^] strings from other strings. In most cases, however, you know a string to be non base-64 if it contains characters outside the list of 64 characters used in base-64 encoding: A-Z, a-z, 0-9, +, /, and trailing = for padding. Any string composed from legal base-64 characters can be decoded successfully.
 
Share this answer
 
Comments
Sergey Alexandrovich Kryukov 12-Oct-11 10:47am    
My 5. Good point about divisibility. At the same time, our answers conclude the same thing, I'm sure -- correctly. :-)
--SA
Strictly speaking, there is no way to check it up. You can only do some guesswork. You can try to decode it as base64 and see if the results make any sense, no more.

The base64 string simply consists of ASCII characters, 8-bit padded, '0'..'9', 'A'..'Z', 'a'...'z', '+', '/'. There are several variants of the encoding. Usually, MIME specification (based on RFC 1421) is implied. See:
http://en.wikipedia.org/wiki/Base64[^],
http://en.wikipedia.org/wiki/Base64#MIME[^],
http://tools.ietf.org/html/rfc1421[^].

Imagine you generated a completely random sequence of such characters and format text exactly as base64 encoders usually do, using same delimiters, including end-of-line characters — this formatting is purely optional for base64. Is it a base64-encoded data? Yes and no — it depends on how you look at it. What is "simple string" anyway? There is no such thing. (By the way, I don't understand how division by 4 can be relevant here.)

The problem is: the standard for base64 encoding does not include any meta-data which would help to recognize the format. Such meta-data is always placed outside of base64 data. For example, in e-mail you will usually get a header Content-Type: multipart/mixed; and one or more of the parts will have part headers with Content-Type: application/octet-stream and Content-Transfer-Encoding: base64. You can consider this as the meta-data showing you how to use the string. You should do something like that if you use base64 in your software, but what to do with available legacy data? Too late…

—SA
 
Share this answer
 
v3
Comments
dasblinkenlight 12-Oct-11 10:43am    
> By the way, I don't understand how division by 4 can be relevant here
Since each three bytes from the original sequence are encoded as four base-64 characters, the length of properly padded result should always be divisible by four. However, padding is mandatory only when multiple base-64 items are concatenated as a single string; in all other cases, it is optional. That is why the divisibility of the length by four has rather limited applicability.
Sergey Alexandrovich Kryukov 12-Oct-11 10:46am    
Sure, thanks for answering.
--SA
As the other answers state, in general it is not possible. However, depending on what the field is supposed to hold, you can specify some rules to make a good guess. These three rules (in this order, I think) will work for most types of data.

How long is the data supposed to be? If it is a fixed length (e.g. an ID) you can look at the length of the string, if it is ~1/3 larger than you expect it is base64.

Would the data normally contain characters outside the base64 range? Any character which is not valid in the encoding (e.g. spaces, punctuation, accented characters) means it's not base64 encoded. Names and addresses will usually have spaces.

Is it supposed to be human readable text? Try base64 decoding it; if any of the output string is not a readable character (anything below 32 except for \r, \n and \t, possibly punctuation characters) then it was not base64.
 
Share this answer
 

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900