Click here to Skip to main content
14,424,458 members
Rate this:
Please Sign up or sign in to vote.
See more:
I was able to encrypt text documents by using StreamReader and StreamWriter to read and change the file content.
But Word, I tried to rename it to txt first, then open it, and I encrypt whatever inside, when I decrypt it, it stay corrupted.
I need explanation to read string in word document file.

PS: I use my Caesar Encryption Method

Edit :
I don't care if it's weak or what.
I just want to know the step to encrypt word document file.

For example I had word document with "ABCDEFGHIJKLMNOPQRSTUVWXYZ" text inside.
And I'm going to encrypt it using vernam cipher.
How?
Posted
Updated 16-Nov-19 10:48am
v3
Comments
Member 13566383 18-Nov-19 16:27pm
   
If your encrypting / decrypting software cannot handle binary files but only text files you should convert the binary filestream into Base64 strings or character arrays (each 3 Bytes will be represented by 4 characters).
The methods to use are Convert.ToBase64String / Convert.ToBase64CharArray.
Decrypting your encrypted file will produce Base64 strings / characters which you can convert to binary format by using the corresponding methods Convert.FromBase64...
Member 14596277 19-Nov-19 17:41pm
   
Why convert to Base64? Why not do File.ReadAllBytes(path), then convert the bytes to Hexadecimal Characters, so you may view in, say, a RichTextBox? :D
Member 14596277 19-Nov-19 17:44pm
   
Base64 is hardly ideal for this, because if you attempted to Convert.FromBase64String(string) then you would see nothing, if displayed in a RichTextBox or a label, etc. Hexadecimal allows you to 100% view the bytes, even NUL bytes("\x0", or "00"), perfectly.
Member 13566383 20-Nov-19 2:09am
   
I had never any problems reading Base64 characters. A Base64 stream consists of valid ASCII characters only. No problem with NUL bytes, see the MDSN documentation for the Convert.ToBase64String Method:

".....The following example demonstrates the ToBase64String method. The input is divided into groups of three bytes (24 bits) each. Consequently, each group consists of four 6-bit numbers where each number ranges from decimal 0 to 63. In this example, there are 85 3-byte groups with one byte remaining. The first group consists of the hexadecimal values 00, 01, and 02, which yield four 6-bit values equal to decimal 0, 0, 4, and 2. Those four values correspond to the base-64 digits "A", "A", "E", and "C" at the beginning of the output."
Rate this:
Please Sign up or sign in to vote.

Solution 3

As everyone keeps pointing out, a word document is not plain text.

What they have not asked is what format of word document are you processing?
Are you processing a DOC or a DOCX file? i.e. does it conform to the OpenXML format?

If you are reading a DOC file, then you need to use the correct interop/libary to extract the raw text data which you can then encrypt. (I suggest you start with the Word Interop, an alternative that doesn't require Word to be installed is NPOI NPOI[^], the 2.0 beta version also supports DOCX files)

If you are reading a DOCX file, then you need to the an OpenXML libary to extract the raw text data, and then encrypt it and insert it back into the document.Microsoft Help Page[^]
   
Comments
Midnight Ahri 3-Dec-13 3:26am
   
Thank you very much for the information.
How if I ignore doc / docx / whatever it is.
I'll just read the byte / byte from the file and encrypt it using vernam cipher.
I can even encrypt anything with that.
Is this a good solution?
Pheonyx 3-Dec-13 3:33am
   
It depends on you actual objective. Personally, I don't know that much about encryption in that manner.

If it was me I would probably just use something like this:
Click me
Midnight Ahri 3-Dec-13 3:36am
   
Thank you very much for the information, it helps a lot. =D
Rate this:
Please Sign up or sign in to vote.

Solution 1

PS: I use my own encryption method.

There's your big mistake. Usually, unless your holding a PhD in Math, rolling your own "encryption" method is about the most insecure encryption you can come up with.

Your second mistake is that you haven't posted any code having anything to do with your "encryption" and "decryption" so it's impossible for anyone to tell you what you have done wrong there.
   
Comments
Midnight Ahri 2-Dec-13 22:46pm
   
Sorry for my terrible information.
I'm using Caesar Cipher that encrypt character one by one.
Read text file, encrypt then modify text file.
I need explanation of how to read string in word document file without the xml format.
Dave Kreskowiak 2-Dec-13 23:57pm
   
Yeah, well, that's the problem. The Caesar Cypher only works on text files. Word documents are NOT text files. They are binary!

Also, the Caesar Cipher is very weak and easily broken. You're really not encrypting anything with that.
Midnight Ahri 3-Dec-13 1:53am
   
Yes it's true that Caesar Cipher is very weak.
But my main point is to get plain text from word documents.
Can you provide me information about that?
Rate this:
Please Sign up or sign in to vote.

Solution 2

You're wrong because Word documents are not completely text files; they are instead binary files.
So, in your process of
- change extension to txt
- open the file as a text one
- encrypt it as if it was a true text file
you are corrupting it.

You need a binary encryption process, which is far more complicated than a simple Caesar encryption. Some brilliant mathematicians have worked and still work on the subject nowadays.
   
Comments
Midnight Ahri 3-Dec-13 3:00am
   
Now I got my Vernam Cipher Encryption, but still I need a way to read Word Document plain text.
I can't simply choose the file and encrypt it right?
phil.o 16-Nov-19 16:14pm
   
What problem?
Member 14596277 19-Nov-19 17:39pm
   
By "problem", I mean his original question. Plus, I did answer his question "Encrypt Word Document File". -And in C#, too. He commented and asked a question, but no-one answered. I answered :)
phil.o 19-Nov-19 17:50pm
   
I still wonder what this has to do with me. But nevermind, that is not so important anyway.
Member 14596277 2-Dec-19 14:14pm
   
No, I wasn't referring to you -- I was talking to Midnight Ahri :) Sorry for confusion
Rate this:
Please Sign up or sign in to vote.

Solution 4

- I am classifying this question as not solved, since his question was not directly answered. You need to use
Spire.Doc

It can be downloaded from Nuget, and is efficient when you need to decrypt a Word Document.

Install
Spire.Doc
-Add the using Spire.Doc reference, then paste / type this code into your OpenFileDialog:

Document document = new Document();

   document.LoadFromFile(ofd.FileName, FileFormat.Docx);

   document.RemoveEncryption();

   richTextBox1.Text = document.GetText();


-I assume you are using an OpenFileDialog, if you are not, replace the ofd.FileName with your file name. -If you are getting an error that ofd is not found, or is an incorrect snippet of code, then do:
OpenFileDialog ofd = new OpenFileDialog();


-And, technically, I have a solution to your answer :) Here:

document.Encrypt("Password!");


-That encrypts the file with a password :D

-Happy to help :)
   
v2
Comments
phil.o 16-Nov-19 16:11pm
   
6 years later, I would bet this question is no more any concern for original poster; either he/she solved it already, or decided to do something else.
Anyway, even if I praise your urge to help others, you should probably stick to answering recent questions. Some could consider that you are doing reputation-hunting and flag your account for that.
Cheers :)
Member 14596277 19-Nov-19 16:51pm
   
I don't see the problem with helping some 6-year-old-questions with up-to-date answers, but I'll keep that in mind. -I don't think that I will get flagged, because I am not "reputation-hunting" - I don't use this site very much, I use StackOverFlow.com a lot more. Plus, I don't think that I will get flagged just because people are "considering" that I am reputation-hunting. Thanks, though :)

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)




CodeProject, 503-250 Ferrand Drive Toronto Ontario, M3C 3G8 Canada +1 416-849-8900 x 100