 |
|
 |
For a couple of reason I decided to post another piece of code that may be useful for you.
1. This code has some problems that already mentioned in other messages below.
2. My variation is based on Encoding. People who search the web and came to this page may not be concerned about Encoding dropping some letters, or they may even do not believe such a problem exists. So this variation is for those who want to use Encoding for converting text to byte[] and vice versa.
Here is the code:
public class Zip
{
public static byte[] Compress(string text, Encoding Encoding = null)
{
if (text == null) return null;
Encoding = Encoding ?? System.Text.Encoding.Unicode; var textBytes = Encoding.GetBytes(text); var textStream = new MemoryStream(); var zip = new GZipStream(textStream, CompressionMode.Compress); zip.Write(textBytes, 0, textBytes.Length); zip.Close();
return textStream.ToArray(); }
public static string Decompress(byte[] value, Encoding Encoding = null)
{
if (value == null) return null;
Encoding = Encoding ?? System.Text.Encoding.Unicode; var inputStream = new MemoryStream(value); var outputStream = new MemoryStream(); var zip = new GZipStream(inputStream, CompressionMode.Decompress); byte[] bytes = new byte[4096];
int n;
while ((n = zip.Read(bytes, 0, bytes.Length)) != 0) {
outputStream.Write(bytes, 0, n); }
zip.Close();
return Encoding.GetString(outputStream.ToArray()); }
}
Sam Naseri
Software Developer
Blog : http://samondotnet.blgospot.com
|
|
|
|
 |
|
 |
Great code Sam, I modified it slightly to use using statements to make sure everything is disposed:
In my tests (on long UTF8 encoded strings) this is working great.
public class Zip
{
public static byte[] Compress(string text, Encoding Encoding = null)
{
if (text == null) return null;
Encoding = Encoding ?? System.Text.Encoding.Unicode; var textBytes = Encoding.GetBytes(text); using (var textStream = new MemoryStream()) {
using (var zip = new GZipStream(textStream, CompressionMode.Compress)) {
zip.Write(textBytes, 0, textBytes.Length); zip.Close();
}
return textStream.ToArray(); }
}
public static string Decompress(byte[] value, Encoding Encoding = null)
{
if (value == null) return null;
Encoding = Encoding ?? System.Text.Encoding.Unicode; using (var inputStream = new MemoryStream(value)) using (var outputStream = new MemoryStream()) {
using (var zip = new GZipStream(inputStream, CompressionMode.Decompress)) {
byte[] bytes = new byte[4096];
int n;
while ((n = zip.Read(bytes, 0, bytes.Length)) != 0) {
outputStream.Write(bytes, 0, n); }
zip.Close();
}
return Encoding.GetString(outputStream.ToArray()); }
}
}
|
|
|
|
 |
|
 |
Thanks for the improvement, now its much better.
|
|
|
|
 |
|
 |
Thanks for this contructive help.
"Nothing is lost, Nothing is created, Everything is transformed" Lavoisier
http://wlwilliamsiv.com
|
|
|
|
 |
|
 |
You have to set max length of array equals to original string length in UnZip function.
I mean to say see the first line of function UnZip()
//Transform string into byte[]
byte[] byteArray = new byte[value.Length];
will be replace with
//Transform string into byte[]
byte[] byteArray = new byte[orginal text length];
|
|
|
|
 |
|
 |
The string "encoding" used by this article is all wrong - it assumes that all characters are single bytes which is not true even for English text. Internally .NET stores all characters as UTF16 but even then, some symbols may need 2 characters to describe them. (Chinese text, for example.) Encoding.UTF8.GetBytes() should've been used instead.
There are other errors as others pointed out, too.
Avoid this garbage.
|
|
|
|
 |
|
 |
Converting a string to a byte array assuming a character is a single byte is wrong.
|
|
|
|
 |
|
 |
did you even test the code before posting or did you just copy it from somewhere? When decompressiong you are using a buffer of the same size with the compressed data. Since compression is supposed to make the file smaller this code will always fail to decompress corectly
|
|
|
|
 |
|
 |
Dah!
Did that for a special need...this code is not meant to satisfy everybody...adapt it to your needs or move on to what you need...
DA
"Nothing is lost, Nothing is created, Everything is transformed" Lavoisier
http://wlwilliamsiv.com
|
|
|
|
 |
|
 |
public static string DecompressData(string sData)
{
byte[] byteArray = new byte[sData.Length];
int indexBa = 0;
foreach (char item in sData)
byteArray[indexBa++] = (byte)item;
MemoryStream memoryStream = new MemoryStream(byteArray);
GZipStream gZipStream = new GZipStream(memoryStream, CompressionMode.Decompress);
byteArray = new byte[1024];
StringBuilder stringBuilder = new StringBuilder();
int readBytes;
while ( (readBytes = gZipStream.Read(byteArray, 0, byteArray.Length)) != 0)
{
for (int i = 0; i < readBytes; i++)
stringBuilder.Append((char)byteArray[i]);
}
gZipStream.Close();
memoryStream.Close();
gZipStream.Dispose();
memoryStream.Dispose();
return stringBuilder.ToString();
}
|
|
|
|
 |
|
 |
what is this, please explain this fixed.
|
|
|
|
 |
|
 |
Tryed to compress html page. Sourcelength = 1764, compressed = 1764. Something wrong.
|
|
|
|
 |
|
 |
hi there .
this code work fine .
Imports System.IO
Imports System.IO.Compression
Public Class CZip
Public Shared Function Compress(ByVal Buff() As Byte) As Byte()
Dim MS As New MemoryStream()
Dim ZipStream As New GZipStream(MS, CompressionMode.Compress)
ZipStream.Write(Buff, 0, Buff.Length)
ZipStream.Close()
Compress = MS.ToArray
MS.Close()
ZipStream.Dispose()
MS.Dispose()
GC.Collect()
End Function
Public Shared Function DeCompress(ByVal Buff() As Byte) As Byte()
Dim MS As New MemoryStream(Buff)
Dim ResultMS As New MemoryStream
Dim ZipStream As New GZipStream(MS, CompressionMode.Decompress)
Dim br As New BinaryReader(ZipStream)
Dim BW As New BinaryWriter(ResultMS)
While True
Dim arr() As Byte = br.ReadBytes(1000000)
If arr Is Nothing OrElse arr.Length = 0 Then
Exit While
End If
BW.Write(arr)
End While
BW.Flush()
ResultMS.Flush()
DeCompress = ResultMS.ToArray
ZipStream.Close()
MS.Close()
ResultMS.Close()
ZipStream.Dispose()
MS.Dispose()
ResultMS.Dispose()
GC.Collect()
End Function
End Class
|
|
|
|
 |
|
 |
Hi,
I am doing like this:
GZipStream objCompressedStream = new GZipStream(objmod, CompressionMode.Compress, true);
objCompressedStream.Write(btReadArray, 0, aiNo_of_bytes_read);
Now when I am trying to get length of objCompressedStream using objCompressedStream.Length property it throws exception that operation is not supported.
Can anyone suggest the alternatives or some way to get the length of compressed stream
Thanks
|
|
|
|
 |
|
 |
From my point of view the code gets a lot faster and cleaner if you use the right encoding from System.Text to create the byte array and also to change the byte array back into a string.
|
|
|
|
 |
|
 |
Sorry - I missed before what you read in the text. I guess that there is somewhere else a bug or you are not 100 % sure about the right decoding - at least the second part should be used encoidng anyway.
|
|
|
|
 |
|
 |
//Reset variable to collect uncompressed result
//There was a mistake here in the original code.
//A buffer that is the same size may not hold the
//result of what is decompressed. The decompressed
//version may be larger.
byteArray = new byte[byteArray.Length*5];
This worked for me. The buffer size is probably overkill. Some of you guys can probably figure out what the difference in a gzip compresses and uncompressed character can have on the result of decompression size.
|
|
|
|
 |
|
 |
You are correct that GZipStream Flush() doesn't work. In fact, this problem can be traced to the underlying DeflateStream.Flush() implementation.
As you say, a workaround is to call GZipStream.Close() - however, this will also close the underlying stream. Luckily, GZipStream has a 3rd parameter on it's constructor which tells GZipStream/DeflateStream not to close the underlying stream when disposing.
|
|
|
|
 |
|
 |
i have the following two functions:
public void Compress(string inFile,string outFile)
{
FileStream fs = File.OpenRead(inFile);
FileStream fw = File.Create(outFile);
GZipStream gs = new GZipStream(fw, CompressionMode.Compress);
int theByte = fs.ReadByte();
while (theByte != -1)
{
gs.WriteByte((byte)theByte);
theByte = fs.ReadByte();
}
fs.Close();
fw.Close();
}
public void Decompress(string inFile,string outFile)
{
FileStream fs = File.OpenRead(inFile);
FileStream fw = File.Create(outFile);
GZipStream gs = new GZipStream(fs, CompressionMode.Decompress);
int theByte = gs.ReadByte();
while (theByte != -1)
{
fw.WriteByte((byte)theByte);
theByte = gs.ReadByte();
}
fs.Close();
fw.Close();
}
but if you try them the don't regain the original file complete but lacks some data,
can you help me why this happens?
MSaty Bottom Up Gamer
|
|
|
|
 |
|
 |
Did you come across the problem - decompression of binary concatenated files?
you can create a sample with the following command-
copy /b file1.gz + file2.gz file_out.gz
Upon decompressing the file file_out.gz only contents of file1.gz are retrived using the .NET class GZipStream!
If you have a resolution for the same please let me know.
Thanks a lot.
|
|
|
|
 |
|
 |
Sorry Robin I have not playin with decomp/comp files directly. Sorry for this unhelpful answer, was away for a while.
Hope you found a solution to your problem.
"Nothing is lost, Nothing is created, Everything is transformed" Lavoisier
http://wlwilliamsiv.com
|
|
|
|
 |
|
 |
Is it possible to compress and decompress a persian or arabic string with this code?
If it is possible, please tell me how.
thanks
|
|
|
|
 |
|
 |
Hi!
elahe babaee wrote: If it is possible, please tell me how.
What do you mean by how? Have you tried the code?
I guess it should work as it manages directly bytes and chars.
Tell me more about what you need so I can help you.
Have a nice one!
"Nothing is lost, Nothing is created, Everything is transformed" Lavoisier
http://wlwilliamsiv.com
|
|
|
|
 |
|
 |
System.Text.Encoding.Unicode is a good but limited encoding, that's why you ran into problems with loosing chars. XML-Serializing for example produces UTF-16 encoded strings, not UTF-8. The trap is, that transforming a utf-16 encoded text into utf-8 often seems to work, but in fact you're always risking loss of chars.
I suggest using System.Text.Encoding.XXX accordigly to your source text, then you'll get correct results an much clearer code. If you're absolutely unsure about the source of your input string, use UTF-32 when transforming the string into a byte array and back. That will preserve all your chars during the process.
Take a look at: http://www.codeproject.com/KB/string/string_compression.aspx[]
|
|
|
|
 |