Click here to Skip to main content
Rate this: bad
good
Please Sign up or sign in to vote.
See more: C#
Hey guy's!
 
I've been working in a LZW compressor in c# and i've got some problems...
 
So i build the dictionary initialy with the 255 known codes and then i star coding... the problem is for example when i try to code the int 256 to a byte this gives problems :S can someone help me ?
 
Cumps and thanks in advance
 
This part is to build the dictionary and compress a file
 
          while (br.BaseStream.Position < br.BaseStream.Length)
            {
 
                Console.WriteLine(omg);
                omg++;
                
                t = br.ReadByte();
                chr = t;
                int aux=-1;
                
                byte[] res = new byte[str.Count() + 1];
 
                for (int i = 0; i < str.Count(); i++)
                {
                    res[i] = str[i];
                }
 
                res[str.Count()] = chr;
 
                int pos = isEqual(res, lista);
 
                if (pos != -1)
                {
                    str = new byte[res.Count()];
                    for(int k=0;k<res.Count();k++)
                    {
                        str[k] = res[k];
                    }
                    
                }
                else if (pos==-1)
                {
 
                    aux = isEqual(str, lista);
 
                    byte uh = (byte)aux;
                    _FileStream.WriteByte(uh);
 
                    Node nv = new Node();
                    nv.by = new Byte[res.Count()];
 
                    for (int k = 0; k < res.Count(); k++)
                    {
                        nv.by[k] = res[k];
                    }
 
                    lista.Add(nv);
 
                    str = new byte[1];
                    str[0] = chr;
                        
                }
 
            }
 
Lista = Dictionary;
isEqual Function = function that returns the position of sequence of bytes that we are searching in the dictionary
 

This part was to uncompress -> and this where's the problem when i read the bytes... i dont get what i have written...
 
 while (br.BaseStream.Position < br.BaseStream.Length)
            {
                t = br.ReadByte();
                if (cnt == 0)
                {
                    NCODE = new byte[1];
                    NCODE[0] = t;
                }
                else
                {
                    NCODE = new byte[NCODE.Count()];
                    NCODE[NCODE.Count()] = t;
                }
 
                pcr = isEqual(NCODE, lista);
 
                if (pcr == -1)
                {
                    pcr = isEqual(OCODE, lista);
                    str = new byte[OCODE.Count()];
 
                    for (int i = 0; i < OCODE.Count(); i++)
                    {
                        str[i] = OCODE[i];
                    }
 
                    if (cnt > 0)
                    {
                        str = new byte[OCODE.Count() + 1];
                        str[OCODE.Count()] = chr;
                    }
                }
                else if (pcr > -1)
                {
                    pcr = isEqual(NCODE, lista);
                    str = new byte[OCODE.Count()];
 
                    for (int i = 0; i < OCODE.Count(); i++)
                    {
                        str[i] = OCODE[i];
                    }
                }
 
                for (int i = 0; i < str.Count(); i++)
                {
                    _FileStream.WriteByte((byte)str[i]);
                }
 
                chr = str[0];
 
                Node nv = new Node();
                nv.by = new byte[OCODE.Count() + 1];
 
                for (int i = 0; i < OCODE.Count(); i++)
                {
                    nv.by[i] = OCODE[i];
                }
 
                nv.by[OCODE.Count()] = chr;
 
                OCODE = new byte[NCODE.Count()];
 
                for (int i = 0; i < NCODE.Count(); i++)
                {
                    OCODE[i] = NCODE[i];
                }
 
            }
 
            _FileStream.Close();
Posted 12-Nov-12 15:18pm
Edited 19-Nov-12 23:05pm
v3
Comments
lewax00 at 12-Nov-12 21:37pm
   
Well for starters, you can't represent 256 with a single byte...
Sergey Alexandrovich Kryukov at 12-Nov-12 22:28pm
   
:-)
SSilver009 at 13-Nov-12 1:02am
   
i've read that is possible throw shift's and or's i think to have 9 bits in one byte
lewax00 at 13-Nov-12 9:45am
   
Then what you read is mistaken. In an PC CPU a byte is 8 bits. Period. This is a limitation that exists on a physical level. 8 bits only have 256 possible values, and since one of those is 0, the maximum is 255.
Sergey Alexandrovich Kryukov at 12-Nov-12 22:28pm
   
"Gives problems..." What problems?
--SA
SSilver009 at 13-Nov-12 1:03am
   
when i write the byte 256 to the file he writes instead the byte 0 :S and this gives problems when i try to decompress the file :X
Sergey Alexandrovich Kryukov at 13-Nov-12 9:55am
   
There is no such thing as byte 256! I would advise you to get more confidence on more simple tasks, before coming to compression.
--SA
lukeer at 20-Nov-12 3:20am
   
Some comments would be great, especially where exactly the code doesn't behave like it should.
 
For now I suspect that you see an error in
_FileStream.WriteByte((byte)str[i]); correct?
 
If that's correct, the next interesting part is the byte-copying loop just above: str is created having the size of NCODE, but is copied from OCODE. Is that by intention or a possible source of the error?
SSilver009 at 20-Nov-12 4:59am
   
Yes you're right despite of not giving error, but writes the wrong thing :X its that thing that i said before it writes 1 instead of 257 :X so when im going to decode it decodes the wrong bytes :X
 
you were right the byte-copying looping to str was an error.
SSilver009 at 22-Nov-12 11:06am
   
Yes you're right despite of not giving error, but writes the wrong thing :X its that thing that i said before it writes 1 instead of 257 :X so when im going to decode it decodes the wrong bytes :X
 
you were right the byte-copying looping to str was an error.
Rate this: bad
good
Please Sign up or sign in to vote.

Solution 2

If you need to encode more than 256 different values, i.e. 0-255, you need to use more than 8 bits. In any normal scenario that means going from one byte to two, giving you a 16 bit ushort (0-65535).
  Permalink  
Comments
Sergey Alexandrovich Kryukov at 13-Nov-12 9:57am
   
No wonder OP has problems -- look at the comments to the question -- trying to write byte 256 (!). See what I advised...
--SA
SSilver009 at 13-Nov-12 10:15am
   
Its like lukeer said "Do you mean your problem is not a value greater than 255 but instead the bytes at indexes greater than 255 in an input file?" it's suposed to write the index of the dictionary to the file when coding with lzw right?
BobJanova at 13-Nov-12 12:01pm
   
If it's file offsets it probably needs to be 4 or maybe even 8 bytes (though I doubt this person is working on seriously large 64-bit data just yet).
Rate this: bad
good
Please Sign up or sign in to vote.

Solution 1

Before compressing with whatever technique, you first have to properly serialize your data. That means to transform what you have in a reversable way into a series of individual bytes.
 
An int is an alias for an Int32. That name hints to its 32 bit of memory consumption. You therefore have to break every int up into four bytes.
 
Handle all other types in a similar way. Always keep in mind that you have to restore your data from that byte sequence afterwards (after decompression). Then, you can compress byte by byte.
  Permalink  
Comments
SSilver009 at 13-Nov-12 1:18am
   
I understand what you're saying ... but all bytes in the file belong to the interval between 0 and 255 so it's that really necessary ? the lzw algorithm builds a new dicionary after 255 the problem is on writting after that :S
lukeer at 13-Nov-12 2:02am
   
Do you mean your problem is not a value greater than 255 but instead the bytes at indexes greater than 255 in an input file?
SSilver009 at 13-Nov-12 10:12am
   
exactly! it's suposed to write the index of the dictionary to the file when coding with lzw right?
lukeer at 14-Nov-12 2:59am
   
IIRC, you're right. You create a dictionary of frequently used byte sequences. Whenever one of those appears in the input file, you replace it with its dictionary index in the output file.
 
This only reduces file size if there are frequently used byte sequences longer than one byte. Otherwise all you're doing is add overhead and complexity.
 
Use the "Improve question" link beneath your original question and add the portion of your source that "gives problems". At least show definitions of the dictionary and indexing variables.
SSilver009 at 16-Nov-12 20:17pm
   
the thing is that im only building my dictionary and than printing to a file te code of the index that it gives the problem is when i go to index's bigger than 255 :/ it dont gives problems in writing but for example when i write byte 256 when i go to read it, the program read byte 0 (257 = 1, 258 = 2) :/ what should i do ? :X do you have any idea?
lukeer at 19-Nov-12 1:33am
   
If the dictionary holds only 256 entries, then you shouldn't ask for indexes above 255.
 
If the dictionary is larger, then search your code for a cast. There may be an integer, or long or whatever that holds the index to read from the dictionary. If this is being cast to byte then the behaviour you describe occurs.
 
There are many "if"s in this post. Seeing your code would easen our attempts to help you (remember the "Improve question" link).
SSilver009 at 19-Nov-12 8:29am
   
Lukeer i've updated the question with the parts of the code that make the compression and the uncompression when i write i will always write things above the 255 because the dictionary is full until that position whit the ascii bytes... i think the problem is on the write because i'm not reading what i'm suposed to read :X (is like i said before ... i write 257 and when i go to read it gives me 1) :X thanks

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



Advertise | Privacy | Mobile
Web03 | 2.8.140709.1 | Last Updated 20 Nov 2012
Copyright © CodeProject, 1999-2014
All Rights Reserved. Terms of Service
Layout: fixed | fluid