Qbasicnews.com
April 05, 2020, 07:41:37 PM *
Welcome, Guest. Please login or register.

Login with username, password and session length
News: Back to Qbasicnews.com | QB Online Help | FAQ | Chat | All Basic Code | QB Knowledge Base
 
   Home   Help Search Login Register  
Poll
Question: Lemon trees are nice...  (Voting closed: June 04, 2002, 03:33:53 AM)
Yes - 6 (100%)
Total Voters: 6

Pages: [1] 2
  Print  
Author Topic: Does anybody know anything about compression?  (Read 8322 times)
R@dioman
Ancient QBer
****
Posts: 410



« on: May 28, 2002, 03:33:53 AM »

Not just algorithms, but actually knew the math behind compressing information? Because I tried and managed to compress only a few digits using an elaborate method using a 36 base number system (involving greek letters, our letters and our numbers). I would like to get into that. Is there a really good tutorial out there? Let me know  Smiley
Logged

wizardlife
Na_th_an
*****
Posts: 1456


WWW
« Reply #1 on: May 28, 2002, 11:51:52 AM »

The basic principle is that you assign characters to certain recurring symbols that come up in the file a lot. I think I could set it up, but I'm not sure if it would be that great, or that fast.
Logged

Agamemnus
x/ \z
*****
Posts: 3491



« Reply #2 on: January 28, 2003, 08:23:13 PM »

data+compression+methods

 :bounce:
Logged

Peace cannot be obtained without war. Why? If there is already peace, it is unnecessary for war. If there is no peace, there is already war."

Visit www.neobasic.net to see rubbish in all its finest.
relsoft
*/-\*
*****
Posts: 3927



WWW
« Reply #3 on: January 29, 2003, 12:38:42 AM »

RLE =Just about Anybody can....

LZW = Antoni Gual/Rich GelDreich
Logged

y smiley is 24 bit.


Genso's Junkyard:
http://rel.betterwebber.com/
wizardlife
Na_th_an
*****
Posts: 1456


WWW
« Reply #4 on: January 29, 2003, 01:59:23 AM »

Quote from: "relsoft"
RLE =Just about Anybody can....


lol. Depends what you're compressing. You won't get very far compressing a novel that way, but it works for images...
Logged

LooseCaboose
I hold this place together
*****
Posts: 981



« Reply #5 on: January 29, 2003, 08:07:11 AM »

*Digs out notes on compression*

If you just want algorithms then youll find plenty of examples on the net, with RLE, LZH, gzip, bzip, huffman encoding, etc. Also look at prefix codes and block codes for simple compression techniques.

The actual theoritical math is (IMHO) rather ugly. To start off with you have compression ratio
Code:

ratio = (LenghtAfter / LengthBefore) * 100

and compression rate
Code:

rate = ((LengthBefore - LengthAfter) / LengthBefore) * 100


Those two are rather straight forward and are the basis of measurement for any given compression method. There are limits to compression, not all data can be compressed (all messages are equally likely), and not all data can be compressed equally well.

The main mathmatical basis for compression is around message probabilites and distributions. For example using the ascii set, you have 256 possible messages, in a novel message 'e' is far more probable than other messages such as 'z' and 'x'. Encoding message 'e' with a short bit string and message 'z' with a longer one is how compression is achieved.

There are formulas for the upper and lower bounds of compression, as well as things like noiseless coding, block code constrution etc. (I wont post them here because they are too complex to write in text, unless you have LaTeX).

Do a google search on Shannon, Kraft and Huffman who all did large amounts of work with the mathmatics and theory behind compression. Hope this helps?
Logged

esus saves.... Passes to Moses, shoots, he scores!
Neo
Na_th_an
*****
Posts: 2150



« Reply #6 on: January 29, 2003, 08:44:05 AM »

There are different ways of compressing, of which RLE is (I think) the easiest.

A harder one is the RDX compression.
Logged
toonski84
__/--\__
*****
Posts: 2567



« Reply #7 on: January 29, 2003, 08:53:58 AM »

rle = representing repeated pixels longer than 3 as 3 characters.

lzw = representing repeated patterns as 2 characters.

huffman/jpg = placing a voodoo doll in the center of a room, light a circle of fire around you and dance around the room praying to antoni gual to send magic jpeg decoding rays from the sky.
Logged

i]"I know what you're thinking. Did he fire six shots or only five? Well, to tell you the truth, in all this excitement, I've kinda lost track myself. But being as this is a .44 Magnum ... you've got to ask yourself one question: 'Do I feel lucky?' Well, do ya punk?"[/i] - Dirty Harry
relsoft
*/-\*
*****
Posts: 3927



WWW
« Reply #8 on: January 29, 2003, 08:57:04 AM »

All hail Antoni!!!!!


 :rotfl:
Logged

y smiley is 24 bit.


Genso's Junkyard:
http://rel.betterwebber.com/
Neo
Na_th_an
*****
Posts: 2150



« Reply #9 on: January 29, 2003, 09:11:53 AM »

:rotfl:  :bounce:  :king:  :bounce:  :rotfl:
Logged
LooseCaboose
I hold this place together
*****
Posts: 981



« Reply #10 on: January 29, 2003, 06:07:30 PM »

I always thought Huffman encoding was easy, I had to write a Huffman encoder in Java as a first year project, it gives really nice results for the right kinds of data as well. :-)

* digs out notes on JPEG format *

Yeah that is nasty, even so far as to have different compression formulas for PAL and NTSC JPEGs. Compression is done around two chrominance and one luminance values (because the eyes responds more to the former) and breaking the image into 8x8 blocks. The blocks are then passed through a Discrete Cosine Transformation (DCT) to create a set of 8x8 matrices. Quantization then removes the less important DCT values, each individual application must provide its own quantization tables for loss-compression tradeoff (these arent part of the JPEG standard). The differences of the {0, 0} values are then calculated, reducing most of them to small values.

The 64 elements are then linearized with a zig-zag and compressed using RLE (;-)), usually there are lots of zeros in the lower right corner of the matrix so this compresses quite well. Then the resulting data is Huffman encoded for easy transmition. JPEG gives about 20:1 compression ratio. Running the algorithm backwards will decompress it in about the same time taken to compress it.

JPEG is a bit tricky, but at least it isnt as bad as MPEG (which is the same only worse) or bzip which one of my tutors commented on "The author of bzip compression was clearly on heavy drugs at the time and I could show you how the compression algorithm works, but you honestly wouldnt believe me" ;-).
Logged

esus saves.... Passes to Moses, shoots, he scores!
Antoni Gual
Na_th_an
*****
Posts: 1434



WWW
« Reply #11 on: January 29, 2003, 08:23:22 PM »

The idea of how the LZW decoder predicts next character is just genial!  Shocked (a very simple and effective algorithm, not so easy to catch).

Huffman algorithm is easy to understand, very theoric, not a spark of imagination, just statistics.
Jpeg is a funny thing: an algorithm on the top of an algorithm on the top of.. It's difficult to put all together and it's harder to guess why all that complication is needed. Because it was designed by a comission?  :Huh:
 
Some day I will try progressive JPEG's, they add a new layer of complexity, more fun!. I need to master EMS before.. :-?


Bzip sounds interesting. Where i can find something about it? :bounce:
Logged

Antoni
LooseCaboose
I hold this place together
*****
Posts: 981



« Reply #12 on: January 29, 2003, 08:53:51 PM »

The home page for bzip is at:
http://sources.redhat.com/bzip2/#what-is-bzip2
You can get the freely available source there which is fully ANSI complaint so it should compile on any platform. Check the futher reading section in the help page for information regarding the algorithm. The author comments that bzip doesnt actually present any new ideas in compression, but is more of an engineering of old ideas. I think you would have to use the source code in conjunction with the cited papers in order to get a full grasp on how the compression method works. I havent actually looked at it myself because I havent had the time with so many other projects and issues in real life(tm).
Logged

esus saves.... Passes to Moses, shoots, he scores!
red_Marvin
Na_th_an
*****
Posts: 1509



WWW
« Reply #13 on: January 30, 2003, 12:24:16 PM »

I have worked out a genial but bad way of compressing
files: More or less a file is made off ascii chars which is
values from 0-255. I f you then could make a mathematical
algorithm to decide what ascii number a character in the
nth position
=
Code:

asciiChar=(filePosition*52)^filePosition/LotsOfOtherMathStuff


If you know what I mean

...But the disadvantage of it is that the equation tend to be
bigger than the initial file...

but it would be quite nice :wink: ...
Logged

/post]
Agamemnus
x/ \z
*****
Posts: 3491



« Reply #14 on: January 30, 2003, 02:16:46 PM »

Yep, I thought of the same thing before. A minimal equation to represent your file. That should compress it to the theoretical maximum.

But how does one get such an equation?

Plot the ASCII characters on an X (LOF) / Y (0 to 255) graph..
 
You could use a genetic algorithm to find approximate matches.. but if the approximate match corrupts the whole file, what then?

It has to be perfect.

I suppose you could start with a bunch of *exact* (no decimals for ASCII chars!!) sine/cosine curve approximations... Then you could use some exotic logic system to shorten your N equations to M equations, where M is the theoretical maximum.

You could start by enveloping most of the curves as determined by a density analysis into one sine/cosine curve and then using another analysis make another one, and so forth...

The COS/SIN format can be provably the smallest in existence for the current data when you can't compress the actual SIN/COS data anymore!!!!
Logged

Peace cannot be obtained without war. Why? If there is already peace, it is unnecessary for war. If there is no peace, there is already war."

Visit www.neobasic.net to see rubbish in all its finest.
Pages: [1] 2
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.21 | SMF © 2015, Simple Machines Valid XHTML 1.0! Valid CSS!