:Hello,
:
:I'm new to this group, so I apologise for this...
:but, could ppl send me some /really/ good compression algorithms, please?
Universal, binary, lossless?
F. M. J. Willems, Y. M. Shtarkov, T. J. Tjalkens, "The Context-Tree Weighting
Method: Basic Properties", IEEE Trans. Info. Theo., V41 653ff (1995).
Abstract:
We describe a sequential universal data compression procedure
for binary tree sources that performs the "double mixture". Using a
context tree, this method weights in an efficient recursive way the
coding distributions corresponding to all bounded memory tree sources,
and achieves a desirable coding distribution for tree sources with an
unknown model and unknown parameters. Computational and storage
complexity of the proposed procedure are both linear in the source
sequence length. We derive a natural upper bound on the cumulative
redundancy of our method for individual sequences. The three terms in
this bound can be identified as coding, parameter and model
redundancy. The bound holds for all source sequence lengths not only
for asymptotically large lengths. The analysis that leads to this
bound is based on standard techniques and turns out to be extremely
simple. Our upper bound on the redundancy shows that the proposed
context-tree weighting procedure is optimal in the sense that it
achieves the Rissanen (1984) lower bound.
:I'm looking to write a powerful, if slow, general purpose compression
:prog. and research can be such a pain at times.
That's the ticket. They have some successor papers on more general
than context-tree models, and also how to apply it well to e.g. 8 bit
text sources.
You might want to try to get in contact with them as I believe they
have students working on implementing it.
That paper above will not be enough to actually implement the method,
you will have to learn about how to implement arithmetic coding as
well.
The magic part of that method is their upper bound of redundancy (coded
bits greater than the source entropy, in effect) given deterministic
sequences, not just 'in probability' (i.e. most likely sequences) or
asymptotically (as n->infinity). It won the IEEE best paper of the year
award in information theory.
In a followup paper, they extend their method to infinite depth
sources, and show that this infinite depth method achieves entropy for
_any_ stationary and ergodic source. (wow!)
:Matt. W.
--
* Matthew B. Kennel/Institute for Nonlinear Science, UCSD
*
* "do || !do; try: Command not found"
* /sbin/yoda --help