We are looking for a fast way to extract a miniature 'thumbnail'
representation from a given JPEG file for preview purposes.
The comp.compression FAQ lists these steps in its JPEG *compression*
" 2. (Optional) Downsample each component by averaging together
groups of pixels. The luminance component is left at full resolution,
while the color components are usually reduced 2:1 horizontally and
either 2:1 or 1:1 (no change) vertically. In JPEG-speak these
alternatives are usually called 2h2v and 2h1v sampling, but you may
also see the terms "411" and "422" sampling. This step immediately
reduces the data volume by one-half or one-third, while having almost
no impact on perceived quality. (Obviously this would not be true
if you tried it in RGB color space...) Note that downsampling is not
applicable to gray-scale data.
3. Group the pixel values for each component into 8x8 blocks.
Transform each 8x8 block through a discrete cosine transform (DCT);
this is a relative of the Fourier transform and likewise gives a
frequency map, with 8x8 components. Thus you now have numbers
representing the average value in each block and successively
higher-frequency changes within the block. The motivation for
doing this is that you can now throw away high-frequency information
without affecting low-frequency information. (The DCT transform
itself is reversible except for roundoff error.)
See question 25 for fast DCT algorithms.
4. In each block, divide each of the 64 frequency components by a
seperate "quantization coefficient", and round the results to integers.
This is the fundamental information-losing step. A Q.C. of 1 loses
no information; larger Q.C.s lose successively more info. The higher
frequencies are normally reduced much more than the lower. (All 64
Q.C.s are parameters to the compression process; tuning them for best
results is a black art. It seems likely that the best values are yet
to be discovered. Most existing coders use simple multiples of the
example tables given in the JPEG standard.) "
The basic idea is to take advantage of :
1. the image is compressed in 8x8 blocks.
2. the first frequency component is an average value of the whole block (?)
By taking just the DC (lowest frequency) component for each
block we have immediately reduced the bitmap size by a factor of 8 in
both the horizontal and vertical. It should be much faster than
decompressing the whole image and throwing away pixels because there is
much less decoding to do. (ie. no reverse DCT, 64:1 less pixels to deal with.)
Of course the above only gives you the color information which needs to
be combined with the luminance info to get an image, but that shouldn't
be too difficult or compute-intensive, should it ?
Any comments, criticisms, etc. welcome.
MediaFlex Software Engineer