> Thanks for your response. Maybe you'll have to explain this from a

> different angle for me to "get it."

OK, here is an example:

Assume that zz[k=20] = 1; (zz = zigzag coeff)

zz[k=21...39] = 0;

zz[k=40] = 2;

zz[k=41...63] = 0;

Al = 0;

Se = 63 (full AC band).

Now note that EOB = 20, *not* 40, since EOB is defined here as last

*newly nonzero* coefficient. zz[40] = 2 = binary_10 is *previously*

nonzero, thus doesn't count as EOB.

If we now come to k=40 in the loop, we have r=19 > 15 *and* k=40 >

EOB (=20). Thus this is *not* the last run of zeros, we don't

escape the loop, and we still have to code the correction bit for

zz[40] = 2. But we avoid emitting a ZRL by the while construct,

since we are *after* EOB per definition and it thus can be folded

into EOBRUN.

If you would leave off the k <= EOB condition in the while, you

would emit an obsolete ZRL.

It appears that your confusion comes from the different EOB definition

in the progressive mode: It is here not just the last nonzero

coefficient (after point transform by Al, as counted with zero run r),

but the last *newly* nonzero coefficient (absvalue=1 after point

transform by Al). Thus there's not the straight relationship between

EOB and zero run r as in the sequential mode.

Quote:> I think it _is_ quite a source of slowdown--The first loop and the top of

> the second loop (through the continue statement) account for 20% of the

> instructions run for a complete encode of a small image. This is because

> the code walks through every entry in presumably every block--and when you

> have two loops to do it instead of merging it into one, you have to touch

> memory twice as many times--loads and stores are expensive. Perhaps you

> can run a profile on the code and see if the most concentrated part of the

> execution falls in this same function, or somewhere else.

Well, you can modify the code somewhat according to my arithmetic coding

example:

1. Remove the absvalues array and calculate the absvalues on the fly.

2. Step through the array backwards from the end to find the EOB:

for (EOB = cinfo->Se; EOB > 0; EOB--) {

temp = (*block)[jpeg_natural_order[EOB]];

if (temp < 0)

temp = -temp; /* temp is abs value of input */

temp >>= Al; /* apply the point transform */

if (temp == 1) break; /* EOB = index of last newly-nonzero coef */

}

This way you avoid stepping through the whole array twice.

You could possibly furthermore reduce the array accesses if you establish

an additional 'conventional' EOB index cEOB:

2a.

for (cEOB = cinfo->Se; cEOB > 0; cEOB--) {

temp = (*block)[jpeg_natural_order[cEOB]];

if (temp < 0)

temp = -temp; /* temp is abs value of input */

temp >>= Al; /* apply the point transform */

if (temp) break; /* cEOB = index of last nonzero coef */

}

for (EOB = cEOB; EOB > 0; EOB--) {

temp = (*block)[jpeg_natural_order[EOB]];

if (temp < 0)

temp = -temp; /* temp is abs value of input */

temp >>= Al; /* apply the point transform */

if (temp == 1) break; /* EOB = index of last newly-nonzero coef */

}

Now use cEOB as main loop limit instead of Se, thus saving

to step through the last run of zeros again.

The condition after loop exit

if (r > 0 || BR > 0) { /* If there are trailing zeroes, */

has then also to be changed to

if (cEOB < Se || BR > 0) { /* If there are trailing zeroes, */

I haven't actually tried that further modification and am not quite

sure about it, just an idea (cEOB would be 40 in the above example).

Regards

Guido