How to clear MSB in a value

How to clear MSB in a value

Post by Donald McLachl » Thu, 31 Jan 2002 03:50:23



I'm not sure this is the right newsgroup, so please redirect me if you know of
a better one.  ... but other Unix programmers might have seen something like this.

I am aware of a "trick" to clear the LS '1' bit in a value:
        X = X & (X-1)

Does anyone know a similar trick to clear the MS '1' bit in a value?

Thanks,
Don

--

Communications Research Centre / RNS    Tel     (613) 998-2845
3701 Carling Ave.,                      Fax     (613) 998-9648
Ottawa, Ontario
K2H 8S2
Canada

 
 
 

How to clear MSB in a value

Post by Barry Margoli » Thu, 31 Jan 2002 04:09:42




>I'm not sure this is the right newsgroup, so please redirect me if you know of
>a better one.  

comp.lang.c seems like it would be OK.  But you're not the first to post
generic C questions here, and I'm sure you won't be the last.

Quote:>            ... but other Unix programmers might have seen something
>like this.

>I am aware of a "trick" to clear the LS '1' bit in a value:
>    X = X & (X-1)

That will work if X is odd, not if it's even.  E.g. if X is 4, (4 & 3)
results in 0.  The correct expression is:

X = X & ~1

Quote:

>Does anyone know a similar trick to clear the MS '1' bit in a value?

There are macros that expand into the number of bits in each integer type,
e.g. CHAR_BITS, UINT_BITS, etc.  Replace <type> by the appopriate prefix in
the expression below:

X = X & ~(1 << (<type>_BITS - 1))

--

Genuity, Woburn, MA
*** DON'T SEND TECHNICAL QUESTIONS DIRECTLY TO ME, post them to newsgroups.
Please DON'T copy followups to me -- I'll assume it wasn't posted to the group.

 
 
 

How to clear MSB in a value

Post by Donald McLachl » Thu, 31 Jan 2002 05:22:48


Maybe I did not phrase my original question precisely enough.




> >I'm not sure this is the right newsgroup, so please redirect me if you know of
> >a better one.  

> comp.lang.c seems like it would be OK.  But you're not the first to post
> generic C questions here, and I'm sure you won't be the last.

Not so much a C question as an algorythm question I think.

Quote:> >I am aware of a "trick" to clear the LS '1' bit in a value:
> >       X = X & (X-1)

> That will work if X is odd, not if it's even.  E.g. if X is 4, (4 & 3)
> results in 0.

which is exactly what I wanted. ie:

given 9, (1001) LS '1' bit is in pos 0.  9 & 8 = 8.  LS '1' bit was cleared.
given 6, (1000) LS '1' bit is in pos 3.  8 & 7 = 0.  LS '1' bit was cleared.

Is there a way given an arbitrary int value, to turn the bits off starting
from the other end?  ie:

given 9, (1001) => (0001)
given 1, (0001) => (0000)

Unless you have a really efficent bit reverser, please don't suggest the following

        y = bitrev(x);
        z = y & y-1
        x = bitrev(z);

Thanks,
Don

--

Communications Research Centre / RNS    Tel     (613) 998-2845
3701 Carling Ave.,                      Fax     (613) 998-9648
Ottawa, Ontario
K2H 8S2
Canada

 
 
 

How to clear MSB in a value

Post by Tobias Oe » Thu, 31 Jan 2002 05:21:54



> I'm not sure this is the right newsgroup, so please redirect me if you
> know of
> a better one.  ... but other Unix programmers might have seen something
> like this.

> I am aware of a "trick" to clear the LS '1' bit in a value:

> X = X & (X-1)

I assume this is C (or C++)

If X is some unsigned integer type, to clear the least significant bit do

X&=~1;  /* ~1 is 111111...110 in binary */

The most significant bit is a little harder, but you can use the fact that
unsigned arithmetic is done modulo 2. So (unsigned type) -1 is all bits
set, you just need to shift it on position to the right to get the binary
pattern 01111...11111

X&=((unsigned type) -1) >> 1;

Tobias.
btw: comp.lang.c is the place to ask C (NOT C++) questions. They're so good
that someone over there will find out something not quite right in the
above...

 
 
 

How to clear MSB in a value

Post by Barry Margoli » Thu, 31 Jan 2002 06:37:32




>Maybe I did not phrase my original question precisely enough.





>> >I'm not sure this is the right newsgroup, so please redirect me if you
>know of
>> >a better one.  

>> comp.lang.c seems like it would be OK.  But you're not the first to post
>> generic C questions here, and I'm sure you won't be the last.

>Not so much a C question as an algorythm question I think.

>> >I am aware of a "trick" to clear the LS '1' bit in a value:
>> >   X = X & (X-1)

>> That will work if X is odd, not if it's even.  E.g. if X is 4, (4 & 3)
>> results in 0.

>which is exactly what I wanted. ie:

>given 9, (1001) LS '1' bit is in pos 0.  9 & 8 = 8.  LS '1' bit was cleared.
>given 6, (1000) LS '1' bit is in pos 3.  8 & 7 = 0.  LS '1' bit was cleared.

Oh, I thought in "LS 1 bit" the 1 was a count of bits, not a bit value
(i.e. if you had said "LS 3 bits", I would have suggested X=X&~7).

Quote:>Is there a way given an arbitrary int value, to turn the bits off starting
>from the other end?  ie:

>given 9, (1001) => (0001)
>given 1, (0001) => (0000)

The best techniques I've seen for finding the MS 1 bit involve table
lookups.  To avoid having a table of 2^32 entries, a divide-and-conquer
approach is usually taken: grab the high-order 8 bits and use it as an
entry into a table of 256 entries.  If it was all zero, then repeat with
the next block of 8 bits, and so on until you find a non-zero block.

--

Genuity, Woburn, MA
*** DON'T SEND TECHNICAL QUESTIONS DIRECTLY TO ME, post them to newsgroups.
Please DON'T copy followups to me -- I'll assume it wasn't posted to the group.

 
 
 

How to clear MSB in a value

Post by Logan Sh » Thu, 31 Jan 2002 06:43:35





>The most significant bit is a little harder, but you can use the fact that
>unsigned arithmetic is done modulo 2.

  :
  :

Quote:>X&=((unsigned type) -1) >> 1;

It seems like you could also do it by shoving the integer so that the
MSB falls off into the abyss, then moving the integer back to where it
was originally:

        X = X << 1 >> 1;

A modern processor should be able to do this very fast:  two very
simple instructions and no pipeline stalls.

  - Logan
--
"I'll tell you something.  Luxury disgusts me."  Giorgio Armani, Jan 17, 2002
( http://dailynews.yahoo.com/h/nm/20020117/re/life_fashion_armani_dc_1.... )

 
 
 

How to clear MSB in a value

Post by Tobias Oe » Thu, 31 Jan 2002 06:44:54






>>The most significant bit is a little harder, but you can use the fact that
>>unsigned arithmetic is done modulo 2.
>   :
>   :
>>X&=((unsigned type) -1) >> 1;

> It seems like you could also do it by shoving the integer so that the
> MSB falls off into the abyss, then moving the integer back to where it
> was originally:

> X = X << 1 >> 1;

> A modern processor should be able to do this very fast:  two very
> simple instructions and no pipeline stalls.

Yes, but that's not what the OP wants either. I think we all got confused
by his use of LS and MS. For me the least significant bit is bit nr 0, for
him it's the first non zero bit (starting from the right). Similar for his
MS. No idea for a fast solution for his problem.
Tobias.
 
 
 

How to clear MSB in a value

Post by Donald McLachl » Thu, 31 Jan 2002 07:06:09



> Oh, I thought in "LS 1 bit" the 1 was a count of bits, not a bit value
> (i.e. if you had said "LS 3 bits", I would have suggested X=X&~7).

That is still not what I'm describing.  If any bits are set to 1 in a value,
I want to locate the least significant bit which is set to 1, and change its
value to 0.  x & (x-1) does exactly that. Try it for a few bit patterns and
you will see what I mean.  ie:

                           ---- ls '1' cleared ---
                          |                       |
                          v                       v
        func(0010 0001 0001 0000) => 0010 0001 0000 0000;
        func(0000 0010 0110 1001) => 0000 0010 0110 1000;
                               ^                       ^
                               |                       |
                                ---- ms '1' cleared ---

I was hoping to find a similarly clever (read fast) way of clearing the most
significant 1 bit in a value.  ie:

                ---- ms '1' cleared ---
               |                       |
               v                       v
        func(0010 0001 0001 0000) => 0000 0001 0001 0000;
        func(0000 0010 0110 1001) => 0000 0000 0110 1001;
                    ^                       ^
                    |                       |
                     ---- ms '1' cleared ---

Quote:> The best techniques I've seen for finding the MS 1 bit involve table
> lookups.  To avoid having a table of 2^32 entries, a divide-and-conquer
> approach is usually taken: grab the high-order 8 bits and use it as an
> entry into a table of 256 entries.  If it was all zero, then repeat with
> the next block of 8 bits, and so on until you find a non-zero block.

That is what I'm doing now.  I was hoping for a "trick" solution.  Maybe I'm
out of luck.

Thanks,
Don

--

Communications Research Centre / RNS    Tel     (613) 998-2845
3701 Carling Ave.,                      Fax     (613) 998-9648
Ottawa, Ontario
K2H 8S2
Canada

 
 
 

How to clear MSB in a value

Post by glmcl.. » Thu, 31 Jan 2002 06:36:31



> Is there a way given an arbitrary int value, to turn the bits off starting
> from the other end?  ie:

> given 9, (1001) => (0001)
> given 1, (0001) => (0000)

        Hmmm, well, I think the following should work. I used the modulus
of nearest powers of 2:

        x is the number you have
        y is the number you will end up with

        z = log(x)/log(2)       // log_2 (x)
        q = 2^(floor(z))        // get nearest power of 2
        y = x % q

Of course, this will break down if q is 0 or if x is <= 0, but I'm sure
you can deal with that.

Running a quick test seems to confirm that this is okay.
I'm sure there is some optimization you can do to this; I didn't look into
optimizing it.

#include <math.h>
#include <stdio.h>

int main (void)  {
  int i = 0;
  for (i = 1; i < 256; i ++)  {
    double z = log (i)/log (2);
    double q = pow (2, (int)z);
    printf ("%d -> %d\n", i, (z > 0 ? i % (int)(q) : 0));
  }

  return 0;

Quote:}

-- Greg McLearn

CS Graduate Student, University of Waterloo
Lab:    DC3548  Programming Languages Group
        +1 (519) 888-4567 x4822

 
 
 

How to clear MSB in a value

Post by Donald McLachl » Thu, 31 Jan 2002 07:18:58




> > X = X << 1 >> 1;

> > A modern processor should be able to do this very fast:  two very
> > simple instructions and no pipeline stalls.

> Yes, but that's not what the OP wants either. I think we all got confused
> by his use of LS and MS. For me the least significant bit is bit nr 0, for
> him it's the first non zero bit (starting from the right). Similar for his
> MS. No idea for a fast solution for his problem.
> Tobias.

Tobias,  that is exactly what I want, but I was not sure how to describe said bit
because:

- don't some differnet chip manufacturers number the bits from opposite ends? (TI?)
- if one does in-memory pointer tricks big/little endian comes into
  play (unless you do htonl()/ntohl() before and after parsing the memory).

Bit nr 0 is the LSB for me also.  Thus I was calling the bit containing the value '1'
which was closest to the LSB the LS '1' bit.  How else should one describe it in
an architecture independent way?

(oops, I just noticed I called it the MSB in the subject, not the MS '1' bit)

Don

--

Communications Research Centre / RNS    Tel     (613) 998-2845
3701 Carling Ave.,                      Fax     (613) 998-9648
Ottawa, Ontario
K2H 8S2
Canada

 
 
 

How to clear MSB in a value

Post by Donald McLachl » Thu, 31 Jan 2002 07:23:40



>    x is the number you have
>    y is the number you will end up with

>    z = log(x)/log(2)       // log_2 (x)
>    q = 2^(floor(z))        // get nearest power of 2
>    y = x % q

> Of course, this will break down if q is 0 or if x is <= 0, but I'm sure
> you can deal with that.

> Running a quick test seems to confirm that this is okay.
> I'm sure there is some optimization you can do to this; I didn't look into
> optimizing it.

> #include <math.h>
> #include <stdio.h>

> int main (void)  {
>   int i = 0;
>   for (i = 1; i < 256; i ++)  {
>     double z = log (i)/log (2);
>     double q = pow (2, (int)z);
>     printf ("%d -> %d\n", i, (z > 0 ? i % (int)(q) : 0));
>   }

>   return 0;
> }

I'll try profiling that against the method I currently have.

Thanks,
Don

--

Communications Research Centre / RNS    Tel     (613) 998-2845
3701 Carling Ave.,                      Fax     (613) 998-9648
Ottawa, Ontario
K2H 8S2
Canada

 
 
 

How to clear MSB in a value

Post by glmcl.. » Thu, 31 Jan 2002 07:16:02



> I'm sure there is some optimization you can do to this; I didn't look into
> optimizing it.

        That's because I am an idiot. :-)  In a moment of clarity, the
"correct" algorithm (or at least one that was thought of in more than 2
minutes), is one that you start from the end of the integer (ie. in the
31nd bit) and scan toward bit 0. The first 1 you find, change to a 0.
Of course, this may wreck havoc on integers that are otherwise supposed to
be interpreted as negative numbers, but I'm sure that can be dealt with.
You get an algorithm that is O(n) (n is the number of bits). I suppose this
could be further optimized by using a lookup table to locate the upper bits
of the number and starting there, rather than always looking from bit 31
(or however large your integers are).

-- Greg McLearn

CS Graduate Student, University of Waterloo
Lab:    DC3548  Programming Languages Group
        +1 (519) 888-4567 x4822

 
 
 

How to clear MSB in a value

Post by Barry Margoli » Thu, 31 Jan 2002 07:54:49






>> Oh, I thought in "LS 1 bit" the 1 was a count of bits, not a bit value
>> (i.e. if you had said "LS 3 bits", I would have suggested X=X&~7).

>That is still not what I'm describing.

I knew that.  I was explaining how I had misunderstood your original post.
If you had set "least/most significant bit that is set" you would have
avoided lots of confusion.

--

Genuity, Woburn, MA
*** DON'T SEND TECHNICAL QUESTIONS DIRECTLY TO ME, post them to newsgroups.
Please DON'T copy followups to me -- I'll assume it wasn't posted to the group.

 
 
 

How to clear MSB in a value

Post by Logan Sh » Thu, 31 Jan 2002 09:29:12




>Unless you have a really efficent bit reverser, please don't suggest the following

>    y = bitrev(x);
>    z = y & y-1
>    x = bitrev(z);

The fastest way I know to reverse bits looks something like this for a
32-bit value:

        x = ((x & 0x0000ffff) << 16) | ((x & 0xffff0000) >> 16);
        x = ((x & 0x00ff00ff) << 8)  | ((x & 0xff00ff00) >> 8);
        x = ((x & 0x0f0f0f0f) << 4)  | ((x & 0xf0f0f0f0) >> 4);
        x = ((x & 0x33333333) << 2)  | ((x & 0xcccccccc) >> 2);
        x = ((x & 0x55555555) << 1)  | ((x & 0xaaaaaaaa) >> 1);

A slightly less clear but more optimized version (which uses fewer
literal values):

        x = (x << 16) | (x >> 16);
        x = ((x << 8) & 0xff00ff00) | ((x & 0xff00ff00) >> 8);
        x = ((x << 4) & 0xf0f0f0f0) | ((x & 0xf0f0f0f0) >> 4);
        x = ((x << 2) & 0xcccccccc) | ((x & 0xcccccccc) >> 2);
        x = ((x << 1) & 0xaaaaaaaa) | ((x & 0xaaaaaaaa) >> 1);

I'm not sure if that would qualify as really efficient, especially
considering since you have to do it twice.  But really, it isn't all
that bad.  Sure, it has a boatload of ALU operations, but modern
processors specialize in chewing through register-to-register ALU
operations fast.  And all the code (for reversing and for clearing
the bit) can form a single basic block, which is a good thing.

My simple benchmarks indicate it seems to take about 15 clock cycles
to do a reverse on an Athlon system.  (I can reverse the bits of
about 40 million integers in one second on a 600 MHz machine.)

  - Logan
--
"I'll tell you something.  Luxury disgusts me."  Giorgio Armani, Jan 17, 2002
( http://dailynews.yahoo.com/h/nm/20020117/re/life_fashion_armani_dc_1.... )

 
 
 

How to clear MSB in a value

Post by Donald McLachl » Fri, 01 Feb 2002 02:28:24


Thanks to everyone for your help and patience.

Logan,

I was aware of those algorythms (I have a copy of this comp.arch message on this:
1991Oct1.100233.1...@odin.diku.dk> torb...@diku.dk (Torben [gidius Mogensen))

[ BTW, does anyone know a good source of similar simple but efficient algorythms? ]

I coded up different versions of bitrev, and your second version was the
fastest (but x must be unsigned).

I coded up 4 versions of msb_clear() and while

        bitrev(x); x = x & (x-1) ; bitrev(x)

was fast, it was not the fastest (but there was not much between them).
I included a copy of the lsb_clear() for comparison purposes.
Source code follows.

obelix don> uname -a
SunOS obelix 5.7 Generic_106541-17 sun4u sparc SUNW,UltraSPARC-IIi-Engine
obelix don> !gcc
gcc -O4 -fno-inline -ansi -pedantic -Wall clearmsb.c -pg
obelix don> a.out
obelix don> gprof a.out
 [ snip ]
granularity: each sample hit covers 4 byte(s) for 0.00% of 643.48 seconds

   %  cumulative    self              self    total          
 time   seconds   seconds    calls  ms/call  ms/call name    
 56.4     362.97   362.97                            internal_mcount [1]
  9.9     426.40    63.43 330000000     0.00     0.00  msb_clear_v2 [4]
  9.8     489.37    62.97 330000000     0.00     0.00  msb_clear_v1 [5]
  8.7     545.35    55.98 330000000     0.00     0.00  msb_clear_v4 [6]
  8.5     599.87    54.52 330000000     0.00     0.00  msb_clear_v3 [7]
  2.8     618.19    18.32        1 18320.00 267590.00  main [2]
  2.0     631.11    12.92                            mcount (139)
  1.9     643.48    12.37 330000000     0.00     0.00  lsb_clear [8]
 [ snip ]

-----

#include <stdio.h>

int lsb_clear(int n)
{
        return(n & (n-1));

}

int msb_clear_v1(int x)
{
        x = ((x >> 16) & 0x0000ffff) | ((x & 0x0000ffff) << 16);
        x = ((x >>  8) & 0x00ff00ff) | ((x & 0x00ff00ff) <<  8);
        x = ((x >>  4) & 0x0f0f0f0f) | ((x & 0x0f0f0f0f) <<  4);
        x = ((x >>  2) & 0x33333333) | ((x & 0x33333333) <<  2);
        x = ((x >>  1) & 0x55555555) | ((x & 0x55555555) <<  1);

        x = x & (x-1);

        x = ((x >> 16) & 0x0000ffff) | ((x & 0x0000ffff) << 16);
        x = ((x >>  8) & 0x00ff00ff) | ((x & 0x00ff00ff) <<  8);
        x = ((x >>  4) & 0x0f0f0f0f) | ((x & 0x0f0f0f0f) <<  4);
        x = ((x >>  2) & 0x33333333) | ((x & 0x33333333) <<  2);
        x = ((x >>  1) & 0x55555555) | ((x & 0x55555555) <<  1);

        return(x);

}

int msb_clear_v2(int x)
{
        int y, i;
        static int mask[32], initialised;

        if(!initialised)
        {
                for(i = 0; i < 32; ++i)
                        mask[i] = ~(1 << i);

                initialised = 1;
        }

        for(y = x, i = 0; y > 0; ++i, y = y << 1)
                ;

        return(x & mask[31 - i]);

}

int msb_clear_v3(int x)
{
        int i, j;
        static unsigned char lut[256], initialised;
        unsigned char *c;

        if(!initialised)
        {
                for(i = 0; i < 256; ++i)
                {
                        for(j = 0x80; j > 0; j = j >> 1)
                        {
                                if(i & j)
                                {
                                        lut[i] = i ^ j;
                                        break;
                                }
                        }
                }

                initialised = 1;
        }

/*      x = htonl(x);   */                      /* big/little endian fix */
        c = (unsigned char *)&x;

        for(i = 0; i < sizeof(int); ++i, ++c)
        {
                if(*c != 0)
                {
                        *c = lut[(unsigned int)*c];
                        break;
                }
        }

/*      x = ntohl(x);   */                      /* big/little endian fix */
        return(x);

}

int msb_clear_v4(unsigned int x)
{
        x = (x << 16) | (x >> 16);
        x = ((x << 8) & 0xff00ff00) | ((x & 0xff00ff00) >> 8);
        x = ((x << 4) & 0xf0f0f0f0) | ((x & 0xf0f0f0f0) >> 4);
        x = ((x << 2) & 0xcccccccc) | ((x & 0xcccccccc) >> 2);
        x = ((x << 1) & 0xaaaaaaaa) | ((x & 0xaaaaaaaa) >> 1);

        x = x & (x-1);

        x = (x << 16) | (x >> 16);
        x = ((x << 8) & 0xff00ff00) | ((x & 0xff00ff00) >> 8);
        x = ((x << 4) & 0xf0f0f0f0) | ((x & 0xf0f0f0f0) >> 4);
        x = ((x << 2) & 0xcccccccc) | ((x & 0xcccccccc) >> 2);
        x = ((x << 1) & 0xaaaaaaaa) | ((x & 0xaaaaaaaa) >> 1);

        return(x);

}

int main()
{
        int i, j, x, a, b, c, d;

        for(j = 0; j < 10000000; ++j)
        {
                for(x = i = 0; i < 33; ++i)
                {
                        if(i)
                                x = (x << 1) | 1;
                        a = lsb_clear(x);
                        a = msb_clear_v1(x);
                        b = msb_clear_v2(x);
                        c = msb_clear_v3(x);
                        d = msb_clear_v4(x);

                        if(a != b || b != c || c != d)
                        {
                                printf("x = %08x, a = %08x, b = %08x,\
                                        c = %08x, d = %08x\n",
                                        x, a, b, c, d);
                                break;
                        }
                }
        }

        return(0);

}

--
Donald McLachlan                        E-mail  Donald.McLach...@crc.ca
Communications Research Centre / RNS    Tel     (613) 998-2845
3701 Carling Ave.,                      Fax     (613) 998-9648
Ottawa, Ontario
K2H 8S2
Canada