# Finding the highest and lowest set bits in a value.

Hurray! ! ! Today’s entry is not about SQL!

Well, not directly, anyway. I needed to do this for a problem I was solving in SQL, but the solution is not specific to SQL.

So given an integer `x`, how can you find the values of the highest (leftmost) and/or lowest (rightmost) bits that are set? Can it done without looping or bit-by-bit comparisons?

YES

If you know you will only be dealing with positive numbers, finding the indices of the high and low bits can be done using:

```highbit = FLOOR(LOG(x) / LOG(2))
lowbit = LOG(x & -x) / LOG(2)

C# users:
highbit = Math.Floor(Math.Log(x, 2));
lowbit = Math.Log(x & -x, 2)```

where the `bits` will be numbered from the rightmost bit starting at 0.

If you think you may also be processing negative numbers then you also need to know the number of bits that the integer is being stored in.

```bits = 8
highbit = FLOOR(LOG(x & (2 ^ bits - 1)) / LOG(2))
lowbit = LOG(x & -x) / LOG(2)

C# users:
highbit = Math.Floor(Math.Log(x & (int)Math.Pow(2, bits) - 1, 2));
lowbit = Math.Log(x & -x, 2);```

In both cases, test for a value of 0 first, as `highbit` and `lowbit` will both be undefined for 0.

So, how does this work? Warning: math ahead !

First, let’s look at the logarithmic (`LOG`) operations:

A common trick used in programming is to set a bit to 1 by `OR`ing a value with `2(index of the bit)`.

`flags = flags | (2 ^ bitnum)`

This works because `ny` is 1 followed by n 0s in base n.

```27 = 010000000 in binary (1 followed by 7 0s)
104 = 10000 in decimal (1 followed by 4 0s)```

Since logarithm is the inverse of exponentiation, then the log of `ny` in base n is y

```LOG(ny)n = y
e.g.:
Log(27)2 = Log(128)2 = 7
Log(104)10 = Log(10000)10 = 4```

So `LOG(x)n` is the number of digits (less 1) in x when x is represented in base n. Which is another way of saying the index of the highest digit (starting from 0).

If x is not a power of n then LOG(x)n still returns the number of digits, but that number is now a non-integer. So we apply FLOOR to get the closest lower integer value. Therefore

`FLOOR(LOG(x)n)`

gets us index of the highest digit.

```e.g.:
FLOOR(LOG(0110012)2) = FLOOR(LOG(25)2) = FLOOR(4.64385619) = 4 (011001 has 5 significant digits)
FLOOR(LOG(2001)10) = FLOOR(3.301247089) = 3 (2001 has 4 digits)```

But why divide by LOG(2)?

Well, unlike C#, most languages do not have a convenient way of calculating `LOG(x)n` directly. In most programming languages, `LOG(x)` is the natural logarithm of x (log of x in base e: `LOG(x)e`).

Fortunately, it doesn’t matter what base your system uses for `LOG()`. It just so happens that

`LOG(x)a / LOG(b)a = LOG(x)b`

So we can get `LOG(x)2 `by dividing `LOG(x)` by `LOG(2)` regardless of what base is used by the system. Hence

`highbit = FLOOR(LOG(x) / LOG(2))`

So wherever you see `LOG(x) / LOG(2)` think ‘index of the highest bit in x’.

For the `lowbit`, before getting the LOG() we `AND` the value with the negative of the value.

`(x & -x)`

Why?

Because of the way computers represent negative numbers. Almost all systems use the two’s complement of a number to represent the negative of a number. The two’s complement is the number subtracted from `2b` where `b` is the number bits of storage being used.

```e.g.                  28 is 100000000
28 stored in 8 bits is  00011100
10000000 - 00011100 =  11100100 = the two's complement of 28 = -28```

The two’s complement can also be seen as the binary representation of the number with all of the bits flipped, plus 1.

```e.g. 28 stored in 8 bits is  00011100
flipping the bits gives 11100011
+ 1 = 11100100 = the two's complement of 28 = -28```

This value has an interesting property:

Since it is the original number `x` with all the bits flipped (e.g. 11100011), `AND`ing it with `x` would produce 0. However, since 1 is added, all the consecutive 1 bits from the right (0 bits in the original `x`) are flipped to 0 and the rightmost 0 (1 in `x`) is flipped to 1. So, in the two’s complement:

• all digits to the left of `x`‘s lowest bit are flipped: when `ANDed with x, they will result in 0s.`
• all digits to the right of `x`‘s lowest bit are 0: when `ANDed with x, they will result in 0s.`
• `x`‘s lowest bit is set to 1: when `ANDed with x, it will be set to 1.`
```e.g. 28 stored in 8 bits is 00011100
-28 is 11100100
AND gives 00000100```

Taking the `LOG(x) / LOG(2)` of this result gives us the index of the `x`‘s lowest bit. And since one, and only one, bit is set to 1 in the result, the result is always a power of 2 and we do not need to use the `FLOOR()` function. So:

`lowbit = LOG(x & -x) / LOG(2)`

Handling negative numbers:

The last item to explain is why, when dealing with possible negative numbers, the formula for the `highbit` changes from

`highbit = FLOOR(LOG(x) / LOG(2))`

to

`highbit = FLOOR(LOG(x & (2 ^ bits - 1)) / LOG(2))`

This is because we cannot take the `LOG()` of a negative number: the result is an imaginary number. So we have to engage in a little skullduggery.

Negative numbers (as we saw above) are stored as two’s complements of the corresponding positive number. As a result of this, all positive numbers begin with 0 in the leftmost bit and all negative numbers have 1 there. In other words, the leftmost bit is just a sign bit and not actually part of the number. It should be ignored when determining the highest order bit for either positive or negative values.

To do this we create a mask: `2(number of bits) - 1`. This will give us a value equal to 1 in every bit except the leftmost one. When we `AND` this value with `x`, we effectively eliminate the leftmost bit.

```-17 in 16 bits is 11111111111101111
216  - 1 is 01111111111111111
AND gives 01111111111101111 (the original value without the leftmost bit)

+17 in 16 bits is 00000000000010001
216  - 1 is 01111111111111111
AND gives 00000000000010001 (the original value without the leftmost bit)
```

Since the result of this operation always has a 0 in the leftmost bit, it is seen as a positive number by the system and can safely be used in the LOG() operation. Since the remaining bits have the same bit pattern as the original value, our formula will return the correct `highbit` value for the original number.

We do not need to do this for the `lowbit` because we `AND` the number with its negative, which ensures that the leftmost bit is always `AND`ed with its opposite, thus always resulting in 0 in the leftmost bit, and so, the result is safe to use with `LOG()`.