Encoding multiple values in a single number


The Problem:

There have been countless times when I have needed to combine multiple values into a single number. This could be because of storage space constraints, the need to conform to a restricted format that does not allow the multiple values to be supplied separately, transmission efficiency, or a host of other reasons.

The usual way that I have seen this done, and I have used myself, is to multiply each of values by an appropriate power of 10 and add the numbers together, For instance, if I was storing the numbers 3, 9 and 4, I could just use

(v1 * 102) + (v2 * 101) + (v3 * 100) => x

(3 * 100) + (9 * 10) + (4 * 1) => 394

This is easy enough. However, a downside of this method is the operations needed to extract the individual numbers:

v1 = floor(x / 102) 
v2 = floor((x % 102) / 101) 
v3 = x % 10

v1 = floor(394 / 100) => 3
v2 = floor((394 % 100) / 10) => 9
v3 = 394 % 10 => 4

Of course, with this example, v1, v2 and v3 are limited to being between 0 and 9 inclusive. To store larger numbers, the powers of 10 would have to adjusted.

This method is simple enough and I have used it countless times. But I never liked the complexity of the extractions (especially when dealing with storing more that 3 values). I wondered if there was a different, and perhaps, better way.

Is There A Solution?

There is, but, while it is easier to use, it is much harder to set up. It uses the Chinese Remainder Theorem.

Firstly though: I am not a mathematician and understanding this does require some math skills. It took me some time to figure out what all the terminology and symbols meant before I could understand how this worked. That is one reason for this blog entry: to allow me to re-understand all this in the future (i.e. the day after tomorrow).

For example, the first line of the Wikipedia entry for the Chinese Remainder Theorem reads

In number theory, the Chinese remainder theorem states that if one knows the remainders of the Euclidean division of an integer n by several integers, then one can determine uniquely the remainder of the division of n by the product of these integers, under the condition that the divisors are pairwise coprime.

Did you follow that? I didn’t! I figured it out and I still can’t follow that!

I’ll try to explain some of the terminology and I will avoid all the math symbols that I can (however, I will still be using % to represent modulo). And please note, I am not a mathematics teacher and I will not be explaining in detail the Chinese Remainder Theorem, the Extended Euclidean Algorithm, Modulus Arithmetic, nor any of the mathematical techniques being used. I hope this article may help you if you are trying to understand any of them, but that was not my goal in writing this.

Modulo:
This is one I thought I knew: the remainder after a division, right? Wrong. Modulo and Remainder are different operations that give different results when dealing with negative numbers. This one bit me in the butt when I was working with Javascript which uses the % symbol, which most languages use to represent modulo, but Javascript uses to represent the remainder operation.
So, what is a modulo operation? The simple answer is that gives the signed remainder after a division (except in Javascript). The actual answer is a little fuzzy and I won’t get into it here. You can look it up on Wikipedia, or read Rob Conery’s excellent summary here.

Modulus:
This is another one I always got wrong. I was in the habit of referring to the % symbol as the ‘modulus‘ operator. It is actually the modulo operator. The modulus is the the divisor in the modulo operation. In 7 % 5 = 2, 5 is the modulus.

Coprime:
I’d seen this term, but until now, I never had to learn what it meant. Simply put, if a set of numbers do not share any prime factors then they are coprime. 10 and 9 are coprime since 10’s prime factors (2, 5) do not appear in 9’s prime factors (3). 10 and 12 are not coprime since 2 appears in both 10’s prime factors (2, 5) and 12’s prime factors (2, 3).

Modular Multiplicative Inverse:
This one is key to this method. The modular multiplicative inverse can be thought of this way: given two numbers n and m, what number, y, can we multiply n by so that y*n % m = 1. So for n=7 and m=5, a modular multiplicative inverse could be 3, since (3*7) % 5 = 21 % 5 = 1. 8 and 13 would also work as inverses, as would a whole series of other numbers.

A Simple Example:

The purpose of this example is to show the primary concepts involved in this method of encoding multiple values. Using the Chinese Remainder Theorem for this problem is definitely overkill, but I will do it anyway.

In this example two teams are competing in a best-out-of-seven competition and we wish to record the number of games won so far by each. We want to record both scores in a single number, yet be able to get each score out of that number.

So, in this case we have 2 values (v1, v2), each of which can have 1 of 8 values (0-7) that we wish to encode in a single number, x.

The first step is pick a number that is larger than v1‘s range. We will use 9. This will be the modulus that we will use to get v1‘s values from x. We will call this value n1.

Next, we pick a second number that is larger than v2‘s range, but it must be coprime with n1. We can use 10. So n2 = 10.

Note that there is nothing special about 9 and 10 (other than they meet the criteria that they are large enough to hold the various values of v1 and v2 and they are coprime). We could have chosen 8 and 51, or 240 and 11. But let’s stick with 9 and 10.

So that we can work this out, let us assign values to v1 and v2. The first team has won 3 games, and second team has won 2.

v1 = 3, n1 = 9
v2 = 2, n2 = 10

Our goal is find a number x so that x % 9 will give us a value of 3, and x % 10 will give us 2. To do this we need two more numbers that we will label m1 and m2. Each of these numbers will be the product of all the n#s for the other values. An easy way to obtain these numbers is get the product of all the n# values and divide out the particular n# that corresponds to the v#:

m1 = (n1*n2) / n1
m2 = (n1*n2) / n2
m1 = (9*10) / 9 = 10
m2 = (9*10) / 10 = 9

Since we only have 2 values, it is easy to see that the m# for one value is the n# of the other. But this won’t be the case when we have more than 2 values later.

For each value, we also want to find a modular multiplicative inverse of m# and n#. There is a process to find this number, but I won’t show that until we are working on the more complex example below. Besides, you have probably already worked out the easy-to-determine inverses for 10 and 9 (y=1) and 9 and 10 (y=9).

y1 = 1 : (m1*y1) % n1 = (10*1) % 9 = 1
y2 = 9 : (m2*y2) % n2 = (9*9) % 10 = 1

However, though we have worked out m1, m2, y1, and y2, those are not the numbers we are actually interested in. The numbers we want to remember are m1*y1 and m2*y2:

z1 = y1*m1 = 10*1 = 10
z2 = y2*m2 = 9*9 = 81

So we have

v1 = 3, n1 = 9,  m1 = 10, y1 = 1, z1 = 10
v2 = 2, n2 = 10, m2 = 9,  y2 = 9, z2 = 81

Why do we need all these numbers? What are they for?
– We know v1 and v2 are the values we are storing.
n1 and n2 will be used to get our v1 and v2 out of x (once we determine what x is).
m1, m2, y1 and y2 were only need to calculate z1 and z2. We can ignore them from now on.
z1 and z2 are new numbers that we will use to encode v1 and v2 in x.

Our final value x can now be calculated by getting the sum of the products of v# and z#.

x = v1*z1 + v2*z2 = 3*10 + 2*81 = 30 + 162 = 192

We finally have x, 192. When we wish to extract v1 and/or v2 from x, we can just do a modulo operation with n1 or n2

v1 = x % n1 = 192 % 9 = 3
v2 = x % n2 = 192 % 10 = 2

Tada! That was a lot of work just to encode two numbers, I agree. But, now that we have n1, n2, z1 and z2, we can easily encode and retrieve any v1 and v2.

v1 = 5, v2 = 1
x = 5*10 + 1*81 = 131

v1 = 131 % 9 = 5
v2 = 131 % 10 = 1

Okay, but why does this work? The secret is in how we derived z1 and z2.

Let’s look at z1:
z1 10 is m1*y1, and we know that (m1*y1) % n1 = 1 (10 % 9 = 1). Note then that if we multiply z1 by 2 we will double the result when we modulo with n1. Multiply by 3 and we triple the result:

2*(m1*y1) % n1 = 2
3*(m1*y1) % n1 = 3
...
2*(10) % 9 = 2
3*(10) % 9 = 3
...

This is key to how we will encode v1 3. We multiply v1*z1 3*10 and end up with a number 30 that modulo n1 9 gives us v1 3. The same relationship applies to v2, z2 and n2.

But we added the products together (v1*z1 + v2*z2). How does that not mess up the modulo operations?

Again, looking at z1:
z1 10 is the product of m1 and y1, and m1 10 is a multiple of all the other n# values (in our case, just the one: n2=10). That means:
z1 10 is a multiple of the other n#s (10).
v1*z1 is always a multiple of the other n#s, so (v1*z1) % n# is always 0 (e.g for v1=3: 30 % 10 = 0).
– we can add as many z1s as we want to into x and it won’t change the result of x % n2.
– multiplying z1 10 by any number only affects the result of x % 9, and never affects x % 10.

Similarly, z2 81 is a multiple of n1 9, so we can add as many 81s as we want into x (changing the result of x % 10) without ever changing x % 9.

Since each of these products (v#*z#) does not affect the modulo operation used to extract the other values, we can safely add the products together to get x.

A More Complicated Example (including the determination of y#):

In this case, we wish to store multiple values, each with differing ranges. In the case that I was working on, I was storing a game state which had to include the following information:
v1: A value indicating the current number of players (range: 1-4)
v2: A value indicating the current player number (range: 1-4)
v3: A value indicating the number of rounds played so far (range: 1-50)
v4: A value indicating the step in current round (range: 1-35)
v5: A value indicating the points collectively earned by the players (range: 1-50)

First, we obtain the numbers we will use to extract the v# values from x:

n1 = 5
n2 = 7
n3 = 53
n4 = 37
n5 = 59

Note: In this example, all the n# are primes. As we saw in the first example, this is not necessary: the requirement is that all the n# are coprime with each other. However, prime numbers are, by definition, coprime with all other prime numbers, and it is easier to pick prime numbers from a list than to examine the prime factors of different values in a search for suitable n#s.

We can create the m# values

n1*n2*n3*n4*n5 = 4049465

n1 = 5,  m1 = 4049465 / 5 = 809893   (= n2*n3*n4*n5)
n2 = 7,  m2 = 4049465 / 7 = 578495   (= n1*n3*n4*n5)
n3 = 53, m3 = 4049465 / 53 = 76405   (= n1*n2*n4*n5)
n4 = 37, m4 = 4049465 / 37 = 109445  (= n1*n2*n3*n5)
n5 = 59, m5 = 4049465 / 59 = 68635   (= n1*n2*n3*n4)

Next we need the y# values, the modulo multiplicative inverses for each m# and n#. As you can see, it is going to be a little harder to work this one out mentally. To find these values we must employ the Extended Euclidean Algorithm.

I used the Euclidean Algorithm in an earlier post. This extended version is similar and just adds the calculation of two addition values (one of which we will not be using and is not included in the code below).

public double MMI(double m, double n) 
{ 
  double r1 = m;
  double r2 = n;
  double s1 = 1;
  double s2 = 0;
  double r, s, q;
  
  while (r1 != 0) 
  {
    q = Math.Floor(r1 / r2); 
  
    r = r1;
    r1 = r2;
    r2 = r - q * r2;  
  
    s = s1;
    s1 = s2;
    s2 = s - q * s2;
  } 
  gcd = r; // this is the greatest common divisor of m and n
  mmi = s + ((s < 0) ? n : 0); // this is the modulo multiplicative inverse for m and n
  return mmi;
}

The value gcd is not used in our operations and can be ignored. The value y is our desired inverse. Using this method we can obtain

n1 = 5,  m1 = 809893, y1 = 2,  z1 = 809893 * 2  = 1619786
n2 = 7,  m2 = 578495, y2 = 1,  z2 = 578495 * 1  = 578495
n3 = 53, m3 = 76405,  y3 = 5,  z3 = 76405 * 5   = 382025
n4 = 37, m4 = 109445, y4 = 36, z4 = 109445 * 36 = 3940020 
n5 = 59, m5 = 68635,  y5 = 23, z5 = 68635 * 23  = 1578605

With our set of z#s we can now encode any set of 5 v#s into an x.

To encode that player 3 of 4 is on step 12 after 6 rounds and that the players have accumulated 22 points:

x = v1*z1 + v2*z2 + v3*z3 + v4*z4  + v5*z5 
x = 4*1619786 + 3*578495 + 6*382085 + 12*3940020 + 22*1578605 = 92516329

So the number 92516329 encapsulates our game state. Decoding the values just needs the use of our n#s:

numPlayers = x % n1 = 4
player     = x % n2 = 3
round      = x % n3 = 6
step       = x % n4 = 12
points     = x % n5 = 22

One reason why I like this method is illustrated above. The modulus operations are all independent of each other. I do not need to involve any other values, their ranges, their storage sizes in x, etc. in order to obtain a value. If I want v4, I just modulo x with n4.

And to show how and why this works, let’s repeat the logic we used from the first example:
Let’s look at z1:
z1 1619786 is m1*y1, and we know that (m1*y1) % n1 = 1 (1619786 % 5 = 1). Note then that if we multiply z1 by 2 we will double the result when we modulo with n1. Multiply by 3 and we triple the result:

2*(m1*y1) % n1 = 2
3*(m1*y1) % n1 = 3
...
2*1619786 % 5 = 2
3*1619786 % 5 = 3
...

This is how we encoded v1 4. We multiplied v1*z1 4*1619786 and ended up with a number 6479144 that modulo n1 5 gives us v1 4. The same relationship applies to v2, z2 and n2, v3, z3 and n3, etc.

To show that all the products can be added to create a viable x, we will look at z1 again:
z1 1619786 is the product of m1 and y1, and m1 809893 is a multiple of all the other n# values (n2*n3*n4*n5 = 809893). That means:
16198786 is a multiple of each of the other n#s (7, 53, 37, and 59).
v1*z1 is always a multiple of each of the other n#s, so (v1*z1) % n# is always 0 (e.g. for v1=4: 6479144 % 37 = 0).
– we can add as many z1s as we want to into x and it won’t change the result of x % n2, x % n3, x % n4, or x % n5.
– multiplying z1 1619786 by any number only affects the result of x % 5, and never affects x % 7, x % 53, x % 37, or x % 59.

Similarly, z2 578495 is a multiple of n1, n3, n4 and n5 so we can add as many 578495s as we want into x (changing the result of x % 7) without ever changing x % 5, x % 53, x % 37, or x % 59. And the same applies to z3, z4 and z5.

Since each of these products (v#*z#) does not affect the modulo operation used to extract the other values, we can safely add all the products together to get x 92516329.

Conclusion:

As I said earlier, this method is harder to set up than others (which is probably why I have not seen it being used before). But once the n# and z# numbers have been determined, it is easier and cleaner to use than any other method I have seen to date. The method works for 2 values, 5 values, 10 values, or however many values you can fit in whatever data type you are using. And it nicely handles sets of numbers with different ranges. It allows independent extraction of values. And I find that the non-obvious value x deters others from trying to tamper with stored values.

I hope that these examples will be enough to understand how this method works and how to make use of it.

Edit:

I can’t believe it! I got caught by the ‘does % mean modulus or remainder‘ gotcha again!

C#, the language I use the most, and which I used to write the example code to find the modular multiplicative inverse above, is another language that incorrectly says that the % operator is a modulus operator! It is not. % in C# returns the remainder, not the result of a modulus operation (how have I used this language for 2 decades and not run across this before?)

To fix this, I made a small correction to the MMI function above, which I highlighted in yellow. This change just ensures that the value returned for the z#s is not negative. As long as you don’t use negative v#s then this should address the problem. I have also changed the values shown for y4, z4 and x.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s