# Encoding multiple values in a single number

## The Problem:

There have been countless times when I have needed to combine multiple values into a single number. This could be because of storage space constraints, the need to conform to a restricted format that does not allow the multiple values to be supplied separately, transmission efficiency, or a host of other reasons.

The usual way that I have seen this done, and I have used myself, is to multiply each of the values by an appropriate power of 10 and add the numbers together, For instance, if I was storing a 1-digit number (3), a 2-digit number (62), and 2 more 1-digit numbers (9, 4), I could just use

```(v1 * 104) +
(v2 * 102) +
(v3 * 101) *
(v4 * 100) => x```

```(3 * 10000) +
(62 * 100) +
(9 * 10) +
(4 * 1) => 36294```

This is easy enough. However, a downside of this method is the operations needed to extract the individual numbers:

```v1 = floor(x / 104)
v2 = floor((x % 104) / 102)
v3 = floor((x % 102) / 101)
v4 = x % 10```

```v1 = floor(36294 / 10000) => 3
v2 = floor((36294 % 10000) / 100) => 62
v3 = floor((36294 % 100) / 10) => 9
v4 = 394 % 10 => 4```

The method is simple enough and I have utilized it countless times. But I never liked the complexity of the extractions (especially when dealing with storing more that 2 or 3 values). I wondered if there was a different, and perhaps, better way.

## Is There A Solution?

There is, but, while it is easier to use, it is much harder to set up. It uses the Chinese Remainder Theorem.

Firstly though: I am not a mathematician and understanding this does require some math skills beyond basic chequebook balancing. It took me some time to figure out what all the terminology and symbols meant before I could understand how this worked. That is one reason for this blog entry: to allow me to re-understand all this in the future (i.e. the day after tomorrow).

For example, the first line of the Wikipedia entry for the Chinese Remainder Theorem reads

In number theory, the Chinese remainder theorem states that if one knows the remainders of the Euclidean division of an integer n by several integers, then one can determine uniquely the remainder of the division of n by the product of these integers, under the condition that the divisors are pairwise coprime.

Did you follow that? I didn’t! I figured it out and I still can’t follow that!

I’ll try to explain some of the terminology and I will avoid all the math symbols that I can (however, I will still be using `%` to represent modulo). And please note, I am not a mathematics teacher and I will not be explaining in detail the Chinese Remainder Theorem, the Extended Euclidean Algorithm, Modulo Arithmetic, nor any of the mathematical techniques being used. I hope this article may help you if you are trying to understand any of them, but that was not my goal in writing this.

But first, some terminology.

Modulo:
This is one I thought I knew: the remainder after a division, right? Wrong. Modulo and Remainder are different operations that give different results when dealing with negative numbers. This one bit me in the butt when I was programming (since most programming languages have a `%` operation (or Mod function), which they call a “modulo operation”, but most often is a “remainder operation”).
So, what is a modulo operation? The simple answer is that gives a set of signed values equivalent to the possible remainders after divisions. The actual answer is a little fuzzy and I won’t get into it here. You can look it up on Wikipedia, or read Rob Conery’s excellent summary here.

So, while we programmers are used to seeing

`7 % 5 = 2`
`-7 % 5 = -2`
`7 % -5 = 2`

`7 % 5 = 2, 7, 12, 17...`
`-7 % 5 = you don't want to know`
`7 % -5 = OMG ! ! !`

For the rest of this post, a modulo operator, %, is assumed to return the lowest positive remainder after a division (formally: the remainder of the Euclidean division), so `7 % 5 = 2`.

Modulus:
This is another one I always got wrong. I was in the habit of referring to the `%` symbol as the ‘modulus‘ operator (in fact, many programming languages do this too). It is actually the modulo operator. The modulus is the the divisor in the modulo operation. In `7 % 5 = 2`, 5 is the modulus.

Coprime:
I’d seen this term, but until now, I never had to learn what it meant. Simply put, if a set of numbers do not share any prime factors then they are coprime. 10 and 9 are coprime since 10’s prime factors (2, 5) do not appear in 9’s prime factors (3). 10 and 12 are not coprime since 2 appears in both 10’s prime factors (2, 5) and 12’s prime factors (2, 3).

Modular Multiplicative Inverse:
This one is key to this method. The modular multiplicative inverse can be thought of this way: given two numbers `m` and `n`, what number, `y`, can we multiply `m` by so that the result is 1 greater than a multiple of `n` (i.e. `y*m % n = 1`). So for `m=7` and `n=5`, a modular multiplicative inverse could be 3, since `(3*7) % 5 = 21 % 5 = 1`. But 8 and 13 would also work as inverses, as would a whole series of other numbers.

## A Simple Example:

The purpose of this example is to show the primary concepts involved in this method of encoding multiple values. Using the Chinese Remainder Theorem for this problem is definitely overkill, but I will do it anyway.

In this example two teams are competing in a set of 7 games and we wish to record the number of games won so far by each. We want to record both scores in a single number, yet be able to get each score out of that number.

So, in this case we have 2 values (`v1, v2`), each of which can have 1 of 8 values (0-7) that we wish to encode in a single number, `x`.

The first step is pick a number that is larger than `v1`‘s range. We will use 9. This will be the modulus that we will use to get `v1`‘s values from `x`. We will call this value `n1`.

Next, we pick a second number that is larger than `v2`‘s range, but it must be coprime with `n1`. We can use 10. So `n2 = 10`.

Note that there is nothing special about 9 and 10 (other than they meet the criteria that they are large enough to hold the various values of `v1` and `v2 `and they are coprime). We could have chosen 8 and 51, or 240 and 11. But let’s stick with 9 and 10.

So that we can work this out, let us assign values to `v1` and `v2`. The first team has won 3 games, and the second team has won 2.

```v1 = 3, n1 = 9
v2 = 2, n2 = 10```

Our goal is find a number `x` so that `x % 9` will give us a value of 3, and `x % 10` will give us 2. To do this we need two more numbers that we will label m1 and m2. Each of these numbers will be the product of all the n#s for the other values. An easy way to obtain these numbers is get the product of all the n# values and divide out the particular n# that corresponds to the v#:

```m1 = (n1*n2) / n1
m2 = (n1*n2) / n2```
```m1 = (9*10) / 9 = 10
m2 = (9*10) / 10 = 9```

Since we only have 2 values, it is easy to see that the `m#` for one value is the `n#` of the other. But this won’t be the case when we have more than 2 values later.

For each value, we also want to find a modular multiplicative inverse of `m#` and `n#`. There is a process to find this number, but I won’t show that until we are working on the more complex example below. Besides, you have probably already worked out the easy-to-determine inverses for 10 and 9 (`y=1`) and 9 and 10 (`y=9`).

```y1 = 1 : (m1*y1) % n1 = (10*1) % 9 = 1
y2 = 9 : (m2*y2) % n2 = (9*9) % 10 = 1```

However, though we have worked out `m1`, `m2`, `y1`, and `y2`, those are not the numbers we are actually interested in. The numbers we want to remember are `m1*y1` and `m2*y2`:

```z1 = y1*m1 = 10*1 = 10
z2 = y2*m2 = 9*9 = 81```

So we have

```v1 = 3, n1 = 9,  m1 = 10, y1 = 1, z1 = 10
v2 = 2, n2 = 10, m2 = 9,  y2 = 9, z2 = 81```

Why do we need all these numbers? What are they for?
– We know `v1` and `v2` are the values we are storing.
`n1` and `n2` will be used to get our `v1` and `v2` out of `x` (once we determine what `x` is).
`m1`, `m2`, `y1` and `y2` were only need to calculate `z1` and `z2`. We can ignore them from now on.
`z1` and `z2` are new numbers that we will use to encode `v1` and `v2` in `x`.

Our final value `x` can now be calculated by getting the sum of the products of `v#` and `z#`.

`x = v1*z1 + v2*z2 = 3*10 + 2*81 = 30 + 162 = 192`

We finally have `x`, 192. When we wish to extract `v1` or `v2` from `x`, we can just do a modulo operation with `n1` or `n2`

```v1 = x % n1 = 192 % 9 = 3
v2 = x % n2 = 192 % 10 = 2```

Tada! That was a lot of work just to encode two numbers, I agree. But, now that we have n1, n2, z1 and z2, we can easily encode and retrieve any `v1` and `v2`.

```v1 = 5, v2 = 1
x = 5*10 + 1*81 = 131

v1 = 131 % 9 = 5
v2 = 131 % 10 = 1```

Okay, but why does this work? The secret is in how we derived `z1` and `z2`.

Let’s look at `z1`:
`z1 10` is `m1*y1`, and we know that `(m1*y1) % n1 = 1` (`10 % 9 = 1`). Note then that if we multiply `z1` by 2 we will double the result when we modulo with n1. Multiply by 3 and we triple the result:

```2*(m1*y1) % n1 = 2
3*(m1*y1) % n1 = 3
...```
```2*(10) % 9 = 2
3*(10) % 9 = 3
...```

This is key to how we will encode `v1 3`. We multiply `v1*z1 3*10` and end up with a number `30` that modulo `n1 9` gives us `v1 3`. The same relationship applies to `v2`, `z2` and `n2`.

But we added the products together (`v1*z1 + v2*z2`). How does that not mess up the modulo operations?

Again, looking at `z1`:
`z1 10` is the product of `m1` and `y1`, and `m1 10` is a multiple of all the other `n#` values (in our case, just the one: `n2=10`). That means:
`z1 10` is a multiple of the other `n#`s (`10`).
`v1*z1`, also, is always a multiple of the other `n#`s, so `(v1*z1) % n#` is always 0 (e.g: 30 % 10 = 0).
– we can add as many `z1`s `10`s as we want to into `x` and it won’t change the result of `x % n2`.
– adding the product of `z1` `10` and any number to `x` only affects the result of `x % 9`, and never affects `x % 10`.

Similarly, `z2 81` is a multiple of `n1 9`, so we can add as many `81`s as we want into `x` (changing the result of `x % 10`) without ever changing `x % 9`.

Since each of these products (`v#*z#`) can be added into x without affecting the result when a modulo operation is done with a different n#, then we can safely add the products together to get `x`.

## A More Complicated Example (including the determination of y#):

In this case, we wish to store multiple values, each with differing ranges. In the case that I was working on, I was storing a game state which had to include the following information:
`v1`: A value indicating the current number of players (range: 1-4)
`v2`: A value indicating the current player number (range: 1-4)
`v3`: A value indicating the number of rounds played so far (range: 1-50)
`v4`: A value indicating the step in the current round (range: 1-35)
`v5`: A value indicating the points collectively earned by the players (range: 1-50)

First, we obtain the numbers we will use to extract the v# values from x:

```n1 = 5
n2 = 7
n3 = 53
n4 = 37
n5 = 59```

Note: In this example, all the `n#` are primes. As we saw in the first example, this is not necessary: the requirement is that all the `n#` are coprime with each other. However, prime numbers are, by definition, coprime with all other prime numbers, and it is easier to pick prime numbers from a list than to examine the prime factors of different values in a search for suitable `n#`s.

We can create the m# values

```n1*n2*n3*n4*n5 = 4049465

n1 = 5,  m1 = 4049465 / 5 = 809893   (= n2*n3*n4*n5)
n2 = 7,  m2 = 4049465 / 7 = 578495   (= n1*n3*n4*n5)
n3 = 53, m3 = 4049465 / 53 = 76405   (= n1*n2*n4*n5)
n4 = 37, m4 = 4049465 / 37 = 109445  (= n1*n2*n3*n5)
n5 = 59, m5 = 4049465 / 59 = 68635   (= n1*n2*n3*n4)```

Next we need the y# values, the modulo multiplicative inverses for each m# and n#. As you can see, it is going to be a little harder to work this one out mentally. To find these values we must employ the Extended Euclidean Algorithm.

I used the Euclidean Algorithm in an earlier post. This extended version is similar and just adds the calculation of two addition values (one of which we will not be using and is not included in the code below).

```public double MMI(double m, double n)
{
double r1 = m;
double r2 = n;
double s1 = 1;
double s2 = 0;
double r, s, q, gcd, mmi;

while (r1 != 0)
{
q = Math.Floor(r1 / r2);

r = r1;
r1 = r2;
r2 = r - q * r2;

s = s1;
s1 = s2;
s2 = s - q * s2;
}
gcd = r; // this is the greatest common divisor of m and n
mmi = s + ((s < 0) ? n : 0); // this is the modulo multiplicative inverse for m and n
return mmi;
}```

The value `gcd `is not used in our operations and can be ignored. The value `mmi` is our desired `y#`. Using this method we can obtain

```n1 = 5,  m1 = 809893, y1 = 2,  z1 = 809893 * 2  = 1619786
n2 = 7,  m2 = 578495, y2 = 1,  z2 = 578495 * 1  = 578495
n3 = 53, m3 = 76405,  y3 = 5,  z3 = 76405 * 5   = 382025
n4 = 37, m4 = 109445, y4 = 36, z4 = 109445 * 36 = 3940020
n5 = 59, m5 = 68635,  y5 = 23, z5 = 68635 * 23  = 1578605```

With our set of `z#`s we can now encode any set of 5 `v#`s into an `x`.

To encode that player 3 of 4 is on step 12 after 6 rounds and that the players have accumulated 22 points:

```x = v1*z1     + v2*z2    + v3*z3    + v4*z4      + v5*z5
x = 4*1619786 + 3*578495 + 6*382085 + 12*3940020 + 22*1578605
x = 6479144   + 1735485  + 2292510  + 47280240   + 34729310 = 92516329```

So the number `92516329` encapsulates our game state. Decoding the values just needs the use of our `n#`s:

```numPlayers = x % n1 = 4
player     = x % n2 = 3
round      = x % n3 = 6
step       = x % n4 = 12
points     = x % n5 = 22```

One reason why I like this method is illustrated above. The modulus operations are all independent of each other. I do not need to involve any other values, their ranges, their storage sizes in x, etc. in order to obtain a value. If I want `v4`, I just modulo `x` with `n4`.

And to show how and why this works, let’s repeat the logic we used from the first example:
Let’s look at `z1`:
`z1 1619786 `is `m1*y1`, and we know that `(m1*y1) % n1 = 1` (`1619786 % 5 = 1`). Note then that if we multiply `z1` by 2 we will double the result when we modulo with n1. Multiply by 3 and we triple the result:

```2*(m1*y1) % n1 = 2
3*(m1*y1) % n1 = 3
...```
```2*1619786 % 5 = 2
3*1619786 % 5 = 3
...```

This is how we encoded `v1 4`. We multiplied `v1*z1` `4*1619786` and ended up with a number `6479144` that modulo `n1 ``5` gives us `v1 4`. The same relationship applies to `v2`, `z2` and `n2`, `v3`, `z3` and `n3`, etc.

To show that all the products can be added to create a viable `x`, we will look at `z1` again:
`z1 1619786` is the product of `m1` and `y1`, and `m1 809893` is a multiple of all the other `n#` values (`n2*n3*n4*n5 = 809893`). That means:
`z1` `16198786` is a multiple of each of the other `n#`s (`7`, `53`, `37`, and `59`).
`v1*z1` is always a multiple of each of the other `n#`s, so `(v1*z1) % n#` is always 0 (e.g.: `6479144 % 7 = 0`, `6479144 % 53 = 0`, `6479144 % 37 = 0`, `6479144 % 59 = 0`).
– we can add as many `z1`s `1619786`s as we want to into `x` and it won’t change the result of `x % n2`, `x % n3`, `x % n4`, or `x % n5`.
– adding the product of `z1` `1619786` and any number to `x` only affects the result of `x % 5`, and never affects `x % 7`, `x % 53`, `x % 37`, or `x % 59`.

Similarly, `z2 578495` is a multiple of `n1`, `n3`, `n4` and `n5` so we can add as many `578495`s as we want into `x` (changing the result of `x % 7`) without ever changing `x % 5`, `x % 53`, `x % 37`, or `x % 59`. The same relationships applies to `z3`, `z4` and `z5`.

Since each of these products (`v#*z#`) can be added into x without affecting the result when a modulo operation is done with a different n#, then we can safely add all the products together to get `x` `92516329`.

## Conclusion:

As I said earlier, this method is harder to set up than others (which is probably why I have not seen it being used before). But once the `n#` and `z#` numbers have been determined, it is easier and cleaner to use than any other method I have seen to date. The method works for 2 values, 5 values, 10 values, or however many values you can fit in whatever data type you are using. And it nicely handles sets of numbers with different ranges. It allows independent extraction of values. And I find that the non-obvious value `x `deters others from trying to tamper with stored values.

I hope that these examples will be enough to understand how this method works and how to make use of it.

Edit:

I can’t believe it! I got caught by the ‘does `%` mean modulo or remainder‘ gotcha again!

C#, the language I use the most, and which I used to write the example code to find the modular multiplicative inverse above, is another language that incorrectly says that the `%` operator is a modulo operator! It is not. `% `in C# returns the remainder, not the result of a modulo operation (how have I used this language for 2 decades and not run across this before?)

To fix this, I made a small correction to the MMI function above, which I highlighted in yellow. This change just ensures that the value returned for the `z#`s is not negative. As long as you don’t use negative `v#`s then this should address the problem. I have also changed the values shown for `y4`, `z4` and `x`.