Vitalik Buterin46:34
If you just think of this as being just a multiplication, then if it was an encryption of zero, it's just going to give you a multiple of 2. If it was an encryption of one, then this gives you a multiple of Q plus half of Q. If you multiply this by P over Q, then multiples of Q would turn into multiples of P, and multiples of half of Q turn into half of P. One example: if Q is 10,000 and P is 1,000, then if a prime dot s is say 50,000 plus a bit of error, then if you go down from Q to P, multiplied by P over Q, it's basically P is 1,000, Q is 10,000, so divided by 10. Your 50,000 becomes 5,000, and 5,000 mod 1,000 is still 0. But if you instead had say 55,000, then 55,000 times P over Q is 5,500, and mod 1,000 is 500. The error introduced by the division not being perfect is still a small error, so it just mixes together with the error that already exists. This should really say e prime is supposed to be e, it's a good correction. Yes, sorry, that's correct, this should be prime. Even the contribution from e also changes. Going back to this example, if a prime dot s was 55,032, so your error is 32, then if you multiply by P over Q, 55,032 becomes 5,503.2, so mod 1,000 it's 503. 500 is P over 2, so your new error is 3. The ratio between the error and the modulus size remains the same. It seems like you're not accomplishing anything, because the error relative to the modulus is the same. But the key is that you can do this repeatedly. You can keep doing this rescaling operation, and each time you reduce the absolute size of the error. If you do it enough times, you can bring the error down to a small constant. This is the key insight that allows you to do many multiplications without the error blowing up. So that's the magic trick.
Ratio is the same but you're actually accomplishing something really important, which I'll get into in the next slide. So I guess, so right now, right? Yes. Are people satisfied that if you have a ciphertext whose error gets bigger, you can turn it into a ciphertext whose error as a number is bounded by some constant, but it's just a modulus as to keep, like, is being reduced, right? So this is kind of the lesson here.
Yeah, okay. So here is why we switch, right? So there's two reasons to switch, actually. So the first reason to switch is this makes bootstrapping actually practical, right? So bootstrapping involves basically computing this dot product mod Q, and computing the star product mod Q has a depth circuit depth of log Q. And by reducing the modulus, what we can do is we can basically turn something which to its decryption requires a circuit depth of log Q to something that has a circuit depth of log log P, and P here, like I said, can be very small, right? Once you do the modulus switch, you don't have to actually do any more computation, so P can be a pretty small constant. And so this basically means that the circuit depth of bootstrapping becomes a constant, and so you can just increase Q to as much as you need in order to make the bootstrapping procedure actually possible.
Yeah, so in the bootstrapping key, basically the way that we do it, bootstrapping is just kind of computing this dot product mod P, or kind of as a circuit. And so to make this sum computable, you basically just provide the s_i x 2^d values in binary representation, right? So for every s_i times 2^d, you get, well, you basically, and this is modulo P, then you provide a binary representation of the s's, so we have n log n log P numbers in binary representation. So for every one of these terms, you have your s_n, and basically to compute an s_n, you take, for every one bit that's in a_n, because you have a_n in the clear, you multiply that by the corresponding bit representation of the power of s_n. And so you just have a bunch of these numbers, and these are numbers in binary representation, right? So they're not single ciphertexts, they're log P ciphertexts. And then you just add a whole bunch of these numbers together, and then you take modulo. If you like, the best choice for P is going to be 2^n - 1, and that makes modulo really easy because you just make your addition circuits wraparound.
So how do we implement these addition circuits, right? So basically, the addition circuit is just going to be, you have these n log P numbers, and you just have to add together these numbers, right? And then the modulo, basically your addition circuits, instead of carrying out onto even higher bits, every time you carry past the size of P, you just kind of wrap around, right? And then that just gives you modulo 2^n - 1 for free, right? So if your modulus was 1023, then your bits are in 128, 256, 512, 1024 is equal to 1, so you just kind of wrap that around to 1, so modulo-ing is really easy. So your setting is you just have to add together a whole bunch of numbers, right? And the first step is that you can reduce three numbers to two numbers, so you can have a kind of three-to-two adder that has multiplicative depth one, right? And the way that this works is basically that every bit of P is going to be just the XOR of the corresponding bits of A, B, and C, and every bit of Q is going to be the two-of-three function of A, B, and C, right? So this is 0 if A+B+C is 0 or 1, and 1 if A+B+C is 2 or 3. And you can think of the S as being the carry digit, right? And the Q you also kind of multiply by 2, which basically means you just kind of left shift Q by 1, right? So the reason why this works is basically because every bit in A, B, and C gets represented inside of P, but then in the specific case where you have kind of two bits that turned onto one, then inside of here they get turned into 0, but you get, you flip the corresponding bit from a 0 to a 1, and then the Q gets kind of moved to the left by 1, and so that multiplies it by 2. And so basically, every individual bit in A, B, or C gets some kind of correctly represented in P plus Q, right? So this is a kind of three-to-two adder, and this has multiplicative depth of one. Adding two numbers X plus Y does require logarithmic multiplicative depth, right? So when you go from N numbers to two numbers, that requires a multiplicative depth which is basically logarithmic in N, so order of making the number of numbers, but then adding the last two does require multiplicative depth log log (X+Y), and here X and Y are mod P, so basically saying log log P, or log of the number of digits of P, right? So if P is 1023, then P has 10 digits, the log of 10 is going to be about 4, and so your multiplicative depth is just going to be 4. Now the reason why adding circuits have this non-trivial logarithmic multiplicative depth is basically because you have to worry about threading the carry bits through the number. And in the code that I'll link to later, I have algorithms for doing this. So in general, computing a linear combination of size n log P is going to take size, so first of all you have these N numbers, and second, you have this kind of, the sum is going to be a size n log P, and so it's kind of log of n log P, and so this becomes log n plus log log P, and so the total O is going to be log n plus log log P. And you can pretty easily just set Q to be high enough that this is less than log log Q, and so you can basically implement the decryption circuit, and so we can bootstrap.
So here's optimization number two, right? So I mentioned back when we were over here, right, that there are two tricks to overcome overflow: one is bootstrapping, and the other is clever tricks to make your multiplication only increase your error by a constant factor. So this is the clever tricks that I'm going to talk about, right? So basically, imagine if instead of just doing a modulus switch before bootstrapping, we do a modulus switch after every multiplication, right? So if you don't do this, right, so if you imagine a circuit that has some multiplicative depth, then your error is going to square every time you do a multiplication, right? So if you imagine your first ciphertext, let's say the modulus is 10^16 - 1 and your ciphertext has error 10^2, if you square it, the error is going to be 10^4, actually it'll be 10^4 times some constant, but let's say it's 10^4. Then the third time you multiply, the error is 10^8, and the fourth time you multiply, the error is 10^16, and then the error kind of overwhelms the ciphertext, and then decryption starts failing. So if you just do this kind of fully homomorphic encryption naively, then after a logarithmic number of multiplications, you're screwed. Now what happens if you do modulus reduction, right? So you have X which has an error of 10^2 modulo 10^16 - 1, then you multiply it, and so you have X^2 with an error of 10^4. Now if you perform a reduction, and so here you have X with error 10^4 modulo 10^16 - 1, we're going to divide the modulus by a factor of 100, and so your error gets cut from 10^4 to 10^2, and your modulus gets cut from 10^16 to 10^14. And so now you have this term over here, and now if you square this again, right, then your X^2 goes up to X^4, your error of 10^2 goes back up to being an error of 10^4, your modulus is the same, then you reduce it again, and now your error goes down to 10^2, your modulus kind of hops down again, then you square it again, the error goes up from 10^2 to 10^4, and then you just reduce the modulus again. And so notice that instead of the error squaring every time, basically we have the modulus just kind of very nicely and orderly stepping down by a constant factor every time, right? So the clever trick here is basically that when you multiply the ciphertexts, the errors get multiplied. And so if you just make sure that the errors as numbers stay small, then squaring the error is always just going to be a constant factor multiplication. And so instead of having a logarithmic number of steps, you can have a linear number of steps. And so your multiplicative depth goes up from log log Q to log Q.
Pause for questions again.