Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Are all lines necessary in Shuffle? #79

Closed
faykurt opened this issue Nov 22, 2023 · 6 comments
Closed

Are all lines necessary in Shuffle? #79

faykurt opened this issue Nov 22, 2023 · 6 comments

Comments

@faykurt
Copy link

faykurt commented Nov 22, 2023

Hi, below part is from shuffle function. Are all the lines necessary or is it just to increase the randomness? x_u is already random. What is the purpose of scalar_mul and vector_add? Can't we get rid of them to increase the performance of shuffling by decreasing the execution time?

Also could you name the paper, study that shuffling algorithm originated?

    for i in range(n-1):
        u = random_unit_vector(sectype, n - i)
        x_u = runtime.in_prod(x[i:], u)
        d = runtime.scalar_mul(x[i] - x_u, u)
        x[i] = x_u
        x[i:] = runtime.vector_add(x[i:], d)
    return
@faykurt faykurt changed the title Are all lines in Shuffle is necessary? Are all lines in Shuffle necessary? Nov 22, 2023
@faykurt faykurt changed the title Are all lines in Shuffle necessary? Are all lines necessary in Shuffle? Nov 22, 2023
@lschoe
Copy link
Owner

lschoe commented Nov 23, 2023

You can take a look at SecretSantaExplained.ipynb for an explanation of this shuffling algorithm. The algorithm generates each permutation with probability $1/n!$. If you stop earlier, e.g, if you replace for i in range(n-1) by for i in range(n-2) then you only get half of the permutations, each with probability $2/n!$.

@faykurt
Copy link
Author

faykurt commented Nov 23, 2023

Dear @lschoe, for a very big n, 1/n! and 2/n! have no difference, they are both small enough. Can we optimize for i in range(n-1) part? Instead of 1, we may put a variable that relates with n size?

@lschoe
Copy link
Owner

lschoe commented Nov 24, 2023

Well, $n$ need not be very large here at all, and certain permutations are excluded with certainty. Also, dropping a few steps at the end does not lower the overall complexity very much, as these later steps require an entropy of only $\log_2 (n-i)$ secure random bits for $i$ close to $n$.

@MarcT0K
Copy link
Contributor

MarcT0K commented Nov 24, 2023

@faykurt If you want to speed up the process, I implemented a parallelized version of the shuffle function (see PR #68).

On a private repo, I also implemented the Laur et al. shuffle. Its complexity is linear contrary to the current MPyC shuffle that is quadratic. However, Laur et al.'s shuffle has a much worse scaling in the number of parties.

@faykurt
Copy link
Author

faykurt commented Nov 28, 2023

Thanks for your all replies, very appreciate and one more question :) Why did you implement Knuth shuffle instead of Sattolo shuffle?

@lschoe
Copy link
Owner

lschoe commented Nov 29, 2023

Sattolo's shuffle is not a uniform permutation, but a uniform cyclic permutation, which is something else.

@faykurt faykurt closed this as completed Nov 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants