Simple Differential Privacy — and how to code it

Today, we are going to code a simple differential privacy algorithm with python. So, lets get started!

Image for post
Image for post
Photo by Franki Chamaki on Unsplash

Sorry, what are we coding again?

Differential privacy is really simple. Let’s say, I have a data set of info about people which I want to publish, but still keep their data private. That’s where differential privacy comes in. It takes your data, and alters it in a way that will keep overall facts about your data in the same area (with more complex algorithms you can alter how close you want it to be) while keeping individuals data private.

The Random Response Mechanism

Okay, lets get to the details of how we are going to alter this data. We want to start simple, so we’ll code the Random Response mechanism. This is the most simple mechanism for differential privacy. It flips a coin, if it is heads, then it keeps the same value. If it is tails, it flips it again, and if it is heads, it returns true, and if tails, then it returns false. Now, again, as I said before, this is a very simple algorithm so it only works with zeros and ones or true and false .

Coding The Random Response Mechanism

Our first step is to get a data set for our algorithm to run on. If you are going to use real data, you’ll need to convert it to an array. You can also use this sample data to test if your algorithm is working:

You could also run a for loop which generates random values and pushes them to an array.

Now that we have our data, lets start by creating a function called rand_response as the function we call to privatize our data. We create it with one parameter, data, for the data.

Now, we call a for loop to go through the array:

for a in data:

Then we set a variable, b, as a random number and use if statements to tell if it is a 0 or 1. We’ll need numpy for this so install it via pip if you haven’t already:

pip install numpy

and now import it into our code:

from numpy import random

We imported a specific file called random for the random number generation.

Now let’s generate the number:

b = random.randint(2);

This generates a random integer, either 0 or 1.

Next, we put the if statements to find the number and change the value accordingly:

if b == 0:        
a[i] = 0
if b == 1:
b1 = random.randint(2)

if b1 == 0:
a[i] = 0

if b1 == 1:
a[i] = 1

And now we’ve finished! You should have something like this:

When you run rand_response(data) and then print data, you should see the data changed a little bit. Now, this isn’t very private, but it is the first step. Now that you know how differential privacy, try other mechanisms like the Exponential Mechanism and Laplace Mechanism. If you want to use these algorithms, you can use my python package DiffPriv.

Computer Programmer, Musician, and hobbyist. View on GitHub -https://github.com/Quantalabs. Follow on DEV @Quantalabs

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store