Effortlessly Anonymize Personal Data with Anonymize API for GDPR Compliance
Discover how Anonymize API replaces personal data with synthetic data, ensuring full anonymity and GDPR compliance without the need for software installation.
File
How to anonymize personal data inside a database
Added on 09/29/2024
Speakers
add Add new speaker

Speaker 1: Hey, hello. During this demo, I would like to show you how easy it is to anonymize your personal or customer data inside your database. First things first, a little bit about Anonymize. Anonymize is an API which you can use to replace your personal or customer data with synthetic data, which is the only way to be fully anonymous and GDPR compliant. If you Google a bit, there's a couple of things you can do to anonymize data like masking, replacing, synchronizing. But replacing it with synthetic data is the only way to be fully 100 percent anonymous. What about anonymous data? Anonymous data gives you the ability to process location-related data with full confidence. You're for sure that it's synthetic, so it doesn't contain any personal data anymore. Knowing that you comply with privacy regulations, and compared to any competitors, you don't have to upload data, you don't have to install software, you don't have to do other things. No, it's just an API, as you can see here on the left side, and you actually determine yourself what you would like to share with the API and become the location a bit. But basically, you're still in control of your data. Keep in mind, anonymize does not require you to share or expose personal data. That's the most important thing. On the right side, here's an example. Original values, and you can see here in bold that I'm only going to share a couple of things like the zip code or postal code, city, and the birth date. On the left side, you can see we're actually pushing or posting towards the API later on. Then on the right side, you can see that the anonymized values, or in this case, the synthetic data is actually being returned. Now, there are two things that I would like to highlight here. You can see here there's a range, like a minimum distance and a maximum distance, and this looks like the picture I've seen here. What it's going to do, it's going to try to pinpoint the geolocation based on the given inputs. In this case, a zip code or postal code in Amsterdam, some kind of location is found, so a latitude and longitude, and then based on that location and you're setting the range and the minimum distance and the maximum distance, it will determine or it will try to find different points or geolocations which actually are resolved into a house. It doesn't need to be a fully real address, so to say. Let's do something, and I would like to share this one with you guys. Let me click send. Here you go. You can see here, these are the results. A couple of things I would like to highlight. We're not sharing the birth date, we're just sharing the birth year, as you can see, which is more anonymous, of course. Based on this birth year, you still return something like age generation, age group, and a synthetic replacement for the birth date inside your database. Of course, the birth year is identical. Then CVV and credit card number, which are both technically valid, so you can actually use these for software testing. Of course, city, country, county, and also the country code and country name. A random customer lifetime value which you can use. You can see here the district and inside Amsterdam, so the district or the area in this case. A sample e-mail address, so Thijs is also the first name, and the last name is Holmpa. As you can see here and also get it pronounced, it's also in Dutch, so it says local names. Nothing generic in English, but it uses and it returns local names. Of course, expiry date, it comes along with the credit card, a random goods. The gender is, of course, the same, and based on this gender, it will return a first name, so this is actually important. House number is, of course, tied to the streets, and then this sums up into a geolocation which has an actual address, which is real. You can use to use on Google Maps or different things. It also generates an e-mail number, international bank account number, which is actually also valid technically. Random IP address and a few random numbers you can use to later on. Of course, the state codes are tied to Amsterdam, a telephone number which has the prefix of the country codes, and a random value for the yearly income, USD, which is actually also based on the country, so it doesn't go that random. If I now submit the same settings again, let me make a change. Let's say I want to have 150 meters of distance and then 350, for example, and here you go,

Speaker 2: and now, oh, let me switch to female, which is nicer. As you can see in here, first name, Jill, and female.

Speaker 1: You can see. So let's do something else. Let's try Istanbul, Turkey, country code TR, female.

Speaker 2: Here you go.

Speaker 1: Turkey, which is actually written differently. Come back to that later on, but I'm looking for the first name. So Sevelyn, Sevelyn, and then if I switch back to female, first name, Jigit, and here my pronouncement is going wrong, but of course, let's make it easier.

Speaker 2: U.S., Los Angeles, first name, Dawson, and let's switch to female, Adley, which makes more sense, of course. California, CA, street name,

Speaker 1: and of course, the prefix for the country code. So this is pretty straightforward how it works. So you don't share a lot of information and you're still in control of what you share. So in this case, Los Angeles and citywide. I can also fill in like downtown Los Angeles, I think. Let me try. And then downtown district is actually returned. So this is a nice thing to do. So you're fully in control. Don't share any sensitive information. And in order to have a practical example, I have a dummy database, which contains a customer table. As you can see here, it only holds gender, postal code, country code, country name, and a birth date. It can also be blank, but now it holds these values. If I go to my Python script imports, here I have that same sample table, including the three records that we're gonna use. And you can see in here also that I'm gonna fill in Turkey. And Turkey was the old or the previous name of the country. Today, it's known as Republic of Turkey with an I, so that's different. And then also the birth date will become a birth year towards the API, which is more anonymous, but you're still getting returned, of course, a generated birth date and also an age group and an age generation, as we've seen before. Now, let me just go back to the script, connect to your database, select dummy data in this case, birth date will become a birth year. So we're gonna remove the suffix, we're gonna remove the month and the date. We're gonna post it and use the data towards the API. So this is actually identical to the one you saw before. Minimum distance, maximum distance. And then what it does is that based on the return response, it will update this data inside that sample table. So let me just run this one. Let's see, interactive, run the file. And you can see in here, this is a post, this is a returned or updated statement. And now if I go back to the database, click on refresh and as you can see in here that Turkey has become Turkey, and including, of course, latitude, longitude, telephone numbers and also the birth dates are now changed. So it seems to be all right. So keep in mind, Mace Isabel is met. I'm gonna run it another time, clear this one.

Speaker 2: Right click, run the file. Yeah, here we go. So Mace Isabel is met.

Speaker 1: And now everything's already shuffled again. So this is how it works. Thank you for watching. If you have any questions, please let me know. I think the full script, the link towards it is attached to the descriptions. You can download it here. So contact me if you want an API key. It's pretty straightforward as you've seen and you can use it to anonymize your data inside your database. Well, have a nice day. Bye-bye.

ai AI Insights
Summary

Generate a brief summary highlighting the main points of the transcript.

Generate
Title

Generate a concise and relevant title for the transcript based on the main themes and content discussed.

Generate
Keywords

Identify and highlight the key words or phrases most relevant to the content of the transcript.

Generate
Enter your query
Sentiments

Analyze the emotional tone of the transcript to determine whether the sentiment is positive, negative, or neutral.

Generate
Quizzes

Create interactive quizzes based on the content of the transcript to test comprehension or engage users.

Generate
{{ secondsToHumanTime(time) }}
Back
Forward
{{ Math.round(speed * 100) / 100 }}x
{{ secondsToHumanTime(duration) }}
close
New speaker
Add speaker
close
Edit speaker
Save changes
close
Share Transcript