This repository offers a Python toolkit for ensuring k-anonymity and l-diversity in datasets. It includes functions for transforming data, checking privacy compliance, and handling various data types. Ideal for anonymizing sensitive data while retaining utility, the toolkit is open-source under the MIT License for broad usage.
k-Anonymization & l-Diversity Toolkit This repository provides a comprehensive toolkit for ensuring that datasets meet k-anonymity and l-diversity privacy requirements. Designed for data privacy enthusiasts and professionals, the toolkit includes functions for transforming datasets, checking for privacy compliance, and handling various data types (numerical, categorical, and dates).
The toolkit is particularly useful for those looking to anonymize sensitive data while retaining its utility for analysis. It includes easy-to-use Python functions, a Jupyter notebook for step-by-step guidance, and thorough documentation to help you anonymize your data effectively.
This project provides a set of tools to help ensure that a dataset meets k-anonymity and l-diversity requirements. The project includes functions for data transformation, anonymity checks, and utilities for data handling. It is designed to be used with a variety of datasets to ensure privacy and security.
Convert date columns to MM/YYYY format.
Group ages into 5-year bands.
Merge categorical fields into broader categories.
Check for k-anonymity and l-diversity.
Drop columns with all unique values.
- Clone the repository
- Navigate to the project directory
- Install the required Python packages by
pip install -r requirements.txt
You can use the provided functions by importing them into your Python scripts or Jupyter notebooks. For example:
Open Source: Released under the MIT License, allowing for wide usage and collaboration. This project is licensed under the MIT License. See the LICENSE file for more details.