- Introduction
- Prerequisites
- Installation
- Usage
- YAML File Format
- Customization
- Dependencies
- Contributing
- Acknowledgments
This Python project allows you to automatically generate and populate a database using a YAML configuration file. You can specify the database details and table structure in the YAML file, and the script will create the database schema and populate the tables with fake data. It's a useful tool for setting up and testing databases for development and testing purposes.
Before you begin, ensure you have met the following requirements:
-
Python 3.x installed on your system
-
MySQL Server: Installed and running. You should have the necessary credentials (username, password, database name) for connecting to the MySQL server.
-
Git: Installed if you want to clone this project from a Git repository.
-
Install required libraries using pip from terminal
pip install -r requirement.txt
If you don't have Git installed, follow the instructions below based on your operating system:
sudo apt update
sudo apt install git
brew install git
Download the Git installer from [Git for Windows] (https://gitforwindows.org/) and follow the installation steps.
You can download Python from the official website Python.org and follow the installation instructions for your specific OS.
Follow the official installation instructions for MySQL based on your OS: -MySQL Installation Guide for Linux -MySQL Installation Guide for macOS -MySQL Installation Guide for Windows
To get started, clone this repository using Git:
git clone https://github.com/Ps1231/Automatic-Database-Generator-and-Populator
Change your working directory to the project folder:
cd Automatic-Database-Generator-and-Populator
This specifies the database details and table structure. See the YAML File Format section for details.
Run the main.py script with the YAML file as an argument:
python main.py your_yaml_file.yaml
The script will generate the database schema and populate the tables with data based on the configuration in the YAML file.
mysql:
host: your_database_host
user: your_database_user
password: your_database_password
database: your_database_name
num_records: no_of _records_to_be_generated
table_definition:
TableName:
columns:
- Column1_Name: Column1_DataType
- Column2_Name: Column2_DataType
# Add more columns as needed
foreign_keys:
- fk_column: ForeignKey_Column_Name
references_table: Referenced_Table_Name
references_column: Referenced_Column_Name
# Add more foreign keys as needed
primary_key:
- PrimaryKey_Column_Name
# Add more primary key columns as needed
# Define more tables as needed
# Define additional tables and configurations as needed
Sample yaml file is given here as config.yaml
- : The name of the table.
- <Column1_Name>: The name of a column in the table.
- <Column1_DataType>: The data type of the column.
- primary_key: Adding Primary key column of the table. (Can have composite primary key)
- <PrimaryKey_Column_Name>: The name of primary key column.
- foreign_keys: Adding foreign key detai of the table.
- <fk_column>: The name foreign key column.
- <references_table>: The name referenced table for foreign key.
- <references_column>: The name referenced column for foreign key.
You can customize various aspects of the database generation and population process to suit your specific needs.
You can customize the number of entries generated for each table by modifying the num_records variable in your Yaml file creating customized yaml file using given template.
- ArgumentParser: Used to give YAML file as argument from terminal.
- PyYAML: Used for parsing YAML configuration files.
- Faker: Used for generating fake data.
- Pymysql: Used for Database connectivity.
Contributions are welcome! If you have any suggestions or improvements, please open an issue or create a pull request.
The project uses the Faker library for generating fake data. Thanks to SQLAlchemy and PyYAML for their great libraries that make this project possible.