Still a WIP: Do not use for production
CatDB allows to migrate data from various databases.
It is available on pypi so you can install it as follows:
$ pip install catdb
Create a file $HOME/.catdb
with the list of databases to use:
databases {
testdb: ${defaults.credentials} {
hostname: localhost
database: testdb
type: postgres
}
my_testdb: ${defaults.credentials} {
hostname: localhost
database: testdb
type: mysql
}
}
defaults {
credentials {
username: scott
password: tiger
}
}
The file uses the HOCON format.
At the moment the following databases are supported:
- Postgres / Redshift
- MySQL
catdb list -d <database> -t <table filter>
Options:
- database: database alias set in .catdb
- table filter: table filter (e.g.,
doc%
)
catdb ddl -d <database> -t <table filter> -e <file>
Options:
- database: database alias set in .catdb
- table filter: table filter (e.g.,
doc%
) - file: file to export the DDL to. If
-
is used, standard output is used
catdb ddl -d <database> -i <file> [-dr]
Options:
- database: database alias set in .catdb
- file: file to import the DDL from. If
-
is used, standard input is used - dr: dry-run to print the statements that would be run on the database
catdb data -d <database> -t <table> -e <file>
Options:
- database: database alias set in .catdb
- table: table to dump the data
- file: file to export the data to. If
-
is used, standard output is used
catdb data -d <database> -t <table> -i <file>
Options:
- database: database alias set in .catdb
- table: table that will receive the data
- file: file to import the data from. If
-
is used, standard input is used
Create a table in Mysql and populate it
mysql> CREATE TABLE employee(
id INT NOT NULL,
name VARCHAR(20) DEFAULT '(no name)' NOT NULL,
dept CHAR(2),
age INT,
height DOUBLE(2,1),
created_on TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
mysql> INSERT INTO employee(id, name, dept, age, height) VALUES (1, 'John Doe', 'IT', 28, 6.3),(1, 'Mary Gray', 'IT', 30, 6.8);
Export the DDL definition
$ catdb ddl -d my_testdb -t employee -e /tmp/employee.json
$ cat /tmp/employee.json
{
"database": "testdb",
"schema": null,
"tables": [
{
"columns": [
{
"column": "id",
"nullable": false,
"size": 11,
"type": "integer"
},
{
"column": "name",
"default": "(no name)",
"nullable": false,
"size": 20,
"type": "varchar"
},
{
"column": "dept",
"nullable": true,
"size": 2,
"type": "char"
},
{
"column": "age",
"nullable": true,
"size": 11,
"type": "integer"
},
{
"column": "height",
"nullable": true,
"scale": 1,
"size": 2,
"type": "real"
},
{
"column": "created_on",
"default": "current_timestamp",
"nullable": false,
"type": "timestamp"
}
],
"name": "employee"
}
]
}
Convert DDL definition to CREATE TABLE statement for Postgres
$ catdb ddl -d pg_testdb -t employee -i /tmp/employee.json -dr
CREATE TABLE employee (
id integer,
name character varying(20) DEFAULT '(no name)',
dept character(2),
age integer,
height real,
created_on timestamp without time zone DEFAULT now()
);
Export data
$ catdb data -d my_testdb -t employee -e /tmp/export.csv
$ cat /tmp/export.csv
id|name|dept|age|height|created_on
1|John Doe|IT|28|6.3|2015-04-28 22:17:57
1|Mary Gray|IT|30|6.8|2015-04-28 22:17:57
Import data (dry-run)
$ catdb data -d pg_testdb -t employee -i /tmp/export.csv -dr
INSERT INTO employee (id,name,dept,age,height,created_on)
VALUES('1','John Doe','IT','28','6.3','2015-04-28 22:17:57'),
('1','Mary Gray','IT','30','6.8','2015-04-28 22:17:57');
Items | Status |
---|---|
DDL export | ✅ |
DDL import | ✅ |
Data export | ✅ |
Data import | ✅ |
Constraints export | ❌ |
Constraints import | ❌ |
Postgres | ✅ |
Redshift | ✅ |
MySQL | ✅ |
Oracle | ❌ |
Vertica | ❌ |
SQLite | ❌ |
Dynamo | ❌ |
Hive | ❌ |
SOLR | ❌ |
Cassandra | ❌ |
Elastic Search | ❌ |
MongoDB | ❌ |
Export to S3 | ❌ |
Import from S3 | ❌ |
Compatible with Turbine XML format | ❌ |
Common console | ❌ |
Export/Import Compression | ❌ |