-
Notifications
You must be signed in to change notification settings - Fork 0
Postgresql
PostgreSQL is a relational database management system (RDBMS). That means it is a system for managing data stored in relations. Relation is essentially a mathematical term for table. The notion of storing data in tables is so commonplace today that it might seem inherently obvious, but there are a number of other ways of organizing databases. Files and directories on Unix-like operating systems form an example of a hierarchical database. A more modern development is the object-oriented database.
Each table is a named collection of rows. Each row of a given table has the same set of named columns, and each column is of a specific data type. Whereas columns have a fixed order in each row, it is important to remember that SQL does not guarantee the order of the rows within the table in any way (although they can be explicitly sorted for display).
Tables are grouped into databases, and a collection of databases managed by a single PostgreSQL server instance constitutes a database cluster.
PostgreSQL vs MySQL:
- MySQL has the most bizarre security leaks and defaults (you need to install a package called mysql_secure_installation to...well...secure MySQL).
- Feed MySQL with a million records and it will die a horrible, resource eating death.
- Try rebooting a crashed database (after its horrible death), you will love the way your tables have "crashed and burned".
- Data integrity and reliability...MySQL will give you a looooong blank stare.
NOTE: There is a postgresql Red Hat familly repo which can be very useful to solve upgrade issues. It will allow you to download previous major versions.
Postgresql default data directory is usually located in /var/lib/pgsql
. This has been replaced by /db/pgsql
, place for all of our Data bases.
postgres user has to be created first.
Before PostgreSQL can function correctly, the database cluster must be initialized:
# mkdir -p /db/pgsql
# chown -R postgres:postgres /db/pgsql
% sudo -i -u postgres
-bash-4.3$ initdb -D /db/postgres/data
..............
Success. You can now start the database server using:
postgres -D /db/postgres/data
NOTES :
- postgresql superuser is name postgres.
- postgres user is not member of wheel group and can not sudo.
- the command will fill the
/db/pgsql
directory with a data and backups directories. - it is possible to initialize as many data bases if using another location (e.g
/db/pgsql/data1
) - to avoid typing password for user postgres at each command from unix shell, a
~/.pgpass
has been created. This file accepts multi lines in this format:
hostname:port:database1:username:password
hostname:port:database2:username:password
Upgrading minor versions requires some extra maintenance. Some commands have to be run manually to be able to use the old database with the new major postgresql version. It is thus strongly advise to manually do the upgrade with pg_upgrade
and not let dnf do it.
Check that permissions are correctly set:
# ls -al /db/pgsql
drwx------ 1 postgres postgres 64 May 31 13:48 .
drwxrwxr-x 1 root root 36 May 31 13:11 ..
drwx------ 1 postgres postgres 40 May 31 13:48 backups
drwx------ 1 postgres postgres 502 May 31 13:48 data
-rw------- 1 postgres postgres 1348 May 31 13:48 initdb_postgresql.log
The postgresql server is using two main configuration files
/db/pgsql/data/postgresql.conf
/db/pgsql/data/pg_hba.conf
NOTES:
- custom settings in posthresql.conf has been made following this site. More setting details below this page.
- host-based authentication is configured in pg_hba.conf. This file controls which hosts are allowed to connect.
- postgresql socket location:
/run/postgresql/.s.PGSQL.5432=
. Permission is set to 0777 - the location of postgresql data is a system variable called PGDATA. This variable is exported by
/etc/profile.d/custom-profile.sh
- after each modification of one of these files, you need to restart posgresql:
# systemctl restart postgresql
- Modify. The default is:
/db/pgsql/data/pg_hba.conf
-----------------------------
# TYPE DATABASE USER ADDRESS METHOD
# "local" is for Unix domain socket connections only
local all all md5
# IPv4 local connections:
host all all 127.0.0.1/32 md5
# IPv6 local connections:
host all all ::1/128 md5
But, this would enable all databases to be connected to on the tcp/ip protocol. This could create a possible backdoor. So to tighten things up we need to set the database user and database name in the file.
Example:
/db/pgsql/data/pg_hba.conf
-----------------------------
# TYPE DATABASE USER ADDRESS METHOD
local postfix mailreader md5
A enlgroup has been created with these privileges:
- create DB
- create Role
- login
Role mediawiki has been created and is member of group enlglobal
List roles and attributes:
postgres=# \du
List of roles
Role name | Attributes | Member of
-----------+------------------------------------------------+-------------
enlglobal | Create role, Create DB | {}
mediawiki | | {enlglobal}
postgres | Superuser, Create role, Create DB, Replication | {}
Switch user to interact with postgres. The user must be listed as a role in postgresql data base.
% sudo -i -u MyUser
Password: XXXXXX
-bash-4.3$ psql
Password:XXXXXX
psql (9.4.1)
Type "help" for help.
MyUser=#
journalctl -u postgresql
Logs:
# cat /storage/log/postgresql/postgresql-XXX.log
# systemctl start postgresql
Systemd unit file:
/usr/lib/systemd/system/postgresql.service
-------------------------------------------------
# It's not recommended to modify this file in-place, because it will be
# overwritten during package upgrades. It is recommended to use systemd
# "dropin" feature; i.e. create file with suffix .conf under
# /etc/systemd/system/UNITNAME.service.d directory overriding the
# unit's defaults. Look at systemd.unit(5) manual page for more info.
[Unit]
Description=PostgreSQL database server
After=network.target
[Service]
Type=forking
User=postgres
Group=postgres
# Where to send early-startup messages from the server (before the logging
# options of postgresql.conf take effect)
# This is normally controlled by the global default set by systemd
# StandardOutput=syslog
# Disable OOM kill on the postmaster
OOMScoreAdjust=-1000
# ... but allow it still to be effective for child processes
# (note that these settings are ignored by Postgres releases before 9.5)
Environment=PG_OOM_ADJUST_FILE=/proc/self/oom_score_adj
Environment=PG_OOM_ADJUST_VALUE=0
Environment=PGDATA=/db/postgres/data
ExecStartPre=/usr/libexec/postgresql-check-db-dir %N
# Temporarily use postgresql-ctl wrapper around pg_ctl until there will
# exist clients which set PGPORT in service file.
ExecStart=/usr/libexec/postgresql-ctl start -D ${PGDATA} -s -w -t 300
ExecStop=/usr/bin/pg_ctl stop -D ${PGDATA} -s -m fast
ExecReload=/usr/bin/pg_ctl reload -D ${PGDATA} -s
# Give a reasonable amount of time for the server to start up/shut down
TimeoutSec=300
[Install]
WantedBy=multi-user.target
NOTE: systemd environment is not the same as the system environment. Thus, it is needed to explicitly add some Environment= in the unit file.
% systemctl status postgresql
● postgresql.service - PostgreSQL database server
Loaded: loaded (/usr/lib/systemd/system/postgresql.service; disabled; vendor preset: disabled)
Active: active (running) since Fri 2015-05-08 16:01:01 CEST; 6s ago
Process: 13296 ExecStart=/usr/libexec/postgresql-ctl start -D ${PGDATA} -s -w -t 300 (code=exited, status=0/SUCCESS)
Process: 13294 ExecStartPre=/usr/libexec/postgresql-check-db-dir %N (code=exited, status=0/SUCCESS)
CGroup: /system.slice/system-systemd\x2dnspawn.slice/[email protected]/system.slice/postgresql.service
├─13300 /usr/bin/postgres -D /var/lib/pgsql/data
├─13311 postgres: logger process
├─13325 postgres: checkpointer process
├─13326 postgres: writer process
├─13327 postgres: wal writer process
├─13328 postgres: autovacuum launcher process
└─13329 postgres: stats collector process
May 08 16:00:39 poppy systemd[1]: Starting PostgreSQL database server...
May 08 16:00:50 poppy postgresql-ctl[13296]: LOG: redirecting log output to logging collector process
May 08 16:00:50 poppy postgresql-ctl[13296]: HINT: Future log output will appear in directory "pg_log".
May 08 16:01:01 poppy systemd[1]: Started PostgreSQL database server.
pg_dump is a utility for backing up a PostgreSQL database. It makes consistent backups even if the database is being used concurrently.
Dumps can be output in script or archive file formats. Script dumps are plain-text files containing the SQL commands required to reconstruct the database to the state it was in at the time it was saved. To restore from such a script, feed it to psql. Script files can be used to reconstruct the database even on other machines and other architectures; with some modifications, even on other SQL database products.
The alternative archive file formats must be used with pg_restore to rebuild the database. They allow pg_restore to be selective about what is restored, or even to reorder the items prior to being restored. The archive file formats are designed to be portable across architectures.
- dump a whole data base in your current directory. Will need to use psql to restore
[postgres@poppy]~/backups% pg_dump myDB > myDB.sql
- dump in custom format (-Fc) with no access privileges dumped (-x) in your current directory. Can be restored with pg_restore.
[postgres@poppy]~/backups% pg_dump -x -Fc myDB > myDB.dump
TIP: best is to always dump everything in a custom format. Table and data only can thereafter be choosen with pg_restore.
- restore data only (-a) from a specific table (-t) and a specific schema (-n) into database (-d)
postgres@poppy ➤➤ backups/mediawiki % pg_restore -a -d myDB -n mySchema -t myTable < myDB.dump
TIP: use --disable-triggers to ignore constraints.
WARNING: cross-database references are not implemented. To work on a specific database, first connect to it.
-
Log from Unix session $ psql -d MyDb -U myUser
-
List tables in DB:
\dt MySchema.*
\dt+ MySchema.* <--- give size and description
- List content in table:
SELECT * FROM MySchema.MyTable;
- create role and data base:
postgres=# CREATE USER myUser WITH PASSWORD 'XXXXX';
CREATE ROLE
postgres=# CREATE DATABASE myDB TEMPLATE template0 OWNER myUser;
CREATE DATABASE
NOTE : CREATE USER will allow by default the user to login. CREATE ROLE will not allow by default, WITH LOGIN must be specified.
- List role attribute
postgres=# \du
List of roles
Role name | Attributes | Member of
-----------+------------------------------------------------+-----------
mailman | | {}
mediawiki | | {}
postgres | Superuser, Create role, Create DB, Replication | {}
vmail |
postgres=# \du mailman
List of roles
Role name | Attributes | Member of
-----------+------------+-----------
mailman | | {}
- insert new column
myDB=# ALTER TABLE mySchema.myTable ADD COLUMN myColName ColumnType NOT NULL ;
TIP: if you have this error ERROR: column "myColName" contains null values
, then it is probably because you droped this column sometimes earlier. The trick is to first enter a default value, then field the column and remove the default value.
myDB=# ALTER TABLE mySchema.myTable ADD COLUMN myColName ColumnType NOT NULL DEFAULT 'foo';
.....
do some work
ALTER TABLE mySchema.myTable ALTER COLUMN myColName DROP DEFAULT;
- Update value in a column
UPDATE MyTable SET myColName='xxyyc' WHERE myColName1 = myValue ;
- grand privileges:
myDB=# GRANT ALL ON DATABASE myDB TO myUser;
myDB=# GRANT ALL ON SCHEMA mySchema TO myUser;
myDB=# GRANT ALL ON ALL TABLES IN SCHEMA mySchema TO myUser;
myDB=# ALTER DATABASE myDB OWNER TO myUser ;
This command define access privileges.
- connect to a database as user:
postgres=# \c MyDataBase MyUser
You are now connected to database "MyDataBase" as user "MyUser".
MyDataBase=>
- disconnect from data base
enl=# \c postgres postgres
You are now connected to database "postgres" as user "postgres".
postgres=#
There is no disconnect command
- print help and quit
postgres=# \help
postgres=# \quit
- display the privileges granted on existing tables and columns
postgres=# \dp
Schema | Name | Type | Access privileges | Column access privileges
--------+------+------+-------------------+--------------------------
- remove database
postgres=# DROP DATABASE MyDb;
- show db directory
postgres=# SHOW data_directory;
data_directory
----------------
/db/pgsql/data
(1 row)
- List database:
postgres=# \l
List of databases
Name | Owner | Encoding | Collate | Ctype | Access privileges
-----------------+-----------------+----------+------------+------------+-----------------------
django_slacklog | django_slacklog | UTF8 | en_US.UTF8 | en_US.UTF8 |
enlwiki | mediawiki | UTF8 | en_US.UTF8 | en_US.UTF8 |
postgres | postgres | UTF8 | en_US.UTF8 | en_US.UTF8 |
template0 | postgres | UTF8 | en_US.UTF8 | en_US.UTF8 | =c/postgres +
| | | | | postgres=CTc/postgres
template1 | postgres | UTF8 | en_US.UTF8 | en_US.UTF8 |
(5 rows)
- get size of data base
postgres=# SELECT pg_size_pretty(pg_database_size('myDB'));
pg_size_pretty
----------------
11 MB
(1 row)
- get size of all data bases
postgres=# SELECT pg_database.datname,
pg_size_pretty(pg_database_size(pg_database.datname)) AS size FROM pg_database;
- list role and attributes:
postgres=# \du
List of roles
Role name | Attributes | Member of
-----------------+------------------------------------------------+-----------
enlglobal | Create role, Create DB | {}
django_slacklog | | {}
mediawiki | | {enlglobal}
postgres | Superuser, Create role, Create DB, Replication | {}
- Alter role
postgres=# ALTER ROLE RoleName WITH .... ;
Please refer to Postgresql doc to a complete list of commands to alter role.
- Remove role
postgres=# DROP ROLE roleName;
- delete table row with condition
myDB=# DELETE FROM mySchema.myTable WHERE columnName [CONDITION];
E.g:
myDB=# DELETE FROM mywiki.page WHERE page_touched > '2015-01-08 17:28:48+01';
- garbage-collect and optionally analyze a database
myDB=# VACUUM (ANALYZE) mySchema.myTable ;
- empty a whole table. Use TRUNCATE
myDB=# TRUNCATE mySchema.myTable CASCADE;
- create schema for a data base and add to your path
postgres=# \c myDB
You are now connected to database "myDB" as user "postgres".
myDB=# CREATE SCHEMA mySchema AUTHORIZATION myRole;
myDB=# SET search_path TO mySchema;
myDB=# SHOW search_path;
search_path
-------------
mySchema
(1 row)
A schema is essentially a namespace: it contains named objects (tables, data types, functions, and operators) whose names can duplicate those of other objects existing in other schemas.
-
to create or access objects in a schema write a qualified name consisting of the schema name and table name separated by a dot: schema.table
-
list schema of enlwiki data base
enlwiki=# \dn
List of schemas
Name | Owner
--------+-----------
mywiki | mediawiki
public | mediawiki
(2 rows)
- drop constraint (key)
myDB=# ALTER TABLE mySchema.myColumn DROP CONSTRAINT keyName CASCADE;
The postgresql data base can be managed from web interface phppgadmin URL. postgres user is not allowed to connect because of security reasons.
Once a dump file has been downloaded from Heroku:
postgres@poppy $ pg_restore --verbose --clean --no-acl --no-owner -h localhost -d mediawiki HerokuDumpFile
This will create a new database called mylocaldb wich will dump MyApp postgresql db. The mylocaldb shall NOT exists before the command is run.
Command: pg_restore --verbose --clean --no-acl --no-owner -h localhost -d [db_name] production.dump
They all are done in the /db/pgsql/data/postgresql.con
. The settings have been based according to a 32G RAM box.
Shared memory allows processes to access common structures and data by placing them in shared memory segments. It is the fastest form of inter-process communication available since no kernel involvement occurs when data is passed between the processes. In fact, data does not need to be copied between the processes.
To see all shared memory settings, execute:
% ipcs -lm
------ Shared Memory Limits --------
max number of segments = 4096
max seg size (kbytes) = 18014398509465599
max total shared memory (kbytes) = 18014398442373116
min seg size (bytes) = 1
Setting SHMMAX Parameter defines the maximum size in bytes of a single shared memory segment that a Linux process can allocate in its virtual address space
To determine the maximum size of a shared memory segment, run:
# cat /proc/sys/kernel/shmmax
18446744073692774399
Then:
# echo "kernel.shmmax=18446744073692774399" >> /etc/sysctl.conf
This parameter sets the system wide maximum number of shared memory segments. 4096 is the recommended value
# cat /proc/sys/kernel/shmmni
4096
This parameter sets the total amount of shared memory pages that can be used system wide. Hence, SHMALL should always be at least ceil(shmmax/PAGE_SIZE)
In most cases this setting should be sufficient since it means that the total amount of shared memory available on the system is 20971524096 bytes (shmallPAGE_SIZE) which is 8 GB. PAGE_SIZE is usually 4096 bytes
echo "kernel.shmall=2097152" >> /etc/sysctl.conf
CREATE EXTENSION loads a new extension into the current database.
To have an effective monitoring you should install the following extensions : pg_stat_statements, pg_buffercache
# dnf install postgresql-contrib
Then log as postgres user
postgres=# CREATE EXTENSION pg_buffercache;
CREATE EXTENSION
postgres=# CREATE EXTENSION pg_stat_statements;
CREATE EXTENSION
List extension
postgres=# \dx
List of installed extensions
Name | Version | Schema | Description
--------------------+---------+------------+-------------------------------------
----------------------
pg_buffercache | 1.0 | public | examine the shared buffer cache
pg_stat_statements | 1.2 | public | track execution statistics of all SQ
L statements executed
plpgsql | 1.0 | pg_catalog | PL/pgSQL procedural language
Then update /db/postgres/data/postgresql.conf
shared_preload_libraries = 'pg_stat_statements, auto_explain'
pg_stat_statements.max = 10000
pg_stat_statements.track = top
auto_explain.log_min_duration = -1
auto_explain.log_analyze = 'false'
Pgcluu, a PostgreSQL Cluster performances monitoring and auditing tool has been installed.
pgCluu comes with two Perl scripts:
- one for PostgreSQL and System metrics collector
pgcluu_collectd [options] output_dir
- one for report generator.
pgcluu [options] -o report_dir input_dir
Output directory has been created as /storage/log/psqlStat
. To analyze a special DB, make a new directory /storage/log/psqlStat/MyDb1
.
Generated reports directory has been created as /storage/log/psqlReport
A systemd service file has been created
/etc/systemd/system/pgcluu-collectd.service
--------------------------------------------
[Unit]
Description=Postgresql performance and auditing
Requires=postgresql.service
After=postgresql.service
[Service]
User=postgres
Group=postgres
PIDFile=/tmp/pgcluu_collectd.pid
Environment=OUTPUTDIR=/storage/log/psqlStat
Environment=PGPASSWORD=XXXXXX
ExecStart=/usr/local/bin/pgcluu_collectd $OUTPUTDIR
ExecStop=/usr/local/bin/pgcluu_collectd -k
[Install]
WantedBy=multi-user.target
Start the service:
# systemctl start pgcluu-collectd
To generate the report:
# pgcluu rm /storage/log/psqlReport/*
# pgcluu -o /storage/log/psqlReport/ /storage/log/psqlStat/
To see the report, a Nginx conf file has been written:
/etc/nginx/conf.d/psqlstat.conf
----------------------------
server {
listen 80;
server_name localhost;
root /storage/log/psqlReport;
location / {
index index.php index.html;
}
}
This allows to see the report in html on localhost poppy.