Useful SQL scripts, split from DevOps Bash tools, for which this is now a submodule.
Hari Sekhon
Cloud & Big Data Contractor, United Kingdom
(you're welcome to connect with me on LinkedIn)
aws_athena_cloudtrail_ddl.sql
- AWS Athena DDL to setup up integration to query CloudTrail logs from Athenabigquery_*.sql
- Google BigQuery scripts:bigquery_billing_*.sql
- billing queries for GCP usage eg. highest cost services, most used GCP products, recent charges etc.bigquery_info_*.sql
- information schema queries for datasets, tables, columns, partitioning, clustering etc.
mysql_*.sql
:postgres_*.sql
:- PostgreSQL queries for DBA investigating + performance tuning
- postgres_info.sql - big summary overview, recommend you start here
- tested on PostgreSQL 8.4, 9.x, 10.x, 11.x, 12.x, 13.x
oracle_*.sql
:- Oracle queries for DBA investigating
- tested on Oracle 9i, 10g, 11g, 19c
bigquery_*.sql
- Google BigQuery scripts:bigquery_billing_*.sql
- billing queries for GCP usage eg. highest cost services, most used GCP products, recent charges etc.bigquery_info_*.sql
- information schema queries for datasets, tables, columns, partitioning, clustering etc.- analytics/
bigquery_*.sql
- ecommerce queries and BigQuery ML machine learning classification logistic regression models and purchasing predictions - for more BigQuery examples, see Data Engineering demos
See the pages for:
in the HariSekhon/Knowledge-Base repo:
You can quickly test the PostgreSQL / MySQL scripts using postgres.sh
/ mysqld.sh
/ mariadb.sh
in the DevOps Bash tools repo, which boots a docker container and drops straight in to a mysql
/ psql
shell with this directory mounted at /sql
and used as $PWD
for fast easy sourcing eg.
postgres:
\i /sql/postgres_query_times.sql
\i postgres_query_times.sql
mysql:
source /sql/mysql_sessions.sql
\. mysql_sessions.sql
- .psqlrc - advanced PostgreSQL psql client config
- psql.sh - quickly connect to PostgreSQL with command line switches inferred from environment variables
- mysql.sh - quickly connect to MySQL / MariaDB with command line switches inferred from environment variables
- postgres.sh - one-touch PostgreSQL, boots docker container and drops you in to
psql
shell. Version can be given as an argument - mysqld.sh / mariadb.sh - one-touch MySQL / MariaDB, boots docker container and drops you in to
mysql
shell. Version can be given as an argument - postgres_foreach_table.sh / mysql_foreach_table.sh - execute templated SQL queries/statements against all or a subset of tables
- postgres_tables_row_counts.sh / mysql_tables_row_counts.sh - get row counts for all or a subset of tables
- sqlcase.pl - autocases your SQL code
- I use this a lot and call it via hotkey configured in my .vimrc
- there are
*case.pl
specializations for most of the major RDBMS and distributed SQL systems, even several NoSQL systems, using each one's language specific keywords
- Hive & Impala SQL:
- beeline.sh - quickly connect to Hive, auto-determines HiveServer2 address, Kerberos & SSL options, ZooKeeper quorum
- impala_shell.sh - quickly connect to Impala, auto-determines a Hadoop worker node address and Kerberos options (can use an environment variable for a Load Balancer setup)
- hive_foreach_table.sh / impala_foreach_table.sh - execute templated SQL queries/statements against all or a subset of tables
- hive_tables_row_counts.sh / impala_tables_row_counts.sh - get row counts for all or a subset of tables
- hive_tables_column_counts.sh / impala_tables_column_counts.sh - get the column counts for big tables in Hive / Impala
- hive_tables_metadata.sh / impala_tables_metadata.sh / hive_tables_locations.sh / impala_tables_locations.sh - get Hive / Impala metadata for all or a subset of tables, eg. Location to determine where the external tables data is being stored (HDFS / S3 paths)
The rest of my original source repos are here.
Pre-built Docker images are available on my DockerHub.