Skip to content
Paul Sebenick edited this page Apr 17, 2024 · 4 revisions

Objectives

  • Capture critical elements of technical meta data of relational databases
  • Utilize a set of templates (SQL queries/scripts) that can be tailored to collect required data
  • Results should be retained in a common, quarriable format

Data profiling is a process

Data Profiling generally consists of a series of steps that dig deeper and deeper into the details of the data sets. The high level steps in the data profile process are:

  1. Inventory of databases (schemas)
  2. Inventory and metadata of tables within the databases
  3. Inventory and metadata of columns with the tables ( count distinct values, num nulls, min, max)
  4. For each column we can then dig into the details of the data contents ( i.e. frequency distribution or list of values for columns) and relationships to other columns.

image

Objectives: Capture critical elements of technical meta data of relational databases Utilize a set of templates (SQL queries/scripts) that can be tailored to collect required data Results should be retained in a common, quarriable format

Clone this wiki locally