Skip to content

SteadyGiant/dbt_pendo_source

 
 

Repository files navigation

Apache License

Pendo (Source)

This package models Pendo data from Fivetran's connector. It uses data in the format described by this ERD.

This package enriches your Fivetran data by doing the following:

  • Adds descriptions to tables and columns that are synced using Fivetran
  • Adds column-level testing where applicable. For example, all primary keys are tested for uniqueness and non-null values.
  • Models staging tables, which will be used in our transform package

Models

This package contains staging models, designed to work simultaneously with our Pendo modeling package. The staging models:

  • Remove any rows that are soft-deleted
  • Name columns consistently across all packages:
    • Boolean fields are prefixed with is_ or has_
    • Timestamps are appended with _at
    • ID primary keys are prefixed with the name of the table. For example, a user table's ID column is renamed user_id

Installation Instructions

Check dbt Hub for the latest installation instructions, or read the dbt docs for more information on installing packages.

Include in your packages.yml

packages:
  - package: fivetran/pendo_source
    version: [">=0.2.0", "<0.3.0"]

Package Maintenance

The Fivetran team maintaining this package only maintains the latest version. We highly recommend you keep your packages.yml updated with the dbt hub latest version. You may refer to the CHANGELOG and release notes for more information on changes across versions.

Configuration

By default, this package looks for your Pendo data in the pendo schema of your target database. If this is not where your Pendo data is, add the following configuration to your dbt_project.yml file:

# dbt_project.yml

...
config-version: 2

vars:
    pendo_database: your_database_name
    pendo_schema: your_schema_name 

Passthrough Columns

This package includes all of the source columns that are defined in the macros folder. We recommend including custom columns in this package because the staging models only bring in the standard columns for the EVENT, FEATURE_EVENT, PAGE_EVENT, ACCOUNT_HISTORY, and VISITOR_HISTORY tables.

You can add more columns using our passthrough column variables. These variables allow the passthrough columns to be aliased (alias) and casted (transform_sql) if you want, although it is not required. You can configure datatype casting by using a SQL snippet within the transform_sql key. You may add the desired SQL snippet while omitting the as field_name part of the casting statement - we rename this column with the alias attribute - and your custom passthrough columns will be casted accordingly.

Use the following format for declaring the respective passthrough variables:

# dbt_project.yml

vars:

  pendo__feature_event_pass_through_columns:
    - name:           "custom_crazy_field_name"
      alias:          "normal_field_name"
  pendo__page_event_pass_through_columns:
    - name:           "property_field_id"
      alias:          "new_name_for_this_field_id"
      transform_sql:  "cast(new_name_for_this_field as int64)"
    - name:           "this_other_field"
      transform_sql:  "cast(this_other_field as string)"
  pendo__event_pass_through_columns:
    - name:           "well_named_field_1"
  pendo__account_history_pass_through_columns:
    - name:           "well_named_field_2"
  pendo__visitor_history_pass_through_columns:
    - name:           "well_named_field_3"

Changing the Build Schema

By default, this package builds the Pendo staging models within a schema titled (<target_schema> + _stg_pendo) in your target database. If this is not where you would like your Pendo staging data to be written to, add the following configuration to your dbt_project.yml file:

# dbt_project.yml

...
models:
    pendo_source:
        +schema: my_new_schema_name # leave blank for just the target_schema

Note that if your profile does not have permissions to create schemas in your warehouse, you can set the +schema to blank. The package will then write all tables to your pre-existing target schema.

Contributions

Don't see a model or specific metric you would have liked to be included? Notice any bugs when installing and running the package? If so, we highly encourage and welcome contributions to this package! Please create issues or open PRs against master. Check out this post on the best workflow for contributing to a package.

Database Support

This package has been tested on BigQuery, Snowflake, Redshift, Postgres, and Databricks.

Resources: