Skip to content

Releases: open-metadata/OpenMetadata

1.5.7-release

17 Oct 17:04
Compare
Choose a tag to compare

What's Changed

  • Feature: Add table-type custom property.
  • Feature: support Persian language option
  • Feature: Postgres stored procedures support.
  • Feature: Allow Custom Property Update in Glossary Bulk Import/export.
  • Improvement: Remove table details from table level Import/Export, allowing updates only for column details.
  • MINOR: looker exclude version.
  • MINOR: Add deleteEntityByFQNPrefix.
  • MINOR: Reduce lineage response size.
  • MINOR: Updated pyiceberg version to 0.5.1
  • MINOR: Fix dark border shown in navbar on UI.
  • MINOR: Add column case sensitivity parameter.
  • MINOR: Pagination with search on service page.
  • MINOR: Added loader in activity feed open and closed count.
  • MINOR: Superset get primitive datatype in case of array, struct.
  • MINOR: fix term references validation msg on glossary import.
  • MINOR: supported search filter and only all show in case of all node value selected.
  • Fix: Fix PinotDB Ingestion.
  • Fix: MSAL popup auth issue.
  • Fix: Fix Alerts for Test Suites.
  • Fix: Added Glue Pipeline Lineage.
  • Fix: ClassGraph performance issue.
  • Fix: Superset query for mysql con.
  • Fix: Empty Connection Overwrite Logic.
  • Fix: Couchbase columns not fetched fix.
  • Fix: Quicksight Ingestion Error handled.
  • Fix: DBT Manifest and Run results parsing.
  • Fix: Increase MAX_AGGREGATE_SIZE in search.
  • Fix: Add display name field in the advanced search filter.
  • Fix: On dashboard soft delete, chart should not be visible.
  • Fix: Fix the automator page breaking when no source is selected.
  • Fix: Salesforce table description from label if not through query.
  • Fix: Add Import/export support for table type custom property in glossary.
  • Fix: Fix exception in search due to exception in database.displayName and databaseSchema.aggregation.
  • MINOR: Knowledge Center publicationDate mismatch error ${CollateIconWithLinkMD}
  • MINOR: Add owner label for knowledge center right panel ${CollateIconWithLinkMD}
  • Fix: Automator pagination & improvments ${CollateIconWithLinkMD}
  • Fix: ArchiveLog to FALSE for test connection ${CollateIconWithLinkMD}
  • Fix: Knowledge Page deletion is not deleting from the search index ${CollateIconWithLinkMD}`,

Full Changelog: 1.5.6-release...1.5.7-release

1.5.6-release

03 Oct 13:42
Compare
Choose a tag to compare

What's Changed

  • Fixed MSTR connector import.
  • Show displayName for database and databaseSchema in explore tree.
  • Allow PowerBI datamodel children in column lineage.
  • Fixed manifest is not parsed correctly on dbt versionless.
  • Fixed lineage & queries in dbt.
  • Added DBT tests with versionless and fixed v7 parsing.
  • Reset displayName to avoid being persisted while editing user display name.
  • Fixed incorrect schema implementations in Swagger annotations.
  • Resolved type null exception on user feed.
  • Addressed missing cast to str.
  • Fixed DI Missing Dashboard Description Status.
  • Fixed SAML redirect leads to 404 page on UI.
  • Fixed General Profiler Bugs.
  • Fixed time format for the created_at of the DBT cloud pipeline status.
  • Fixed role page size from 10 to 50.
  • Fixed Search Indexing.
  • Improved AlationSink connector.
  • Fixed sktime version to fix AUT
  • Fixed Expected ColumnLineage but got dict
  • Improved Collate API with Knowledge Center routes. ${CollateIconWithLinkMD}

Full Changelog: 1.5.5-release...1.5.6-release

1.5.5-release

25 Sep 04:48
f80afe6
Compare
Choose a tag to compare

What's Changed

  • Made the type optional in ES Response.
  • Added support for refresh tokens with multiple tabs open.
  • Resolved issue of overriding user info after login.
  • Updated the custom property entities data model, along with the data product and database schema icons.
  • Ensured Teams and Owner fields are correctly passed in the policy API call.
  • Enhanced PII logging information.
  • Addressed the paginate_es issue in OpenSearch.
  • Decrypted JWT internally for system health checks.
  • Implemented multithreading in View Lineage Processing.
  • Improved search relevancy.
  • Resolved issue with owners patch.
  • Fixed Snowflake data diff issue.
  • Updated Presidio Analyzer version and validated support for legal entities.
  • Added validations for Salesforce connection.
  • Allowed PII Processor to operate without storing sample data.
  • Added seconds to the human-readable format scale for test case graphs.
  • Added missing field in glossary term.
  • Excluded defaultPersona if not present in personas.
  • Resolved team export issue.
  • Updated Python lineage SDK to work with UUID and FQN models.
  • Fixed LDAP login issue.
  • Column sizing of data quality and pipeline widget ${CollateIconWithLinkMD}
  • Export with new line in description ${CollateIconWithLinkMD}
  • Fix Page entity publicationDate datatype ${CollateIconWithLinkMD}

Full Changelog: 1.5.4-release...1.5.5-release

1.5.4-release

13 Sep 12:43
Compare
Choose a tag to compare

What's Changed

OpenMetadata

  • Hotfix to the Term Aggregation size on Data Insights
  • ES pagination with error handling
  • Updated Domain in Docker Compose & Docs
  • Fix Classification API returns Table class for restore
  • Fix Redshift View Def regex_replace Error
  • Make ingestion pipeline APIs public
  • Updating the domain PRINCIPAL DOMAIN
  • Glossary list selector for bulk import
  • Unable to access the import glossary page

Full Changelog: 1.5.3-release...1.5.4-release

Collate

  • Fix token limitations using config
  • Fix Automator pagination
  • Fix MetaPilot push for no constraint

1.5.3-release

09 Sep 14:04
Compare
Choose a tag to compare
1.5.3-release Pre-release
Pre-release

What's Changed

OpenMetadata

  • Added resizable columns for custom properties
  • Added support for automated ingestion of Tableau data source tags and description
  • Improved "follow data" landing page module performance
  • Improved search result suggestion by showing display name instead of FQN
  • Fixed Cost Analysis issue when service has no connection
  • Improved PII classification for JSON data types
  • Fixed issue with expand all operation on terms page
  • Fixed feed freezing when large images are part of the feed results
  • Fixed dbt run_results file name with dbt cloud connection

Full Changelog: 1.5.2-release...1.5.3-release

Collate

  • Cleaned Argo logs artifacts
  • Shipped VertexAI Connector
  • Fixed automator lineage propagation issues with possible None entities

1.5.2-release

02 Sep 10:05
Compare
Choose a tag to compare

What's Changed

  • [Fix]: Resolved issue with lineage lookup for long Fully Qualified Names (FQNs), ensuring accurate lineage tracking and display.
  • [Improve]: Fixed the 'Edit Reviewers' permission issue, allowing correct permission management for editing reviewers.
  • [Improve]: Addressed email update issues to ensure that email addresses are properly updated throughout the system.
  • [Improve]: Fixed the delete lineage functionality to handle cases where override lineage is enabled, preventing errors and ensuring consistency.
  • [Improve]: Added support for the 'Edit Assign' button in the Incident Manager, allowing for easier assignment changes.
  • [Improve]: Introduced a resizable layout for the glossary page, improving usability and adaptability to different screen sizes.
  • [Improve]: Enhanced the display of tier tags with improved styling for better visibility and distinction.
  • [Improve]: Pick email and name based on claim values at login. This update ensures that user details are automatically populated during the login process, streamlining user experience.
  • [Improve]: Added custom properties support in Data Product

Full Changelog: 1.5.1-release...1.5.2-release

1.5.2-rc1-release

30 Aug 12:26
Compare
Choose a tag to compare
1.5.2-rc1-release Pre-release
Pre-release

What's Changed

  • [Fix]: Resolved issue with lineage lookup for long Fully Qualified Names (FQNs), ensuring accurate lineage tracking and display.
  • [Improve]: Fixed the 'Edit Reviewers' permission issue, allowing correct permission management for editing reviewers.
  • [Improve]: Addressed email update issues to ensure that email addresses are properly updated throughout the system.
  • [Improve]: Fixed the delete lineage functionality to handle cases where override lineage is enabled, preventing errors and ensuring consistency.
  • [Improve]: Added support for the 'Edit Assign' button in the Incident Manager, allowing for easier assignment changes.
  • [Improve]: Introduced a resizable layout for the glossary page, improving usability and adaptability to different screen sizes.
  • [Improve]: Enhanced the display of tier tags with improved styling for better visibility and distinction.
  • [Improve]: Pick email and name based on claim values at login. This update ensures that user details are automatically populated during the login process, streamlining user experience.
  • [Improve]: Added custom properties support in Data Product

Full Changelog: 1.5.1-release...1.5.2-rc1-release

1.5.1-release

28 Aug 14:05
Compare
Choose a tag to compare

Backward Incompatible Changes

Multi Owners

OpenMetadata allows a single user or a team to be tagged as owners for any data assets. In Release 1.5.0, we allow users to tag multiple individual owners or a single team. This will allow organizations to add ownership to multiple individuals without necessarily needing to create a team around them like previously.

This is a backward incompatible change, if you are using APIs, please make sure the owner field is now changed to “owners”

Import/Export Format

To support the multi-owner format, we have now changed how we export and import the CSV file in glossary, services, database, schema, table, etc. The new format will be
user:userName;team:TeamName

If you are importing an older file, please make sure to make this change.

Pydantic V2

The core of OpenMetadata are the JSON Schemas that define the metadata standard. These schemas are automatically translated into Java, Typescript, and Python code with Pydantic classes.

In this release, we have migrated the codebase from Pydantic V1 to Pydantic V2.

Deployment Related Changes (OSS only)

./bootstrap/bootstrap_storage.sh removed

OpenMetadata community has built rolling upgrades to database schema and the data to make upgrades easier. This tool is now called as ./bootstrap/openmetadata-ops.sh and has been part of our releases since 1.3. The bootstrap_storage.sh doesn’t support new native schemas in OpenMetadata. Hence, we have deleted this tool from this release.

While upgrading, please refer to our Upgrade Notes in the documentation. Always follow the best practices provided there.

Database Connection Pooling

OpenMetadata uses Jdbi to handle database-related operations such as read/write/delete. In this release, we introduced additional configs to help with connection pooling, allowing the efficient use of a database with low resources.

Please update the defaults if your cluster is running at a large scale to scale up the connections efficiently.

For the new configuration, please refer to the doc here

Data Insights

The Data Insights application is meant to give you a quick glance at your data's state and allow you to take action based on the information you receive. To continue pursuing this objective, the application was completely refactored to allow customizability.

Part of this refactor was making Data Insights an internal application, no longer relying on an external pipeline. This means triggering Data Insights from the Python SDK will no longer be possible.

With this change you will need to run a backfill on the Data Insights for the last couple of days since the Data Assets data changed.

UI Changes

New Explore Page

Explore page displays hierarchically organized data assets by grouping them into services > database > schema > tables/stored procedures. This helps users organically find the data asset they are looking for based on a known database or schema they were using. This is a new feature and changes the way the Explore page was built in previous releases.

Connector Schema Changes

In the latest release, several updates and enhancements have been made to the JSON schema across various connectors. These changes aim to improve security, configurability, and expand integration capabilities. Here's a detailed breakdown of the updates:

  • KafkaConnect: Added schemaRegistryTopicSuffixName to enhance topic configuration flexibility for schema registries.
  • GCS Datalake: Introduced bucketNames field, allowing users to specify targeted storage buckets within the Google Cloud Storage environment.
  • OpenLineage: Added saslConfig to enhance security by enabling SASL (Simple Authentication and Security Layer) configuration.
  • Salesforce: Added sslConfig to strengthen the security layer for Salesforce connections by supporting SSL.
  • DeltaLake: Updated schema by moving metastoreConnection to a newly created metastoreConfig.json file. Additionally, introduced configSource to better define source configurations, with new support for metastoreConfig.json and storageConfig.json.
  • Iceberg RestCatalog: Removed clientId and clientSecret as mandatory fields, making the schema more flexible for different authentication methods.
  • DBT Cloud Pipelines: Added as a new connector to support cloud-native data transformation workflows using DBT.
  • Looker: Expanded support to include connections using GitLab integration, offering more flexible and secure version control.
  • Tableau: Enhanced support by adding capabilities for connecting with TableauPublishedDatasource and TableauEmbeddedDatasource, providing more granular control over data visualization and reporting.

Include DDL

During the Database Metadata ingestion, we can optionally pick up the DDL for both tables and views. During the metadata ingestion, we use the view DDLs to generate the View Lineage.

To reduce the processing time for out-of-the-box workflows, we are disabling the include DDL by default, whereas before, it was enabled, which potentially led to long-running workflows.

Secrets Manager

Starting with the release 1.5.0, the JWT Token for the bots will be sent to the Secrets Manager if you configured one. It won't appear anymore in your dag_generated_configs in Airflow.

Python SDK

The metadata insight command has been removed. Since Data Insights application was moved to be an internal system application instead of relying on external pipelines the SDK command to run the pipeline was removed.

What's New

Data Observability with Anomaly Detection (Collate)

OpenMetadata has been driving innovation in Data Quality in Open Source. Many organizations are taking advantage of the following Data Quality features to achieve better-quality data

  1. A Native Profiler to understand the shape of the data, freshness, completeness, volume, and ability to add your own metrics, including column level profiler over time-series and dashboards
  2. No-code data quality tests, deploy, collect results back to see it in a dashboard all within OpenMetadata
  3. Create alerts and get notified of Test results through email, Slack, NSteams, GChat, and Webhook
  4. Incident Manager to collaborate around test failures and visibility to downstream consumers of failures from upstream

In 1.5.0, we are bringing in Anomaly Detection based on AI to predict when an anomaly happens based on our learning historical data and automatically sending notifications to the owners of the table to warn them of the impending incidents

Enhanced Data Quality Dashboard (Collate)

We also have improved the Table Data quality dashboard to showcase the tests categorized and make it easy for everyone to consume. When there are issues, the new dashboard makes it easier to understand the Data Quality coverage of your tables and the possible impact each test failure has by organizing tests into different groups.

Freshness Data Quality Tests (Collate)

Working with old data can lead to making wrong decisions. With the new Freshness test, you can validate that your data arrives at the right time. Freshness tests are a critical part of any data team's toolset. Bringing these tests together with lineage information and the Incident Manager, your team will be able to quickly detect issues related to missing data or stuck pipelines.

Data Diff Data Quality Tests

Data quality checks are important not only within a single table but also between different tables. These data diff checks can ensure key data remains unchanged after transformation, or conversely, ensure that the transformations were actually performed.

We are introducing the table difference data quality test to validate that multiple appearances of the same information remain consistent. Note that the test allows you to specify which column to use as a key and which columns you want to compare, and even add filters in the data to give you more control over multiple use cases.

Domains RBAC & Subdomains

OpenMetadata introduced Domains & Data Products in 1.3.0. Since then, many large organizations have started using Domains & Data Products to achieve better ownership and collaboration around domains that can span multiple teams.

In the 1.5.0 release, we added support for subdomains. This will help teams to organize into multiple subdomains within each domain.

RBAC for Domains

With the 1.5.0 release, we are adding more stricter controls around Domain. Now, teams, data assets, glossaries, and classification can have domain concepts and can get a policy such that only users within a domain can access the data within a domain. Domain owners can use Data Products to publish data products and showcase publicly available data assets from a specific domain.

This will help large companies to use a single OpenMetadata platform to unify all of their data and teams but also provide more stringent controls to segment the data between domains

Improved Explore Page & Data Asset Widget

OpenMetadata, with its simple UI/UX and data collaboration features, is becoming more attractive to non-technical users as well. Data Governance teams are using OpenMetadata to add glossary terms and policies around metadata. Teams using Collate SaaS product are taking advantage of our Automations feature to gain productivity in their governance tasks.

Our new improved navigation on the Explore page will help users navigate hierarchically and find the data they are looking for. Users will see the data assets now grouped by service name -> database -> schema -> tables/stored procedures.

We are also making the discovery of data more accessible for users introducing a data asset widget, which will gro...

Read more

OpenMetadata 1.5.0 Release

26 Aug 05:19
Compare
Choose a tag to compare
Pre-release

Please wait for 1.5.1

We've found some issues that would degrade the user experience of 1.5.0. Please use 1.5.1 instead. Thanks

Backward Incompatible Changes

Multi Owners

OpenMetadata allows a single user or a team to be tagged as owners for any data assets. In Release 1.5.0, we allow users to tag multiple individual owners or a single team. This will allow organizations to add ownership to multiple individuals without necessarily needing to create a team around them like previously.

This is a backward incompatible change, if you are using APIs, please make sure the owner field is now changed to “owners”

Import/Export Format

To support the multi-owner format, we have now changed how we export and import the CSV file in glossary, services, database, schema, table, etc. The new format will be
user:userName;team:TeamName

If you are importing an older file, please make sure to make this change.

Pydantic V2

The core of OpenMetadata are the JSON Schemas that define the metadata standard. These schemas are automatically translated into Java, Typescript, and Python code with Pydantic classes.

In this release, we have migrated the codebase from Pydantic V1 to Pydantic V2.

Deployment Related Changes (OSS only)

./bootstrap/bootstrap_storage.sh removed

OpenMetadata community has built rolling upgrades to database schema and the data to make upgrades easier. This tool is now called as ./bootstrap/openmetadata-ops.sh and has been part of our releases since 1.3. The bootstrap_storage.sh doesn’t support new native schemas in OpenMetadata. Hence, we have deleted this tool from this release.

While upgrading, please refer to our Upgrade Notes in the documentation. Always follow the best practices provided there.

Database Connection Pooling

OpenMetadata uses Jdbi to handle database-related operations such as read/write/delete. In this release, we introduced additional configs to help with connection pooling, allowing the efficient use of a database with low resources.

Please update the defaults if your cluster is running at a large scale to scale up the connections efficiently.

For the new configuration, please refer to the doc here

Data Insights

The Data Insights application is meant to give you a quick glance at your data's state and allow you to take action based on the information you receive. To continue pursuing this objective, the application was completely refactored to allow customizability.

Part of this refactor was making Data Insights an internal application, no longer relying on an external pipeline. This means triggering Data Insights from the Python SDK will no longer be possible.

With this change you will need to run a backfill on the Data Insights for the last couple of days since the Data Assets data changed.

UI Changes

New Explore Page

Explore page displays hierarchically organized data assets by grouping them into services > database > schema > tables/stored procedures. This helps users organically find the data asset they are looking for based on a known database or schema they were using. This is a new feature and changes the way the Explore page was built in previous releases.

Connector Schema Changes

In the latest release, several updates and enhancements have been made to the JSON schema across various connectors. These changes aim to improve security, configurability, and expand integration capabilities. Here's a detailed breakdown of the updates:

  • KafkaConnect: Added schemaRegistryTopicSuffixName to enhance topic configuration flexibility for schema registries.
  • GCS Datalake: Introduced bucketNames field, allowing users to specify targeted storage buckets within the Google Cloud Storage environment.
  • OpenLineage: Added saslConfig to enhance security by enabling SASL (Simple Authentication and Security Layer) configuration.
  • Salesforce: Added sslConfig to strengthen the security layer for Salesforce connections by supporting SSL.
  • DeltaLake: Updated schema by moving metastoreConnection to a newly created metastoreConfig.json file. Additionally, introduced configSource to better define source configurations, with new support for metastoreConfig.json and storageConfig.json.
  • Iceberg RestCatalog: Removed clientId and clientSecret as mandatory fields, making the schema more flexible for different authentication methods.
  • DBT Cloud Pipelines: Added as a new connector to support cloud-native data transformation workflows using DBT.
  • Looker: Expanded support to include connections using GitLab integration, offering more flexible and secure version control.
  • Tableau: Enhanced support by adding capabilities for connecting with TableauPublishedDatasource and TableauEmbeddedDatasource, providing more granular control over data visualization and reporting.

Include DDL

During the Database Metadata ingestion, we can optionally pick up the DDL for both tables and views. During the metadata ingestion, we use the view DDLs to generate the View Lineage.

To reduce the processing time for out-of-the-box workflows, we are disabling the include DDL by default, whereas before, it was enabled, which potentially led to long-running workflows.

Secrets Manager

Starting with the release 1.5.0, the JWT Token for the bots will be sent to the Secrets Manager if you configured one. It won't appear anymore in your dag_generated_configs in Airflow.

Python SDK

The metadata insight command has been removed. Since Data Insights application was moved to be an internal system application instead of relying on external pipelines the SDK command to run the pipeline was removed.

What's New

Data Observability with Anomaly Detection (Collate)

OpenMetadata has been driving innovation in Data Quality in Open Source. Many organizations are taking advantage of the following Data Quality features to achieve better-quality data

  1. A Native Profiler to understand the shape of the data, freshness, completeness, volume, and ability to add your own metrics, including column level profiler over time-series and dashboards
  2. No-code data quality tests, deploy, collect results back to see it in a dashboard all within OpenMetadata
  3. Create alerts and get notified of Test results through email, Slack, NSteams, GChat, and Webhook
  4. Incident Manager to collaborate around test failures and visibility to downstream consumers of failures from upstream

In 1.5.0, we are bringing in Anomaly Detection based on AI to predict when an anomaly happens based on our learning historical data and automatically sending notifications to the owners of the table to warn them of the impending incidents

Enhanced Data Quality Dashboard (Collate)

We also have improved the Table Data quality dashboard to showcase the tests categorized and make it easy for everyone to consume. When there are issues, the new dashboard makes it easier to understand the Data Quality coverage of your tables and the possible impact each test failure has by organizing tests into different groups.

Freshness Data Quality Tests (Collate)

Working with old data can lead to making wrong decisions. With the new Freshness test, you can validate that your data arrives at the right time. Freshness tests are a critical part of any data team's toolset. Bringing these tests together with lineage information and the Incident Manager, your team will be able to quickly detect issues related to missing data or stuck pipelines.

Data Diff Data Quality Tests

Data quality checks are important not only within a single table but also between different tables. These data diff checks can ensure key data remains unchanged after transformation, or conversely, ensure that the transformations were actually performed.

We are introducing the table difference data quality test to validate that multiple appearances of the same information remain consistent. Note that the test allows you to specify which column to use as a key and which columns you want to compare, and even add filters in the data to give you more control over multiple use cases.

Domains RBAC & Subdomains

OpenMetadata introduced Domains & Data Products in 1.3.0. Since then, many large organizations have started using Domains & Data Products to achieve better ownership and collaboration around domains that can span multiple teams.

In the 1.5.0 release, we added support for subdomains. This will help teams to organize into multiple subdomains within each domain.

RBAC for Domains

With the 1.5.0 release, we are adding more stricter controls around Domain. Now, teams, data assets, glossaries, and classification can have domain concepts and can get a policy such that only users within a domain can access the data within a domain. Domain owners can use Data Products to publish data products and showcase publicly available data assets from a specific domain.

This will help large companies to use a single OpenMetadata platform to unify all of their data and teams but also provide more stringent controls to segment the data between domains

Improved Explore Page & Data Asset Widget

OpenMetadata, with its simple UI/UX and data collaboration features, is becoming more attractive to non-technical users as well. Data Governance teams are using OpenMetadata to add glossary terms and policies around metadata. Teams using Collate SaaS product are taking advantage of our Automations feature to gain productivity in their governance tasks.

Our new improved navigation on the Explore page will help users navigate hierarchically and find the data they are looking for. Users will see the data assets now grouped by `service name -> database -> schema -> tables/st...

Read more

1.5.0-rc2-release

22 Aug 13:33
Compare
Choose a tag to compare
1.5.0-rc2-release Pre-release
Pre-release

What's Changed

Read more