diff --git a/README.md b/README.md index 9c67c5e8f..f7b367339 100644 --- a/README.md +++ b/README.md @@ -11,7 +11,7 @@ Apache2 Maintenance GitHub contributors - GitHub issues by-label + GitHub issues by-label

@@ -31,7 +31,7 @@ Play with our [demo app](https://demo.oddp.io)! ## Introduction -ODD is an open-source data discovery and observability tool for data teams that helps to efficiently democratise data, power collaboration and reduce time on data discovery through modern user-friendly environment. +ODD is an open-source data discovery and observability tool for data teams that helps to efficiently democratise data, power collaboration and reduce time on data discovery through modern user-friendly environment. ### Key wins @@ -41,9 +41,7 @@ ODD is an open-source data discovery and observability tool for data teams that * Accelerate data insights * Know the sources of your dashboards and ad hoc reports * Deprecate outdated objects responsibly by assessing and mitigating the risks - - -* :point_right: ODD Platform is a reference implementation of **[Open Data Discovery Spec](https://github.com/opendatadiscovery/opendatadiscovery-specification)**. +* :point_right: ODD Platform is a reference implementation of **[Open Data Discovery Spec](https://github.com/opendatadiscovery/opendatadiscovery-specification)** ## Features @@ -53,14 +51,12 @@ ODD is an open-source data discovery and observability tool for data teams that * Gain observability through E2E Data objects Lineage * Benefit from cutting-edge E2E microservices Lineage feature in tracking your data flow through the whole data landscape * Be warned and alerted by Pipeline Monitoring tools -* Store your metadata +* Store your metadata * Use ODD-native modern lightweight UI - ### ML First citizen -* Save results of your ML Experiments by automatically logging its parameters - +* Save results of your ML Experiments by automatically logging its parameters ### Data Security & Compliance @@ -68,11 +64,10 @@ ODD is an open-source data discovery and observability tool for data teams that * Refer to Tags to stay compliant with data security standards * Have full transparency on how and by whom the data is used - -### Data Quality +### Data Quality * Utilize advanced Data Quality Dashboard to gain insights into data quality metrics, trends, and issues across your datasets, enabling proactive data quality management -* Simplify DQ processes by using ODD with Great Expectations and DBT tests compatibility +* Simplify DQ processes by using ODD with Great Expectations and DBT tests compatibility * Integrate ODD with any custom DQ framework ### Reference Data Management (Lookup Tables) - a part of Master Data Management (MDM) @@ -81,37 +76,41 @@ ODD is an open-source data discovery and observability tool for data teams that * Easily integrate Lookup Tables with data pipelines and transformations, enhancing data enrichment and validation processes * Support data governance and compliance efforts by maintaining accurate and consistent reference data across all data assets +## Getting Started -## Getting Started ### Running as a separate container Setting up PostgreSQL connection details, for example: -``` -export POSTGRES_HOST=172.17.0.1 \ -export POSTGRES_PORT=5432 \ -export POSTGRES_DATABASE=postgres \ -export POSTGRES_USER=postgres \ + +```shell +export POSTGRES_HOST=172.17.0.1 +export POSTGRES_PORT=5432 +export POSTGRES_DATABASE=postgres +export POSTGRES_USER=postgres export POSTGRES_PASSWORD=mysecretpassword ``` + Starting new instance of the platform: -``` + +```shell docker run -d \ --name odd-platform \ -e SPRING_DATASOURCE_URL=jdbc:postgresql://${POSTGRES_HOST}:${POSTGRES_PORT}/${POSTGRES_DATABASE} \ -e SPRING_DATASOURCE_USERNAME=${POSTGRES_USER} \ - -e SPRING_DATASOURCE_PASSWORD=${POSTGRES_PASSWORD} \ + -e SPRING_DATASOURCE_PASSWORD=${POSTGRES_PASSWORD} \ -p 8080:8080 \ ghcr.io/opendatadiscovery/odd-platform:latest ``` -Go to [localhost:8080](http://localhost:8080) in case of local environment + +Go to [localhost:8080](http://localhost:8080) in case of local environment. ### Running Locally with Docker Compose -``` +```shell docker-compose -f docker/demo.yaml up -d odd-platform-enricher ``` -* :point_right: **[QUICKSTART](./docker/README.md)** +* :point_right: **[QUICKSTART](./docker/README.md)** ### Deploying to Kubernetes with Helm Charts @@ -123,19 +122,21 @@ There are various example configurations (via docker-compose) within **[docker/e ## Contributing -Contributing to ODD Platform is very welcome. For basic contributions, all you need is being comfortable with GitHub and Git. The best ways to contribute are: -* Work on new adapters +Contributing to ODD Platform is very welcome. For basic contributions, all you need is being comfortable with GitHub and Git. The best ways to contribute are: + +* Work on new adapters * Work on documentation -To ensure equal and positive communication, we adhere to our [Code of Conduct](./CODE_OF_CONDUCT.md). Before starting any interactions with this repository, please read it and make sure to follow. +To ensure equal and positive communication, we adhere to our [Code of Conduct](./CODE_OF_CONDUCT.md). Before starting any interactions with this repository, please read it and make sure to follow. -Please before contributing check out our [Contributing Guide](./CONTRIBUTING.md) and issues labeled "good first issue": +Please before contributing check out our [Contributing Guide](./CONTRIBUTING.md) and issues labeled "good first issue": [![GitHub issues by-label](https://img.shields.io/github/issues/opendatadiscovery/odd-platform/good%20first%20issue?style=for-the-badge)](https://github.com/opendatadiscovery/odd-platform/contribute)
## Integrations + OpenDataDiscovery Platform offers comprehensive data source support to meet your needs. @@ -146,94 +147,107 @@ OpenDataDiscovery Platform offers comprehensive data source support to meet your - + - - - + + + - - - + + + - - - + + + - - - + + + - - - + + + - - - + + + - - - + + + - - - + + + - - - + + + - - - + + + - - - + + - - - + + + - + - + - - + + - - - + + + + + + + + + + + + + - - - + + + - - - + + + + + +
Proxy AdapterProxy Adapter Airflow Airflow 2+
Apache DruidCassandraClickhouseApache DruidCassandraClickhouse
Elasticsearch HiveKafkaElasticsearchHiveKafka
Feast MSSQLMySQLFeastMSSQLMySQL
Microsoft ODBCMongoDB Neo4j Microsoft ODBCMongoDBNeo4j
MariaDBOraclePostgreSQLMariaDBOraclePostgreSQL
RedshiftSnowflakeVerticaRedshiftSnowflakeVertica
TarantoolAthenaDynamoDBTarantoolAthenaDynamoDB
GlueKinesisQuicksight GlueKinesisQuicksight
S3SageMaker SageMaker FeaturestoreS3SageMakerSageMaker Featurestore
SQSDelta lake S3Tableau SQSDelta lake S3Tableau
CubeSuperSetPowerBi + CubeSuperSetPowerBI
TrinoPrestoDBTTrinoPrestoDBT
RedashRedash SparkMLflowMLflow
KubeflowDatabricks Unity CatalogKubeflowDatabricks Unity Catalog Great Expectations
SQLiteCouchbaseCockroachdbSQLiteCouchbaseCockroachdb
FivetranAirbyteMetabase
ModeBigQuerySinglestore
FivetranAirbyteMetabaseBigTableGoogleCloudStorageGoogleCloudStoraDeltaTables
ModeBigQuerySinglestoreBlob StorageDuckdbScyllaDB
CKAN
@@ -244,29 +258,26 @@ ODD operates the following high-level types of entities:

  1. Datasets (collections of data: tables, topics, files, feature groups)
  2. -
  3. Transformers (transformers of data: ETL or ML training jobs, experiments)
  4. -
  5. Data Consumers (data consumers: ML models or BI dashboards)
  6. -
  7. Data Quality Tests (data quality tests for datasets)
  8. +
  9. Transformers (transformers of data: ETL or ML training jobs, experiments)
  10. +
  11. Data Consumers (data consumers: ML models or BI dashboards)
  12. +
  13. Data Quality Tests (data quality tests for datasets)
  14. Data Inputs (sources of data)
  15. Transformer Runs (executions of ETL or ML training jobs)
  16. -
  17. Quality Test Runs executions of data quality tests
  18. +
  19. Quality Test Runs executions of data quality tests
For more information, please check **[specification.md](https://github.com/opendatadiscovery/opendatadiscovery-specification/blob/main/specification/specification.md)**. - ## Community Support Join our community if you need help, want to chat or have any other questions for us: -- [GitHub](https://github.com/opendatadiscovery/odd-platform/discussions) - Discussion forums and issues -- [Slack](https://go.opendatadiscovery.org/slack) - Join the conversation! Get all the latest updates and chat to the devs - +* [GitHub](https://github.com/opendatadiscovery/odd-platform/discussions) - Discussion forums and issues +* [Slack](https://go.opendatadiscovery.org/slack) - Join the conversation! Get all the latest updates and chat to the devs ## Contacts -If you have any questions or ideas, please don't hesitate to drop a line to any of us. - +If you have any questions or ideas, please don't hesitate to drop a line to any of us. | Team Member | LinkedIn | GitHub | | ---------------- | ------------------------------------------------------------------ | --------------------------------------------------- | @@ -276,6 +287,7 @@ If you have any questions or ideas, please don't hesitate to drop a line to any | Alexey Kozyurov | [LinkedIn](https://www.linkedin.com/in/aleksei-koziurov/) | [Leshe4ka](https://github.com/Leshe4ka) | | Pavel Makarichev | [LinkedIn](https://www.linkedin.com/in/pavel-makarichev-8a8730a4/) | [vixtir](https://github.com/vixtir) | | Roman Zabaluev | [LinkedIn](https://www.linkedin.com/in/haarolean/) | [Haarolean](https://github.com/haarolean) | + ## License ODD Platform uses the [Apache 2.0 License](https://www.apache.org/licenses/LICENSE-2.0.txt).