-
Data Warehouse Tech Stack with PostgreSQL, DBT, Airflow, and Redash
-
2022 Technology Landscape to Democratize Data
from raw data to insights, discover pre build operationalize
-
What’s in Store for the Future of the Modern Data Stack? 😐
2022 predictions
-
Data Engineering Technologies in 2021 😄
data eco overview
-
Open Source Analytics Stack: Bringing Control, Flexibility, and Data-Privacy to Your Analytics 🍇
open source data platform solutions
-
How we’re building our data platform as a product 🍒
data as product
-
Building an End-to-End Open-Source Modern Data Platform
end to end data platform
-
The Best Way to Manage Unstructured Data Efficiently
unstructured data
-
SaaS Foundations: Building scale-up data infrastructure
saas data inf
-
The State of Data Infrastructure Landscape in 2022 and Beyond
MDS 2022
-
Fybrik Open Architecture and Ecosystem
data architecture
-
How ManyPets Implemented The Modern Data Stack
some open source stack
Metadata is Data about other Data. Metadata describes the data and provides information about it.
Knowing Metadata is knowing “who, what, where, why, when and how” of Data.
- Data Governance Part 2 — Data Governance Platforms
- Data catalog ROI — A Primer
- What is metadata? 👑
metadata tutorial
- Announcing OpenMetadata
- Democratizing Data at Airbnb
- Creating a Metadata Architecture From the Ground Up
- Data Discovery in 2020
- Rethink Metadata … It’s Facets ❄️
metadata thinking
- The Missing Piece of Data Discovery and Observability Platforms: Open Standard for Metadata 🍊
For the first time, I met a structure that was consistent with my ideas
- Metadata Management, A Critical Element in Data Governance
- Metacat: Making Big Data Discoverable and Meaningful at Netflix
- Hive Metastore – Why It’s Still Here and What Can Replace It?
- Level Up Your Data Lake
- Open Source Data Lake Table Formats: Evaluating Current Interest and Rate of Adoption
Hudi vs Iceberg vs Delta
- Apache HUDI vs Delta Lake
HUDI DeltaLake
- Building a Data Lake with Apache Airflow 🚤
datalake airflow
- How to build a Data Lake in your early-stage startup 🌊
data lake tutorial
- Anatomy of Modern Data Platform 🍔
data infrastructure overview
- Universal Data Lake: The Future of Data 🚐
I couldn't agree more
- Apache Hudi — The Streaming Data Lake Platform ``
- DataLake - In-Depth Comparison of DeltaLake and Apache HUDI
hudi vs deltalake
- hudi-iceberg-and-delta-lake-data-lake-table-formats-compared
hudi vs deltalake vs iceberg
- Road to Lakehouse - Part 1: Delta Lake data pipeline overview
Delta Lake tutorial
- Road to Lakehouse — Part 2: Ingest and process data from Kafka with CDC and Delta Lake’s CDF
- Iceberg at Adobe
iceberg practice
- Challenges with data lakes
- Modern data architecture recipes
-
What is a Data Lakehouse? 👔
Lakehouse tutorial
-
The Modern Cloud Data Platform war — DataBricks (Part 2) 🎅
data warehouse, data lake, data lakehouse
-
Demystifying Data Lake Architecture
architecture
- The 10 Best Data Visualizations of 2021
- Anomaly Detection Part 1: The Key to Effective Data Observability 🍔
observability why
- Data Contracts — The Mesh glue
- Data Mesh Architecture Patterns
Pattern
- Awesome Data Mesh
- How to Build the Data Mesh Foundation: A Principled Approach
- DataOps Technologies — Why Meta-Orchestration?
- How to Move Beyond a Monolithic Data Lake to a Distributed Data Mesh
- Data Mesh Set
- Data Mesh: The Four Principles of a Distributed Architecture
- Data Movement in Netflix Studio via Data Mesh
data mesh studio
- An Architecture for the Data Mesh
data mesh architecture
- A Meta-architecture for Data Mesh
data mesh meta
- Data Mesh Architecture and the Role of APIs & JSON Schemas
APIs & JSON Schemas
- Data Mesh Pattern Deep Dive: Change Data Capture
datamesh Change Data Capture
- Top MLOps Data Versioning tools — 2021
- Machine Learning Data Visualization
- Five Predictions for the Future of the Modern Data Stack
- A Primer on Data Drift
- Optimization of the life cycle of an ML project using MLflow and DVC 📘
DVC + mlflow
- A Quick Start To Data Quality Monitoring For Machine Learning
- How to ensure data quality with Great Expectations 🐷
Great Expectations tutorial
- A tale of data quality checks at the runtime
kedro+great expectations
- ETL Testing in a nutshell
- 14 Principles To Secure Your Data Pipelines
- Setting up Airbyte ETL: Minimum Viable Data Stack Part II 🎨
airbyte tutorial
- Airbyte: Data Integration / CDC Solution for Modern Data Teams! 🎸
airbyte tutorial
- (P)TL, a new data engineering architecture 🎇
new data integration view
- How we build a Cloud Data lake using ELT instead of ETL
data lake + etl
- Airbyte or Meltano — and why I use neither of them
Airbyte vs Meltano
- 6 data integration principles for data engineers to live by
- Python ETL Pipeline: The Incremental data load Techniques