Trusted Across the Industry
How does it work?
Automated Metadata Ingestion
Push-based ingestion can use a prebuilt emitter or can emit custom events using our framework.
Pull-based ingestion crawls a metadata source. We have prebuilt integrations with Kafka, MySQL, MS SQL, Postgres, LDAP, Snowflake, Hive, BigQuery, and more. Ingestion can be automated using our Airflow integration or another scheduler of choice.
DataHub's push-based architecture also supports pull, but pull-first systems cannot support push. Learn more about metadata ingestion with DataHub in the docs.
source: type: "mysql" config: username: "datahub" password: "datahub" host_port: "localhost:3306"sink: type: "datahub-rest" config: server: 'http://localhost:8080'
datahub ingest -c recipe.yml
Discover Trusted Data
Browse and search over a continuously updated catalog of datasets, dashboards, charts, ML models, and more.
Understand Data in Context
DataHub is the one-stop shop for documentation, schemas, ownership, lineage, pipelines and usage information. Data quality and data preview information coming soon.