Skip to main content

Kafka Connect

For context on getting started with ingestion, check out our metadata ingestion guide.


To install this plugin, run pip install 'acryl-datahub[kafka-connect]'.


This plugin extracts the following:

  • Kafka Connect connector as individual DataFlowSnapshotClass entity
  • Creating individual DataJobSnapshotClass entity using {connector_name}:{source_dataset} naming
  • Lineage information between source database to Kafka topic

Current limitations:

  • Currently works for JDBC and Debezium source connectors only.

Quickstart recipe#

Check out the following recipe to get started with ingestion! See below for full configuration options.

For general pointers on writing and running a recipe, see our main recipe guide.

source:  type: "kafka-connect"  config:    # Coordinates    connect_uri: "http://localhost:8083"    cluster_name: "connect-cluster"
    # Credentials    username: admin    password: password
sink:  # sink configs

Config details#

Note that a . is used to denote nested fields in the YAML recipe.

connect_uri"http://localhost:8083/"URI to connect to.
usernameKafka Connect username.
passwordKafka Connect password.
cluster_name"connect-cluster"Cluster to ingest from.
construct_lineage_workunitsTrueWhether to create the input and output Dataset entities
connector_patterns.denyList of regex patterns for connectors to include in ingestion.
connector_patterns.allowList of regex patterns for connectors to exclude from ingestion.
connector_pattern.ignoreCaseTrueWhether to ignore case sensitivity during pattern matching.
env"PROD"Environment to use in namespace when constructing URNs.


Coming soon!


If you've got any questions on configuring this source, feel free to ping us on our Slack!