Skip to main content
Version: Next

Metabase

Certified

Important Capabilities

CapabilityStatusNotes
Platform InstanceEnabled by default
Table-Level LineageSupported by default

This plugin extracts Charts, dashboards, and associated metadata. This plugin is in beta and has only been tested on PostgreSQL and H2 database.

Collection

/api/collection endpoint is used to retrieve the available collections.

/api/collection/<COLLECTION_ID>/items?models=dashboard endpoint is used to retrieve a given collection and list their dashboards.

Dashboard

/api/dashboard/<DASHBOARD_ID> endpoint is used to retrieve a given Dashboard and grab its information.

  • Title and description
  • Last edited by
  • Owner
  • Link to the dashboard in Metabase
  • Associated charts

Chart

/api/card endpoint is used to retrieve the following information.

  • Title and description
  • Last edited by
  • Owner
  • Link to the chart in Metabase
  • Datasource and lineage

The following properties for a chart are ingested in DataHub.

NameDescription
DimensionsColumn names
FiltersAny filters applied to the chart
MetricsAll columns that are being used for aggregation

CLI based Ingestion

Install the Plugin

pip install 'acryl-datahub[metabase]'

Config Details

Note that a . is used to denote nested fields in the YAML recipe.

FieldDescription
connect_uri
string
Metabase host URL.
Default: localhost:3000
database_alias_map
object
Database name map to use when constructing dataset URN.
database_id_to_instance_map
map(str,string)
default_schema
string
Default schema name to use when schema is not provided in an SQL query
Default: public
display_uri
string
optional URL to use in links (if connect_uri is only for ingestion)
engine_platform_map
map(str,string)
exclude_other_user_collections
boolean
Flag that if true, exclude other user collections
Default: False
password
string(password)
Metabase password.
platform_instance_map
map(str,string)
username
string
Metabase username.
env
string
The environment that all assets produced by this connector belong to
Default: PROD

Metabase databases will be mapped to a DataHub platform based on the engine listed in the api/database response. This mapping can be customized by using the engine_platform_map config option. For example, to map databases using the athena engine to the underlying datasets in the glue platform, the following snippet can be used:

  engine_platform_map:
athena: glue

DataHub will try to determine database name from Metabase api/database payload. However, the name can be overridden from database_alias_map for a given database connected to Metabase.

If several platform instances with the same platform (e.g. from several distinct clickhouse clusters) are present in DataHub, the mapping between database id in Metabase and platform instance in DataHub may be configured with the following map:

  database_id_to_instance_map:
"42": platform_instance_in_datahub

The key in this map must be string, not integer although Metabase API provides id as number. If database_id_to_instance_map is not specified, platform_instance_map is used for platform instance mapping. If none of the above are specified, platform instance is not used when constructing urn when searching for dataset relations.

If needed it is possible to exclude collections from other users by setting the following configuration:

exclude_other_user_collections: true

Compatibility

Metabase version v0.48.3

Code Coordinates

  • Class Name: datahub.ingestion.source.metabase.MetabaseSource
  • Browse on GitHub

Questions

If you've got any questions on configuring ingestion for Metabase, feel free to ping us on our Slack.