Skip to main content

How to onboard an entity?

Refer to this doc if you're only interested in adding a new aspect to an existing entity

Currently, DataHub only has a support for 3 entity types: datasets, users and groups. If you want to extend DataHub with your own use cases such as metrics, charts, dashboards etc, you should follow the below steps in order.

Also we use this following diagram to help you visualize the process. onboard-a-new-entity

1. Define URN#

Refer to here for URN definition.

2. Model your metadata#

Refer to metadata modelling section. Make sure to do the following:

  1. Define Aspect models.
  2. Define aspect union model. Refer to DatasetAspect as an example.
  3. Define Snapshot model. Refer to DatasetSnapshot as an example.
  4. Add your newly defined snapshot to Snapshot Union model.

3. GMA search onboarding#

Refer to search onboarding if you need to search the entity.

4. GMA graph onboarding#

Refer to graph onboarding if you need to perform graph queries against the entity.

5. Add rest.li resource endpoints#

See CorpUsers for an example of top-level resource endpoint. Optionally add an aspect-specific sub-resource endpoint such as CorpUsersEditableInfoResource.

If you want to use this new entity type from the ingestion framework's REST-based sink, you'll need to add it to the new endpoint to the resource list.

6. Configure dependency injection#

GMS uses Spring Framework for dependency injection. You'll need to add various factories to create any custom DAOs used by the rest.li endpoint. You'll also need to add any custom package to the base-package of <context:component-scan> tag in beans.xml

7. UI for entity onboarding [WIP]#