Skip to main content

About DataHub Personal Access Tokens

Feature Availability
Self-Hosted DataHub
Managed DataHub

Personal Access Tokens or PATs for short, allow users to represent themselves in code and programmatically use DataHub's APIs in deployments where security is a concern.

Used along-side with authentication-enabled metadata service, PATs add a layer of protection to DataHub where only authorized users are able to perform actions in an automated way.

Personal Access Tokens Setup, Prerequisites, and Permissions

To use PATs, two things are required:

  1. Metadata Authentication must have been enabled in GMS. See Configuring Metadata Service Authentication in authentication-enabled metadata service.
  2. Users must have been granted the Generate Personal Access Tokens or Manage All Access Tokens Privilege via a DataHub Policy.

Once configured, users should be able to navigate to 'Settings' > 'Access Tokens' > 'Generate Personal Access Token' to generate a token:

If you have configured permissions correctly the Generate new token should be clickable.

note

If you see Token based authentication is currently disabled. Contact your DataHub administrator to enable this feature. then you must enable authentication in the metadata service (step 1 of the prerequisites).

Creating Personal Access Tokens

Once in the Manage Access Tokens Settings Tab:

  1. Click Generate new token where a form should appear.

  1. Fill out the information as needed and click Create.

  2. Save the token text somewhere secure! This is what will be used later on!

Using Personal Access Tokens

Once a token has been generated, the user that created it will subsequently be able to make authenticated HTTP requests, assuming he/she has permissions to do so, to DataHub frontend proxy or DataHub GMS directly by providing the generated Access Token as a Bearer token in the Authorization header:

Authorization: Bearer <generated-access-token> 

For example, using a curl to the frontend proxy (preferred in production):

curl 'http://localhost:9002/api/gms/entities/urn:li:corpuser:datahub' -H 'Authorization: Bearer <access-token>

or to Metadata Service directly:

curl 'http://localhost:8080/entities/urn:li:corpuser:datahub' -H 'Authorization: Bearer <access-token>

Since authorization happens at the GMS level, this means that ingestion is also protected behind access tokens, to use them simply add a token to the sink config property as seen below:

note

Without an access token, making programmatic requests will result in a 401 result from the server if Metadata Service Authentication is enabled.

Additional Resources

GraphQL

FAQ and Troubleshooting

The button to create tokens is greyed out - why can’t I click on it?

This means that the user currently logged in DataHub does not have either Generate Personal Access Tokens or Manage All Access Tokens permissions. Please ask your DataHub administrator to grant you those permissions.

When using a token, I get 401 unauthorized - why?

A PAT represents a user in DataHub, if that user does not have permissions for a given action, neither will the token.

Can I create a PAT that represents some other user?

Yes, although not through the UI correctly, you will have to use the token management graphQL API and the user making the request must have Manage All Access Tokens permissions.

Need more help? Join the conversation in Slack!