There are a two ways to delete data from DataHub.
The CLI will point to localhost DataHub by default. Running
will allow you to customize the datahub instance you are communicating with.
Note: Provide your GMS instance's host when the prompt asks you for the DataHub host.
Alternatively, you can set the following env variables if you don't want to use a config file
The env variables take precendence over what is in the config.
To delete all the data related to a single entity, run
datahub delete --urn "<my urn>"
Note: make sure you surround your urn with quotes! If you do not include the quotes, your terminal may misinterpret the command.
Whenever you run
datahub ingest -c ..., all the metadata ingested with that run will have the same run id.
To view the ids of the most recent set of ingestion batches, execute
datahub ingest list-runs
That will print out a table of all the runs. Once you have an idea of which run you want to roll back, run
datahub ingest show --run-id <run-id>
to see more info of the run.
datahub ingest rollback --run-id <run-id>
To rollback all aspects added with this run and all entities created by this run.