Make sure to allocate enough hardware resources for Docker engine. Tested & confirmed config: 2 CPUs, 8GB RAM, 2GB Swap area.
You can easily download and run all these images and their dependencies with our quick start guide.
DataHub Docker Images:
Do not use
debug tags for any of the image as those are not supported and present only due to leagcy reasons. Please use
head or tags specific for versions like
v0.8.40. For production we recommend using version specific tags not
- linkedin/datahub-ingestion - This contains the Python CLI. If you are looking for docker image for every minor CLI release you can find them under acryldata/datahub-ingestion.
- acryldata/datahub-actions. Do not use
acryldata/acryl-datahub-actionsas that is deprecated and no longer used.
Ingesting demo data.
If you want to test ingesting some data once DataHub is up, use the
./docker/ingestion/ingestion.sh script or
datahub docker ingest-sample-data. See the quickstart guide for more details.
Using Docker Images During Development
Building And Deploying Docker Images
We use GitHub Actions to build and continuously deploy our images. There should be no need to do this manually; a successful release on Github will automatically publish the images.
This is not our recommended development flow and most developers should be following the Using Docker Images During Development guide.
To build the full images (that we are going to publish), you need to run the following:
COMPOSE_DOCKER_CLI_BUILD=1 DOCKER_BUILDKIT=1 docker-compose -p datahub build
This is because we're relying on builtkit for multistage builds. It does not hurt also set