Infer Docker Registry Hashes for Local Image Layers
In recent weeks I spent some time working on security analysis of Docker container images in an environment that used multiple container registries. The goal of the project was to ensure that application images are built against known-good / certified base images. There was an unforseen factor that complicated this work- the organizationally approved base images reside in an old Quay Enterprise 2.9.x server that does not support the latest Docker registry API (Image Manifest Version 2, Schema 2) which prohibited a simple check of image layer hashes as the hashes are calculated differently and don't match up.
To get around this I crafted a solution that calculates the 'new' hash for each layer of approved base images and used the calculated layers to compare against application images. If you want to jump to the code, see this repo: InferDockerRegistryHash. For more details, read on below
References
-
InferDockerRegistryHash [github.com/7thzero]
-
Docker Registry HTTP API V2 [github.com/docker]
-
The new stored format of Docker image on disk and Distribution [hustcat.github.io]
-
SO, what is the way to compute the docker image id from a v2 registry? [github.com/deitch]
-
v1.Manifest is no longer compatible with docker [github.com/opencontainers]
-
go-digest [github.com/opencontainers]
-
Red Hat Quay Release Notes [access.redhat.com]
-
Image Manifest Version 2, Schema 2 [github.com/docker]
-
Image Manifest Version 2, Schema 1 [docs.docker.com]
-
Docker-Content-Digest in GET does not match docker push [github.com/docker]
-
Local images ID does not match registry manifest digest [github.com/docker]
-
Duplicate Blobsum in Manifest.FSLayers [github.com/docker]
-
Docker Registry HTTP API V2 [docs.docker.com]
-
How Docker calculates the hash of each layer? Is it deterministic? [stackoverflow.com]
-
1.10 Distribution Changes Design Doc: ID definitions and calculations [gist.github.com/aaronlehmann]
-
Check if a specific layer exists in my private Docker registry [stackoverflow.com]
-
Where are Docker Images Stored? Docker Container Paths Explained [freecodecamp.org]
-
docker image - merged/diff/work/LowerDir components of GraphDriver [stackoverflow.com]
-
Where does the docker images stored in local machine [stackoverflow.com]
-
Docker's layer content hashing scheme doesn't follow the canonicalization rules [github.com/moby]
-
Can you use the Docker Registry to recreate a Dockerfile? [medium.com]
-
Downloading Docker Images from Docker Hub without using Docker [devops.stackexchange.com]
-
How to compare two tarball's content [stackoverflow.com]
-
tar without preserving user [duplicate] [unix.stackexchange.com]
-
Source file src/archive/tar/example_test.go [golang.org]
-
Golang tarring directory [stackoverflow.com]
-
How to calculate checksum of a file in GO [stackoverflow.com]
-
Streaming IO in Go [medium.com]
-
How to Tar and Un-tar Files in Golang [medium.com]
-
Reading through a tar.gz file in Go / golang [gist.github.com/indraniel]
Strategy
The most straight-forward solution in this case involved re-calculating image layer hashes for base images as there are a fixed number of base images and a near-infinite supply of derived application images. By calculating and storing the base image layers it makes it easy to compare the registry manifest for application images against the base image layer hashes without having to download each and every application image.
Boiled into steps it looks like this:
- Download each base image/tag that is considered 'good'/current/compliant
- Calculate the Docker registry hashes for each layer of the image
- Store the image layer hashes
- For application images, hit the registry API and download the manifest
- Compare the registry layer hashes in the manifest with the downloaded base image layers
- If there's no match, it means the application image is not derived from an approved/current base image
Key Notes
1) While attempting to calculate docker registry base image layer hashes, I found that I had to use golang to perform the operations. Attempting to use offline tools resulted in inconsistent results- YMMV, I was time constrained and ran with a solution in golang.
2) You can query the docker registry API to obtain a manifest of all the layers (and registry hashes for those layers) associated with each image. Here's a script that pulls the manifest for the 'redis' image using dockerhub auth:
#!/bin/bash
#
# Constants
registryBase='https://registry-1.docker.io'
authBase='https://auth.docker.io'
authService='registry.docker.io'
#
# Approved/official images live in the 'library/$image' path
image="library/redis"
#
# Get a token to use to query the dockerhub API
# If you are targeting a private or internal docker registry, you may have basic auth or other authentication to account for!
token="$(curl -fsSL "$authBase/token?service=$authService&scope=repository:$image:pull" | jq --raw-output '.token')"
echo "$token"
#
# Pull the manifest
curl -fsSL -H "Authorization: Bearer $token" -H 'Accept: application/vnd.docker.distribution.manifest.v2+json' "$registryBase/v2/$image/manifests/buster"
The manifest output looks like this:
{
"schemaVersion": 2,
"mediaType": "application/vnd.docker.distribution.manifest.v2+json",
"config": {
"mediaType": "application/vnd.docker.container.image.v1+json",
"size": 6928,
"digest": "sha256:50541622f4f179450f4acec7d16964499525932a81263eafb91e699671f58ee4"
},
"layers": [
{
"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
"size": 27098544,
"digest": "sha256:6ec8c9369e08152361a01729f2c8a1e7aae898426c6e67267f41894bf9524827"
},
{
"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
"size": 1729,
"digest": "sha256:efe6cceb88f84ac331c72b04faa346f6b123e634f224e345b275b8cbccf185f3"
},
{
"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
"size": 1417693,
"digest": "sha256:cdb6bd1ce7c51976204e8123e897ffb59a1fc021ef30d861fdebf4e1578549be"
},
{
"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
"size": 9659754,
"digest": "sha256:9d80498f79fe7167276124a6511df0dbcd0c38431a90d9134668131b10425c46"
},
{
"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
"size": 98,
"digest": "sha256:b7cd40c9247bb101ac7830f19ef1d265235fef9b37fb2c5f3b49e9aabe8ceb88"
},
{
"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
"size": 407,
"digest": "sha256:96403647fb55fbeb37f670b5db5cc679057353d4bc078c99bbc5307cfaceb64c"
}
]
}
3) It is technically possible for there to be multiple manifests, even if it is uncommon.
Example/Prototype Library
To see the code, head on over to InferDockerRegistryHash