Podman invocations using different `--root`s cannot share an `--imagestore` #2257

seanlafferty-ibm · 2025-02-13T16:28:12Z

Issue Description

I suspect this was never an intended feature, but our use case is CI runners. We have hundreds of replicas, each of which has podman installed. Jobs always start with container pulls (which is the slowest part of the job). If a job pulls an image, we'd like for it to be written to a shared imagestore for subsequent jobs to reuse.

I created a imgs volume on the host, which I mounted into all of the runners. Then, all the runners were given a storage.conf with the contents:

[storage]
  driver = "overlay"
  imagestore = "/home/runner/imgs"

My expectation was that any podman pull inside of a runner would read/write this shared imagestore. However, the pulls step on each other, corrupting the imagestore/root.

Steps to reproduce the issue

# pull with root1, shared imgstore (works)
podman --root /tmp/root1 --imagestore /tmp/imgs pull mirror.gcr.io/library/bash

# pull with root2, shared imgstore (doesn't work)
podman --root /tmp/root2 --imagestore /tmp/imgs pull mirror.gcr.io/library/bash

ERRO[0000] Image mirror.gcr.io/library/bash exists in local storage but may be corrupted (remove the image to resolve the issue): layer not known

Describe the results you received

Corrupted imagestore

Describe the results you expected

Non-corrupted imagestore

podman info output

host:
  arch: amd64
  buildahVersion: 1.37.6
  cgroupControllers:
  - cpuset
  - cpu
  - io
  - memory
  - hugetlb
  - pids
  - rdma
  - misc
  cgroupManager: systemd
  cgroupVersion: v2
  conmon:
    package: conmon-2.1.12-1.el9.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.1.12, commit: c0564282e9befb7804c3642230f8e94f1b2ba9f8'
  cpuUtilization:
    idlePercent: 95.6
    systemPercent: 1.94
    userPercent: 2.47
  cpus: 2
  databaseBackend: sqlite
  distribution:
    distribution: rhel
    version: "9.4"
  eventLogger: journald
  freeLocks: 2033
  hostname: slaffy-github1.fyre.ibm.com
  idMappings:
    gidmap: null
    uidmap: null
  kernel: 5.14.0-427.42.1.el9_4.x86_64
  linkmode: dynamic
  logDriver: journald
  memFree: 357359616
  memTotal: 3837394944
  networkBackend: netavark
  networkBackendInfo:
    backend: netavark
    dns:
      package: aardvark-dns-1.12.1-1.el9.x86_64
      path: /usr/libexec/podman/aardvark-dns
      version: aardvark-dns 1.12.1
    package: netavark-1.12.2-1.el9.x86_64
    path: /usr/libexec/podman/netavark
    version: netavark 1.12.2
  ociRuntime:
    name: crun
    package: crun-1.16.1-1.el9.x86_64
    path: /usr/bin/crun
    version: |-
      crun version 1.16.1
      commit: afa829ca0122bd5e1d67f1f38e6cc348027e3c32
      rundir: /run/user/0/crun
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +YAJL
  os: linux
  pasta:
    executable: /usr/bin/pasta
    package: passt-0^20240806.gee36266-2.el9.x86_64
    version: |
      pasta 0^20240806.gee36266-2.el9.x86_64
      Copyright Red Hat
      GNU General Public License, version 2 or later
        <https://www.gnu.org/licenses/old-licenses/gpl-2.0.html>
      This is free software: you are free to change and redistribute it.
      There is NO WARRANTY, to the extent permitted by law.
  remoteSocket:
    exists: false
    path: /run/podman/podman.sock
  rootlessNetworkCmd: pasta
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: false
    seccompEnabled: true
    seccompProfilePath: /usr/share/containers/seccomp.json
    selinuxEnabled: false
  serviceIsRemote: false
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: slirp4netns-1.3.1-1.el9.x86_64
    version: |-
      slirp4netns version 1.3.1
      commit: e5e368c4f5db6ae75c2fce786e31eef9da6bf236
      libslirp: 4.4.0
      SLIRP_CONFIG_VERSION_MAX: 3
      libseccomp: 2.5.2
  swapFree: 17118425088
  swapTotal: 17175670784
  uptime: 453h 46m 52.00s (Approximately 18.88 days)
  variant: ""
plugins:
  authorization: null
  log:
  - k8s-file
  - none
  - passthrough
  - journald
  network:
  - bridge
  - macvlan
  - ipvlan
  volume:
  - local
registries:
  search:
  - registry.access.redhat.com
  - registry.redhat.io
  - docker.io
store:
  configFile: /etc/containers/storage.conf
  containerStore:
    number: 7
    paused: 0
    running: 7
    stopped: 0
  graphDriverName: overlay
  graphOptions:
    overlay.mountopt: nodev,metacopy=on
  graphRoot: /var/lib/containers/storage
  graphRootAllocated: 250059685888
  graphRootUsed: 124696444928
  graphStatus:
    Backing Filesystem: xfs
    Native Overlay Diff: "false"
    Supports d_type: "true"
    Supports shifting: "false"
    Supports volatile: "true"
    Using metacopy: "true"
  imageCopyTmpDir: /var/tmp
  imageStore:
    number: 1970
  runRoot: /run/containers/storage
  transientStore: false
  volumePath: /var/lib/containers/storage/volumes
version:
  APIVersion: 5.2.2
  Built: 1737721907
  BuiltTime: Fri Jan 24 04:31:47 2025
  GitCommit: ""
  GoVersion: go1.22.9 (Red Hat 1.22.9-2.el9_5)
  Os: linux
  OsArch: linux/amd64
  Version: 5.2.2

Podman in a container

Yes

Privileged Or Rootless

Rootless

Upstream Latest Release

No

Additional environment details

No response

Additional information

No response

The text was updated successfully, but these errors were encountered:

Luap99 · 2025-02-13T17:49:46Z

I am not sure if imagestores are supposed to be shared between different podman instances, my understanding was that one uses additionalimagestores= for that. That of course is read only and would not pull new images into that.

cc @giuseppe

baude · 2025-02-13T22:10:12Z

that is also my understanding ... i think https://www.redhat.com/en/blog/image-stores-podman might be what you are after?

seanlafferty-ibm · 2025-02-13T23:18:14Z

I was taking a bit of a logical leap based on how additionalimagestores behaves:

two podmans that do not share storage = set different roots
two podmans that share a pre-populated image cache = set different roots, but the same additionalimagestores
two podmans that share a read/write image cache = set different roots, but the same imagestore?

I am doubtful imagestore was designed to serve that usecase, but I couldn't find explicit documentation about it, so figured I'd ask. It would be handy since the read-only nature of additionalimagestores makes it hard to populate in environments were the images are not known ahead of time. 🙂

seanlafferty-ibm · 2025-02-14T02:51:54Z

Another issue with additionalimagestores is that podman pull does not seem to consult it when determining if an image already exists. So even if we populate additionalimagestores via some back channel, the RO store doesn't provide any benefit to jobs that explicitly invoke podman pull.

Ex:

podman --root /tmp/root1 pull mirror.gcr.io/library/bash
# pull again to the same root, its cached as expected
podman --root /tmp/root1 pull mirror.gcr.io/library/bash

# use a different root, with an additionalimagestore to the first root
# we download the image even though its present in the RO additionalimagestore :(
podman --root /tmp/root2 --storage-opt additionalimagestore=/tmp/root1 pull mirror.gcr.io/library/bash

Not sure if that expected behavior. It seems quirky, but on the other hand if an image's existence in additionalimagestore short-circuited the podman pull, you wouldn't be able to download the image explicitly to your root even if you wanted to. So maybe that's an intentional design decision?

giuseppe · 2025-02-14T09:04:41Z

I've tried with Podman 5.3.2 and it works well for me, as the issue you've seen was fixed with defaae6

Can you please try with a newer version?

Luap99 · 2025-02-14T10:28:30Z

I used the original reproducer on main yesterday and it failed.

$ bin/podman version
Client:       Podman Engine
Version:      5.5.0-dev
API Version:  5.5.0-dev
Go Version:   go1.22.10
Git Commit:   53c9100c72db4d612e05c230df8dc2e702c5f4da
Built:        Fri Feb 14 11:26:59 2025
OS/Arch:      linux/amd64
$ bin/podman --root /tmp/root1 --imagestore /tmp/imgs pull mirror.gcr.io/library/bash
Trying to pull mirror.gcr.io/library/bash:latest...
Getting image source signatures
Copying blob 1e14325cdcb9 done   | 
Copying blob 1f3e46996e29 done   | 
Copying blob dc5efae3f74e done   | 
Copying config 2a658e2e2b done   | 
Writing manifest to image destination
2a658e2e2bab7c6a5fb70fada1e493be70faf89a941fbb5fd52b69b439781c64
$ bin/podman --root /tmp/root2 --imagestore /tmp/imgs pull mirror.gcr.io/library/bash
ERRO[0000] Image mirror.gcr.io/library/bash exists in local storage but may be corrupted (remove the image to resolve the issue): layer not known 
Trying to pull mirror.gcr.io/library/bash:latest...
Getting image source signatures
Copying blob 1f3e46996e29 [================================>-----] 3.0MiB / 3.5MiB | 3.9 MiB/s
Copying blob dc5efae3f74e [==================>-------------------] 1.3MiB / 2.6MiB | 1.6 MiB/s
Copying blob 1f3e46996e29 done   | 
Copying blob dc5efae3f74e done   | 
Copying blob 1f3e46996e29 done   | 
Copying blob dc5efae3f74e done   | 
Copying blob 1f3e46996e29 done   | 
Copying blob dc5efae3f74e done   | 
Copying blob 1e14325cdcb9 done   | 
Copying config 2a658e2e2b done   | 
Writing manifest to image destination
2a658e2e2bab7c6a5fb70fada1e493be70faf89a941fbb5fd52b69b439781c64

giuseppe · 2025-02-14T12:46:46Z

yeah sorry, I've tested only the last sequence of commands in #2257

The issue is that we write layers.json on the root store, not sure how easy it is to fix considering the messy locking we already have

giuseppe · 2025-02-14T14:17:52Z

I think we need something like the following diff:

diff --git a/layers.go b/layers.go
index 69a4887bf..ed10340fa 100644
--- a/layers.go
+++ b/layers.go
@@ -51,6 +51,7 @@ type layerLocations uint8
 // unclean shutdown
 const (
 	stableLayerLocation layerLocations = 1 << iota
+	imageStoreLayerLocation
 	volatileLayerLocation
 
 	numLayerLocationIndex = iota
@@ -167,6 +168,9 @@ type Layer struct {
 	// volatileStore is true if the container is from the volatile json file
 	volatileStore bool `json:"-"`
 
+	// imageStore is true if the layer is from the image store json file
+	imageStore bool `json:"-"`
+
 	// BigDataNames is a list of names of data items that we keep for the
 	// convenience of the caller.  They can be large, and are only in
 	// memory when being read from or written to disk.
@@ -435,6 +439,9 @@ func layerLocation(l *Layer) layerLocations {
 	if l.volatileStore {
 		return volatileLayerLocation
 	}
+	if l.imageStore {
+		return imageStoreLayerLocation
+	}
 	return stableLayerLocation
 }
 
@@ -456,6 +463,7 @@ func copyLayer(l *Layer) *Layer {
 		CompressionType:    l.CompressionType,
 		ReadOnly:           l.ReadOnly,
 		volatileStore:      l.volatileStore,
+		imageStore:         l.imageStore,
 		BigDataNames:       copySlicePreferringNil(l.BigDataNames),
 		Flags:              copyMapPreferringNil(l.Flags),
 		UIDMap:             copySlicePreferringNil(l.UIDMap),
@@ -823,6 +831,9 @@ func (r *layerStore) load(lockedForWriting bool) (bool, error) {
 			if location == volatileLayerLocation {
 				layer.volatileStore = true
 			}
+			if location == imageStoreLayerLocation {
+				layer.imageStore = true
+			}
 			layers = append(layers, layer)
 			ids[layer.ID] = layer
 		}
@@ -1144,6 +1155,7 @@ func (s *store) newLayerStore(rundir, layerdir, imagedir string, driver drivers.
 		rundir:         rundir,
 		jsonPath: [numLayerLocationIndex]string{
 			filepath.Join(layerdir, "layers.json"),
+			filepath.Join(imagedir, "layers.json"),
 			filepath.Join(volatileDir, "volatile-layers.json"),
 		},
 		layerdir: layerdir,
@@ -1180,6 +1192,7 @@ func newROLayerStore(rundir string, layerdir string, driver drivers.Driver) (roL
 		mountsLockfile: nil,
 		rundir:         rundir,
 		jsonPath: [numLayerLocationIndex]string{
+			filepath.Join(layerdir, "layers.json"),
 			filepath.Join(layerdir, "layers.json"),
 			filepath.Join(layerdir, "volatile-layers.json"),
 		},
@@ -1422,6 +1435,7 @@ func (r *layerStore) create(id string, parentLayer *Layer, names []string, mount
 		GIDMap:             copySlicePreferringNil(moreOptions.GIDMap),
 		BigDataNames:       []string{},
 		volatileStore:      moreOptions.Volatile,
+		imageStore:         moreOptions.ImageStore,
 	}
 	layer.Flags[incompleteFlag] = true
 
diff --git a/store.go b/store.go
index 053c31c42..88e7e088e 100644
--- a/store.go
+++ b/store.go
@@ -667,6 +667,8 @@ type LayerOptions struct {
 	UncompressedDigest digest.Digest
 	// True is the layer info can be treated as volatile
 	Volatile bool
+	// True is the layer info must be written to the image store
+	ImageStore bool
 	// BigData is a set of items which should be stored with the layer.
 	BigData []LayerBigDataOption
 	// Flags is a set of named flags and their values to store with the layer.
@@ -1545,6 +1547,9 @@ func (s *store) putLayer(rlstore rwLayerStore, rlstores []roLayerStore, id, pare
 			GIDMap:         copySlicePreferringNil(gidMap),
 		}
 	}
+	if !writeable && s.imageStoreDir != "" {
+		options.ImageStore = true
+	}
 	return rlstore.create(id, parentLayer, names, mountLabel, nil, &options, writeable, diff, slo)
 }

when an imagestore is used, a R/O layer must be written to the layers.json under the imagestore, not graphroot. The lock on the imagestore is already taken through the multipleLockFile{} mechanism in place. Closes: containers#2257 Signed-off-by: Giuseppe Scrivano <[email protected]>

giuseppe · 2025-02-14T15:49:11Z

opened a PR: #2258

seanlafferty-ibm · 2025-02-14T15:57:37Z

Thanks for looking into it! I do have a bit of a hacky workaround in the meantime that anyone else in the same boat may be interested in. I populate additionalimagestores on the fly via podman save | podman load at the end of my jobs. Obviously it's a bit resource heavy since its compressing and decompressing images redundantly. For the morbidly curious:

Workaround

Step 1. Give all the runners an additionalimagestores that's a volume (to begin with, nothing is in it)

[storage]
  driver = "overlay"
[storage.options]
  additionalimagestores=[ "/home/runner/.local/share/podman-pull-storage" ]

Step 2. On job_completion, find all images that exist, and copy them into the shared storage by finagling --root

images=$(podman images --format json | jq -c 'map(select((.Names | length > 0))) | group_by(.Id) | map({Id: .[0].Id, Names: [.[] | .Names] | flatten | unique }) | .[]')
pull_cache="/home/runner/.local/share/podman-pull-storage"
for image in $images
do
  id=$(jq -r '.Id' <<< "$image")
  names=$(jq -r '.Names | join(" ")' <<< "$image")
  echo "attempting to cache image with id: $id"
  if podman --root $pull_cache image exists $id; then
    echo "  image already exists in the cache"
    for name in $names
    do
      if podman --root $pull_cache image inspect $id | jq --exit-status --arg imageName $name 'map(select(.RepoTags[] | contains($imageName))) | length == 0' > /dev/null; then
        echo "    adding missing tag to cache: $name"
        podman --root $pull_cache tag $id $name
      fi
    done
  else
    echo "  image not found in cache, writing it with tags: $names"
    podman save $id $names | podman --root $pull_cache load
  fi
done

Step 3. On subsequent runs, images will be pulled from the additionalimagestores if they exist (assuming v5.3.2)

when an imagestore is used, a R/O layer must be written to the layers.json under the imagestore, not graphroot. The lock on the imagestore is already taken through the multipleLockFile{} mechanism in place. Closes: containers#2257 Signed-off-by: Giuseppe Scrivano <[email protected]>

seanlafferty-ibm added the kind/bug label Feb 13, 2025

giuseppe transferred this issue from containers/podman Feb 14, 2025

giuseppe linked a pull request Feb 14, 2025 that will close this issue

layers: write read only layers to imagestore #2258

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Podman invocations using different `--root`s cannot share an `--imagestore` #2257

Podman invocations using different `--root`s cannot share an `--imagestore` #2257

seanlafferty-ibm commented Feb 13, 2025 •

edited

Loading

Luap99 commented Feb 13, 2025

baude commented Feb 13, 2025

seanlafferty-ibm commented Feb 13, 2025 •

edited

Loading

seanlafferty-ibm commented Feb 14, 2025 •

edited

Loading

giuseppe commented Feb 14, 2025

Luap99 commented Feb 14, 2025

giuseppe commented Feb 14, 2025

giuseppe commented Feb 14, 2025

giuseppe commented Feb 14, 2025

seanlafferty-ibm commented Feb 14, 2025 •

edited

Loading

Podman invocations using different --roots cannot share an --imagestore #2257

Podman invocations using different --roots cannot share an --imagestore #2257

Comments

seanlafferty-ibm commented Feb 13, 2025 • edited Loading

Issue Description

Steps to reproduce the issue

Describe the results you received

Describe the results you expected

podman info output

Podman in a container

Privileged Or Rootless

Upstream Latest Release

Additional environment details

Additional information

Luap99 commented Feb 13, 2025

baude commented Feb 13, 2025

seanlafferty-ibm commented Feb 13, 2025 • edited Loading

seanlafferty-ibm commented Feb 14, 2025 • edited Loading

giuseppe commented Feb 14, 2025

Luap99 commented Feb 14, 2025

giuseppe commented Feb 14, 2025

giuseppe commented Feb 14, 2025

giuseppe commented Feb 14, 2025

seanlafferty-ibm commented Feb 14, 2025 • edited Loading

Podman invocations using different `--root`s cannot share an `--imagestore` #2257

Podman invocations using different `--root`s cannot share an `--imagestore` #2257

seanlafferty-ibm commented Feb 13, 2025 •

edited

Loading

seanlafferty-ibm commented Feb 13, 2025 •

edited

Loading

seanlafferty-ibm commented Feb 14, 2025 •

edited

Loading

seanlafferty-ibm commented Feb 14, 2025 •

edited

Loading