Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MountVolume.MountDevice fails with "wrong fs type, bad option, bad superblock" after AKS cluster start #59

Open
mloskot opened this issue Feb 24, 2025 · 2 comments

Comments

@mloskot
Copy link

mloskot commented Feb 24, 2025

This is not a bug report to Seq or its Helm chart, but my attempt to discuss an issue I've been observing while evaluating Seq in a fairly typical Kubernetes cluster created with Azure Kubernetes Services (AKS). Normally, I wouldn't bother the Seq community about a general AKS or Kubernetes issue, but I've been running number of clusters with variety of deployments mounting PV-s from Azure Storage like Azure Files and Azure Disks, and I have not observe such issue with any of applications I run. Although my problem can be caused by a bug in Azure CSI drive or fairly recent Kubernetes version that I'm using on AKS, I thought I'll try brainstorm it here first.

Context

To the point, here is my test environment where I'm evaluating Seq:

  • AKS cluster with Kubernetes 1.31.5
  • AKS hybrid node pools running, Linux and Windows
  • Azure automation with scheduled runbook that stops AKS every evening and starts it every morning - this may be considered as an uncommon set-up
  • Flux deployed for GitOps
  • Seq deployed with mostly default Helm values and dynamically provisioned PV from Azure Disk

Here is manifest with Helm release of Seq using the official Helm chart, which then Flux observes in my GitOps repository and releases:

---
apiVersion: helm.toolkit.fluxcd.io/v2
kind: HelmRelease
metadata:
  name: seq
  namespace: common
spec:
  interval: 10m
  chart:
    spec:
      chart: seq
      version: "2024.3.1"
      sourceRef:
        kind: HelmRepository
        name: datalust
        namespace: flux-system
      interval: 10m
  values:
    image:
      pullPolicy: IfNotPresent
      repository: datalust/seq
      tag: "2024.3"
    persistence:
      storageClass: managed-csi

where managed-csi is one of built-in AKS storage classes.

Problem

First time Seq is deployed, everything is perfectly fine.

Then, almost every time after AKS is started in the morning, Kubernetes cluster resurrected with all pods but Seq which is failing due to this PV issue:

MountVolume.MountDevice failed for volume "pvc-11ab2543-35e2-44cc-88a4-2640bca6396e" : rpc error: code = Internal
desc = could not format /dev/disk/azure/scsi1/lun0(lun: 0), and mount it at /var/lib/kubelet/plugins/kubernetes.io/csi/disk.csi.azure.com/56dd34d1e64485d92930f8e0a3873a31e4030b01790c2c2f45d2de222c3a52b0/globalmount,
failed with mount failed: exit status 32 Mounting command: mount
Mounting arguments: -t ext4 -o defaults /dev/disk/azure/scsi1/lun0 /var/lib/kubelet/plugins/kubernetes.io/csi/disk.csi.azure.com/56dd34d1e64485d92930f8e0a3873a31e4030b01790c2c2f45d2de222c3a52b0/globalmount
Output: mount: /var/lib/kubelet/plugins/kubernetes.io/csi/disk.csi.azure.com/56dd34d1e64485d92930f8e0a3873a31e4030b01790c2c2f45d2de222c3a52b0/globalmount:
wrong fs type, bad option, bad superblock on /dev/sdb, missing codepage or helper program, or other error. dmesg(1) may have more information after failed mount system call.
kubectl describe pod -n common seq-7d575bc88b-n76rq
Name:             seq-7d575bc88b-n76rq
Namespace:        common
Priority:         0
Service Account:  default
Node:             aks-default-29286985-vmss00000j/10.3.0.4
Start Time:       Mon, 24 Feb 2025 08:07:31 +0100
Labels:           app=seq
                  pod-template-hash=7d575bc88b
                  release=seq
Annotations:      
Status:           Pending
IP:               
IPs:              
Controlled By:    ReplicaSet/seq-7d575bc88b
Containers:
  seq:
    Container ID:   
    Image:          datalust/seq:2024.3
    Image ID:       
    Ports:          5341/TCP, 80/TCP
    Host Ports:     0/TCP, 0/TCP
    State:          Waiting
      Reason:       ContainerCreating
    Ready:          False
    Restart Count:  0
    Liveness:       http-get http://:ui/ delay=0s timeout=1s period=10s #success=1 #failure=3
    Readiness:      http-get http://:ui/ delay=0s timeout=1s period=10s #success=1 #failure=3
    Startup:        http-get http://:ui/ delay=0s timeout=1s period=10s #success=1 #failure=30
    Environment:
      ACCEPT_EULA:  Y
    Mounts:
      /data from seq-data (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-96n6k (ro)
Conditions:
  Type                        Status
  PodReadyToStartContainers   False 
  Initialized                 True 
  Ready                       False 
  ContainersReady             False 
  PodScheduled                True 
Volumes:
  seq-data:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  seq
    ReadOnly:   false
  kube-api-access-96n6k:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       
    DownwardAPI:             true
QoS Class:                   BestEffort
Node-Selectors:              
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason       Age                    From     Message
  ----     ------       ----                   ----     -------
  Warning  FailedMount  3m30s (x306 over 10h)  kubelet  MountVolume.MountDevice failed for volume "pvc-11ab2543-35e2-44cc-88a4-2640bca6396e" : rpc error: code = Internal desc = could not format /dev/disk/azure/scsi1/lun0(lun: 0), and mount it at /var/lib/kubelet/plugins/kubernetes.io/csi/disk.csi.azure.com/56dd34d1e64485d92930f8e0a3873a31e4030b01790c2c2f45d2de222c3a52b0/globalmount, failed with mount failed: exit status 32
Mounting command: mount
Mounting arguments: -t ext4 -o defaults /dev/disk/azure/scsi1/lun0 /var/lib/kubelet/plugins/kubernetes.io/csi/disk.csi.azure.com/56dd34d1e64485d92930f8e0a3873a31e4030b01790c2c2f45d2de222c3a52b0/globalmount
Output: mount: /var/lib/kubelet/plugins/kubernetes.io/csi/disk.csi.azure.com/56dd34d1e64485d92930f8e0a3873a31e4030b01790c2c2f45d2de222c3a52b0/globalmount: wrong fs type, bad option, bad superblock on /dev/sdb, missing codepage or helper program, or other error.
       dmesg(1) may have more information after failed mount system call.
kubectl describe pvc -n common seq
Name:          seq
Namespace:     common
StorageClass:  managed-premium
Status:        Bound
Volume:        pvc-11ab2543-35e2-44cc-88a4-2640bca6396e
Labels:        app=seq
               app.kubernetes.io/managed-by=Helm
               chart=seq-2024.3.1
               helm.toolkit.fluxcd.io/name=seq
               helm.toolkit.fluxcd.io/namespace=common
               heritage=Helm
               release=seq
Annotations:   meta.helm.sh/release-name: seq
               meta.helm.sh/release-namespace: common
               pv.kubernetes.io/bind-completed: yes
               pv.kubernetes.io/bound-by-controller: yes
               volume.beta.kubernetes.io/storage-provisioner: disk.csi.azure.com
               volume.kubernetes.io/selected-node: aks-default-10614471-vmss00000n
               volume.kubernetes.io/storage-provisioner: disk.csi.azure.com
Finalizers:    [kubernetes.io/pvc-protection]
Capacity:      8Gi
Access Modes:  RWO
VolumeMode:    Filesystem
Used By:       seq-7d575bc88b-n76rq
Events:        
kubectl describe pv pvc-11ab2543-35e2-44cc-88a4-2640bca6396e
                                                                                                                                
Name:              pvc-11ab2543-35e2-44cc-88a4-2640bca6396e                                                                                                                                   
Labels:                                                                                                                                                                                 
Annotations:       pv.kubernetes.io/provisioned-by: disk.csi.azure.com                                                                                                                        
                   volume.kubernetes.io/provisioner-deletion-secret-name:                                                                                                                     
                   volume.kubernetes.io/provisioner-deletion-secret-namespace:                                                                                                                
Finalizers:        [external-provisioner.volume.kubernetes.io/finalizer kubernetes.io/pv-protection external-attacher/disk-csi-azure-com]                                                     
StorageClass:      managed-premium                                                                                                                                                            
Status:            Bound                                                                                                                                                                      
Claim:             common/seq                                                                                                                                                                 
Reclaim Policy:    Delete                                                                                                                                                                     
Access Modes:      RWO                                                                                                                                                                        
VolumeMode:        Filesystem                                                                                                                                                                 
Capacity:          8Gi                                                                                                                                                                        
Node Affinity:                                                                                                                                                                                
  Required Terms:                                                                                                                                                                             
    Term 0:        topology.disk.csi.azure.com/zone in []                                                                                                                                     
Message:                                                                                                                                                                                      
Source:                                                                                                                                                                                       
    Type:              CSI (a Container Storage Interface (CSI) volume source)                                                                                                                
    Driver:            disk.csi.azure.com                                                                                                                                                     
    FSType:                                                                                                                                                                                   
    VolumeHandle:      /subscriptions/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxx/resourceGroups/rg-aks-mloskot-uks-tst-nodes/providers/Microsoft.Compute/disks/pvc-11ab2543-35e2-44cc-88a4-2640bca63
96e                                                                                                                                                                                           
    ReadOnly:          false                                                                                                                                                                  
    VolumeAttributes:      cachingmode=ReadOnly                                                                                                                                               
                           csi.storage.k8s.io/pv/name=pvc-11ab2543-35e2-44cc-88a4-2640bca6396e                                                                                                
                           csi.storage.k8s.io/pvc/name=seq                                                                                                                                    
                           csi.storage.k8s.io/pvc/namespace=common                                                                                                                            
                           kind=Managed                                                                                                                                                       
                           requestedsizegib=8                                                                                                                                                 
                           storage.kubernetes.io/csiProvisionerIdentity=1739257573427-7568-disk.csi.azure.com                                                                                 
                           storageaccounttype=Premium_LRS                                                                                                                                     
Events:                

Brainstorm

If we take Kubernetes out of the picture and focus on the common Linux error wrong fs type, bad option, bad superblock on /dev/sdb, then I suspect some of the following situations might be happening during the scheduled stopping of my AKS cluster:

  • (Azure) disk not properly unmounted (similar issue discussed here
  • (Azure) disk not properly detached from node (similar issue described here)
  • Seq container not being terminated gracefully, but forcibly, while Seq still writing data to volume that is being unmounted leading to disk corruption

This issue could be caused by a bug in CSI driver, as I mentioned earlier, but web searching for "disk.csi.azure.com"+"wrong fs type" does not bring any helpful results.

Perhaps root of this issue is in the hybrid nodes in my cluster which may lead to this peculiar situation: at AKS restart, the Azure disk PV pvc-11ab2543-35e2-44cc-88a4-2640bca6396e is mounted to Windows node by Kubernetes trying to (randomly) schedule Sec pod to Windows node, that is, because I do not specify the nodeSelector in my Helm values above, neither the Helm chart here specifies this as a reasonable default to ensure Seq is scheduled only to Linux nodes:

nodeSelector:
  kubernetes.io/os: Linux

in

nodeSelector: {}

and, perhaps, such unexpected mounting of the pvc-11ab2543-35e2-44cc-88a4-2640bca6396e disk to Windows somehow leads to corruption of its ext4, making it unusable later. A long shot, I do realise :)

Outro

I have not tried to fsck the disk from node. Since I'm deploying Seq to a test cluster for evaluation, I simply re-deploy it to trigger re-creation of Azure Disk, PV and PVC, and which works around the problem until some another AKS start/stop cycle breaks it.

Next, I am going to try the following:

  • Specify explicit nodeSelector with kubernetes.io/os: Linux
  • Use static provisioning of the Azure Disk to see if it makes any difference

I'm sharing my experience with hope that either there are Seq users who have seen similar issues or Seq team folks, who know Seq internals, may be able to provide feedback helpful to diagnose the problem better.

I'll appreciate any ideas.

@liammclennan
Copy link

Hi @mloskot ,

Regarding, "Seq container not being terminated gracefully, but forcibly, while Seq still writing data to volume that is being unmounted leading to disk corruption". You could corrupt Seq's storage this way, but I don't think you can break the filesystem.

@mloskot
Copy link
Author

mloskot commented Feb 25, 2025

Hi @liammclennan

You could corrupt Seq's storage this way, but I don't think you can break the filesystem.

That's my general understanding too, unless such situation hits a subtle bug in CSI driver.

As I mentioned, I'll be tweaking and testing my setup. I hope to share update on my experience here in couple of days/next week.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants