You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
What steps did you take and what happened: [A clear and concise description
of what the bug is.]
First of all, thanks to all contributors for their work and for providing this
CAPI provider!
I was following the quickstart guide to setup a Kubernetes V1.30.8 cluster on
one of my proxmox instances (running proxmox 8.2.2).
The Ubuntu 24.04 VM template was created with Image Builder; by setting PACKER_FLAGS I chose Kubernetes 1.30.8. Except for a necessary change of builder.disks.format from qcow2 to raw this was a smooth go; template was
created and received Id 114 on my proxmox instance (INTEL).
I fired up a local kind cluster on my OSX machine (ARM) for management and
initialized Cluster API like so
where CTL_CONFIG points to my clusterctl.yaml (contents: see below). clusterctl picked Cluster API and kubeadmv1.9.4, proxmox provider v0.6.2 and ipam-in-clusterv1.0.0.
Next step was to create the workload cluster manifest:
The first VM being created is always a control plane and, as expected, it receives two
IPs: the CONTROL_PLANE_ENDPOINT_IP and the first IP from the pool defined in NODE_IP_RANGES.
PROBLEM 1: Watching the summary of the VM in proxmox, pretty soon after
creation of more nodes begun, the CONTROL_PLANE_ENDPOINT_IP is disappearing
and I am no longer able to ping the associated IP. It rarely and randomly comes back to
disappear soon again ...
As far as I understand kube-vip, the CONTROL_PLANE_ENDPOINT_IP should go
around between control planes from time to time? However, if no other plane is
ready this should not happen? At least I see no other plane having it ...
PROBLEM 2 Although I asked for 3 control planes and 3 worker nodes, fewer
VMs get created (due to the size of my management cluster?); the amount varies
from try to try. Except for the first created VM each is labeled with go-proxmox+cloud-init but this label never disappears, as it did with the first VM.
I was able to ssh into the nodes and journalctl -u kubelet revealed that all
kubelets (except for the first created VM) crashed due to missing file /var/lib/kubelet/config.yaml.
There was also a warning that flag --pod-infra-container-image has been
deprecated ...
I also tried the --flavor calico after creating a config map (as described in the quickstart guide) but it didn't have an effect.
Please let me know if I should provide further information and if so, how to get
my hands on it (I am a kubernetes newbie).
What did you expect to happen:
A running kubernetes cluster on my proxmox instance ;-)
Anything else you would like to add: [Miscellaneous information that will
assist in solving the issue.]
Side note 1:Trying to ssh into the machines works but is very very slow for the first attempt.
Side note 2: Running kubectl apply -f kubemox.yaml I randomly encountered the
following error:
cluster.cluster.x-k8s.io/kubemox created
kubeadmcontrolplane.controlplane.cluster.x-k8s.io/kubemox-control-plane created
proxmoxmachinetemplate.infrastructure.cluster.x-k8s.io/kubemox-control-plane created
machinedeployment.cluster.x-k8s.io/kubemox-workers created
proxmoxmachinetemplate.infrastructure.cluster.x-k8s.io/kubemox-worker created
kubeadmconfigtemplate.bootstrap.cluster.x-k8s.io/kubemox-worker created
clusterresourceset.addons.cluster.x-k8s.io/kubemox-crs-0 created
Error from server (InternalError): error when creating "kubemox.yaml": Internal error occurred: failed calling webhook "validation.proxmoxcluster.infrastructure.cluster.x-k8s.io": failed to call webhook: Post "https://capmox-webhook-service.capmox-system.svc:443/validate-infrastructure-cluster-x-k8s-io-v1alpha1-proxmoxcluster?timeout=10s": dial tcp 10.96.178.119:443: connect: connection refused
However, if it occurs, after several retries proxmox starts to instantiate
machines from my templates - no idea, whether this is related to my problem.
Meanwhile I digged a bit deeper: It seems that something goes wrong during creation of the VMs.
From the proxmox-logs:
Jan 29 15:59:20 pve-mp pvedaemon[1095]: <capmox@pve!capi> end task UPID:pve-mp:00000ACD:00006AE8:679A423B:qmclone:112:capmox@pve!capi: clone failed: can't lock file '/var/lock/pve-manager/pve-storage-local-lvm' - got timeout
VMs 115, 116 and 118 were created. 116 and 118 hung with label go-proxmox+cloud-init attached.
capmox-controller-manager repeatedly complains that it cannot find VM117:
E0129 18:26:36.117277 1 find.go:53] "unable to find vm" err="cannot find vm with id 117: 500 Configuration file 'nodes/pve-mp/qemu-server/117.conf' does not exist" controller="proxmoxmachine" controllerGroup="infrastructure.cluster.x-k8s.io" controllerKind="ProxmoxMachine" ProxmoxMachine="default/kubemox-workers-zz9t9-v47sc" namespace="default" name="kubemox-workers-zz9t9-v47sc" reconcileID="9c44a201-9739-4e92-b534-2fa659c58af8" machine="default/kubemox-workers-zz9t9-v47sc" cluster="default/kubemox"
Maybe this is a pointer into the right direction? Any ideas?
Thank you for your response. I did not know that Ceph storage is mandatory for Proxmox, where did you get this information from? Indeed a distributed file system may speed up the cloning process but unfortunately, I only have single node.
In the Proxmox support forums I found a posts that addresses the same problem. In a response to that post a staff member from Proxmox pointed out that sending cloning requests in parallel, as proxmox provider obviously does, can lead to problems.
Maybe this is not a problem for faster machines, and you may prefer to close this issue, however, as the recommendation from the Proxmox team stands, it would be helpful if you consider an option to handle this case; e.g. by cloning the machines one after another, as recommended.
What steps did you take and what happened: [A clear and concise description
of what the bug is.]
First of all, thanks to all contributors for their work and for providing this
CAPI provider!
I was following the quickstart guide to setup a Kubernetes V1.30.8 cluster on
one of my proxmox instances (running proxmox 8.2.2).
The Ubuntu 24.04 VM template was created with Image Builder; by setting
PACKER_FLAGS
I chose Kubernetes 1.30.8. Except for a necessary change ofbuilder.disks.format
fromqcow2
toraw
this was a smooth go; template wascreated and received Id 114 on my proxmox instance (INTEL).
I fired up a local
kind
cluster on my OSX machine (ARM) for management andinitialized Cluster API like so
clusterctl init --core cluster-api \ --config ${CTL_CONFIG} \ --bootstrap kubeadm \ --control-plane kubeadm \ --infrastructure proxmox \ --ipam in-cluster \ -v5
where
CTL_CONFIG
points to myclusterctl.yaml
(contents: see below).clusterctl
picked Cluster API andkubeadm
v1.9.4
, proxmox providerv0.6.2
andipam-in-cluster
v1.0.0
.Next step was to create the workload cluster manifest:
The first VM being created is always a control plane and, as expected, it receives two
IPs: the
CONTROL_PLANE_ENDPOINT_IP
and the first IP from the pool defined inNODE_IP_RANGES
.PROBLEM 1: Watching the summary of the VM in proxmox, pretty soon after
creation of more nodes begun, the
CONTROL_PLANE_ENDPOINT_IP
is disappearingand I am no longer able to
ping
the associated IP. It rarely and randomly comes back todisappear soon again ...
As far as I understand
kube-vip
, theCONTROL_PLANE_ENDPOINT_IP
should goaround between control planes from time to time? However, if no other plane is
ready this should not happen? At least I see no other plane having it ...
PROBLEM 2 Although I asked for 3 control planes and 3 worker nodes, fewer
VMs get created (due to the size of my management cluster?); the amount varies
from try to try. Except for the first created VM each is labeled with
go-proxmox+cloud-init
but this label never disappears, as it did with the first VM.I was able to
ssh
into the nodes andjournalctl -u kubelet
revealed that allkubelets (except for the first created VM) crashed due to missing file
/var/lib/kubelet/config.yaml
.There was also a warning that flag
--pod-infra-container-image
has beendeprecated ...
I also tried the
--flavor calico
after creating a config map (as described in the quickstart guide) but it didn't have an effect.Please let me know if I should provide further information and if so, how to get
my hands on it (I am a kubernetes newbie).
What did you expect to happen:
A running kubernetes cluster on my proxmox instance ;-)
Anything else you would like to add: [Miscellaneous information that will
assist in solving the issue.]
Side note 1:Trying to
ssh
into the machines works but is very very slow for the first attempt.Side note 2: Running
kubectl apply -f kubemox.yaml
I randomly encountered thefollowing error:
However, if it occurs, after several retries proxmox starts to instantiate
machines from my templates - no idea, whether this is related to my problem.
Environment:
Cluster-api-provider-proxmox version: 0.6.2
Kubernetes version: (use
kubectl version
):OS (e.g. from
/etc/os-release
): Ubuntu 24.04 (VM Image), proxmox 8.2.2My
clusterctrl.yaml
:The text was updated successfully, but these errors were encountered: