Workers: Draining
Usage
worker_pool_mode = "node-pool"
worker_pool_size = 1
# Configuration for draining nodes through operator
worker_drain_ignore_daemonsets = true
worker_drain_delete_local_data = true
worker_drain_timeout_seconds = 900
worker_pools = {
oke-vm-active = {
description = "Node pool with active workers",
size = 2,
},
oke-vm-draining = {
description = "Node pool with scheduling disabled and draining through operator",
drain = true,
},
oke-vm-disabled = {
description = "Node pool with resource creation disabled (destroyed)",
create = false,
},
oke-managed-drain = {
description = "Node pool with custom settings for managed cordon & drain",
eviction_grace_duration = 30, # specified in seconds
is_force_delete_after_grace_duration = true
},
}
Example
Terraform will perform the following actions:
# module.workers_only.module.utilities[0].null_resource.drain_workers[0] will be created
+ resource "null_resource" "drain_workers" {
+ id = (known after apply)
+ triggers = {
+ "drain_commands" = jsonencode(
[
+ "kubectl drain --timeout=900s --ignore-daemonsets=true --delete-emptydir-data=true -l oke.oraclecloud.com/pool.name=oke-vm-draining",
]
)
+ "drain_pools" = jsonencode(
[
+ "oke-vm-draining",
]
)
}
}
Plan: 1 to add, 0 to change, 0 to destroy.
module.workers_only.module.utilities[0].null_resource.drain_workers[0] (remote-exec): node/10.200.220.157 cordoned
module.workers_only.module.utilities[0].null_resource.drain_workers[0] (remote-exec): WARNING: ignoring DaemonSet-managed Pods: kube-system/csi-oci-node-99x74, kube-system/kube-flannel-ds-spvsp, kube-system/kube-proxy-6m2kk, ...
module.workers_only.module.utilities[0].null_resource.drain_workers[0] (remote-exec): node/10.200.220.157 drained
module.workers_only.module.utilities[0].null_resource.drain_workers[0]: Creation complete after 18s [id=7686343707387113624]
Apply complete! Resources: 1 added, 0 changed, 0 destroyed.
Observe that the node(s) are now disabled for scheduling, and free of workloads other than DaemonSet-managed Pods when worker_drain_ignore_daemonsets = true
(default):
kubectl get nodes -l oke.oraclecloud.com/pool.name=oke-vm-draining
NAME STATUS ROLES AGE VERSION
10.200.220.157 Ready,SchedulingDisabled node 24m v1.26.2
kubectl get pods --all-namespaces --field-selector spec.nodeName=10.200.220.157
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system csi-oci-node-99x74 1/1 Running 0 50m
kube-system kube-flannel-ds-spvsp 1/1 Running 0 50m
kube-system kube-proxy-6m2kk 1/1 Running 0 50m
kube-system proxymux-client-2r6lk 1/1 Running 0 50m
Run the following command to uncordon a previously drained worker pool. The drain = true
setting should be removed from the worker_pools
entry to avoid re-draining the pool when running Terraform in the future.
kubectl uncordon -l oke.oraclecloud.com/pool.name=oke-vm-draining
node/10.200.220.157 uncordoned