Oracle Container Engine for Kubernetes (OKE) Terraform Module

Introduction

This module automates the provisioning of an OKE cluster.

Note

The documentation here is for 5.x only. The documentation for earlier versions can be found on the GitHub repo on the relevant branch.

Warning

The documentation here is still being reorganised.

News


May 20 2024: Announcing v5.1.7

  • fix symlinks issues and cluster autoscaler defaults override

May 18 2024: Announcing v5.1.6

  • fix: versions file in submodules

April 11 2024: Announcing v5.1.5

  • Create OKE VCN DRG attachment when existing DRG is specified
  • fix FSS NSGs

March 28 2024: Announcing v5.1.4

  • add nodepool support for max_pods_per_node
  • Add service account extension
  • Improve logic for kube_config datasource
  • fix: Remove unknown resource counts from derived inputs
  • fix issue introduced by #909 with new clusters and #873

March 4 2024: Announcing v5.1.3

  • Fix in NSG rule for health check (incorrect direction, or incorrect description?)
  • feat: Configurable boot volume VPUs/GB on self-managed
  • docs: example of using this module in multi-cluster mode with Istio
  • Fix : Wrong control_plane_is_public behavior for OKE cluster
  • Update drg module version.

February 6 2024: Announcing v5.1.2

  • Improve operator package installation

January 17 2024: Announcing v5.1.1

  • feat: upgraded default Autonomous Linux to 8.8 by
  • fix: operator nsg is not created when cluster is disabled
  • feat: added ability to create rpc to peer vcn to other vcns

November 29 2023: Announcing release v5.1.0

  • added Cilium CNI
  • https://github.com/oracle-terraform-modules/terraform-oci-oke/releases/tag/v5.1.0

October 25 2023: Announcing release v5.0.0

  • https://github.com/oracle-terraform-modules/terraform-oci-oke/releases

May 9 2023: Announcing release v4.5.9

  • Make the default freeform_tags empty
  • Use lower bound version specification for oci provider

Changelog

View the CHANGELOG.

Security

Please consult the security guide for our responsible security vulnerability disclosure process

License

Copyright (c) 2019-2023 Oracle and/or its affiliates.

Released under the Universal Permissive License v1.0 as shown at https://oss.oracle.com/licenses/upl/.

Getting started

This module automates the provisioning of an OKE cluster.

Note

The documentation here is for 5.x only. The documentation for earlier versions can be found on the GitHub repo.

Usage

Clone the repo

Clone the git repo:

git clone https://github.com/oracle-terraform-modules/terraform-oci-oke.git tfoke
cd tfoke

Create

  1. Create 2 OCI providers and add them to providers.tf:
provider "oci" {
  fingerprint      = var.api_fingerprint
  private_key_path = var.api_private_key_path
  region           = var.region
  tenancy_ocid     = var.tenancy_id
  user_ocid        = var.user_id
}

provider "oci" {
  fingerprint      = var.api_fingerprint
  private_key_path = var.api_private_key_path
  region           = var.home_region
  tenancy_ocid     = var.tenancy_id
  user_ocid        = var.user_id
  alias            = "home"
}
  1. Initialize a working directory containing Terraform configuration files, and optionally upgrade module dependencies:
terraform init --upgrade
  1. Create a terraform.tfvars and provide the necessary parameters:
# Identity and access parameters
api_fingerprint      = "00:ab:12:34:56:cd:78:90:12:34:e5:fa:67:89:0b:1c"
api_private_key_path = "~/.oci/oci_rsa.pem"

home_region = "us-ashburn-1"
region      = "ap-sydney-1"
tenancy_id  = "ocid1.tenancy.oc1.."
user_id     = "ocid1.user.oc1.."

# general oci parameters
compartment_id = "ocid1.compartment.oc1.."
timezone       = "Australia/Sydney"

# ssh keys
ssh_private_key_path = "~/.ssh/id_ed25519"
ssh_public_key_path  = "~/.ssh/id_ed25519.pub"

# networking
create_vcn               = true
assign_dns               = true
lockdown_default_seclist = true
vcn_cidrs                = ["10.0.0.0/16"]
vcn_dns_label            = "oke"
vcn_name                 = "oke"

# Subnets
subnets = {
  bastion  = { newbits = 13, netnum = 0, dns_label = "bastion", create="always" }
  operator = { newbits = 13, netnum = 1, dns_label = "operator", create="always" }
  cp       = { newbits = 13, netnum = 2, dns_label = "cp", create="always" }
  int_lb   = { newbits = 11, netnum = 16, dns_label = "ilb", create="always" }
  pub_lb   = { newbits = 11, netnum = 17, dns_label = "plb", create="always" }
  workers  = { newbits = 2, netnum = 1, dns_label = "workers", create="always" }
  pods     = { newbits = 2, netnum = 2, dns_label = "pods", create="always" }
}

# bastion
create_bastion           = true
bastion_allowed_cidrs    = ["0.0.0.0/0"]
bastion_user             = "opc"

# operator
create_operator                = true
operator_install_k9s           = true


# iam
create_iam_operator_policy   = "always"
create_iam_resources         = true

create_iam_tag_namespace = false // true/*false
create_iam_defined_tags  = false // true/*false
tag_namespace            = "oke"
use_defined_tags         = false // true/*false

# cluster
create_cluster     = true
cluster_name       = "oke"
cni_type           = "flannel"
kubernetes_version = "v1.29.1"
pods_cidr          = "10.244.0.0/16"
services_cidr      = "10.96.0.0/16"

# Worker pool defaults
worker_pool_size = 0
worker_pool_mode = "node-pool"

# Worker defaults
await_node_readiness     = "none"

worker_pools = {
  np1 = {
    shape              = "VM.Standard.E4.Flex",
    ocpus              = 2,
    memory             = 32,
    size               = 1,
    boot_volume_size   = 50,
    kubernetes_version = "v1.29.1"
  }
  np2 = {
     shape            = "VM.Standard.E4.Flex",
     ocpus            = 2,
     memory           = 32,
     size             = 3,
     boot_volume_size = 150,
     kubernetes_version = "v1.29.1"
  }
}

# Security
allow_node_port_access       = false
allow_worker_internet_access = true
allow_worker_ssh_access      = true
control_plane_allowed_cidrs  = ["0.0.0.0/0"]
control_plane_is_public      = false
load_balancers               = "both"
preferred_load_balancer      = "public"

  1. Run the plan and apply commands to create OKE cluster and other components:
terraform plan
terraform apply

You can create a Kubernetes cluster with the latest version of Kubernetes available in OKE using this terraform script.

Connect

NOTE: TODO Add content

kubectl is installed on the operator host by default and the kubeconfig file is set in the default location (~/.kube/config) so you don't need to set the KUBECONFIG environment variable every time you log in to the operator host.


The instance principal of the operator must be granted MANAGE on target cluster for configuration of an admin user context.


An alias "k" will be created for kubectl on the operator host.

If you would like to use kubectl locally, first install and configure OCI CLI locally. Then, install kubectl and set the KUBECONFIG to the config file path.

export KUBECONFIG=path/to/kubeconfig

To be able to get the kubeconfig file, you will need to get the credentials with terraform and store in the preferred storage format (e.g: file, vault, bucket...):

# OKE cluster creation.
module "oke_my_cluster" {
#...
}

# Obtain cluster Kubeconfig.
data "oci_containerengine_cluster_kube_config" "kube_config" {
  cluster_id = module.oke_my_cluster.cluster_id
}

# Store kubeconfig in vault.
resource "vault_generic_secret" "kube_config" {
  path = "my/cluster/path/kubeconfig"
  data_json = jsonencode({
    "data" : data.oci_containerengine_cluster_kube_config.kube_config.content
  })
}

# Store kubeconfig in file.
resource "local_file" "kube_config" {
  content         = data.oci_containerengine_cluster_kube_config.kube_config.content
  filename        = "/tmp/kubeconfig"
  file_permission = "0600"
}

Tip

Ensure you install the same kubectl version as the OKE Kubernetes version for compatibility.

Update

NOTE: TODO Add content

Destroy

Run the below command to destroy the infrastructure created by Terraform:

terraform destroy

You can also do targeted destroy e.g.

terraform destroy --target=module.workers

Note

Only infrastructure created by Terraform will be destroyed.

Prerequisites

This section will guide you through the prerequisites before you can use this project.

Identity and Access Management Rights

The Terraform user must have the following permissions to:

  • MANAGE dynamic groups (instance_principal and KMS integration)
  • MANAGE cluster-family in compartment
  • MANAGE virtual-network-family in compartment
  • MANAGE instance-family in compartment

Install Terraform

Start by installing Terraform and configuring your path.

Download Terraform

  1. Open your browser and navigate to the Terraform download page. You need version 1.0.0+.
  2. Download the appropriate version for your operating system
  3. Extract the the contents of compressed file and copy the terraform binary to a location that is in your path (see next section below)

Configure path on Linux/macOS

Open a terminal and enter the following:

# edit your desired path in-place:
sudo mv /path/to/terraform /usr/local/bin

Configure path on Windows

Follow the steps below to configure your path on Windows:

  1. Click on 'Start', type 'Control Panel' and open it
  2. Select System > Advanced System Settings > Environment Variables
  3. Select System variables > PATH and click 'Edit'
  4. Click New and paste the location of the directory where you have extracted the terraform.exe
  5. Close all open windows by clicking OK
  6. Open a new terminal and verify terraform has been properly installed

Testing Terraform installation

Open a terminal and test:

terraform -v

Generate API keys

Follow the documentation for generating keys on OCI Documentation.

Upload your API keys

Follow the documentation for uploading your keys on OCI Documentation.

Note the fingerprint.

Create an OCI compartment

Follow the documentation for creating a compartment.

Obtain the necessary OCIDs

The following OCIDs are required:

  • Compartment OCID
  • Tenancy OCID
  • User OCID

Follow the documentation for obtaining the tenancy and user ids on OCI Documentation.

To obtain the compartment OCID:

  1. Navigate to Identity > Compartments
  2. Click on your Compartment
  3. Locate OCID on the page and click on 'Copy'

If you wish to encrypt Kubernetes secrets with a key from OCI KMS, you also need to create a vault and a key and obtain the key id.

Configure OCI Policy for OKE

Follow the documentation for to create the necessary OKE policy.

Deploy the OKE Terraform Module

Prerequisites

Provisioning from an OCI Resource Manager Stack

Network

Deploy to Oracle Cloud

Network resources configured for an OKE cluster.

The following resources may be created depending on provided configuration:

Cluster

Deploy to Oracle Cloud

An OKE-managed Kubernetes cluster.

The following resources may be created depending on provided configuration:

Node Pool

Deploy to Oracle Cloud

A standard OKE-managed pool of worker nodes with enhanced feature support.

Configured with mode = "node-pool" on a worker_pools entry, or with worker_pool_mode = "node-pool" to use as the default for all pools unless otherwise specified.

You can set the image_type attribute to one of the following values:

  • oke (default)
  • platform
  • custom.

When the image_type is equal to oke or platform there is a high risk for the node-pool image to be updated on subsequent terraform apply executions because the module is using a datasource to fetch the latest images available.

To avoid this situation, you can set the image_type to custom and the image_id to the OCID of the image you want to use for the node-pool.

The following resources may be created depending on provided configuration:

Virtual Node Pool

Deploy to Oracle Cloud

An OKE-managed Virtual Node Pool.

Configured with mode = "virtual-node-pool" on a worker_pools entry, or with worker_pool_mode = "virtual-node-pool" to use as the default for all pools unless otherwise specified.

The following resources may be created depending on provided configuration:

Instance

Deploy to Oracle Cloud

A set of self-managed Compute Instances for custom user-provisioned worker nodes not managed by an OCI pool, but individually by Terraform.

Configured with mode = "instance" on a worker_pools entry, or with worker_pool_mode = "instance" to use as the default for all pools unless otherwise specified.

The following resources may be created depending on provided configuration:

Instance Pool

Deploy to Oracle Cloud

A self-managed Compute Instance Pool for custom user-provisioned worker nodes.

Configured with mode = "instance-pool" on a worker_pools entry, or with worker_pool_mode = "instance-pool" to use as the default for all pools unless otherwise specified.

The following resources may be created depending on provided configuration:

Cluster Network

Deploy to Oracle Cloud

A self-managed HPC Cluster Network.

Configured with mode = "cluster-network" on a worker_pools entry, or with worker_pool_mode = "cluster-network" to use as the default for all pools unless otherwise specified.

The following resources may be created depending on provided configuration:

User Guide

Topology

The following resources are created by default:

  • 1 VCN with Internet, NAT and Service Gateways
  • Route tables for Internet, NAT and Service Gateways
  • 1 regional public subnet for the bastion host along with its security list
  • 1 regional private subnet for the operator host along with its NSG
  • 1 public control plane subnet
  • 1 private regional worker subnet
  • 1 public regional load balancer
  • 1 bastion host
  • 1 operator host
  • 1 public Kubernetes Cluster with private worker nodes
  • 1 Network Security Group (NSG) for each of control plane, workers and load balancers

Note

The Kubernetes Control Plane Nodes run in Oracle's tenancy and are not shown here.

Although the recommended approach is to now deploy private clusters,we are currently keeping the default setting to public. This is to give our users the time to adjust other configurations e.g. their CI/CD tools.

The Load Balancers are only created when Kubernetes services of type LoadBalancer are deployed or you manually create Load Balancers yourself.

The diagrams below depicts the default deployment in multi-AD regions:

Figure 1: Multi-AD Default Deployment

and single-AD regions:

Figure 2: Single-AD Default Deployment

Note

The node pools above are depicted for illustration purposes only. By default, the clusters are now created without any node pools.


Networking and Gateways

Figure 3: Networking and Gateways

The following subnets are created by default:

  • 1 public regional control plane subnet: this subnet hosts an endpoint for the Kubernetes API server publicly accessible from the Internet. Typically, only 1 IP address is sufficient if you intend to host only 1 OKE cluster in this VCN. However, if you intend to host many OKE or Kubernetes clusters in this VCN and you intend to reuse the same subnets, you need to increase the default size of this subnet.
  • 1 private regional worker subnet: this subnet hosts the worker nodes where your workloads will be running. By default, they are private. If you need admin access to the worker nodes e.g. SSH, you'll need to enable and use either the bastion host or the OCI Bastion Service.
  • 1 public regional load balancer subnet: this subnet hosts your OCI Load Balancer which acts as a network ingress point into your OKE cluster.
  • 1 public regional bastion subnet: this subnet hosts an optional bastion host. See additional documentation on the purpose of the bastion host.
  • 1 private regional operator subnet: this subnet hosts an optional operator host that is used for admin purposes. See additional documentation on the purpose of the operator host

Warning

Do not confuse the bastion host with the OCI Bastion Service.

The bastion subnet is regional i.e. in multi-AD regions, the subnet spans all Availability Domains. By default, the bastion subnet is assigned a CIDR of 10.0.0.0/29 giving a maximum possible of 5 assignable IP addresses in the bastion subnet.

The workers subnet has a CIDR of 10.0.144.0/20 assigned by default. This gives the subnet a maximum possible of 4093 IP addresses. This is enough to scale the cluster to the maximum number of worker nodes (2000) currently allowed by Oracle Container Engine.

The load balancer subnets are of 2 types:

  • public
  • private

By default, only the public load balancer subnet is created. See Public and Internal Load Balancers for more details. The private load balancer subnet has a CIDR of 10.0.32.0/27 whereas the public load balancer subnet has a CIDR of 10.0.128.0/27 assigned by default. This allows both subnets to assign a maximum of 29 IP addresses and therefore 9 load balancers can be created in each. You can control the size of your subnets and have more load balancers if required by adjusting the newbit and netnum values for the subnets parameter.

The subnets parameter govern the boundaries and sizes of the subnets. If you need to change the default values, refer to the Networking Documentation to see how. We recommend working with your network administrator to design your network. The following additional documentation is useful in designing your network:

The following gateways are also created:

  • Internet Gateway: this is required if the application is public-facing or a public bastion host is used
  • NAT Gateway if deployed in private mode
  • Service Gateway: this is required for connectivity between worker nodes and the control plane

The Service Gateway also allows OCI cloud resources without public IP addresses to privately access Oracle services and without the traffic going over the public Internet. Refer to the OCI Service Gateway documentation to understand whether you need to enable it.


Bastion Host

Figure 4: Networking and Gateways

The bastion host is created in a public regional subnet. You can create or destroy it anytime with no effect on the Kubernetes cluster by setting the create_bastion_host = true in your variable file. You can also turn it on or off by changing the bastion_state to RUNNING or STOPPED respectively.

By default, the bastion host can be accessed from anywhere. However, you can restrict its access to a defined list of CIDR blocks using the bastion_access parameter. You can also make the bastion host private if you have some alternative connectivity method to your VCN e.g. using VPN.

You can use the bastion host for the following:

  • SSH to the worker nodes
  • SSH to the operator host to manage your Kubernetes cluster

To SSH to the bastion, copy the command that terraform outputs at the end of its run:

ssh_to_bastion = ssh -i /path/to/private_key opc@bastion_ip

To SSH to the worker nodes, you can do the following:

ssh -i /path/to/private_key -J <username>@bastion_ip opc@worker_node_private_ip

Tip

If your private ssh key has a different name or path than the default ~/.ssh/id_* that ssh expects e.g if your private key is ~/.ssh/dev_rsa, you must add it to your ssh agent:

eval $(ssh-agent -s)
ssh-add ~/.ssh/dev_rsa

Public vs Private Clusters

When deployed in public mode, the Kubernetes API endpoint is publicly accessible.

Figure 5: Accessing a public cluster

.Accessing the Kubernetes API endpoint publicly image::images/publiccluster.png[align="center"]

You can set the Kubernetes cluster to be public and restrict its access to the CIDR blocks A.B.C.D/A and X.Y.Z.X/Z by using the following parameters:

control_plane_is_public     = true # *true/false
control_plane_allowed_cidrs = ["A.B.C.D/A","X.Y.Z.X/Z"]

When deployed in private mode, the Kubernetes endpoint can only be accessed from the operator host or from a defined list of CIDR blocks specified in control_plane_allowed_cidrs. This assumes that you have established some form of connectivity with the VCN via VPN or FastConnect from the networks listed in control_plane_allowed_cidrs.

Figure 5: Accessing the Kubernetes API endpoint from the operator host

The following table maps all possible cluster and workers deployment combinations:

Workers/control planepublicprivate
worker_type=publicXX
worker_type=privateXX

Important

We strongly recommend you use private clusters.

Public vs Private worker nodes

Public workers

Figure 6: Deploying public workers

When deployed in public mode, all worker subnets will be deployed as public subnets and route to the Internet Gateway directly. Worker nodes will have both private and public IP addresses. Their private IP addresses will be from the range of the worker subnet they are part of whereas the public IP addresses will be allocated from Oracle's pool of public IP addresses.

If you intend to use Kubernetes NodePort services on your public workers or SSH to them, you must explicitly enable these in order for the security rules to be properly configured and allow access:

allow_node_port_access  = true
allow_worker_ssh_access = true

Danger

Because of the increased attack surface area, we do not recommend running your worker nodes publicly. However, there are some valid use cases for these and you have the option to make this choice.

Private workers

Figure 7: Deploying private workers

When deployed in private mode, the worker subnet will be deployed as a private subnet and route to the NAT Gateway instead. This considerably reduces the surface attack area and improves the security posture of your OKE cluster as well as the rest of your infrastructure.

Tip

We strongly recommend you run your worker nodes in private mode.

Irrespective of whether you run your worker nodes publicly or privately, if you ssh to them, you must do so through the bastion host or the OCI Bastion Service. Ensure you have enabled the bastion host.

Public vs. Internal Load Balancers

Figure 8: Using load balancers

You can use both public and internal load balancers. By default, OKE creates public load balancers whenever you deploy services of type LoadBalancer. As public load balancers are allocated public IP addresses, they require require a public subnet and the default service load balancer is therefore set to use the public subnet pub-lb.

You can change this default behaviour and use internal load balancers instead. Internal load balancers have only private IP addresses and are not accessible from the Internet. Although you can place internal load balancers in public subnets (they just will not be allocated public IP addresses), we recommend you use a different subnet for internal load balancers.

Depending on your use case, you can also have both public and private load balancers.

Refer to the user guide on load balancers for more details.

Using Public Load Balancers

When creating a service of type LoadBalancer, you must specify the list of NSGs using OCI Load Balancer annotations e.g.:

apiVersion: v1
kind: Service
metadata:
  name: acme-website
  annotations:
    oci.oraclecloud.com/oci-network-security-groups: "ocid1.networksecuritygroup...."
    service.beta.kubernetes.io/oci-load-balancer-security-list-management-mode: "None"
spec:
  type: LoadBalancer
  ....

Note

Since we have already added the NodePort range to the public load balancer NSG, you can also disable the security list management and set its value to "None".

Using Internal Load Balancers

When creating an internal load balancer, you must ensure the following:

  • load_balancers is set to both or internal.

Setting the load_balancers parameter to both or internal only ensures the private subnet for internal load balancers and the required NSG is created. To set it as the default subnet for your service load balancer, set the preferred_load_balancer to internal. In this way, if you happen to use both types of Load Balancers, the cluster will preference the internal load balancer subnets instead.

Important

Even if you set the preferred_load_balancer to internal, you still need to set the correct service annotation when creating internal load balancers. Just setting the subnet to be private is not sufficient e.g.

service.beta.kubernetes.io/oci-load-balancer-internal: "true"

Refer to OCI Documentation for further guidance on internal load balancers.

Creating LoadBalancers using IngressControllers

You may want to refer to the following articles exploring Ingress Controllers and Load Balancers for additional information:

Identity

Optional creation of Identity Dynamic Groups, Policies, and Tags.

Identity: Policies

Usage

create_iam_autoscaler_policy = "auto" // never/*auto/always
create_iam_kms_policy        = "auto" // never/*auto/always
create_iam_operator_policy   = "auto" // never/*auto/always
create_iam_worker_policy     = "auto" // never/*auto/always

References

Identity: Tags

Usage

create_iam_tag_namespace = false // true/*false
create_iam_defined_tags  = false // true/*false
tag_namespace            = "oke"
use_defined_tags         = false // true/*false

References

Network

Optional creation of VCN subnets, Network Security Groups, NSG Rules, and more.

Examples

Create Minimal Network Resources

TODO: ../../../examples/network/vars-network-only-minimal.auto.tfvars

# All configuration for network sub-module w/ defaults

# Virtual Cloud Network (VCN)
assign_dns               = true # *true/false
create_vcn               = true # *true/false
local_peering_gateways   = {}
lockdown_default_seclist = true            # *true/false
vcn_id                   = null            # Ignored if create_vcn = true
vcn_cidrs                = ["10.0.0.0/16"] # Ignored if create_vcn = false
vcn_dns_label            = "oke"           # Ignored if create_vcn = false
vcn_name                 = "oke"           # Ignored if create_vcn = false

Create Common Network Resources

# All configuration for network sub-module w/ defaults

# Virtual Cloud Network (VCN)
assign_dns               = true # *true/false
create_vcn               = true # *true/false
local_peering_gateways   = {}
lockdown_default_seclist = true            # *true/false
vcn_id                   = null            # Ignored if create_vcn = true
vcn_cidrs                = ["10.0.0.0/16"] # Ignored if create_vcn = false
vcn_dns_label            = "oke"           # Ignored if create_vcn = false
vcn_name                 = "oke"           # Ignored if create_vcn = false

# Subnets
subnets = {
  bastion  = { newbits = 13 }
  operator = { newbits = 13 }
  cp       = { newbits = 13 }
  int_lb   = { newbits = 11 }
  pub_lb   = { newbits = 11 }
  workers  = { newbits = 2 }
  pods     = { newbits = 2 }
}

# Security
allow_node_port_access       = true          # *true/false
allow_pod_internet_access    = true          # *true/false
allow_worker_internet_access = false         # true/*false
allow_worker_ssh_access      = false         # true/*false
control_plane_allowed_cidrs  = ["0.0.0.0/0"] # e.g. "0.0.0.0/0"
control_plane_nsg_ids        = []            # Additional NSGs combined with created
control_plane_type           = "public"      # public/*private
enable_waf                   = false         # true/*false
load_balancers               = "both"        # public/private/*both
preferred_load_balancer      = "public"      # public/*private
worker_nsg_ids               = []            # Additional NSGs combined with created
worker_type                  = "private"     # public/*private

# See https://www.iana.org/assignments/protocol-numbers/protocol-numbers.xhtml
# Protocols: All = "all"; ICMP = 1; TCP  = 6; UDP  = 17
# Source/destination type: NSG ID: "NETWORK_SECURITY_GROUP"; CIDR range: "CIDR_BLOCK"
allow_rules_internal_lb = {
  # "Allow TCP ingress to internal load balancers for port 8080 from VCN" : {
  #   protocol = 6, port = 8080, source = "10.0.0.0/16", source_type = "CIDR_BLOCK",
  # },
}

allow_rules_public_lb = {
  # "Allow TCP ingress to public load balancers for SSL traffic from anywhere" : {
  #   protocol = 6, port = 443, source = "0.0.0.0/0", source_type = "CIDR_BLOCK",
  # },
}

allow_rules_workers = {
  # "Allow TCP ingress to workers for port 8080 from VCN" : {
  #   protocol = 6, port = 8080, source = "10.0.0.0/16", source_type = "CIDR_BLOCK",
  # },
}

# Dynamic routing gateway (DRG)
create_drg       = false # true/*false
drg_display_name = "drg"
drg_id           = null

# Routing
ig_route_table_id = null # Optional ID of existing internet gateway route table
internet_gateway_route_rules = [
  #   {
  #     destination       = "192.168.0.0/16" # Route Rule Destination CIDR
  #     destination_type  = "CIDR_BLOCK"     # only CIDR_BLOCK is supported at the moment
  #     network_entity_id = "drg"            # for internet_gateway_route_rules input variable, you can use special strings "drg", "internet_gateway" or pass a valid OCID using string or any Named Values
  #     description       = "Terraformed - User added Routing Rule: To drg provided to this module. drg_id, if available, is automatically retrieved with keyword drg"
  #   },
]

nat_gateway_public_ip_id = "none"
nat_route_table_id       = null # Optional ID of existing NAT gateway route table
nat_gateway_route_rules = [
  #   {
  #     destination       = "192.168.0.0/16" # Route Rule Destination CIDR
  #     destination_type  = "CIDR_BLOCK"     # only CIDR_BLOCK is supported at the moment
  #     network_entity_id = "drg"            # for nat_gateway_route_rules input variable, you can use special strings "drg", "nat_gateway" or pass a valid OCID using string or any Named Values
  #     description       = "Terraformed - User added Routing Rule: To drg provided to this module. drg_id, if available, is automatically retrieved with keyword drg"
  #   },
]

References

Subnets

Subnets are created for core components managed within the module, namely:

  • Bastion
  • Operator
  • Control plane (cp)
  • Workers
  • Pods
  • Internal load balancers (int_lb)
  • Public load balancers (pub_lb)

Create new subnets (automatic)

subnets = {
  bastion  = { newbits = 13 }
  operator = { newbits = 13 }
  cp       = { newbits = 13 }
  int_lb   = { newbits = 11 }
  pub_lb   = { newbits = 11 }
  workers  = { newbits = 2 }
  pods     = { newbits = 2 }
}

Create new subnets (forced)

subnets = {
  bastion = {
    create  = "always",
    netnum  = 0,
    newbits = 13
  }

  operator = {
    create  = "always",
    netnum  = 1,
    newbits = 13
  }

  cp = {
    create  = "always",
    netnum  = 2,
    newbits = 13
  }

  int_lb = {
    create  = "always",
    netnum  = 16,
    newbits = 11
  }

  pub_lb = {
    create  = "always",
    netnum  = 17,
    newbits = 11
  }

  workers = {
    create  = "always",
    netnum  = 1,
    newbits = 2
  }
}

Create new subnets (CIDR notation)

subnets = {
  bastion  = { cidr = "10.0.0.0/29" }
  operator = { cidr = "10.0.0.64/29" }
  cp       = { cidr = "10.0.0.8/29" }
  int_lb   = { cidr = "10.0.0.32/27" }
  pub_lb   = { cidr = "10.0.128.0/27" }
  workers  = { cidr = "10.0.144.0/20" }
  pods     = { cidr = "10.0.64.0/18" }
}

Use existing subnets

subnets = {
  operator = { id = "ocid1.subnet..." }
  cp       = { id = "ocid1.subnet..." }
  int_lb   = { id = "ocid1.subnet..." }
  pub_lb   = { id = "ocid1.subnet..." }
  workers  = { id = "ocid1.subnet..." }
  pods     = { id = "ocid1.subnet..." }
}

References

Network Security Groups

Network Security Groups (NSGs) are used to permit network access between resources creation by the module, namely:

  • Bastion
  • Operator
  • Control plane (cp)
  • Workers
  • Pods
  • Internal load balancers (int_lb)
  • Public load balancers (pub_lb)

Create new NSGs

nsgs = {
  bastion  = {}
  operator = {}
  cp       = {}
  int_lb   = {}
  pub_lb   = {}
  workers  = {}
  pods     = {}
}

Use existing NSGs

nsgs = {
  bastion  = { id = "ocid1.networksecuritygroup..." }
  operator = { id = "ocid1.networksecuritygroup..." }
  cp       = { id = "ocid1.networksecuritygroup..." }
  int_lb   = { id = "ocid1.networksecuritygroup..." }
  pub_lb   = { id = "ocid1.networksecuritygroup..." }
  workers  = { id = "ocid1.networksecuritygroup..." }
  pods     = { id = "ocid1.networksecuritygroup..." }
}

References

Cluster

See also:

The OKE parameters concern mainly the following:

  • whether you want your OKE control plane to be public or private
  • whether to assign a public IP address to the API endpoint for public access
  • whether you want to deploy public or private worker nodes
  • whether you want to allow NodePort or ssh access to the worker nodes
  • Kubernetes options such as dashboard, networking
  • number of node pools and their respective size of the cluster
  • services and pods cidr blocks
  • whether to use encryption

Note

If you need to change the default services and pods' CIDRs, note the following:

  • The CIDR block you specify for the VCN must not overlap with the CIDR block you specify for the Kubernetes services.
  • The CIDR blocks you specify for pods running in the cluster must not overlap with CIDR blocks you specify for worker node and load balancer subnets.

Example usage

Basic cluster with defaults:

cluster_name       = "oke-example"
kubernetes_version = "v1.26.2"

Enhanced cluster with extra configuration:

create_cluster                    = true // *true/false
cluster_dns                       = null
cluster_kms_key_id                = null
cluster_name                      = "oke"
cluster_type                      = "enhanced" // *basic/enhanced
cni_type                          = "flannel"  // *flannel/npn
assign_public_ip_to_control_plane = true // true/*false
image_signing_keys                = []
kubernetes_version                = "v1.26.2"
pods_cidr                         = "10.244.0.0/16"
services_cidr                     = "10.96.0.0/16"
use_signed_images                 = false // true/*false

Workers

The worker_pools input defines worker node configuration for the cluster.

Many of the global configuration values below may be overridden on each pool definition or omitted for defaults, with the worker_ or worker_pool_ variable prefix removed, e.g. worker_image_id overridden with image_id.

For example:

worker_pool_mode = "node-pool"
worker_pool_size = 1

worker_pools = {
  oke-vm-standard = {},

  oke-vm-standard-large = {
    description      = "OKE-managed Node Pool with OKE Oracle Linux 8 image",
    shape            = "VM.Standard.E4.Flex",
    create           = true,
    ocpus            = 8,
    memory           = 128,
    boot_volume_size = 200,
    os               = "Oracle Linux",
    os_version       = "8",
  },
}

Workers: Mode

The mode parameter controls the type of resources provisioned in OCI for OKE worker nodes.

Workers / Mode: Node Pool

Deploy to Oracle Cloud

A standard OKE-managed pool of worker nodes with enhanced feature support.

Configured with mode = "node-pool" on a worker_pools entry, or with worker_pool_mode = "node-pool" to use as the default for all pools unless otherwise specified.

You can set the image_type attribute to one of the following values:

  • oke (default)
  • platform
  • custom.

When the image_type is equal to oke or platform there is a high risk for the node-pool image to be updated on subsequent terraform apply executions because the module is using a datasource to fetch the latest images available.

To avoid this situation, you can set the image_type to custom and the image_id to the OCID of the image you want to use for the node-pool.

The following resources may be created depending on provided configuration:

Usage

worker_pool_mode = "node-pool"
worker_pool_size = 1

worker_pools = {
  oke-vm-standard = {},

  oke-vm-standard-large = {
    size             = 1,
    shape            = "VM.Standard.E4.Flex",
    ocpus            = 8,
    memory           = 128,
    boot_volume_size = 200,
    create           = false,
  },

  oke-vm-standard-ol7 = {
    description = "OKE-managed Node Pool with OKE Oracle Linux 7 image",
    size        = 1,
    os          = "Oracle Linux",
    os_version  = "7",
    create      = false,
  },

  oke-vm-standard-ol8 = {
    description = "OKE-managed Node Pool with OKE Oracle Linux 8 image",
    size        = 1,
    os          = "Oracle Linux",
    os_version  = "8",
  },

  oke-vm-standard-custom = {
    description = "OKE-managed Node Pool with custom image",
    image_type  = "custom",
    image_id    = "ocid1.image...",
    size        = 1,
    create      = false,
  },
}

References

Workers / Mode: Virtual Node Pool

Deploy to Oracle Cloud

An OKE-managed Virtual Node Pool.

Configured with mode = "virtual-node-pool" on a worker_pools entry, or with worker_pool_mode = "virtual-node-pool" to use as the default for all pools unless otherwise specified.

The following resources may be created depending on provided configuration:

Usage

worker_pools = {
  oke-virtual = {
    description = "OKE-managed Virtual Node Pool",
    mode        = "virtual-node-pool",
    size        = 1,
  },
}

References

Workers / Mode: Instance

Deploy to Oracle Cloud

A set of self-managed Compute Instances for custom user-provisioned worker nodes not managed by an OCI pool, but individually by Terraform.

Configured with mode = "instance" on a worker_pools entry, or with worker_pool_mode = "instance" to use as the default for all pools unless otherwise specified.

The following resources may be created depending on provided configuration:

Usage

worker_pools = {
  oke-vm-instance = {
    description = "Self-managed Instances",
    mode        = "instance",
    size        = 1,
    node_labels = {
      "keya" = "valuea",
      "keyb" = "valueb"
    },
    secondary_vnics = {
      "vnic-display-name" = {},
    },
  },
}

Instance agent configuration:

worker_pools = {
  oke-instance = {
    agent_config = {
      are_all_plugins_disabled = false,
      is_management_disabled   = false,
      is_monitoring_disabled   = false,
      plugins_config = {
        "Bastion"                             = "DISABLED",
        "Block Volume Management"             = "DISABLED",
        "Compute HPC RDMA Authentication"     = "DISABLED",
        "Compute HPC RDMA Auto-Configuration" = "DISABLED",
        "Compute Instance Monitoring"         = "ENABLED",
        "Compute Instance Run Command"        = "ENABLED",
        "Compute RDMA GPU Monitoring"         = "DISABLED",
        "Custom Logs Monitoring"              = "ENABLED",
        "Management Agent"                    = "ENABLED",
        "Oracle Autonomous Linux"             = "DISABLED",
        "OS Management Service Agent"         = "DISABLED",
      }
    }
  },
}

References

Workers / Mode: Instance Pool

Deploy to Oracle Cloud

A self-managed Compute Instance Pool for custom user-provisioned worker nodes.

Configured with mode = "instance-pool" on a worker_pools entry, or with worker_pool_mode = "instance-pool" to use as the default for all pools unless otherwise specified.

The following resources may be created depending on provided configuration:

Usage

worker_pools = {
  oke-vm-instance-pool = {
    description = "Self-managed Instance Pool with custom image",
    mode        = "instance-pool",
    size        = 1,
    node_labels = {
      "keya" = "valuea",
      "keyb" = "valueb"
    },
    secondary_vnics = {
      "vnic-display-name" = {},
    },
  },
}

Instance agent configuration:

worker_pools = {
  oke-instance = {
    agent_config = {
      are_all_plugins_disabled = false,
      is_management_disabled   = false,
      is_monitoring_disabled   = false,
      plugins_config = {
        "Bastion"                             = "DISABLED",
        "Block Volume Management"             = "DISABLED",
        "Compute HPC RDMA Authentication"     = "DISABLED",
        "Compute HPC RDMA Auto-Configuration" = "DISABLED",
        "Compute Instance Monitoring"         = "ENABLED",
        "Compute Instance Run Command"        = "ENABLED",
        "Compute RDMA GPU Monitoring"         = "DISABLED",
        "Custom Logs Monitoring"              = "ENABLED",
        "Management Agent"                    = "ENABLED",
        "Oracle Autonomous Linux"             = "DISABLED",
        "OS Management Service Agent"         = "DISABLED",
      }
    }
  },
}

References

Workers / Mode: Cluster Network

Deploy to Oracle Cloud

A self-managed HPC Cluster Network.

Configured with mode = "cluster-network" on a worker_pools entry, or with worker_pool_mode = "cluster-network" to use as the default for all pools unless otherwise specified.

The following resources may be created depending on provided configuration:

Usage

worker_pools = {
  oke-vm-standard = {
    description      = "Managed node pool for operational workloads without GPU toleration"
    mode             = "node-pool",
    size             = 1,
    shape            = "VM.Standard.E4.Flex",
    ocpus            = 2,
    memory           = 16,
    boot_volume_size = 50,
  },

  oke-bm-gpu-rdma = {
    description   = "Self-managed nodes in a Cluster Network with RDMA networking",
    mode          = "cluster-network",
    size          = 1,
    shape         = "BM.GPU.B4.8",
    placement_ads = [1],
    image_id      = "ocid1.image..."
    cloud_init = [
      {
        content = <<-EOT
        #!/usr/bin/env bash
        echo "Pool-specific cloud_init using shell script"
        EOT
      },
    ],
    secondary_vnics = {
      "vnic-display-name" = {
        nic_index = 1,
        subnet_id = "ocid1.subnet..."
      },
    },
  }
}

Instance agent configuration:

worker_pools = {
  oke-instance = {
    agent_config = {
      are_all_plugins_disabled = false,
      is_management_disabled   = false,
      is_monitoring_disabled   = false,
      plugins_config = {
        "Bastion"                             = "DISABLED",
        "Block Volume Management"             = "DISABLED",
        "Compute HPC RDMA Authentication"     = "DISABLED",
        "Compute HPC RDMA Auto-Configuration" = "DISABLED",
        "Compute Instance Monitoring"         = "ENABLED",
        "Compute Instance Run Command"        = "ENABLED",
        "Compute RDMA GPU Monitoring"         = "DISABLED",
        "Custom Logs Monitoring"              = "ENABLED",
        "Management Agent"                    = "ENABLED",
        "Oracle Autonomous Linux"             = "DISABLED",
        "OS Management Service Agent"         = "DISABLED",
      }
    }
  },
}

References

Workers: Network

Subnets

worker_pool_mode = "node-pool"
worker_pool_size = 1

worker_subnet_id = "ocid1.subnet..."

worker_pools = {
  oke-vm-custom-subnet-flannel = {
    subnet_id = "ocid1.subnet..."
  },

  oke-vm-custom-subnet-npn = {
    subnet_id     = "ocid1.subnet..."
    pod_subnet_id = "ocid1.subnet..." // when cni_type = "npn"
  },
}

Network Security Groups

worker_pool_mode = "node-pool"
worker_pool_size = 1

worker_nsg_ids = ["ocid1.networksecuritygroup..."]
pod_nsg_ids    = [] // when cni_type = "npn"

worker_pools = {
  oke-vm-custom-nsgs-flannel = {
    nsg_ids = ["ocid1.networksecuritygroup..."]
  },

  oke-vm-custom-nsgs-npn = {
    nsg_ids     = ["ocid1.networksecuritygroup..."]
    pod_nsg_ids = ["ocid1.networksecuritygroup..."] // when cni_type = "npn"
  },
}

Secondary VNICs

On pools with a self-managed mode:

worker_pool_mode = "node-pool"
worker_pool_size = 1

kubeproxy_mode    = "iptables" // *iptables/ipvs
worker_is_public  = false
assign_public_ip  = false
worker_nsg_ids    = ["ocid1.networksecuritygroup..."]
worker_subnet_id  = "ocid1.subnet..."
max_pods_per_node = 110
pod_nsg_ids       = [] // when cni_type = "npn"

worker_pools = {
  oke-vm-custom-network-flannel = {
    assign_public_ip = false,
    create           = false,
    subnet_id        = "ocid1.subnet..."
    nsg_ids          = ["ocid1.networksecuritygroup..."]
  },

  oke-vm-custom-network-npn = {
    assign_public_ip = false,
    create           = false,
    subnet_id        = "ocid1.subnet..."
    pod_subnet_id    = "ocid1.subnet..."
    nsg_ids          = ["ocid1.networksecuritygroup..."]
    pod_nsg_ids      = ["ocid1.networksecuritygroup..."]
  },

  oke-vm-vnics = {
    mode   = "instance-pool",
    size   = 1,
    create = false,
    secondary_vnics = {
      vnic0 = {
        nic_index = 0,
        subnet_id = "ocid1.subnet..."
      },
      vnic1 = {
        nic_index = 1,
        subnet_id = "ocid1.subnet..."
      },
    },
  },

  oke-bm-vnics = {
    mode          = "cluster-network",
    size          = 2,
    shape         = "BM.GPU.B4.8",
    placement_ads = [1],
    create        = false,
    secondary_vnics = {
      gpu0 = {
        nic_index = 0,
        subnet_id = "ocid1.subnet..."
      },
      gpu1 = {
        nic_index = 1,
        subnet_id = "ocid1.subnet..."
      },
    },
  },
}

Workers: Image

The operating system image for worker nodes may be defined both globally and on each worker pool.

Recommended base images:

Workers: Cloud-Init

Custom actions may be configured on instance startup in an number of ways depending on the use-case and preferences.

See also:

Global

Cloud init configuration applied to all workers:

worker_cloud_init = [
  {
    content      = <<-EOT
    runcmd:
    - echo "Global cloud_init using cloud-config"
    EOT
    content_type = "text/cloud-config",
  },
  {
    content      = "/path/to/file"
    content_type = "text/cloud-boothook",
  },
  {
    content      = "<Base64-encoded content>"
    content_type = "text/x-shellscript",
  },
]

Pool-specific

Cloud init configuration applied to a specific worker pool:

worker_pools = {
  pool_default = {}
  pool_custom = {
    cloud_init = [
      {
        content      = <<-EOT
        runcmd:
        - echo "Pool-specific cloud_init using cloud-config"
        EOT
        content_type = "text/cloud-config",
      },
      {
        content      = "/path/to/file"
        content_type = "text/cloud-boothook",
      },
      {
        content      = "<Base64-encoded content>"
        content_type = "text/x-shellscript",
      },
    ]
  }
}

Default Cloud-Init Disabled

When providing a custom script that calls OKE initialization:

worker_disable_default_cloud_init = true

Workers: Scaling

There are two easy ways to add worker nodes to a cluster:

  • Add entries to worker_pools.
  • Increase the size of a worker_pools entry.

Worker pools can be added and removed, their size and boot volume size can be updated. After each change, run terraform apply.

Scaling changes to the number and size of pools are immediate after changing the parameters and running terraform apply. The changes to boot volume size will only be effective in newly created nodes after the change is completed.

Autoscaling

See Extensions/Cluster Autoscaler.

Examples

Workers: Storage

TODO

Workers: Draining

Usage

worker_pool_mode = "node-pool"
worker_pool_size = 1

# Configuration for draining nodes through operator
worker_drain_ignore_daemonsets = true
worker_drain_delete_local_data = true
worker_drain_timeout_seconds   = 900

worker_pools = {
  oke-vm-active = {
    description = "Node pool with active workers",
    size        = 2,
  },
  oke-vm-draining = {
    description = "Node pool with scheduling disabled and draining through operator",
    drain       = true,
  },
  oke-vm-disabled = {
    description = "Node pool with resource creation disabled (destroyed)",
    create      = false,
  },
  oke-managed-drain = {
    description                          = "Node pool with custom settings for managed cordon & drain",
    eviction_grace_duration              = 30, # specified in seconds
    is_force_delete_after_grace_duration = true
  },
}

Example

Terraform will perform the following actions:

  # module.workers_only.module.utilities[0].null_resource.drain_workers[0] will be created
  + resource "null_resource" "drain_workers" {
      + id       = (known after apply)
      + triggers = {
          + "drain_commands" = jsonencode(
                [
                  + "kubectl drain --timeout=900s --ignore-daemonsets=true --delete-emptydir-data=true -l oke.oraclecloud.com/pool.name=oke-vm-draining",
                ]
            )
          + "drain_pools"    = jsonencode(
                [
                  + "oke-vm-draining",
                ]
            )
        }
    }

Plan: 1 to add, 0 to change, 0 to destroy.
module.workers_only.module.utilities[0].null_resource.drain_workers[0] (remote-exec): node/10.200.220.157 cordoned
module.workers_only.module.utilities[0].null_resource.drain_workers[0] (remote-exec): WARNING: ignoring DaemonSet-managed Pods: kube-system/csi-oci-node-99x74, kube-system/kube-flannel-ds-spvsp, kube-system/kube-proxy-6m2kk, ...
module.workers_only.module.utilities[0].null_resource.drain_workers[0] (remote-exec): node/10.200.220.157 drained
module.workers_only.module.utilities[0].null_resource.drain_workers[0]: Creation complete after 18s [id=7686343707387113624]

Apply complete! Resources: 1 added, 0 changed, 0 destroyed.

Observe that the node(s) are now disabled for scheduling, and free of workloads other than DaemonSet-managed Pods when worker_drain_ignore_daemonsets = true (default):

kubectl get nodes -l oke.oraclecloud.com/pool.name=oke-vm-draining
NAME             STATUS                     ROLES   AGE   VERSION
10.200.220.157   Ready,SchedulingDisabled   node    24m   v1.26.2

kubectl get pods --all-namespaces --field-selector spec.nodeName=10.200.220.157
NAMESPACE     NAME                    READY   STATUS    RESTARTS   AGE
kube-system   csi-oci-node-99x74      1/1     Running   0          50m
kube-system   kube-flannel-ds-spvsp   1/1     Running   0          50m
kube-system   kube-proxy-6m2kk        1/1     Running   0          50m
kube-system   proxymux-client-2r6lk   1/1     Running   0          50m

Run the following command to uncordon a previously drained worker pool. The drain = true setting should be removed from the worker_pools entry to avoid re-draining the pool when running Terraform in the future.

kubectl uncordon -l oke.oraclecloud.com/pool.name=oke-vm-draining
node/10.200.220.157 uncordoned

References

Workers: Node Cycle

Cycling nodes simplifies both the upgrading of the Kubernetes and host OS versions running on the managed worker nodes, and the updating of other worker node properties.

When you set node_cycling_enabled to true for a node pool, Container Engine for Kubernetes will compare the properties of the existing nodes in the node pool with the properties of the node_pool. If any of the following attributes is not aligned, the node is marked for replacement:

  • kubernetes_version
  • node_labels
  • compute_shape (shape, ocpus, memory)
  • boot_volume_size
  • image_id
  • node_metadata
  • ssh_public_key
  • cloud_init
  • nsg_ids
  • volume_kms_key_id
  • pv_transit_encryption

The node_cycling_max_surge (default: 1) and node_cycling_max_unavailable (default: 0) node_pool attributes can be configured with absolute values or percentage values, calculated relative to the node_pool size. These attributes determine how the Container Engine for Kubernetes will replace the nodes with a stale config in the node_pool.

When cycling nodes, the Container Engine for Kubernetes cordons, drains, and terminates nodes according to the node pool's cordon and drain options.

Notes:

  • It's strongly recommended to use readiness probes and PodDisruptionBudgets to reduce the impact of the node replacement operation.
  • This operation is supported only with the enhanced OKE clusters.
  • New nodes will be created within the same AD/FD as the ones they replace.
  • Node cycle requests can be canceled but can't be reverted.
  • When setting a high node_cycling_max_surge value, check your tenancy compute limits to confirm availability of resources for the new worker nodes.
  • Compatible with the cluster_autoscaler. During node-cycling execution, the request to reduce node_pool size is rejected, and all the worker nodes within the cycled node_pool are annotated with "cluster-autoscaler.kubernetes.io/scale-down-disabled": "true" to prevent the termination of the newly created nodes.
  • node_cycling_enabled = true is incompatible with changes to the node_pool placement_config (subnet_id, availability_domains, placement_fds, etc.)
  • If the kubernetes_version attribute is changed when image_type = custom, ensure a compatible image_id with the new Kubernetes version is provided.

Usage

# Example worker pool node-cycle configuration.

worker_pools = {
  cycled-node-pool = {
    description                  = "Cycling nodes in a node_pool.",
    size                         = 4,
    node_cycling_enabled         = true
    node_cycling_max_surge       = "25%"
    node_cycling_max_unavailable = 0
  }
}

References

Load Balancers

Using Dynamic and Flexible Load Balancers

When you create a service of type LoadBalancer, by default, an OCI Load Balancer with dynamic shape 100Mbps will be created.

.You can override this shape by using the {uri-oci-loadbalancer-annotations}[OCI Load Balancer Annotations]. In order to keep using the dynamic shape but change the available total bandwidth to 400Mbps, use the following annotation on your LoadBalancer service:

service.beta.kubernetes.io/oci-load-balancer-shape: "400Mbps"

Configure flexible shape with bandwidth:

service.beta.kubernetes.io/oci-load-balancer-shape: "flexible"
service.beta.kubernetes.io/oci-load-balancer-shape-flex-min: 50
service.beta.kubernetes.io/oci-load-balancer-shape-flex-max: 200

References

Bastion

The bastion instance provides a public SSH entrypoint into the VCN from which resources in private subnets may be accessed - recommended to limit public IP usage and exposure.

The bastion host parameters concern: 0. whether you want to enable the bastion 0. from where you can access the bastion 0. the different parameters about the bastion host e.g. shape, image id etc.

Image

The OS image for the created bastion instance.

Recommended: Oracle Autonomous Linux 8.x

Example usage

create_bastion              = true           # *true/false
bastion_allowed_cidrs       = []             # e.g. ["0.0.0.0/0"] to allow traffic from all sources
bastion_availability_domain = null           # Defaults to first available
bastion_image_id            = null           # Ignored when bastion_image_type = "platform"
bastion_image_os            = "Oracle Linux" # Ignored when bastion_image_type = "custom"
bastion_image_os_version    = "8"            # Ignored when bastion_image_type = "custom"
bastion_image_type          = "platform"     # platform/custom
bastion_nsg_ids             = []             # Combined with created NSG when enabled in var.nsgs
bastion_public_ip           = null           # Ignored when create_bastion = true
bastion_type                = "public"       # *public/private
bastion_upgrade             = false          # true/*false
bastion_user                = "opc"

bastion_shape = {
  shape            = "VM.Standard.E4.Flex",
  ocpus            = 1,
  memory           = 4,
  boot_volume_size = 50
}

Bastion: SSH

Command usage for ssh through the created bastion to the operator host is included in the module's output:

$ terraform output
cluster = {
  "bastion_public_ip" = "138.0.0.1"
  "ssh_to_operator" = "ssh -J opc@138.0.0.1 opc@10.0.0.16"
  ...
  }

$ ssh -J opc@138.0.0.1 opc@10.0.0.16 kubectl get nodes
NAME          STATUS   ROLES    AGE     VERSION
10.1.48.175   Ready    node     7d10h   v1.25.6
10.1.50.102   Ready    node     3h12m   v1.25.6
10.1.52.76    Ready    node     7d10h   v1.25.6
10.1.54.237   Ready    node     5h41m   v1.25.6
10.1.58.74    Ready    node     5h22m   v1.25.4
10.1.62.90    Ready    node     3h12m   v1.25.6

$ ssh -J opc@138.0.0.1 opc@10.1.54.237 systemctl status kubelet
● kubelet.service - Kubernetes Kubelet
     Active: active (running) since Tue 2023-03-28 01:48:08 UTC; 5h 48min ago
...

Operator

The operator instance provides an optional environment within the VCN from which the OKE cluster can be managed.

The operator host parameters concern:

  1. whether you want to enable the operator
  2. from where you can access the operator
  3. the different parameters about the operator host e.g. shape, image id etc.

Example usage

create_operator                = true # *true/false
operator_availability_domain   = null
operator_cloud_init            = []
operator_image_id              = null           # Ignored when operator_image_type = "platform"
operator_image_os              = "Oracle Linux" # Ignored when operator_image_type = "custom"
operator_image_os_version      = "8"            # Ignored when operator_image_type = "custom"
operator_image_type            = "platform"
operator_nsg_ids               = []
operator_private_ip            = null
operator_pv_transit_encryption = false # true/*false
operator_upgrade               = false # true/*false
operator_user                  = "opc"
operator_volume_kms_key_id     = null

operator_shape = {
  shape            = "VM.Standard.E4.Flex",
  ocpus            = 1,
  memory           = 4,
  boot_volume_size = 50
}

Operator: Cloud-Init

Custom actions may be configured on instance startup in an number of ways depending on the use-case and preferences.

See also:

Cloud init configuration applied to the operator host:

operator_cloud_init = [
  {
    content      = <<-EOT
    runcmd:
    - echo "Operator cloud_init using cloud-config"
    EOT
    content_type = "text/cloud-config",
  },
  {
    content      = "/path/to/file"
    content_type = "text/cloud-boothook",
  },
  {
    content      = "<Base64-encoded content>"
    content_type = "text/x-shellscript",
  },
]

Operator: Identity

instance_principal

Instance_principal is an IAM service feature that enables instances to be authorized actors (or principals) to perform actions on service resources. Each compute instance has its own identity, and it authenticates using the certificates that are added to it. These certificates are automatically created, assigned to instances and rotated, preventing the need for you to distribute credentials to your hosts and rotate them.

Dynamic Groups group OCI instances as principal actors, similar to user groups. IAM policies can then be created to allow instances in these groups to make calls against OCI infrastructure services. For example, on the operator host, this permits kubectl to access the OKE cluster.

Any user who has access to the instance (who can SSH to the instance), automatically inherits the privileges granted to the instance. Before you enable this feature, ensure that you know who can access it, and that they should be authorized with the permissions you are granting to the instance.

By default, this feature is disabled. However, it is required at the time of cluster creation if you wish to enable KMS Integration or Extensions.

When you enable this feature, by default, the operator host will have privileges to all resources in the compartment. If you are enabling it for KMS Integration, the operator host will also have rights to create policies in the root tenancy.

Enabling instance_principal for the operator instance

instance_principal for the operator instance can be enabled or disabled at any time without impact on the operator or the cluster.

To enable this feature, specify the following to create of the necessary IAM policies, Dynamic Groups, and Matching Rules:

create_iam_resources = true
create_iam_operator_policy = "always"

To disable this feature, specify:

create_iam_operator_policy = "never"

Operator: SSH

Command usage for ssh through the created bastion to the operator host is included in the module's output:

$ terraform output
cluster = {
  "bastion_public_ip" = "138.0.0.1"
  "ssh_to_operator" = "ssh -J opc@138.0.0.1 opc@10.0.0.16"
  ...
}

$ ssh -J opc@138.0.0.1 opc@10.0.0.16 kubectl get nodes
NAME          STATUS   ROLES    AGE     VERSION
10.1.48.175   Ready    node     7d10h   v1.25.6
10.1.50.102   Ready    node     3h12m   v1.25.6
10.1.52.76    Ready    node     7d10h   v1.25.6
10.1.54.237   Ready    node     5h41m   v1.25.6
10.1.58.74    Ready    node     5h22m   v1.25.4
10.1.62.90    Ready    node     3h12m   v1.25.6

Utilities

OCIR

NOTE: TODO Pending validation in 5.x

The auth token must first be manually created and stored in OCI Secret in Vault. It will subsequently be used to create a Kubernetes secret, which can then be used as an imagePullSecrets in a deployment. If you do not need to use private OCIR repositories, then leave the secret_id parameter empty.

The secret is created in the "default" namespace. To copy it to your namespace, use the following command:

kubectl --namespace=default get secret ocirsecret --export -o yaml | kubectl apply --namespace=<newnamespace> -f -

Creating a Secret

Oracle Cloud Infrastructure Registry is a highly available private container registry service for storing and sharing container images within the same regions as the OKE Cluster. Use the following rules to determine if you need to create a Kubernetes Secret for OCIR:

  • If your container repository is public, you do not need to create a secret.
  • If your container repository is private, you need to create a secret before OKE can pull your images from the private repository.

If you plan on creating a Kubernetes Secret for OCIR, you must first create an Auth Token. Copy and temporarily save the value of the Auth Token.

You must then create a Secret in OCI Vault to store the value of the Auth Token in it.

Finally, assign the Secret OCID to secret_id in terraform.tfvars. Refer to {uri-terraform-options}#ocir[OCIR parameters] for other parameters to be set.

NOTE: Installing the Vertical Pod Autoscaler also requires installing the Metrics Server, so you need to enable that too.

Service account

NOTE: TODO Pending validation in 5.x

OKE now uses Kubeconfig v2 which means the default token has a limited lifespan. In order to allow CI/CD tools to deploy to OKE, a service account must be created.

Set the create_service_account = true and you can name the other parameters as appropriate:

create_service_account = true
service_account_name = "kubeconfigsa"
service_account_namespace = "kube-system"
service_account_cluster_role_binding = ""

KMS

The KMS integration parameters control whether OCI Key Management Service will be used for encrypting Kubernetes secrets and boot volumes/block volumes. Additionally, the bastion and operator hosts must be enabled as well as instance_principal on the operator.

OKE also supports enforcing the use of signed images. You can enforce the use of signed image using the following parameters:

use_signed_images  = false
image_signing_keys = ["ocid1.key.oc1....", "ocid1.key.oc1...."]

Reference

Extensions


WARNING: The following options are provided as a reference for evaluation only, and may install software to the cluster that is not supported by or sourced from Oracle. These features should be enabled with caution as their operation is not guaranteed!


Gatekeeper

Usage

gatekeeper_install           = true
gatekeeper_namespace         = "kube-system"
gatekeeper_helm_version      = "3.11.0"
gatekeeper_helm_values       = {}
gatekeeper_helm_values_files = []

References


MPI Operator

Usage

mpi_operator_install    = true
mpi_operator_namespace  = "default"
mpi_operator_deployment = null // determined automatically for version by default
mpi_operator_version    = "0.4.0"

References


Extensions: Standalone Cluster Autoscaler

Deployed using the cluster-autoscaler Helm chart with configuration from the worker_pools variable.

The module is using the oke.oraclecloud.com/cluster_autoscaler nodepool label to facilitate the understanding of how the Kubernetes cluster auto-scaler will interact with the node:

  • allowed - cluster-autoscaler deployment will be allowed to run on the nodes with this label
  • managed - cluster-autoscaler is managing this node (may terminate it if required)
  • disabled - cluster-autoscaler will not run nor manage the node.

The following parameters may be added on each pool definition to enable management or scheduling of the cluster autoscaler:

  • allow_autoscaler: Enable scheduling of the cluster autoscaler deployment on a pool by adding a node label matching the deployment's nodeSelector (oke.oraclecloud.com/cluster_autoscaler: allowed), and an OCI defined tag for use with IAM tag-based policies granting access to the instances (${var.tag_namespace}.cluster_autoscaler: allowed).
  • autoscale: Enable cluster autoscaler management of the pool by appending --nodes <nodepool-ocid> argument to the CMD of the cluster-autoscaler container. Nodes part of these nodepools will have the label oke.oraclecloud.com/cluster_autoscaler: managed and an OCI defined tag ${var.tag_namespace}.cluster_autoscaler: managed.
  • min_size: Define the minimum scale of a pool managed by the cluster autoscaler. Defaults to size when not provided.
  • max_size: Define the maximum scale of a pool managed by the cluster autoscaler. Defaults to size when not provided.

The cluster-autoscaler will manage the size of the nodepools with the attribute autoscale = true. To avoid the conflict between the actual size of a nodepool and the size defined in the terraform configuration files, you can add the ignore_initial_pool_size = true attribute to the nodepool definition in the worker_pools variable. This parameter will allow terraform to ignore the drift of the size parameter for the specific nodepool.

This setting is strongly recommended for nodepools configured with autoscale = true.

Example:

worker_pools = {
  np-autoscaled = {
    description              = "Node pool managed by cluster autoscaler",
    size                     = 2,
    min_size                 = 1,
    max_size                 = 3,
    autoscale                = true,
    ignore_initial_pool_size = true # allows nodepool size drift
  },
  np-autoscaler = {
    description      = "Node pool with cluster autoscaler scheduling allowed",
    size             = 1,
    allow_autoscaler = true,
  },
}

For existing deployments is necessary to use the terraform state mv command.

Example for nodepool resource:


$ terraform plan
...
Terraform will perform the following actions:
  
  # module.oke.module.workers[0].oci_containerengine_node_pool.tfscaled_workers["np-autoscaled"] will be destroyed
...

  # module.oke.module.workers[0].oci_containerengine_node_pool.autoscaled_workers["np-autoscaled"] will be created


$ terraform state mv module.oke.module.workers[0].oci_containerengine_node_pool.tfscaled_workers[\"np-autoscaled\"]  module.oke.module.workers[0].oci_containerengine_node_pool.autoscaled_workers[\"np-autoscaled\"]

Successfully moved 1 object(s).

$ terraform plan
...
No changes. Your infrastructure matches the configuration.

Example for instance_pool resource:

$ terraform state mv module.oke.module.workers[0].oci_core_instance_pool.tfscaled_workers[\"np-autoscaled\"] module.oke.module.workers[0].oci_core_instance_pool.autoscaled_workers[\"np-autoscaled\"]

Successfully moved 1 object(s).

Notes

Don't set allow_autoscaler and autoscale to true on the same pool. This will cause the cluster autoscaler pod to be unschedulable as the oke.oraclecloud.com/cluster_autoscaler: managed node label will override the oke.oraclecloud.com/cluster_autoscaler: allowed node label specified by the cluster autoscaler nodeSelector pod attribute.

Usage

cluster_autoscaler_install           = true
cluster_autoscaler_namespace         = "kube-system"
cluster_autoscaler_helm_version      = "9.24.0"
cluster_autoscaler_helm_values       = {}
cluster_autoscaler_helm_values_files = []
# Example worker pool configurations with cluster autoscaler

worker_pools = {
  np-autoscaled = {
    description              = "Node pool managed by cluster autoscaler",
    size                     = 2,
    min_size                 = 1,
    max_size                 = 3,
    autoscale                = true,
    ignore_initial_pool_size = true
  },
  np-autoscaler = {
    description      = "Node pool with cluster autoscaler scheduling allowed",
    size             = 1,
    allow_autoscaler = true,
  },
}

References

Extensions: Networking


WARNING: The following options are provided as a reference for evaluation only, and may install software to the cluster that is not supported by or sourced from Oracle. These features should be enabled with caution as their operation is not guaranteed!


Multus CNI

Usage

multus_install       = true
multus_namespace     = "network"
multus_daemonset_url = null // determined automatically for version by default
multus_version       = "3.9.3"

References


Cilium CNI

Usage

cilium_install           = true
cilium_reapply           = false
cilium_namespace         = "network"
cilium_helm_version      = "45.2.0"
cilium_helm_values       = {}
cilium_helm_values_files = []

References


Whereabouts IPAM plugin

Usage

whereabouts_install       = true
whereabouts_namespace     = "network"
whereabouts_daemonset_url = null // determined automatically for version by default
whereabouts_version       = "master"

References


SR-IOV Device plugin

Usage

sriov_device_plugin_install       = true
sriov_device_plugin_namespace     = "network"
sriov_device_plugin_daemonset_url = null // determined automatically for version by default
sriov_device_plugin_version       = "master"

References


SR-IOV CNI plugin

Usage

sriov_cni_plugin_install       = true
sriov_cni_plugin_namespace     = "network"
sriov_cni_plugin_daemonset_url = null // determined automatically for version by default
sriov_cni_plugin_version       = "master"

References


RDMA CNI plugin

Usage

rdma_cni_plugin_install       = true
rdma_cni_plugin_namespace     = "network"
rdma_cni_plugin_daemonset_url = null // determined automatically for version by default
rdma_cni_plugin_version       = "master"

References


Extensions: Monitoring


WARNING: The following options are provided as a reference for evaluation only, and may install software to the cluster that is not supported by or sourced from Oracle. These features should be enabled with caution as their operation is not guaranteed!


Metrics Server

Usage

metrics_server_install       = true
metrics_server_namespace     = "metrics"
metrics_server_daemonset_url = null // determined automatically for version by default
metrics_server_version       = "master"

References


Prometheus

Usage

prometheus_install           = true
prometheus_reapply           = false
prometheus_namespace         = "metrics"
prometheus_helm_version      = "45.2.0"
prometheus_helm_values       = {}
prometheus_helm_values_files = []

References


DCGM Exporter

Usage

dcgm_exporter_install           = true
dcgm_exporter_reapply           = false
dcgm_exporter_namespace         = "metrics"
dcgm_exporter_helm_version      = "3.1.5"
dcgm_exporter_helm_values       = {}
dcgm_exporter_helm_values_files = []

References


Upgrading

TODO Update content

This section documents how to upgrade the OKE cluster using this project. At a high level, upgrading the OKE cluster is fairly straightforward:

  1. Upgrade the control plane nodes
  2. Upgrade the worker nodes using either {uri-upgrade-oke}[in-place or out-of-place] approach

These steps must be performed in order.

Prerequisites

For in-place upgrade:

  • Enhanced cluster

For out-of-place upgrade:

  • Bastion host is created
  • Operator host is created
  • instance_principal is enabled on operator

Upgrading the control plane nodes

Locate your kubernetes_version in your Terraform variable file and change:

kubernetes_version = "v1.22.5" 

to

kubernetes_version = "v1.23.4"

Run terraform apply. This will upgrade the control plane nodes. You can verify this in the OCI Console.

Tip

If you have modified the default resources e.g. security lists, you will need to use a targeted apply:

terraform apply --target=module.oke.k8s_cluster

Upgrading the worker nodes using the in-place method

In-place worker node upgrade is performed using the node_pool node_cycle operation.

Set node_cycling_enabled for the existing node_pools you want to upgrade and control the node replacement strategy using: node_cycling_max_surge and node_cycling_max_unavailable.

worker_pools = {
  cycled-node-pool = {
    description                  = "Cycling nodes in a node_pool.",
    size                         = 2,
    node_cycling_enabled         = true
    node_cycling_max_surge       = 1
    node_cycling_max_unavailable = 0
  }
}

By default, the node_pools are using the same Kubernetes version as the control plane (defined in the kubernetes_version variable).

Note: You can override each node_pool Kubernetes version via the kubernetes_version attribute in the worker_pools variable.

kubernetes_version = "v1.26.7" # control plane Kubernetes version (used by default for the node_pools).

worker_pools = {
  cycled-node-pool = {
    description                  = "Cycling nodes in a node_pool.",
    size                         = 2,
    kubernetes_version           = "v1.26.2" # override the default Kubernetes version
  }
}

Worker node image compatibility

If the node_pool is configured to use a custom worker node image (image_type = custom), make sure that the worker ndoe image referenced in the image_id attribute of the worker_pools is compatible with the new kubernetes_version.

kubernetes_version = "v1.26.7" # control plane Kubernetes version (used by default for the node_pools).

worker_pools = {
  cycled-node-pool = {
    description                  = "Cycling nodes in a node_pool.",
    size                         = 2,
    image_type                   = "custom",
    image_id                     = "ocid1.image..."
  }
}

Note: A new image_id, compatible with the node_pool kubernetes_version is automatically configured when image_type is not configured for the node_pool or set to the values ("oke" or "platform").

Upgrading the worker nodes using the out-of-place method

Add new node pools

Add a new node pool in your list of node pools e.g. change:

worker_pools = {
  np1 = ["VM.Standard.E2.2", 7, 50]
  np2 = ["VM.Standard2.8", 5, 50]
}

to

worker_pools = {
  np1 = ["VM.Standard.E2.2", 7, 50]
  np2 = ["VM.Standard2.8", 5, 50]
  np3 = ["VM.Standard.E2.2", 7, 50]
  np4 = ["VM.Standard2.8", 5, 50]
}

and run terraform apply again. (See note above about targeted apply). If you are using Kubernetes labels for your existing applications, you will need to ensure the new node pools also have the same labels. Refer to the terraform.tfvars.example file for the format to specify the labels.

When node pools 3 and 4 are created, they will be created with the newer cluster version of Kubernetes. Since you have already upgrade your cluster to v1.23.4, node pools 3 and 4 will be running Kubernetes v1.23.4.

Drain older nodepools

Set upgrade_nodepool=true. This will instruct the OKE cluster that some node pools will be drained.

Provide the list of node pools to drain. This should usually be only the old node pools. You don't need to upgrade all the node pools at once.

worker_pools_to_drain = [ "np1", "np2"] 

Rerun terraform apply (see note above about targeted apply).

Delete node pools with older Kubernetes version

When you are ready, you can now delete the old node pools by removing them from the list of node pools:

worker_pools = {
  np3 = ["VM.Standard.E2.2", 7, 50]
  np4 = ["VM.Standard2.8", 5, 50]
}

Rerun terraform apply. This completes the upgrade process. Now, set upgrade_nodepool = false to prevent draining from current nodes by mistake.

Deploy the OKE Terraform Module

Prerequisites

Provisioning from an OCI Resource Manager Stack

Network

Deploy to Oracle Cloud

Network resources configured for an OKE cluster.

The following resources may be created depending on provided configuration:

Cluster

Deploy to Oracle Cloud

An OKE-managed Kubernetes cluster.

The following resources may be created depending on provided configuration:

Node Pool

Deploy to Oracle Cloud

A standard OKE-managed pool of worker nodes with enhanced feature support.

Configured with mode = "node-pool" on a worker_pools entry, or with worker_pool_mode = "node-pool" to use as the default for all pools unless otherwise specified.

You can set the image_type attribute to one of the following values:

  • oke (default)
  • platform
  • custom.

When the image_type is equal to oke or platform there is a high risk for the node-pool image to be updated on subsequent terraform apply executions because the module is using a datasource to fetch the latest images available.

To avoid this situation, you can set the image_type to custom and the image_id to the OCID of the image you want to use for the node-pool.

The following resources may be created depending on provided configuration:

Virtual Node Pool

Deploy to Oracle Cloud

An OKE-managed Virtual Node Pool.

Configured with mode = "virtual-node-pool" on a worker_pools entry, or with worker_pool_mode = "virtual-node-pool" to use as the default for all pools unless otherwise specified.

The following resources may be created depending on provided configuration:

Instance

Deploy to Oracle Cloud

A set of self-managed Compute Instances for custom user-provisioned worker nodes not managed by an OCI pool, but individually by Terraform.

Configured with mode = "instance" on a worker_pools entry, or with worker_pool_mode = "instance" to use as the default for all pools unless otherwise specified.

The following resources may be created depending on provided configuration:

Instance Pool

Deploy to Oracle Cloud

A self-managed Compute Instance Pool for custom user-provisioned worker nodes.

Configured with mode = "instance-pool" on a worker_pools entry, or with worker_pool_mode = "instance-pool" to use as the default for all pools unless otherwise specified.

The following resources may be created depending on provided configuration:

Cluster Network

Deploy to Oracle Cloud

A self-managed HPC Cluster Network.

Configured with mode = "cluster-network" on a worker_pools entry, or with worker_pool_mode = "cluster-network" to use as the default for all pools unless otherwise specified.

The following resources may be created depending on provided configuration:

Reference

Inputs

The module supports the following configuration for created resources:

NameDescriptionTypeDefaultRequired
allow_rules_internal_lbA map of additional rules to allow incoming traffic for internal load balancers.any{}no
allow_rules_public_lbA map of additional rules to allow incoming traffic for public load balancers.any{}no
allow_rules_workersA map of additional rules to allow traffic for the workers.any{}no
defined_tagsDefined tags to be applied to created resources. Must already exist in the tenancy.any{
"bastion": {},
"cluster": {},
"iam": {},
"network": {},
"operator": {},
"persistent_volume": {},
"service_lb": {},
"workers": {}
}
no
drg_attachmentsDRG attachment configurations.any{}no
freeform_tagsFreeform tags to be applied to created resources.any{
"bastion": {},
"cluster": {},
"iam": {},
"network": {},
"operator": {},
"persistent_volume": {},
"service_lb": {},
"workers": {}
}
no
worker_poolsTuple of OKE worker pools where each key maps to the OCID of an OCI resource, and value contains its definition.any{}no
allow_bastion_cluster_accessWhether to allow access to the Kubernetes cluster endpoint from the bastion host.boolfalseno
allow_node_port_accessWhether to allow access from worker NodePort range to load balancers.boolfalseno
allow_pod_internet_accessAllow pods to egress to internet. Ignored when cni_type != 'npn'.booltrueno
allow_worker_internet_accessAllow worker nodes to egress to internet. Required if container images are in a registry other than OCIR.booltrueno
allow_worker_ssh_accessWhether to allow SSH access to worker nodes.boolfalseno
assign_dnsWhether to assign DNS records to created instances or disable DNS resolution of hostnames in the VCN.booltrueno
assign_public_ip_to_control_planeWhether to assign a public IP address to the API endpoint for public access. Requires the control plane subnet to be public to assign a public IP address.boolfalseno
bastion_is_publicWhether to create allocate a public IP and subnet for the created bastion host.booltrueno
bastion_upgradeWhether to upgrade bastion packages after provisioning.boolfalseno
cilium_installWhether to deploy the Cilium Helm chart. May only be enabled when cni_type = 'flannel'. See https://docs.cilium.io. NOTE: Provided only as a convenience and not supported by or sourced from Oracle - use at your own risk.boolfalseno
cilium_reapplyWhether to force reapply of the chart when no changes are detected, e.g. with state modified externally.boolfalseno
cluster_autoscaler_installWhether to deploy the Kubernetes Cluster Autoscaler Helm chart. See kubernetes/autoscaler. NOTE: Provided only as a convenience and not supported by or sourced from Oracle - use at your own risk.boolfalseno
control_plane_is_publicWhether the Kubernetes control plane endpoint should be allocated a public IP address to enable access over public internet.boolfalseno
create_bastionWhether to create a bastion host.booltrueno
create_clusterWhether to create the OKE cluster and dependent resources.booltrueno
create_drgWhether to create a Dynamic Routing Gateway and attach it to the VCN.boolfalseno
create_iam_defined_tagsWhether to create defined tags used for IAM policy and tracking. Ignored when 'create_iam_resources' is false.boolfalseno
create_iam_resourcesWhether to create IAM dynamic groups, policies, and tags. Resources for components may be controlled individually with 'create_iam_*' variables when enabled. Ignored when 'create_iam_resources' is false.boolfalseno
create_iam_tag_namespaceWhether to create a namespace for defined tags used for IAM policy and tracking. Ignored when 'create_iam_resources' is false.boolfalseno
create_operatorWhether to create an operator server in a private subnet.booltrueno
create_service_accountWether to create a service account or not.boolfalseno
create_vcnWhether to create a Virtual Cloud Network.booltrueno
dcgm_exporter_installWhether to deploy the DCGM exporter Helm chart. See DCGM-Exporter. NOTE: Provided only as a convenience and not supported by or sourced from Oracle - use at your own risk.boolfalseno
dcgm_exporter_reapplyWhether to force reapply of the Helm chart when no changes are detected, e.g. with state modified externally.boolfalseno
enable_wafWhether to enable WAF monitoring of load balancers.boolfalseno
gatekeeper_installWhether to deploy the Gatekeeper Helm chart. See https://github.com/open-policy-agent/gatekeeper. NOTE: Provided only as a convenience and not supported by or sourced from Oracle - use at your own risk.boolfalseno
lockdown_default_seclistWhether to remove all default security rules from the VCN Default Security List.booltrueno
metrics_server_installWhether to deploy the Kubernetes Metrics Server Helm chart. See kubernetes-sigs/metrics-server. NOTE: Provided only as a convenience and not supported by or sourced from Oracle - use at your own risk.boolfalseno
mpi_operator_installWhether to deploy the MPI Operator. See kubeflow/mpi-operator. NOTE: Provided only as a convenience and not supported by or sourced from Oracle - use at your own risk.boolfalseno
multus_installWhether to deploy Multus. See k8snetworkplumbingwg/multus-cni. NOTE: Provided only as a convenience and not supported by or sourced from Oracle - use at your own risk.boolfalseno
operator_install_helmWhether to install Helm on the created operator host.booltrueno
operator_install_istioctlWhether to install istioctl on the created operator host.boolfalseno
operator_install_k9sWhether to install k9s on the created operator host. NOTE: Provided only as a convenience and not supported by or sourced from Oracle - use at your own risk.boolfalseno
operator_install_kubectl_from_repoWhether to install kubectl on the created operator host from olcne repo.booltrueno
operator_install_kubectxWhether to install kubectx/kubens on the created operator host. NOTE: Provided only as a convenience and not supported by or sourced from Oracle - use at your own risk.booltrueno
operator_pv_transit_encryptionWhether to enable in-transit encryption for the data volume's paravirtualized attachment.boolfalseno
operator_upgradeWhether to upgrade operator packages after provisioning.boolfalseno
output_detailWhether to include detailed output in state.boolfalseno
prometheus_installWhether to deploy the Prometheus Helm chart. See https://github.com/prometheus-community/helm-charts/tree/main/charts/kube-prometheus-stack. NOTE: Provided only as a convenience and not supported by or sourced from Oracle - use at your own risk.boolfalseno
prometheus_reapplyWhether to force reapply of the Prometheus Helm chart when no changes are detected, e.g. with state modified externally.boolfalseno
rdma_cni_plugin_installWhether to deploy the SR-IOV CNI Plugin. See <a href=https://github.com/openshift/sriov-cni. NOTE: Provided only as a convenience and not supported by or sourced from Oracle - use at your own risk.boolfalseno
sriov_cni_plugin_installWhether to deploy the SR-IOV CNI Plugin. See <a href=https://github.com/openshift/sriov-cni. NOTE: Provided only as a convenience and not supported by or sourced from Oracle - use at your own risk.boolfalseno
sriov_device_plugin_installWhether to deploy the SR-IOV Network Device Plugin. See k8snetworkplumbingwg/sriov-network-device-plugin. NOTE: Provided only as a convenience and not supported by or sourced from Oracle - use at your own risk.boolfalseno
use_defined_tagsWhether to apply defined tags to created resources for IAM policy and tracking.boolfalseno
use_signed_imagesWhether to enforce the use of signed images. If set to true, at least 1 RSA key must be provided through image_signing_keys.boolfalseno
whereabouts_installWhether to deploy the MPI Operator. See k8snetworkplumbingwg/whereabouts. NOTE: Provided only as a convenience and not supported by or sourced from Oracle - use at your own risk.boolfalseno
worker_disable_default_cloud_initWhether to disable the default OKE cloud init and only use the cloud init explicitly passed to the worker pool in 'worker_cloud_init'.boolfalseno
worker_drain_delete_local_dataWhether to accept removal of data stored locally on draining worker pools. See kubectl drain for more information.booltrueno
worker_drain_ignore_daemonsetsWhether to ignore DaemonSet-managed Pods when draining worker pools. See kubectl drain for more information.booltrueno
worker_is_publicWhether to provision workers with public IPs allocated by default when unspecified on a pool.boolfalseno
worker_pv_transit_encryptionWhether to enable in-transit encryption for the data volume's paravirtualized attachment by default when unspecified on a pool.boolfalseno
internet_gateway_route_rules(Updatable) List of routing rules to add to Internet Gateway Route Table.list(map(string))nullno
nat_gateway_route_rules(Updatable) List of routing rules to add to NAT Gateway Route Table.list(map(string))nullno
operator_cloud_initList of maps containing cloud init MIME part configuration for operator host. See https://registry.terraform.io/providers/hashicorp/template/latest/docs/data-sources/cloudinit_config.html#part for expected schema of each element.list(map(string))[]no
worker_cloud_initList of maps containing cloud init MIME part configuration for worker nodes. Merged with pool-specific definitions. See https://registry.terraform.io/providers/hashicorp/template/latest/docs/data-sources/cloudinit_config.html#part for expected schema of each element.list(map(string))[]no
bastion_allowed_cidrsA list of CIDR blocks to allow SSH access to the bastion host. NOTE: Default is empty i.e. no access permitted. Allow access from anywhere with '0.0.0.0/0'.list(string)[]no
bastion_nsg_idsAn additional list of network security group (NSG) IDs for bastion security.list(string)[]no
cilium_helm_values_filesPaths to a local YAML files with Helm chart values (as with helm install -f which supports multiple). Generate with defaults using helm show values [CHART] [flags].list(string)[]no
cluster_autoscaler_helm_values_filesPaths to a local YAML files with Helm chart values (as with helm install -f which supports multiple). Generate with defaults using helm show values [CHART] [flags].list(string)[]no
control_plane_allowed_cidrsThe list of CIDR blocks from which the control plane can be accessed.list(string)[]no
dcgm_exporter_helm_values_filesPaths to a local YAML files with Helm chart values (as with helm install -f which supports multiple). Generate with defaults using helm show values [CHART] [flags].list(string)[]no
gatekeeper_helm_values_filesPaths to a local YAML files with Helm chart values (as with helm install -f which supports multiple). Generate with defaults using helm show values [CHART] [flags].list(string)[]no
metrics_server_helm_values_filesPaths to a local YAML files with Helm chart values (as with helm install -f which supports multiple). Generate with defaults using helm show values [CHART] [flags].list(string)[]no
operator_nsg_idsAn optional and updatable list of network security groups that the operator will be part of.list(string)[]no
pod_nsg_idsAn additional list of network security group (NSG) IDs for pod security. Combined with 'pod_nsg_ids' specified on each pool.list(string)[]no
prometheus_helm_values_filesPaths to a local YAML files with Helm chart values (as with helm install -f which supports multiple). Generate with defaults using helm show values [CHART] [flags].list(string)[]no
vcn_cidrsThe list of IPv4 CIDR blocks the VCN will use.list(string)[
"10.0.0.0/16"
]
no
worker_nsg_idsAn additional list of network security group (NSG) IDs for node security. Combined with 'nsg_ids' specified on each pool.list(string)[]no
bastion_shapeThe shape of bastion instance.map(any){
"boot_volume_size": 50,
"memory": 4,
"ocpus": 1,
"shape": "VM.Standard.E4.Flex"
}
no
local_peering_gatewaysMap of Local Peering Gateways to attach to the VCN.map(any)nullno
operator_shapeShape of the created operator instance.map(any){
"boot_volume_size": 50,
"memory": 4,
"ocpus": 1,
"shape": "VM.Standard.E4.Flex"
}
no
remote_peering_connectionsMap of parameters to add and optionally to peer to remote peering connections. Key-only items represent local acceptors and no peering attempted; items containing key and values represent local requestor and must have the OCID and region of the remote acceptor to peer tomap(any){}no
service_accountsMap of service accounts and associated parameters.map(any){
"kubeconfigsa": {
"sa_cluster_role": "cluster-admin",
"sa_cluster_role_binding": "kubeconfigsa-crb",
"sa_name": "kubeconfigsa",
"sa_namespace": "kube-system"
}
}
no
worker_preemptible_configDefault preemptible Compute configuration when unspecified on a pool. See Preemptible Worker Nodes for more information.map(any){
"enable": false,
"is_preserve_boot_volume": false
}
no
worker_shapeDefault shape of the created worker instance when unspecified on a pool.map(any){
"boot_volume_size": 50,
"boot_volume_vpus_per_gb": 10,
"memory": 16,
"ocpus": 2,
"shape": "VM.Standard.E4.Flex"
}
no
subnetsConfiguration for standard subnets. The 'create' parameter of each entry defaults to 'auto', creating subnets when other enabled components are expected to utilize them, and may be configured with 'never' or 'always' to force disabled/enabled.map(object({
create = optional(string)
id = optional(string)
newbits = optional(string)
netnum = optional(string)
cidr = optional(string)
dns_label = optional(string)
}))
{
"bastion": {
"newbits": 13
},
"cp": {
"newbits": 13
},
"int_lb": {
"newbits": 11
},
"operator": {
"newbits": 13
},
"pods": {
"newbits": 2
},
"pub_lb": {
"newbits": 11
},
"workers": {
"newbits": 4
}
}
no
nsgsConfiguration for standard network security groups (NSGs). The 'create' parameter of each entry defaults to 'auto', creating NSGs when other enabled components are expected to utilize them, and may be configured with 'never' or 'always' to force disabled/enabled.map(object({
create = optional(string)
id = optional(string)
}))
{
"bastion": {},
"cp": {},
"int_lb": {},
"operator": {},
"pods": {},
"pub_lb": {},
"workers": {}
}
no
bastion_defined_tagsDefined tags applied to created resources.map(string){}no
bastion_freeform_tagsFreeform tags applied to created resources.map(string){}no
cilium_helm_valuesMap of individual Helm chart values. See https://registry.terraform.io/providers/hashicorp/helm/latest/docs/data-sources/template.map(string){}no
cluster_autoscaler_helm_valuesMap of individual Helm chart values. See data.helm_template.map(string){}no
cluster_defined_tagsDefined tags applied to created resources.map(string){}no
cluster_freeform_tagsFreeform tags applied to created resources.map(string){}no
dcgm_exporter_helm_valuesMap of individual Helm chart values. See data.helm_template.map(string){}no
gatekeeper_helm_valuesMap of individual Helm chart values. See data.helm_template.map(string){}no
iam_defined_tagsDefined tags applied to created resources.map(string){}no
iam_freeform_tagsFreeform tags applied to created resources.map(string){}no
metrics_server_helm_valuesMap of individual Helm chart values. See data.helm_template.map(string){}no
network_defined_tagsDefined tags applied to created resources.map(string){}no
network_freeform_tagsFreeform tags applied to created resources.map(string){}no
operator_defined_tagsDefined tags applied to created resources.map(string){}no
operator_freeform_tagsFreeform tags applied to created resources.map(string){}no
persistent_volume_defined_tagsDefined tags applied to created resources.map(string){}no
persistent_volume_freeform_tagsFreeform tags applied to created resources.map(string){}no
prometheus_helm_valuesMap of individual Helm chart values. See data.helm_template.map(string){}no
service_lb_defined_tagsDefined tags applied to created resources.map(string){}no
service_lb_freeform_tagsFreeform tags applied to created resources.map(string){}no
worker_node_labelsDefault worker node labels. Merged with labels defined on each pool.map(string){}no
worker_node_metadataMap of additional worker node instance metadata. Merged with metadata defined on each pool.map(string){}no
workers_defined_tagsDefined tags applied to created resources.map(string){}no
workers_freeform_tagsFreeform tags applied to created resources.map(string){}no
max_pods_per_nodeThe default maximum number of pods to deploy per node when unspecified on a pool. Absolute maximum is 110. Ignored when when cni_type != 'npn'.number31no
worker_drain_timeout_secondsThe length of time to wait before giving up on draining nodes in a pool. See kubectl drain for more information.number900no
worker_pool_sizeDefault size for worker pools when unspecified on a pool.number0no
agent_configDefault agent_config for self-managed worker pools created with mode: 'instance', 'instance-pool', or 'cluster-network'. See <a href=https://docs.oracle.com/en-us/iaas/api/#/en/iaas/20160918/datatypes/InstanceAgentConfig for more information.object({
are_all_plugins_disabled = bool,
is_management_disabled = bool,
is_monitoring_disabled = bool,
plugins_config = map(string),
})
nullno
platform_configDefault platform_config for self-managed worker pools created with mode: 'instance', 'instance-pool', or 'cluster-network'. See PlatformConfig for more information.object({
type = optional(string),
are_virtual_instructions_enabled = optional(bool),
is_access_control_service_enabled = optional(bool),
is_input_output_memory_management_unit_enabled = optional(bool),
is_measured_boot_enabled = optional(bool),
is_memory_encryption_enabled = optional(bool),
is_secure_boot_enabled = optional(bool),
is_symmetric_multi_threading_enabled = optional(bool),
is_trusted_platform_module_enabled = optional(bool),
numa_nodes_per_socket = optional(number),
percentage_of_cores_enabled = optional(bool),
})
nullno
control_plane_nsg_idsAn additional list of network security groups (NSG) ids for the cluster endpoint.set(string)[]no
image_signing_keysA list of KMS key ids used by the worker nodes to verify signed images. The keys must use RSA algorithm.set(string)[]no
api_fingerprintFingerprint of the API private key to use with OCI API.stringnullno
api_private_keyThe contents of the private key file to use with OCI API, optionally base64-encoded. This takes precedence over private_key_path if both are specified in the provider.stringnullno
api_private_key_passwordThe corresponding private key password to use with the api private key if it is encrypted.stringnullno
api_private_key_pathThe path to the OCI API private key.stringnullno
await_node_readinessOptionally block completion of Terraform apply until one/all worker nodes become ready.string"none"no
bastion_availability_domainThe availability domain for bastion placement. Defaults to first available.stringnullno
bastion_image_idImage ID for created bastion instance.stringnullno
bastion_image_osBastion image operating system name when bastion_image_type = 'platform'.string"Oracle Autonomous Linux"no
bastion_image_os_versionBastion image operating system version when bastion_image_type = 'platform'.string"8"no
bastion_image_typeWhether to use a platform or custom image for the created bastion instance. When custom is set, the bastion_image_id must be specified.string"platform"no
bastion_public_ipThe IP address of an existing bastion host, if create_bastion = false.stringnullno
bastion_userUser for SSH access through bastion host.string"opc"no
cilium_helm_versionVersion of the Helm chart to install. List available releases using helm search repo [keyword] --versions.string"1.14.4"no
cilium_namespaceKubernetes namespace for deployed resources.string"kube-system"no
cluster_autoscaler_helm_versionVersion of the Helm chart to install. List available releases using helm search repo [keyword] --versions.string"9.24.0"no
cluster_autoscaler_namespaceKubernetes namespace for deployed resources.string"kube-system"no
cluster_ca_certBase64+PEM-encoded cluster CA certificate for unmanaged instance pools. Determined automatically when 'create_cluster' = true or 'cluster_id' is provided.stringnullno
cluster_dnsCluster DNS resolver IP address. Determined automatically when not set (recommended).stringnullno
cluster_idAn existing OKE cluster OCID when create_cluster = false.stringnullno
cluster_kms_key_idThe id of the OCI KMS key to be used as the master encryption key for Kubernetes secrets encryption.string""no
cluster_nameThe name of oke cluster.string"oke"no
cluster_typeThe cluster type. See Working with Enhanced Clusters and Basic Clusters for more information.string"basic"no
cni_typeThe CNI for the cluster: 'flannel' or 'npn'. See Pod Networking.string"flannel"no
compartment_idThe compartment id where resources will be created.stringnullno
compartment_ocidA compartment OCID automatically populated by Resource Manager.stringnullno
config_file_profileThe profile within the OCI config file to use.string"DEFAULT"no
create_iam_autoscaler_policyWhether to create an IAM dynamic group and policy rules for Cluster Autoscaler management. Depends on configuration of associated component when set to 'auto'. Ignored when 'create_iam_resources' is false.string"auto"no
create_iam_kms_policyWhether to create an IAM dynamic group and policy rules for cluster autoscaler. Depends on configuration of associated components when set to 'auto'. Ignored when 'create_iam_resources' is false.string"auto"no
create_iam_operator_policyWhether to create an IAM dynamic group and policy rules for operator access to the OKE control plane. Depends on configuration of associated components when set to 'auto'. Ignored when 'create_iam_resources' is false.string"auto"no
create_iam_worker_policyWhether to create an IAM dynamic group and policy rules for self-managed worker nodes. Depends on configuration of associated components when set to 'auto'. Ignored when 'create_iam_resources' is false.string"auto"no
current_user_ocidA user OCID automatically populated by Resource Manager.stringnullno
dcgm_exporter_helm_versionVersion of the Helm chart to install. List available releases using helm search repo [keyword] --versions.string"3.1.5"no
dcgm_exporter_namespaceKubernetes namespace for deployed resources.string"metrics"no
drg_compartment_idCompartment for the DRG resource. Can be used to override network_compartment_id.stringnullno
drg_display_name(Updatable) Name of the created Dynamic Routing Gateway. Does not have to be unique. Defaults to 'oke' suffixed with the generated Terraform 'state_id' value.stringnullno
drg_idID of an external created Dynamic Routing Gateway to be attached to the VCN.stringnullno
gatekeeper_helm_versionVersion of the Helm chart to install. List available releases using helm search repo [keyword] --versions.string"3.11.0"no
gatekeeper_namespaceKubernetes namespace for deployed resources.string"kube-system"no
home_regionThe tenancy's home region. Required to perform identity operations.stringnullno
ig_route_table_idOptional ID of existing internet gateway in VCN.stringnullno
kubeproxy_modeThe mode in which to run kube-proxy when unspecified on a pool.string"iptables"no
kubernetes_versionThe version of kubernetes to use when provisioning OKE or to upgrade an existing OKE cluster to.string"v1.26.2"no
load_balancersThe type of subnets to create for load balancers.string"both"no
metrics_server_helm_versionVersion of the Helm chart to install. List available releases using helm search repo [keyword] --versions.string"3.8.3"no
metrics_server_namespaceKubernetes namespace for deployed resources.string"metrics"no
mpi_operator_deployment_urlThe URL path to the manifest. Leave unset for tags of kubeflow/mpi-operator using mpi_operator_version.stringnullno
mpi_operator_namespaceKubernetes namespace for deployed resources.string"default"no
mpi_operator_versionVersion to install. Ignored when an explicit value for mpi_operator_deployment_url is provided.string"0.4.0"no
multus_daemonset_urlThe URL path to the Multus manifest. Leave unset for tags of k8snetworkplumbingwg/multus-cni using multus_version.stringnullno
multus_namespaceKubernetes namespace for deployed resources.string"network"no
multus_versionVersion of Multus to install. Ignored when an explicit value for multus_daemonset_url is provided.string"3.9.3"no
nat_gateway_public_ip_idOCID of reserved IP address for NAT gateway. The reserved public IP address needs to be manually created.stringnullno
nat_route_table_idOptional ID of existing NAT gateway in VCN.stringnullno
network_compartment_idThe compartment id where network resources will be created.stringnullno
ocir_email_addressThe email address used for the Oracle Container Image Registry (OCIR).stringnullno
ocir_secret_idThe OCI Vault secret ID for the OCIR authentication token.stringnullno
ocir_secret_nameThe name of the Kubernetes secret to be created with the OCIR authentication token.string"ocirsecret"no
ocir_secret_namespaceThe Kubernetes namespace in which to create the OCIR secret.string"default"no
ocir_usernameA username with access to the OCI Vault secret for OCIR access. Required when 'ocir_secret_id' is provided.stringnullno
operator_availability_domainThe availability domain for FSS placement. Defaults to first available.stringnullno
operator_image_idImage ID for created operator instance.stringnullno
operator_image_osOperator image operating system name when operator_image_type = 'platform'.string"Oracle Linux"no
operator_image_os_versionOperator image operating system version when operator_image_type = 'platform'.string"8"no
operator_image_typeWhether to use a platform or custom image for the created operator instance. When custom is set, the operator_image_id must be specified.string"platform"no
operator_private_ipThe IP address of an existing operator host. Ignored when create_operator = true.stringnullno
operator_userUser for SSH access to operator host.string"opc"no
operator_volume_kms_key_idThe OCID of the OCI KMS key to assign as the master encryption key for the boot volume.stringnullno
pods_cidrThe CIDR range used for IP addresses by the pods. A /16 CIDR is generally sufficient. This CIDR should not overlap with any subnet range in the VCN (it can also be outside the VCN CIDR range). Ignored when cni_type = 'npn'.string"10.244.0.0/16"no
preferred_load_balancerThe preferred load balancer subnets that OKE will automatically choose when creating a load balancer. Valid values are 'public' or 'internal'. If 'public' is chosen, the value for load_balancers must be either 'public' or 'both'. If 'private' is chosen, the value for load_balancers must be either 'internal' or 'both'. NOTE: Service annotations for internal load balancers must still be specified regardless of this setting. See Load Balancer Annotations for more information.string"public"no
prometheus_helm_versionVersion of the Helm chart to install. List available releases using helm search repo [keyword] --versions.string"45.2.0"no
prometheus_namespaceKubernetes namespace for deployed resources.string"metrics"no
rdma_cni_plugin_daemonset_urlThe URL path to the manifest. Leave unset for tags of <a href=https://github.com/openshift/sriov-cni using rdma_cni_plugin_version.stringnullno
rdma_cni_plugin_namespaceKubernetes namespace for deployed resources.string"network"no
rdma_cni_plugin_versionVersion to install. Ignored when an explicit value for rdma_cni_plugin_daemonset_url is provided.string"master"no
regionThe OCI region where OKE resources will be created.string"us-ashburn-1"no
services_cidrThe CIDR range used within the cluster by Kubernetes services (ClusterIPs). This CIDR should not overlap with the VCN CIDR range.string"10.96.0.0/16"no
sriov_cni_plugin_daemonset_urlThe URL path to the manifest. Leave unset for tags of <a href=https://github.com/openshift/sriov-cni using sriov_cni_plugin_version.stringnullno
sriov_cni_plugin_namespaceKubernetes namespace for deployed resources.string"network"no
sriov_cni_plugin_versionVersion to install. Ignored when an explicit value for sriov_cni_plugin_daemonset_url is provided.string"master"no
sriov_device_plugin_daemonset_urlThe URL path to the manifest. Leave unset for tags of k8snetworkplumbingwg/sriov-network-device-plugin using sriov_device_plugin_version.stringnullno
sriov_device_plugin_namespaceKubernetes namespace for deployed resources.string"network"no
sriov_device_plugin_versionVersion to install. Ignored when an explicit value for sriov_device_plugin_daemonset_url is provided.string"master"no
ssh_private_keyThe contents of the SSH private key file, optionally base64-encoded. May be provided via SSH agent when unset.stringnullno
ssh_private_key_pathA path on the local filesystem to the SSH private key. May be provided via SSH agent when unset.stringnullno
ssh_public_keyThe contents of the SSH public key file, optionally base64-encoded. Used to allow login for workers/bastion/operator with corresponding private key.stringnullno
ssh_public_key_pathA path on the local filesystem to the SSH public key. Used to allow login for workers/bastion/operator with corresponding private key.stringnullno
state_idOptional Terraform state_id from an existing deployment of the module to re-use with created resources.stringnullno
tag_namespaceThe tag namespace for standard OKE defined tags.string"oke"no
tenancy_idThe tenancy id of the OCI Cloud Account in which to create the resources.stringnullno
tenancy_ocidA tenancy OCID automatically populated by Resource Manager.stringnullno
timezoneThe preferred timezone for workers, operator, and bastion instances.string"Etc/UTC"no
user_idThe id of the user that terraform will use to create the resources.stringnullno
vcn_create_internet_gatewayWhether to create an internet gateway with the VCN. Defaults to automatic creation when public network resources are expected to utilize it.string"auto"no
vcn_create_nat_gatewayWhether to create a NAT gateway with the VCN. Defaults to automatic creation when private network resources are expected to utilize it.string"auto"no
vcn_create_service_gatewayWhether to create a service gateway with the VCN. Defaults to always created.string"always"no
vcn_dns_labelA DNS label for the VCN, used in conjunction with the VNIC's hostname and subnet's DNS label to form a fully qualified domain name (FQDN) for each VNIC within this subnet. Defaults to the generated Terraform 'state_id' value.stringnullno
vcn_idOptional ID of existing VCN. Takes priority over vcn_name filter. Ignored when create_vcn = true.stringnullno
vcn_nameDisplay name for the created VCN. Defaults to 'oke' suffixed with the generated Terraform 'state_id' value.stringnullno
whereabouts_daemonset_urlThe URL path to the manifest. Leave unset for tags of k8snetworkplumbingwg/whereabouts using whereabouts_version.stringnullno
whereabouts_namespaceKubernetes namespace for deployed resources.string"default"no
whereabouts_versionVersion to install. Ignored when an explicit value for whereabouts_daemonset_url is provided.string"master"no
worker_block_volume_typeDefault block volume attachment type for Instance Configurations when unspecified on a pool.string"paravirtualized"no
worker_capacity_reservation_idThe ID of the Compute capacity reservation the worker node will be launched under. See Capacity Reservations for more information.stringnullno
worker_compartment_idThe compartment id where worker group resources will be created.stringnullno
worker_image_idDefault image for worker pools when unspecified on a pool.stringnullno
worker_image_osDefault worker image operating system name when worker_image_type = 'oke' or 'platform' and unspecified on a pool.string"Oracle Linux"no
worker_image_os_versionDefault worker image operating system version when worker_image_type = 'oke' or 'platform' and unspecified on a pool.string"8"no
worker_image_typeWhether to use a platform, OKE, or custom image for worker nodes by default when unspecified on a pool. When custom is set, the worker_image_id must be specified.string"oke"no
worker_pool_modeDefault management mode for workers when unspecified on a pool.string"node-pool"no
worker_volume_kms_key_idThe ID of the OCI KMS key to be used as the master encryption key for Boot Volume and Block Volume encryption by default when unspecified on a pool.stringnullno

Inputs

Sub-modules currently use a sparse definition of inputs required from the root:

Identity Access Management (IAM)

NameDescriptionTypeDefaultRequired
create_iam_autoscaler_policyn/abooln/ayes
create_iam_defined_tagsTagsbooln/ayes
create_iam_kms_policyn/abooln/ayes
create_iam_operator_policyn/abooln/ayes
create_iam_resourcesn/abooln/ayes
create_iam_tag_namespacen/abooln/ayes
create_iam_worker_policyn/abooln/ayes
use_defined_tagsn/abooln/ayes
autoscaler_compartmentsPolicylist(string)n/ayes
worker_compartmentsn/alist(string)n/ayes
defined_tagsn/amap(string)n/ayes
freeform_tagsn/amap(string)n/ayes
cluster_idCommonstringn/ayes
cluster_kms_key_idKMSstringn/ayes
compartment_idn/astringn/ayes
operator_volume_kms_key_idn/astringn/ayes
state_idn/astringn/ayes
tag_namespacen/astringn/ayes
tenancy_idn/astringn/ayes
worker_volume_kms_key_idn/astringn/ayes

Network

NameDescriptionTypeDefaultRequired
allow_rules_internal_lbn/aanyn/ayes
allow_rules_public_lbn/aanyn/ayes
allow_rules_workersn/aanyn/ayes
drg_attachmentsn/aanyn/ayes
allow_bastion_cluster_accessn/abooln/ayes
allow_node_port_accessNetworkbooln/ayes
allow_pod_internet_accessn/abooln/ayes
allow_worker_internet_accessn/abooln/ayes
allow_worker_ssh_accessn/abooln/ayes
assign_dnsn/abooln/ayes
bastion_is_publicn/abooln/ayes
control_plane_is_publicn/abooln/ayes
create_bastionn/abooln/ayes
create_clustern/abooln/ayes
create_operatorn/abooln/ayes
enable_wafn/abooln/ayes
use_defined_tagsn/abooln/ayes
worker_is_publicn/abooln/ayes
vcn_cidrsn/alist(string)n/ayes
subnetsn/amap(object({
create = optional(string)
id = optional(string)
newbits = optional(string)
netnum = optional(string)
cidr = optional(string)
dns_label = optional(string)
}))
n/ayes
nsgsn/amap(object({
create = optional(string)
id = optional(string)
}))
n/ayes
defined_tagsTagsmap(string)n/ayes
freeform_tagsn/amap(string)n/ayes
bastion_allowed_cidrsn/aset(string)n/ayes
control_plane_allowed_cidrsn/aset(string)n/ayes
cni_typen/astringn/ayes
compartment_idCommonstringn/ayes
ig_route_table_idn/astringn/ayes
load_balancersn/astringn/ayes
nat_route_table_idn/astringn/ayes
state_idn/astringn/ayes
tag_namespacen/astringn/ayes
vcn_idn/astringn/ayes

Bastion

NameDescriptionTypeDefaultRequired
assign_dnsBastionbooln/ayes
is_publicn/abooln/ayes
upgraden/abooln/ayes
use_defined_tagsn/abooln/ayes
nsg_idsn/alist(string)n/ayes
shapen/amap(any)n/ayes
defined_tagsTagsmap(string)n/ayes
freeform_tagsn/amap(string)n/ayes
availability_domainn/astringn/ayes
bastion_image_os_versionn/astringn/ayes
compartment_idCommonstringn/ayes
image_idn/astringn/ayes
ssh_private_keyn/astringn/ayes
ssh_public_keyn/astringn/ayes
state_idn/astringn/ayes
subnet_idn/astringn/ayes
tag_namespacen/astringn/ayes
timezonen/astringn/ayes
usern/astringn/ayes

Cluster

NameDescriptionTypeDefaultRequired
assign_public_ip_to_control_planen/abooln/ayes
control_plane_is_publicn/abooln/ayes
use_signed_imagesn/abooln/ayes
cluster_defined_tagsTaggingmap(string)n/ayes
cluster_freeform_tagsn/amap(string)n/ayes
persistent_volume_defined_tagsn/amap(string)n/ayes
persistent_volume_freeform_tagsn/amap(string)n/ayes
service_lb_defined_tagsn/amap(string)n/ayes
service_lb_freeform_tagsn/amap(string)n/ayes
control_plane_nsg_idsn/aset(string)n/ayes
image_signing_keysn/aset(string)n/ayes
cluster_kms_key_idClusterstringn/ayes
cluster_namen/astringn/ayes
cluster_typen/astringn/ayes
cni_typen/astringn/ayes
compartment_idCommonstringn/ayes
control_plane_subnet_idn/astringn/ayes
kubernetes_versionn/astringn/ayes
pods_cidrn/astringn/ayes
service_lb_subnet_idn/astringn/ayes
services_cidrn/astringn/ayes
state_idn/astringn/ayes
tag_namespacen/astringn/ayes
use_defined_tagsn/astringn/ayes
vcn_idn/astringn/ayes

Workers

NameDescriptionTypeDefaultRequired
image_idsMap of images for filtering with image_os and image_os_version.any{}no
worker_poolsTuple of OKE worker pools where each key maps to the OCID of an OCI resource, and value contains its definition.any{}no
assign_dnsn/abooln/ayes
assign_public_ipn/abooln/ayes
disable_default_cloud_initWhether to disable the default OKE cloud init and only use the cloud init explicitly passed to the worker pool in 'worker_cloud_init'.boolfalseno
pv_transit_encryptionWhether to enable in-transit encryption for the data volume's paravirtualized attachment by default when unspecified on a pool.boolfalseno
use_defined_tagsWhether to apply defined tags to created resources for IAM policy and tracking.boolfalseno
cloud_initList of maps containing cloud init MIME part configuration for worker nodes. Merged with pool-specific definitions. See https://registry.terraform.io/providers/hashicorp/template/latest/docs/data-sources/cloudinit_config.html#part for expected schema of each element.list(map(string))[]no
ad_numbersn/alist(number)n/ayes
pod_nsg_idsAn additional list of network security group (NSG) IDs for pod security. Combined with 'pod_nsg_ids' specified on each pool.list(string)[]no
worker_nsg_idsAn additional list of network security group (NSG) IDs for node security. Combined with 'nsg_ids' specified on each pool.list(string)[]no
preemptible_configDefault preemptible Compute configuration when unspecified on a pool. See Preemptible Worker Nodes for more information.map(any){
"enable": false,
"is_preserve_boot_volume": false
}
no
shapeDefault shape of the created worker instance when unspecified on a pool.map(any){
"boot_volume_size": 50,
"memory": 16,
"ocpus": 2,
"shape": "VM.Standard.E4.Flex"
}
no
ad_numbers_to_namesn/amap(string)n/ayes
defined_tagsDefined tags to be applied to created resources. Must already exist in the tenancy.map(string){}no
freeform_tagsFreeform tags to be applied to created resources.map(string){}no
node_labelsDefault worker node labels. Merged with labels defined on each pool.map(string){}no
node_metadataMap of additional worker node instance metadata. Merged with metadata defined on each pool.map(string){}no
max_pods_per_nodeThe default maximum number of pods to deploy per node when unspecified on a pool. Absolute maximum is 110. Ignored when when cni_type != 'npn'.number31no
worker_pool_sizeDefault size for worker pools when unspecified on a pool.number0no
agent_configDefault agent_config for self-managed worker pools created with mode: 'instance', 'instance-pool', or 'cluster-network'. See <a href=https://docs.oracle.com/en-us/iaas/api/#/en/iaas/20160918/datatypes/InstanceAgentConfig for more information.object({
are_all_plugins_disabled = bool,
is_management_disabled = bool,
is_monitoring_disabled = bool,
plugins_config = map(string),
})
n/ayes
platform_configDefault platform_config for self-managed worker pools created with mode: 'instance', 'instance-pool', or 'cluster-network'. See PlatformConfig for more information.object({
type = optional(string),
are_virtual_instructions_enabled = optional(bool),
is_access_control_service_enabled = optional(bool),
is_input_output_memory_management_unit_enabled = optional(bool),
is_measured_boot_enabled = optional(bool),
is_memory_encryption_enabled = optional(bool),
is_secure_boot_enabled = optional(bool),
is_symmetric_multi_threading_enabled = optional(bool),
is_trusted_platform_module_enabled = optional(bool),
numa_nodes_per_socket = optional(number),
percentage_of_cores_enabled = optional(bool),
})
nullno
apiserver_private_hostn/astringn/ayes
block_volume_typeDefault block volume attachment type for Instance Configurations when unspecified on a pool.string"paravirtualized"no
capacity_reservation_idThe ID of the Compute capacity reservation the worker node will be launched under. See Capacity Reservations for more information.stringnullno
cluster_ca_certBase64+PEM-encoded cluster CA certificate for unmanaged instance pools. Determined automatically when 'create_cluster' = true or 'cluster_id' is provided.stringnullno
cluster_dnsCluster DNS resolver IP address. Determined automatically when not set (recommended).stringnullno
cluster_idAn existing OKE cluster OCID when create_cluster = false.stringnullno
cluster_typeThe cluster type. See Working with Enhanced Clusters and Basic Clusters for more information.string"basic"no
cni_typeThe CNI for the cluster: 'flannel' or 'npn'. See Pod Networking.string"flannel"no
compartment_idThe compartment id where resources will be created.stringnullno
image_idDefault image for worker pools when unspecified on a pool.stringnullno
image_osDefault worker image operating system name when worker_image_type = 'oke' or 'platform' and unspecified on a pool.string"Oracle Linux"no
image_os_versionDefault worker image operating system version when worker_image_type = 'oke' or 'platform' and unspecified on a pool.string"8"no
image_typeWhether to use a platform, OKE, or custom image for worker nodes by default when unspecified on a pool. When custom is set, the worker_image_id must be specified.string"oke"no
kubeproxy_modeThe mode in which to run kube-proxy when unspecified on a pool.string"iptables"no
kubernetes_versionThe version of Kubernetes used for worker nodes.string"v1.26.2"no
pod_subnet_idn/astringn/ayes
ssh_public_keyThe contents of the SSH public key file. Used to allow login for workers/bastion/operator with corresponding private key.stringnullno
state_idOptional Terraform state_id from an existing deployment of the module to re-use with created resources.stringnullno
tag_namespaceThe tag namespace for standard OKE defined tags.string"oke"no
tenancy_idThe tenancy id of the OCI Cloud Account in which to create the resources.stringnullno
timezonen/astringn/ayes
volume_kms_key_idThe ID of the OCI KMS key to be used as the master encryption key for Boot Volume and Block Volume encryption by default when unspecified on a pool.stringnullno
worker_pool_modeDefault management mode for workers when unspecified on a pool. Only 'node-pool' is currently supported.string"node-pool"no
worker_subnet_idn/astringn/ayes

Operator

NameDescriptionTypeDefaultRequired
assign_dnsOperatorbooln/ayes
install_ciliumn/abooln/ayes
install_helmn/abooln/ayes
install_istioctln/abooln/ayes
install_k9sn/abooln/ayes
install_kubectl_from_repon/abooltrueno
install_kubectxn/abooln/ayes
pv_transit_encryptionn/abooln/ayes
upgraden/abooln/ayes
use_defined_tagsn/abooln/ayes
cloud_initn/alist(map(string))n/ayes
nsg_idsn/alist(string)n/ayes
shapen/amap(any)n/ayes
defined_tagsTagsmap(string)n/ayes
freeform_tagsn/amap(string)n/ayes
availability_domainn/astringn/ayes
bastion_hostBastion (to await cloud-init completion)stringn/ayes
bastion_usern/astringn/ayes
compartment_idCommonstringn/ayes
image_idn/astringn/ayes
kubeconfign/astringn/ayes
kubernetes_versionn/astringn/ayes
operator_image_os_versionn/astringn/ayes
ssh_private_keyn/astringn/ayes
ssh_public_keyn/astringn/ayes
state_idn/astringn/ayes
subnet_idn/astringn/ayes
tag_namespacen/astringn/ayes
timezonen/astringn/ayes
usern/astringn/ayes
volume_kms_key_idn/astringn/ayes

Outputs

Identity Access Management (IAM)

  • dynamic_group_ids   Cluster IAM dynamic group IDs
  • policy_statements   Cluster IAM policy statements

Network

  • bastion_nsg_id  
  • bastion_subnet_cidr  
  • bastion_subnet_id   Return configured/created subnet IDs and CIDRs when applicable
  • control_plane_nsg_id  
  • control_plane_subnet_cidr  
  • control_plane_subnet_id  
  • fss_nsg_id  
  • fss_subnet_cidr  
  • fss_subnet_id  
  • int_lb_nsg_id  
  • int_lb_subnet_cidr  
  • int_lb_subnet_id  
  • network_security_rules  
  • nsg_ids  
  • operator_nsg_id  
  • operator_subnet_cidr  
  • operator_subnet_id  
  • pod_nsg_id  
  • pod_subnet_cidr  
  • pod_subnet_id  
  • pub_lb_nsg_id  
  • pub_lb_subnet_cidr  
  • pub_lb_subnet_id  
  • worker_nsg_id  
  • worker_subnet_cidr  
  • worker_subnet_id  

Bastion

  • id  
  • public_ip  

Cluster

  • cluster_id  
  • endpoints  

Workers

  • worker_count_expected   # of nodes expected from created worker pools
  • worker_drain_expected   # of nodes expected to be draining in worker pools
  • worker_instances   Created worker pools (mode == 'instance')
  • worker_pool_autoscale_expected   # of worker pools expected with autoscale enabled from created worker pools
  • worker_pool_ids   Created worker pool IDs
  • worker_pool_ips   Created worker instance private IPs by pool for available modes ('node-pool', 'instance').
  • worker_pools   Created worker pools (mode != 'instance')

Operator

  • id  
  • private_ip  

Resources

Identity Access Management (IAM)

Network

Bastion

Cluster

Workers

Operator

Version 5.x

Summary

  • Improved config flexibility, e.g.:
    • All resources in same tfstate
    • Identity resources only/enabled individually
    • Network resources only/enabled individually
    • Cluster with existing network VCN/subnets/NSGs
    • Cluster & isolated NSGs with existing network VCN/subnets
    • Workers with existing cluster/network
    • Workers with tag-based group/policy for Cluster Autoscaler, ...
    • Operator with existing cluster & group/policy for cluster access
  • Workers: resource type configuration (Self-Managed, Virtual)
    • mode="node-pool"
    • New mode="virtual-node-pool"
    • New mode="instance"
    • New mode="instance-pool"
    • New mode="cluster-network"
  • Workers: merge/override global & pool-specific for most inputs
  • Network: Referential NSG security rule definitions
  • Sub-module refactor
    • iam: Dynamic groups, policies, defined tags
    • network: VCN, subnets, NSGs, DRGs
    • bastion: Bastion host for external VCN access
    • cluster: OKE managed Kubernetes cluster
    • workers: Compute pools for cluster workloads with configurable resource types
    • operator: Operator instance with access to the OKE cluster endpoint
    • utilities: Additional automation for cluster operations performed by the module
    • extensions: Optional cluster software for evaluation

Status

Pre-release / Beta

Core features of the module are working.

Some features under utilities need re-implementation/testing:

  • OCIR
  • Worker pool drain

Documentation in progress.

Breaking changes

  • Input variables
  • Pending

Migration

Pending

Version 4.x

Summary

  • ...?

Status

Released

This is the latest supported version of the module.

Migration

Pending

Version 3.x

Summary

Status

Maintenance

Migration

Pending

Version 2.x

Status

Maintenance

Version 1.x

Status

Unsupported

Coding conventions

This project adheres to the following conventions:

New conventions may be added to the list in future. All contributions should adhere to the list as published when the contribution is made.

Use PR comments and the GitHub suggestion feature to agree on the final result.

Module Structure

  • This project adheres to the {uri-terraform-standard-module-structure}[Terraform Standard Module Structure]
  • Any nested module calls are in the appropriate module-<name>.tf file at the root of the project.
  • All variables declarations must be in variables.tf or variables-<group>.tf
  • All ouputs declarations must be in outputs.tf, outputs-<group>.tf, or colocated with their values.
  • All variables and outputs must have descriptions.
  • Nested modules must exist under the modules subdirectory.
  • Examples of how to use the module must be placed in the examples subdirectory, with documentation under docs.

Documentation format

This project uses Markdown with the *.md file extension.

HashiCorp Terraform Registry

  • README files must be in Markdown format
  • All links must use absolute path, relative links are not supported

Terraform code

Case type, Files, Names

  • Use snake_case when naming Terraform files, variables and resources
  • If you need a new .tf file for better clarity, use this naming scheme: <resources_group>: e.g. subnets.tf, nsgs.tf
  • If your variable is controlling a behaviour, use imperative style to name it: e.g. create_internet_gateway, use_cluster_encryption

Formatting

The following should be performed as needed before committing changes:

  • Run terraform fmt -recursive from the project root directory.
  • Run tflint --recursive from the project root directory (see documentation here) and address new warnings.

Variable blocks

Variables should always be in the format below:

variable "xyz" {
  default = "A default value"
  description:  "Add (Updatable) at the begining of the description if this value do not triggers a resource recreate"
  type: string

Variables exposed by the root module:

  • must be included with its default value in the approriate tfvars example file.
  • must define a default value that matches its type and will not alter existing behavior of the module.
  • must define a description that describes how changes to the value will impact resources and interact with other variables when applicable.
  • should be prefixed with the name of the component they pertain to unless shared across more than one, e.g. worker_, operator_, etc.
  • should use imperative verbs when controlling behavior, e.g. create, use, etc.
  • should include preconditions for input validation where possible.
  • should prefer null for empty/unset defaults over empty string or other values.

Variables within submodules:

  • must define only a type matching that of the root module.
  • must omit defaults to ensure they are referenced from the root module.
  • must omit descriptions to avoid maintaining in multiple places.
  • should match the name of their root module counterparts, with the possible exception of a component prefix when redundant and unambiguous, e.g. worker_, operator_, etc.

Do not hesitate to insert a brief comment in the variable block if it helps to clarify your intention.

WARNING: No default value for compartment_id or any other variables related to provider authentication in module or examples files. The user will have to explicitly set these values.

Examples

Examples should promote good practices as much as possible e.g. avoid creating resources in the tenancy root compartment. Please review the OCI Security Guide.

Coding conventions

This project adheres to the following conventions:

New conventions may be added to the list in future. All contributions should adhere to the list as published when the contribution is made.

Use PR comments and the GitHub suggestion feature to agree on the final result.

Module Structure

  • This project adheres to the {uri-terraform-standard-module-structure}[Terraform Standard Module Structure]
  • Any nested module calls are in the appropriate module-<name>.tf file at the root of the project.
  • All variables declarations must be in variables.tf or variables-<group>.tf
  • All ouputs declarations must be in outputs.tf, outputs-<group>.tf, or colocated with their values.
  • All variables and outputs must have descriptions.
  • Nested modules must exist under the modules subdirectory.
  • Examples of how to use the module must be placed in the examples subdirectory, with documentation under docs.

Documentation format

This project uses Markdown with the *.md file extension.

HashiCorp Terraform Registry

  • README files must be in Markdown format
  • All links must use absolute path, relative links are not supported

Terraform code

Case type, Files, Names

  • Use snake_case when naming Terraform files, variables and resources
  • If you need a new .tf file for better clarity, use this naming scheme: <resources_group>: e.g. subnets.tf, nsgs.tf
  • If your variable is controlling a behaviour, use imperative style to name it: e.g. create_internet_gateway, use_cluster_encryption

Formatting

The following should be performed as needed before committing changes:

  • Run terraform fmt -recursive from the project root directory.
  • Run tflint --recursive from the project root directory (see documentation here) and address new warnings.

Variable blocks

Variables should always be in the format below:

variable "xyz" {
  default = "A default value"
  description:  "Add (Updatable) at the begining of the description if this value do not triggers a resource recreate"
  type: string

Variables exposed by the root module:

  • must be included with its default value in the approriate tfvars example file.
  • must define a default value that matches its type and will not alter existing behavior of the module.
  • must define a description that describes how changes to the value will impact resources and interact with other variables when applicable.
  • should be prefixed with the name of the component they pertain to unless shared across more than one, e.g. worker_, operator_, etc.
  • should use imperative verbs when controlling behavior, e.g. create, use, etc.
  • should include preconditions for input validation where possible.
  • should prefer null for empty/unset defaults over empty string or other values.

Variables within submodules:

  • must define only a type matching that of the root module.
  • must omit defaults to ensure they are referenced from the root module.
  • must omit descriptions to avoid maintaining in multiple places.
  • should match the name of their root module counterparts, with the possible exception of a component prefix when redundant and unambiguous, e.g. worker_, operator_, etc.

Do not hesitate to insert a brief comment in the variable block if it helps to clarify your intention.

WARNING: No default value for compartment_id or any other variables related to provider authentication in module or examples files. The user will have to explicitly set these values.

Examples

Examples should promote good practices as much as possible e.g. avoid creating resources in the tenancy root compartment. Please review the OCI Security Guide.

Support

Report an issue

Support

Report an issue