1
0
Fork 0
mirror of https://github.com/poseidon/typhoon synced 2024-05-09 00:56:11 +02:00
typhoon/search/search_index.json
2022-07-27 16:54:22 -07:00

1 line
357 KiB
JSON

{"config":{"indexing":"full","lang":["en"],"min_search_length":3,"prebuild_index":false,"separator":"[\\s\\-]+"},"docs":[{"location":"","text":"Typhoon \u00b6 Typhoon is a minimal and free Kubernetes distribution. Minimal, stable base Kubernetes distribution Declarative infrastructure and configuration Free (freedom and cost) and privacy-respecting Practical for labs, datacenters, and clouds Typhoon distributes upstream Kubernetes, architectural conventions, and cluster addons, much like a GNU/Linux distribution provides the Linux kernel and userspace components. Features \u00b6 Kubernetes v1.24.3 (upstream) Single or multi-master, Calico or Cilium or flannel networking On-cluster etcd with TLS, RBAC -enabled, network policy , SELinux enforcing Advanced features like worker pools , preemptible workers, and snippets customization Ready for Ingress, Prometheus, Grafana, CSI, or other addons Modules \u00b6 Typhoon provides a Terraform Module for each supported operating system and platform. Typhoon is available for Fedora CoreOS . Platform Operating System Terraform Module Status AWS Fedora CoreOS aws/fedora-coreos/kubernetes stable Azure Fedora CoreOS azure/fedora-coreos/kubernetes alpha Bare-Metal Fedora CoreOS bare-metal/fedora-coreos/kubernetes stable DigitalOcean Fedora CoreOS digital-ocean/fedora-coreos/kubernetes beta Google Cloud Fedora CoreOS google-cloud/fedora-coreos/kubernetes stable Platform Operating System Terraform Module Status AWS Fedora CoreOS (ARM64) aws/fedora-coreos/kubernetes alpha Typhoon is available for Flatcar Linux . Platform Operating System Terraform Module Status AWS Flatcar Linux aws/flatcar-linux/kubernetes stable Azure Flatcar Linux azure/flatcar-linux/kubernetes alpha Bare-Metal Flatcar Linux bare-metal/flatcar-linux/kubernetes stable DigitalOcean Flatcar Linux digital-ocean/flatcar-linux/kubernetes beta Google Cloud Flatcar Linux google-cloud/flatcar-linux/kubernetes stable Platform Operating System Terraform Module Status AWS Flatcar Linux (ARM64) aws/flatcar-linux/kubernetes alpha Documentation \u00b6 Architecture concepts and operating-systems Fedora CoreOS tutorials for AWS , Azure , Bare-Metal , DigitalOcean , and Google Cloud Flatcar Linux tutorials for AWS , Azure , Bare-Metal , DigitalOcean , and Google Cloud Example \u00b6 Define a Kubernetes cluster by using the Terraform module for your chosen platform and operating system. Here's a minimal example. module \"yavin\" { source = \"git::https://github.com/poseidon/typhoon//google-cloud/fedora-coreos/kubernetes?ref=v1.24.3\" # Google Cloud cluster_name = \"yavin\" region = \"us-central1\" dns_zone = \"example.com\" dns_zone_name = \"example-zone\" # configuration ssh_authorized_key = \"ssh-ed25519 AAAAB3Nz...\" # optional worker_count = 2 } # Obtain cluster kubeconfig resource \"local_file\" \"kubeconfig-yavin\" { content = module.yavin.kubeconfig-admin filename = \"/home/user/.kube/configs/yavin-config\" } Initialize modules, plan the changes to be made, and apply the changes. $ terraform init $ terraform plan Plan: 62 to add, 0 to change, 0 to destroy. $ terraform apply Apply complete! Resources: 62 added, 0 changed, 0 destroyed. In 4-8 minutes (varies by platform), the cluster will be ready. This Google Cloud example creates a yavin.example.com DNS record to resolve to a network load balancer across controller nodes. $ export KUBECONFIG=/home/user/.kube/configs/yavin-config $ kubectl get nodes NAME ROLES STATUS AGE VERSION yavin-controller-0.c.example-com.internal <none> Ready 6m v1.24.3 yavin-worker-jrbf.c.example-com.internal <none> Ready 5m v1.24.3 yavin-worker-mzdm.c.example-com.internal <none> Ready 5m v1.24.3 List the pods. $ kubectl get pods --all-namespaces NAMESPACE NAME READY STATUS RESTARTS AGE kube-system calico-node-1cs8z 2/2 Running 0 6m kube-system calico-node-d1l5b 2/2 Running 0 6m kube-system calico-node-sp9ps 2/2 Running 0 6m kube-system coredns-1187388186-dkh3o 1/1 Running 0 6m kube-system coredns-1187388186-zj5dl 1/1 Running 0 6m kube-system kube-apiserver-controller-0 1/1 Running 0 6m kube-system kube-controller-manager-controller-0 1/1 Running 0 6m kube-system kube-proxy-117v6 1/1 Running 0 6m kube-system kube-proxy-9886n 1/1 Running 0 6m kube-system kube-proxy-njn47 1/1 Running 0 6m kube-system kube-scheduler-controller-0 1/1 Running 0 6m Help \u00b6 Schedule a meeting via Github Sponsors to discuss your use case. Motivation \u00b6 Typhoon powers the author's cloud and colocation clusters. The project has evolved through operational experience and Kubernetes changes. Typhoon is shared under a free license to allow others to use the work freely and contribute to its upkeep. Typhoon addresses real world needs, which you may share. It is honest about limitations or areas that aren't mature yet. It avoids buzzword bingo and hype. It does not aim to be the one-solution-fits-all distro. An ecosystem of Kubernetes distributions is healthy. Social Contract \u00b6 Typhoon is not a product, trial, or free-tier. Typhoon does not offer support, services, or charge money. And Typhoon is independent of operating system or platform vendors. Typhoon clusters will contain only free components. Cluster components will not collect data on users without their permission. Sponsors \u00b6 Poseidon's Github Sponsors support the infrastructure and operational costs of providing Typhoon. If you'd like your company here, please contact dghubble at psdn.io.","title":"Home"},{"location":"#typhoon","text":"Typhoon is a minimal and free Kubernetes distribution. Minimal, stable base Kubernetes distribution Declarative infrastructure and configuration Free (freedom and cost) and privacy-respecting Practical for labs, datacenters, and clouds Typhoon distributes upstream Kubernetes, architectural conventions, and cluster addons, much like a GNU/Linux distribution provides the Linux kernel and userspace components.","title":"Typhoon "},{"location":"#features","text":"Kubernetes v1.24.3 (upstream) Single or multi-master, Calico or Cilium or flannel networking On-cluster etcd with TLS, RBAC -enabled, network policy , SELinux enforcing Advanced features like worker pools , preemptible workers, and snippets customization Ready for Ingress, Prometheus, Grafana, CSI, or other addons","title":"Features "},{"location":"#modules","text":"Typhoon provides a Terraform Module for each supported operating system and platform. Typhoon is available for Fedora CoreOS . Platform Operating System Terraform Module Status AWS Fedora CoreOS aws/fedora-coreos/kubernetes stable Azure Fedora CoreOS azure/fedora-coreos/kubernetes alpha Bare-Metal Fedora CoreOS bare-metal/fedora-coreos/kubernetes stable DigitalOcean Fedora CoreOS digital-ocean/fedora-coreos/kubernetes beta Google Cloud Fedora CoreOS google-cloud/fedora-coreos/kubernetes stable Platform Operating System Terraform Module Status AWS Fedora CoreOS (ARM64) aws/fedora-coreos/kubernetes alpha Typhoon is available for Flatcar Linux . Platform Operating System Terraform Module Status AWS Flatcar Linux aws/flatcar-linux/kubernetes stable Azure Flatcar Linux azure/flatcar-linux/kubernetes alpha Bare-Metal Flatcar Linux bare-metal/flatcar-linux/kubernetes stable DigitalOcean Flatcar Linux digital-ocean/flatcar-linux/kubernetes beta Google Cloud Flatcar Linux google-cloud/flatcar-linux/kubernetes stable Platform Operating System Terraform Module Status AWS Flatcar Linux (ARM64) aws/flatcar-linux/kubernetes alpha","title":"Modules"},{"location":"#documentation","text":"Architecture concepts and operating-systems Fedora CoreOS tutorials for AWS , Azure , Bare-Metal , DigitalOcean , and Google Cloud Flatcar Linux tutorials for AWS , Azure , Bare-Metal , DigitalOcean , and Google Cloud","title":"Documentation"},{"location":"#example","text":"Define a Kubernetes cluster by using the Terraform module for your chosen platform and operating system. Here's a minimal example. module \"yavin\" { source = \"git::https://github.com/poseidon/typhoon//google-cloud/fedora-coreos/kubernetes?ref=v1.24.3\" # Google Cloud cluster_name = \"yavin\" region = \"us-central1\" dns_zone = \"example.com\" dns_zone_name = \"example-zone\" # configuration ssh_authorized_key = \"ssh-ed25519 AAAAB3Nz...\" # optional worker_count = 2 } # Obtain cluster kubeconfig resource \"local_file\" \"kubeconfig-yavin\" { content = module.yavin.kubeconfig-admin filename = \"/home/user/.kube/configs/yavin-config\" } Initialize modules, plan the changes to be made, and apply the changes. $ terraform init $ terraform plan Plan: 62 to add, 0 to change, 0 to destroy. $ terraform apply Apply complete! Resources: 62 added, 0 changed, 0 destroyed. In 4-8 minutes (varies by platform), the cluster will be ready. This Google Cloud example creates a yavin.example.com DNS record to resolve to a network load balancer across controller nodes. $ export KUBECONFIG=/home/user/.kube/configs/yavin-config $ kubectl get nodes NAME ROLES STATUS AGE VERSION yavin-controller-0.c.example-com.internal <none> Ready 6m v1.24.3 yavin-worker-jrbf.c.example-com.internal <none> Ready 5m v1.24.3 yavin-worker-mzdm.c.example-com.internal <none> Ready 5m v1.24.3 List the pods. $ kubectl get pods --all-namespaces NAMESPACE NAME READY STATUS RESTARTS AGE kube-system calico-node-1cs8z 2/2 Running 0 6m kube-system calico-node-d1l5b 2/2 Running 0 6m kube-system calico-node-sp9ps 2/2 Running 0 6m kube-system coredns-1187388186-dkh3o 1/1 Running 0 6m kube-system coredns-1187388186-zj5dl 1/1 Running 0 6m kube-system kube-apiserver-controller-0 1/1 Running 0 6m kube-system kube-controller-manager-controller-0 1/1 Running 0 6m kube-system kube-proxy-117v6 1/1 Running 0 6m kube-system kube-proxy-9886n 1/1 Running 0 6m kube-system kube-proxy-njn47 1/1 Running 0 6m kube-system kube-scheduler-controller-0 1/1 Running 0 6m","title":"Example"},{"location":"#help","text":"Schedule a meeting via Github Sponsors to discuss your use case.","title":"Help"},{"location":"#motivation","text":"Typhoon powers the author's cloud and colocation clusters. The project has evolved through operational experience and Kubernetes changes. Typhoon is shared under a free license to allow others to use the work freely and contribute to its upkeep. Typhoon addresses real world needs, which you may share. It is honest about limitations or areas that aren't mature yet. It avoids buzzword bingo and hype. It does not aim to be the one-solution-fits-all distro. An ecosystem of Kubernetes distributions is healthy.","title":"Motivation"},{"location":"#social-contract","text":"Typhoon is not a product, trial, or free-tier. Typhoon does not offer support, services, or charge money. And Typhoon is independent of operating system or platform vendors. Typhoon clusters will contain only free components. Cluster components will not collect data on users without their permission.","title":"Social Contract"},{"location":"#sponsors","text":"Poseidon's Github Sponsors support the infrastructure and operational costs of providing Typhoon. If you'd like your company here, please contact dghubble at psdn.io.","title":"Sponsors"},{"location":"announce/","text":"Announce \u00b6 Jan 23, 2020 \u00b6 Typhoon for Fedora CoreOS promoted to alpha! Last summer, Typhoon released the first preview of Kubernetes on Fedora CoreOS for bare-metal and AWS, developing many ideas and patterns from Typhoon for Container Linux and Fedora Atomic. Since then, Typhoon for Fedora CoreOS has evolved and gained features alongside Typhoon, while Fedora CoreOS itself has evolved and improved too. Fedora recently announced that Fedora CoreOS is available for general use. To align with that change and to better indicate the maturing status, Typhoon for Fedora CoreOS has been promoted to alpha. Many thanks to folks who have worked to make this possbile! About: For newcomers, Typhoon is a minimal and free (cost and freedom) Kubernetes distribution providing upstream Kubernetes, declarative configuration via Terraform, and support for AWS, Azure, Google Cloud, DigitalOcean, and bare-metal. It is run by former CoreOS engineer @dghubble to power his clusters, with freedom motivations . Jul 18, 2019 \u00b6 Introducing a preview of Typhoon Kubernetes clusters with Fedora CoreOS! Fedora recently announced the first preview release of Fedora CoreOS, aiming to blend the best of CoreOS and Fedora for containerized workloads. To spur testing, Typhoon is sharing preview modules for Kubernetes v1.15 on AWS and bare-metal using the new Fedora CoreOS preview. What better way to test drive than by running Kubernetes? While Typhoon uses Container Linux (or Flatcar Linux) for stable modules, the project hasn't been a stranger to Fedora ideas, once developing a Fedora Atomic variant in 2018. That makes the Fedora CoreOS fushion both exciting and familiar. Typhoon with Fedora CoreOS uses Ignition v3 for provisioning, uses rpm-ostree for layering and updates, tries swapping system containers for podman, and brings SELinux enforcement ( table ). This is an early preview (don't go to prod), but do try it out and help identify and solve issues (getting started links above). March 27, 2019 \u00b6 Last April, Typhoon introduced alpha support for creating Kubernetes clusters with Fedora Atomic on AWS, Google Cloud, DigitalOcean, and bare-metal. Fedora Atomic shared many of Container Linux's aims for a container-optimized operating system, introduced novel ideas, and provided technical diversification for an uncertain future. However, Project Atomic efforts were merged into Fedora CoreOS and future Fedora Atomic releases are not expected . Typhoon modules for Fedora Atomic will not be updated much beyond Kubernetes v1.13 . They may later be removed. Typhoon for Fedora Atomic fell short of goals to provide a consistent, practical experience across operating systems and platforms. The modules have remained alpha, despite improvements. Features like coordinated OS updates and boot-time declarative customization were not realized. Inelegance of Cloud-Init/kickstart loomed large. With that brief but obligatory summary, I'd like to change gears and celebrate the many positives. Fedora Atomic showcased rpm-ostree as a different approach to Container Linux's AB update scheme. It provided a viable route toward CRI-O to replace Docker as the container engine. And Fedora Atomic devised system containers as a way to package and run raw OCI images through runc for host-level containers 1 . Many of these ideas will live on in Fedora CoreOS, which is exciting! For Typhoon, Fedora Atomic brought fresh ideas and broader perspectives about different container-optimized base operating systems and related tools. Its sad to let go of so much work, but I think its time. Many of the concepts and technologies that were explored will surface again and Typhoon is better positioned as a result. Thank you Project Atomic team members for your work! - dghubble May 23, 2018 \u00b6 Starting in v1.10.3, Typhoon AWS and bare-metal container-linux modules allow picking between the Red Hat Container Linux (formerly CoreOS Container Linux) and Kinvolk Flatcar Linux operating system. Flatcar Linux serves as a drop-in compatible \"friendly fork\" of Container Linux. Flatcar Linux publishes the same channels and versions as Container Linux and gets provisioned, managed, and operated in an identical way (e.g. login as user \"core\"). On AWS, pick the Container Linux derivative channel by setting os_image to coreos-stable, coreos-beta, coreos-alpha, flatcar-stable, flatcar-beta, or flatcar-alpha. On bare-metal, pick the Container Linux derivative channel by setting os_channel to coreos-stable, coreos-beta, coreos-alpha, flatcar-stable, flatcar-beta, or flatcar-alpha. Set the os_version number to PXE boot and install. Variables container_linux_channel and container_linux_version have been dropped. Flatcar Linux provides a familar Container Linux experience, with support from Kinvolk as an alternative to Red Hat. Typhoon offers the choice of Container Linux vendor to satisfy differing preferences and to diversify technology underpinnings, while providing a consistent Kubernetes experience across operating systems, clouds, and on-premise. April 26, 2018 \u00b6 Introducing Typhoon Kubernetes clusters for Fedora Atomic! Fedora Atomic is a container-optimized operating system designed for large-scale clustered operation, immutable infrastructure, and atomic operating system upgrades. Its part of Fedora and Project Atomic , a Red Hat sponsored project working on rpm-ostree, buildah, skopeo, CRI-O, and the related CentOS/RHEL Atomic. For newcomers, Typhoon is a free (cost and freedom) Kubernetes distribution providing upstream Kubernetes, declarative configuration via Terraform , and support for AWS, Google Cloud, DigitalOcean, and bare-metal. Typhoon clusters use a self-hosted control plane, support Calico and flannel CNI networking, and enable etcd TLS, RBAC , and network policy. Typhoon for Fedora Atomic reflects many of the same principles that created Typhoon for Container Linux. Clusters are declared using plain Terraform configs that can be versioned. In lieu of Ignition, instances are declaratively provisioned with Cloud-Init and kickstart (bare-metal only). TLS assets are generated. Hosts run only a kubelet service, other components are scheduled (i.e. self-hosted). The upstream hyperkube is used directly 2 . And clusters are kept minimal by offering optional addons for Ingress , Prometheus , and Grafana . Typhoon compliments and enhances Fedora Atomic as a choice of operating system for Kubernetes. Meanwhile, Fedora Atomic adds some promising new low-level technologies: ostree & rpm-ostree - a hybrid, layered, image and package system that lets you perform atomic updates and rollbacks, layer on packages, \"rebase\" your system, or manage a remote tree repo. See Dusty Mabe's great intro . system containers - OCI container images that embed systemd and runc metadata for starting low-level host services before container runtimes are ready. Typhoon uses system containers under runc for etcd , kubelet , and bootkube on Fedora Atomic (instead of rkt-fly). CRI-O - CRI-O is a kubernetes-incubator implementation of the Kubernetes Container Runtime Interface. Typhoon uses Docker as the container runtime today, but its a goal to gradually introduce CRI-O as an alternative runtime as it matures. Typhoon has long aspired to add a dissimilar operating system to compliment Container Linux. Operating Typhoon clusters across colocations and multiple clouds was driven by our own real need and has provided healthy perspective and clear direction. Adding Fedora Atomic is exciting for the same reasons. Fedora Atomic diversifies Typhoon's technology underpinnings, uniting the Container Linux and Fedora Atomic ecosystems to provide a consistent Kubernetes experience across operating systems, clouds, and on-premise. Get started with the basics or read the OS comparison . If you're familiar with Terraform, follow the new tutorials for Fedora Atomic on AWS , Google Cloud , DigitalOcean , and bare-metal . Typhoon is not affiliated with Red Hat or Project Atomic. Warning Heed the warnings. Typhoon for Fedora Atomic is still alpha. Container Linux continues to be the recommended flavor for production clusters. Atomic is not meant to detract from efforts on Container Linux or its derivatives. Tip For bare-metal, you may continue to use your v0.7+ Matchbox service and terraform-provider-matchbox plugin to provision both Container Linux and Fedora Atomic clusters. No changes needed. Container Linux's own primordial rkt-fly shim dates back to the pre-OCI era. In some ways, rkt drove the OCI standards that made newer ideas, like system containers, appealing. \u21a9 Using etcd , kubelet , and bootkube as system containers required metadata files be added in system-containers \u21a9","title":"Announce"},{"location":"announce/#announce","text":"","title":"Announce "},{"location":"announce/#jan-23-2020","text":"Typhoon for Fedora CoreOS promoted to alpha! Last summer, Typhoon released the first preview of Kubernetes on Fedora CoreOS for bare-metal and AWS, developing many ideas and patterns from Typhoon for Container Linux and Fedora Atomic. Since then, Typhoon for Fedora CoreOS has evolved and gained features alongside Typhoon, while Fedora CoreOS itself has evolved and improved too. Fedora recently announced that Fedora CoreOS is available for general use. To align with that change and to better indicate the maturing status, Typhoon for Fedora CoreOS has been promoted to alpha. Many thanks to folks who have worked to make this possbile! About: For newcomers, Typhoon is a minimal and free (cost and freedom) Kubernetes distribution providing upstream Kubernetes, declarative configuration via Terraform, and support for AWS, Azure, Google Cloud, DigitalOcean, and bare-metal. It is run by former CoreOS engineer @dghubble to power his clusters, with freedom motivations .","title":"Jan 23, 2020"},{"location":"announce/#jul-18-2019","text":"Introducing a preview of Typhoon Kubernetes clusters with Fedora CoreOS! Fedora recently announced the first preview release of Fedora CoreOS, aiming to blend the best of CoreOS and Fedora for containerized workloads. To spur testing, Typhoon is sharing preview modules for Kubernetes v1.15 on AWS and bare-metal using the new Fedora CoreOS preview. What better way to test drive than by running Kubernetes? While Typhoon uses Container Linux (or Flatcar Linux) for stable modules, the project hasn't been a stranger to Fedora ideas, once developing a Fedora Atomic variant in 2018. That makes the Fedora CoreOS fushion both exciting and familiar. Typhoon with Fedora CoreOS uses Ignition v3 for provisioning, uses rpm-ostree for layering and updates, tries swapping system containers for podman, and brings SELinux enforcement ( table ). This is an early preview (don't go to prod), but do try it out and help identify and solve issues (getting started links above).","title":"Jul 18, 2019"},{"location":"announce/#march-27-2019","text":"Last April, Typhoon introduced alpha support for creating Kubernetes clusters with Fedora Atomic on AWS, Google Cloud, DigitalOcean, and bare-metal. Fedora Atomic shared many of Container Linux's aims for a container-optimized operating system, introduced novel ideas, and provided technical diversification for an uncertain future. However, Project Atomic efforts were merged into Fedora CoreOS and future Fedora Atomic releases are not expected . Typhoon modules for Fedora Atomic will not be updated much beyond Kubernetes v1.13 . They may later be removed. Typhoon for Fedora Atomic fell short of goals to provide a consistent, practical experience across operating systems and platforms. The modules have remained alpha, despite improvements. Features like coordinated OS updates and boot-time declarative customization were not realized. Inelegance of Cloud-Init/kickstart loomed large. With that brief but obligatory summary, I'd like to change gears and celebrate the many positives. Fedora Atomic showcased rpm-ostree as a different approach to Container Linux's AB update scheme. It provided a viable route toward CRI-O to replace Docker as the container engine. And Fedora Atomic devised system containers as a way to package and run raw OCI images through runc for host-level containers 1 . Many of these ideas will live on in Fedora CoreOS, which is exciting! For Typhoon, Fedora Atomic brought fresh ideas and broader perspectives about different container-optimized base operating systems and related tools. Its sad to let go of so much work, but I think its time. Many of the concepts and technologies that were explored will surface again and Typhoon is better positioned as a result. Thank you Project Atomic team members for your work! - dghubble","title":"March 27, 2019"},{"location":"announce/#may-23-2018","text":"Starting in v1.10.3, Typhoon AWS and bare-metal container-linux modules allow picking between the Red Hat Container Linux (formerly CoreOS Container Linux) and Kinvolk Flatcar Linux operating system. Flatcar Linux serves as a drop-in compatible \"friendly fork\" of Container Linux. Flatcar Linux publishes the same channels and versions as Container Linux and gets provisioned, managed, and operated in an identical way (e.g. login as user \"core\"). On AWS, pick the Container Linux derivative channel by setting os_image to coreos-stable, coreos-beta, coreos-alpha, flatcar-stable, flatcar-beta, or flatcar-alpha. On bare-metal, pick the Container Linux derivative channel by setting os_channel to coreos-stable, coreos-beta, coreos-alpha, flatcar-stable, flatcar-beta, or flatcar-alpha. Set the os_version number to PXE boot and install. Variables container_linux_channel and container_linux_version have been dropped. Flatcar Linux provides a familar Container Linux experience, with support from Kinvolk as an alternative to Red Hat. Typhoon offers the choice of Container Linux vendor to satisfy differing preferences and to diversify technology underpinnings, while providing a consistent Kubernetes experience across operating systems, clouds, and on-premise.","title":"May 23, 2018"},{"location":"announce/#april-26-2018","text":"Introducing Typhoon Kubernetes clusters for Fedora Atomic! Fedora Atomic is a container-optimized operating system designed for large-scale clustered operation, immutable infrastructure, and atomic operating system upgrades. Its part of Fedora and Project Atomic , a Red Hat sponsored project working on rpm-ostree, buildah, skopeo, CRI-O, and the related CentOS/RHEL Atomic. For newcomers, Typhoon is a free (cost and freedom) Kubernetes distribution providing upstream Kubernetes, declarative configuration via Terraform , and support for AWS, Google Cloud, DigitalOcean, and bare-metal. Typhoon clusters use a self-hosted control plane, support Calico and flannel CNI networking, and enable etcd TLS, RBAC , and network policy. Typhoon for Fedora Atomic reflects many of the same principles that created Typhoon for Container Linux. Clusters are declared using plain Terraform configs that can be versioned. In lieu of Ignition, instances are declaratively provisioned with Cloud-Init and kickstart (bare-metal only). TLS assets are generated. Hosts run only a kubelet service, other components are scheduled (i.e. self-hosted). The upstream hyperkube is used directly 2 . And clusters are kept minimal by offering optional addons for Ingress , Prometheus , and Grafana . Typhoon compliments and enhances Fedora Atomic as a choice of operating system for Kubernetes. Meanwhile, Fedora Atomic adds some promising new low-level technologies: ostree & rpm-ostree - a hybrid, layered, image and package system that lets you perform atomic updates and rollbacks, layer on packages, \"rebase\" your system, or manage a remote tree repo. See Dusty Mabe's great intro . system containers - OCI container images that embed systemd and runc metadata for starting low-level host services before container runtimes are ready. Typhoon uses system containers under runc for etcd , kubelet , and bootkube on Fedora Atomic (instead of rkt-fly). CRI-O - CRI-O is a kubernetes-incubator implementation of the Kubernetes Container Runtime Interface. Typhoon uses Docker as the container runtime today, but its a goal to gradually introduce CRI-O as an alternative runtime as it matures. Typhoon has long aspired to add a dissimilar operating system to compliment Container Linux. Operating Typhoon clusters across colocations and multiple clouds was driven by our own real need and has provided healthy perspective and clear direction. Adding Fedora Atomic is exciting for the same reasons. Fedora Atomic diversifies Typhoon's technology underpinnings, uniting the Container Linux and Fedora Atomic ecosystems to provide a consistent Kubernetes experience across operating systems, clouds, and on-premise. Get started with the basics or read the OS comparison . If you're familiar with Terraform, follow the new tutorials for Fedora Atomic on AWS , Google Cloud , DigitalOcean , and bare-metal . Typhoon is not affiliated with Red Hat or Project Atomic. Warning Heed the warnings. Typhoon for Fedora Atomic is still alpha. Container Linux continues to be the recommended flavor for production clusters. Atomic is not meant to detract from efforts on Container Linux or its derivatives. Tip For bare-metal, you may continue to use your v0.7+ Matchbox service and terraform-provider-matchbox plugin to provision both Container Linux and Fedora Atomic clusters. No changes needed. Container Linux's own primordial rkt-fly shim dates back to the pre-OCI era. In some ways, rkt drove the OCI standards that made newer ideas, like system containers, appealing. \u21a9 Using etcd , kubelet , and bootkube as system containers required metadata files be added in system-containers \u21a9","title":"April 26, 2018"},{"location":"addons/fleetlock/","text":"fleetlock \u00b6 fleetlock is a reboot coordinator for Fedora CoreOS nodes. It implements the FleetLock protocol for use as a Zincati lock strategy backend. Declare a Zincati fleet_lock strategy when provisioning Fedora CoreOS nodes via snippets . variant : fcos version : 1.1.0 storage : files : - path : /etc/zincati/config.d/55-update-strategy.toml contents : inline : | [updates] strategy = \"fleet_lock\" [updates.fleet_lock] base_url = \"http://10.3.0.15/\" module \"nemo\" { ... controller_snippets = [ file ( \"./snippets/zincati-strategy.yaml\" ), ] worker_snippets = [ file ( \"./snippets/zincati-strategy.yaml\" ), ] } Apply fleetlock based on the example manifests. git clone git@github.com:poseidon/fleetlock.git kubectl apply -f examples/k8s","title":"fleetlock"},{"location":"addons/fleetlock/#fleetlock","text":"fleetlock is a reboot coordinator for Fedora CoreOS nodes. It implements the FleetLock protocol for use as a Zincati lock strategy backend. Declare a Zincati fleet_lock strategy when provisioning Fedora CoreOS nodes via snippets . variant : fcos version : 1.1.0 storage : files : - path : /etc/zincati/config.d/55-update-strategy.toml contents : inline : | [updates] strategy = \"fleet_lock\" [updates.fleet_lock] base_url = \"http://10.3.0.15/\" module \"nemo\" { ... controller_snippets = [ file ( \"./snippets/zincati-strategy.yaml\" ), ] worker_snippets = [ file ( \"./snippets/zincati-strategy.yaml\" ), ] } Apply fleetlock based on the example manifests. git clone git@github.com:poseidon/fleetlock.git kubectl apply -f examples/k8s","title":"fleetlock"},{"location":"addons/grafana/","text":"Grafana \u00b6 Grafana can be used to build dashboards and visualizations that use Prometheus as the datasource. Create the grafana deployment and service. kubectl apply -f addons/grafana -R Use kubectl to authenticate to the apiserver and create a local port-forward to the Grafana pod. kubectl port-forward grafana-POD-ID 8080 -n monitoring Visit 127.0.0.1:8080 to view the bundled dashboards.","title":"Grafana"},{"location":"addons/grafana/#grafana","text":"Grafana can be used to build dashboards and visualizations that use Prometheus as the datasource. Create the grafana deployment and service. kubectl apply -f addons/grafana -R Use kubectl to authenticate to the apiserver and create a local port-forward to the Grafana pod. kubectl port-forward grafana-POD-ID 8080 -n monitoring Visit 127.0.0.1:8080 to view the bundled dashboards.","title":"Grafana"},{"location":"addons/ingress/","text":"Nginx Ingress Controller \u00b6 Nginx Ingress controller pods accept and demultiplex HTTP, HTTPS, TCP, or UDP traffic to backend services. Ingress controllers watch the Kubernetes API for Ingress resources and update their configuration accordingly. Ingress resources for HTTP(S) applications support virtual hosts (FQDNs), path rules, TLS termination, and SNI. AWS \u00b6 On AWS, a network load balancer (NLB) distributes TCP traffic across two target groups (port 80 and 443) of worker nodes running an Ingress controller deployment. Security groups rules allow traffic to ports 80 and 443. Health checks ensure only workers with a healthy Ingress controller receive traffic. Create the Ingress controller deployment, service, RBAC roles, RBAC bindings, and namespace. kubectl apply -R -f addons/nginx-ingress/aws For each application, add a DNS CNAME resolving to the NLB's DNS record. app1.example.com -> tempest-ingress.123456.us-west2.elb.amazonaws.com app2.example.com -> tempest-ingress.123456.us-west2.elb.amazonaws.com app3.example.com -> tempest-ingress.123456.us-west2.elb.amazonaws.com Find the NLB's DNS name through the console or use the Typhoon module's output ingress_dns_name . For example, you might use Terraform to manage a Google Cloud DNS record: resource \"google_dns_record_set\" \"some-application\" { # DNS zone name managed_zone = \"example-zone\" # DNS record name = \"app.example.com.\" type = \"CNAME\" ttl = 300 rrdatas = [ \"${module.tempest.ingress_dns_name}.\" ] } Azure \u00b6 On Azure, a load balancer distributes traffic across a backend address pool of worker nodes running an Ingress controller deployment. Security group rules allow traffic to ports 80 and 443. Health probes ensure only workers with a healthy Ingress controller receive traffic. Create the Ingress controller deployment, service, RBAC roles, RBAC bindings, and namespace. kubectl apply -R -f addons/nginx-ingress/azure For each application, add a DNS record resolving to the load balancer's IPv4 address. app1.example.com -> 11.22.33.44 app2.example.com -> 11.22.33.44 app3.example.com -> 11.22.33.44 Find the load balancer's IPv4 address with the Azure console or use the Typhoon module's output ingress_static_ipv4 . For example, you might use Terraform to manage a Google Cloud DNS record: resource \"google_dns_record_set\" \"some-application\" { # DNS zone name managed_zone = \"example-zone\" # DNS record name = \"app.example.com.\" type = \"A\" ttl = 300 rrdatas = [ module.ramius.ingress_static_ipv4 ] } Bare-Metal \u00b6 On bare-metal, routing traffic to Ingress controller pods can be done in number of ways. Equal-Cost Multi-Path \u00b6 Create the Ingress controller deployment, service, RBAC roles, and RBAC bindings. The service should use a fixed ClusterIP (e.g. 10.3.0.12) in the Kubernetes service IPv4 CIDR range. kubectl apply -R -f addons/nginx-ingress/bare-metal There is no need for pods to use host networking or for the ingress service to use NodePort or LoadBalancer. Nodes already proxy packets destined for the service's ClusterIP to node(s) with a pod endpoint. Configure the network router or load balancer with a static route for the Kubernetes service range and set the next hop to a node. Repeat for each node, as desired, and set the metric (i.e. cost) of each. Finally, DNAT traffic destined for the WAN on ports 80 or 443 to the service's fixed ClusterIP. For each application, add a DNS record resolving to the WAN(s). resource \"google_dns_record_set\" \"some-application\" { # Managed DNS Zone name managed_zone = \"zone-name\" # Name of the DNS record name = \"app.example.com.\" type = \"A\" ttl = 300 rrdatas = [ \"SOME-WAN-IP\" ] } Digital Ocean \u00b6 On DigitalOcean, DNS A and AAAA records (e.g. FQDN nemo-workers.example.com ) resolve to each worker 1 running an Ingress controller DaemonSet on host ports 80 and 443. Firewall rules allow IPv4 and IPv6 traffic to ports 80 and 443. Create the Ingress controller daemonset, service, RBAC roles, RBAC bindings, and namespace. kubectl apply -R -f addons/nginx-ingress/digital-ocean For each application, add a CNAME record resolving to the worker(s) DNS record. Use the Typhoon module's output workers_dns to find the worker DNS value. For example, you might use Terraform to manage a Google Cloud DNS record: resource \"google_dns_record_set\" \"some-application\" { # DNS zone name managed_zone = \"example-zone\" # DNS record name = \"app.example.com.\" type = \"CNAME\" ttl = 300 rrdatas = [ \"${module.nemo.workers_dns}.\" ] } Note Hosting IPv6 apps is possible, but requires editing the nginx-ingress addon to use hostNetwork: true . Google Cloud \u00b6 On Google Cloud, a TCP Proxy load balancer distributes IPv4 and IPv6 TCP traffic across a backend service of worker nodes running an Ingress controller deployment. Firewall rules allow traffic to ports 80 and 443. Health check rules ensure only workers with a healthy Ingress controller receive traffic. Create the Ingress controller deployment, service, RBAC roles, RBAC bindings, and namespace. kubectl apply -R -f addons/nginx-ingress/google-cloud For each application, add DNS A records resolving to the load balancer's IPv4 address and DNS AAAA records resolving to the load balancer's IPv6 address. app1.example.com -> 11.22.33.44 app2.example.com -> 11.22.33.44 app3.example.com -> 11.22.33.44 Find the IPv4 address with gcloud compute addresses list or use the Typhoon module's outputs ingress_static_ipv4 and ingress_static_ipv6 . For example, you might use Terraform to manage a Google Cloud DNS record: resource \"google_dns_record_set\" \"app-record-a\" { # DNS zone name managed_zone = \"example-zone\" # DNS record name = \"app.example.com.\" type = \"A\" ttl = 300 rrdatas = [ module.yavin.ingress_static_ipv4 ] } resource \"google_dns_record_set\" \"app-record-aaaa\" { # DNS zone name managed_zone = \"example-zone\" # DNS record name = \"app.example.com.\" type = \"AAAA\" ttl = 300 rrdatas = [ module.yavin.ingress_static_ipv6 ] } DigitalOcean does offer load balancers. We've opted not to use them to keep the DigitalOcean cluster cheap for developers. \u21a9","title":"Nginx Ingress"},{"location":"addons/ingress/#nginx-ingress-controller","text":"Nginx Ingress controller pods accept and demultiplex HTTP, HTTPS, TCP, or UDP traffic to backend services. Ingress controllers watch the Kubernetes API for Ingress resources and update their configuration accordingly. Ingress resources for HTTP(S) applications support virtual hosts (FQDNs), path rules, TLS termination, and SNI.","title":"Nginx Ingress Controller"},{"location":"addons/ingress/#aws","text":"On AWS, a network load balancer (NLB) distributes TCP traffic across two target groups (port 80 and 443) of worker nodes running an Ingress controller deployment. Security groups rules allow traffic to ports 80 and 443. Health checks ensure only workers with a healthy Ingress controller receive traffic. Create the Ingress controller deployment, service, RBAC roles, RBAC bindings, and namespace. kubectl apply -R -f addons/nginx-ingress/aws For each application, add a DNS CNAME resolving to the NLB's DNS record. app1.example.com -> tempest-ingress.123456.us-west2.elb.amazonaws.com app2.example.com -> tempest-ingress.123456.us-west2.elb.amazonaws.com app3.example.com -> tempest-ingress.123456.us-west2.elb.amazonaws.com Find the NLB's DNS name through the console or use the Typhoon module's output ingress_dns_name . For example, you might use Terraform to manage a Google Cloud DNS record: resource \"google_dns_record_set\" \"some-application\" { # DNS zone name managed_zone = \"example-zone\" # DNS record name = \"app.example.com.\" type = \"CNAME\" ttl = 300 rrdatas = [ \"${module.tempest.ingress_dns_name}.\" ] }","title":"AWS"},{"location":"addons/ingress/#azure","text":"On Azure, a load balancer distributes traffic across a backend address pool of worker nodes running an Ingress controller deployment. Security group rules allow traffic to ports 80 and 443. Health probes ensure only workers with a healthy Ingress controller receive traffic. Create the Ingress controller deployment, service, RBAC roles, RBAC bindings, and namespace. kubectl apply -R -f addons/nginx-ingress/azure For each application, add a DNS record resolving to the load balancer's IPv4 address. app1.example.com -> 11.22.33.44 app2.example.com -> 11.22.33.44 app3.example.com -> 11.22.33.44 Find the load balancer's IPv4 address with the Azure console or use the Typhoon module's output ingress_static_ipv4 . For example, you might use Terraform to manage a Google Cloud DNS record: resource \"google_dns_record_set\" \"some-application\" { # DNS zone name managed_zone = \"example-zone\" # DNS record name = \"app.example.com.\" type = \"A\" ttl = 300 rrdatas = [ module.ramius.ingress_static_ipv4 ] }","title":"Azure"},{"location":"addons/ingress/#bare-metal","text":"On bare-metal, routing traffic to Ingress controller pods can be done in number of ways.","title":"Bare-Metal"},{"location":"addons/ingress/#equal-cost-multi-path","text":"Create the Ingress controller deployment, service, RBAC roles, and RBAC bindings. The service should use a fixed ClusterIP (e.g. 10.3.0.12) in the Kubernetes service IPv4 CIDR range. kubectl apply -R -f addons/nginx-ingress/bare-metal There is no need for pods to use host networking or for the ingress service to use NodePort or LoadBalancer. Nodes already proxy packets destined for the service's ClusterIP to node(s) with a pod endpoint. Configure the network router or load balancer with a static route for the Kubernetes service range and set the next hop to a node. Repeat for each node, as desired, and set the metric (i.e. cost) of each. Finally, DNAT traffic destined for the WAN on ports 80 or 443 to the service's fixed ClusterIP. For each application, add a DNS record resolving to the WAN(s). resource \"google_dns_record_set\" \"some-application\" { # Managed DNS Zone name managed_zone = \"zone-name\" # Name of the DNS record name = \"app.example.com.\" type = \"A\" ttl = 300 rrdatas = [ \"SOME-WAN-IP\" ] }","title":"Equal-Cost Multi-Path"},{"location":"addons/ingress/#digital-ocean","text":"On DigitalOcean, DNS A and AAAA records (e.g. FQDN nemo-workers.example.com ) resolve to each worker 1 running an Ingress controller DaemonSet on host ports 80 and 443. Firewall rules allow IPv4 and IPv6 traffic to ports 80 and 443. Create the Ingress controller daemonset, service, RBAC roles, RBAC bindings, and namespace. kubectl apply -R -f addons/nginx-ingress/digital-ocean For each application, add a CNAME record resolving to the worker(s) DNS record. Use the Typhoon module's output workers_dns to find the worker DNS value. For example, you might use Terraform to manage a Google Cloud DNS record: resource \"google_dns_record_set\" \"some-application\" { # DNS zone name managed_zone = \"example-zone\" # DNS record name = \"app.example.com.\" type = \"CNAME\" ttl = 300 rrdatas = [ \"${module.nemo.workers_dns}.\" ] } Note Hosting IPv6 apps is possible, but requires editing the nginx-ingress addon to use hostNetwork: true .","title":"Digital Ocean"},{"location":"addons/ingress/#google-cloud","text":"On Google Cloud, a TCP Proxy load balancer distributes IPv4 and IPv6 TCP traffic across a backend service of worker nodes running an Ingress controller deployment. Firewall rules allow traffic to ports 80 and 443. Health check rules ensure only workers with a healthy Ingress controller receive traffic. Create the Ingress controller deployment, service, RBAC roles, RBAC bindings, and namespace. kubectl apply -R -f addons/nginx-ingress/google-cloud For each application, add DNS A records resolving to the load balancer's IPv4 address and DNS AAAA records resolving to the load balancer's IPv6 address. app1.example.com -> 11.22.33.44 app2.example.com -> 11.22.33.44 app3.example.com -> 11.22.33.44 Find the IPv4 address with gcloud compute addresses list or use the Typhoon module's outputs ingress_static_ipv4 and ingress_static_ipv6 . For example, you might use Terraform to manage a Google Cloud DNS record: resource \"google_dns_record_set\" \"app-record-a\" { # DNS zone name managed_zone = \"example-zone\" # DNS record name = \"app.example.com.\" type = \"A\" ttl = 300 rrdatas = [ module.yavin.ingress_static_ipv4 ] } resource \"google_dns_record_set\" \"app-record-aaaa\" { # DNS zone name managed_zone = \"example-zone\" # DNS record name = \"app.example.com.\" type = \"AAAA\" ttl = 300 rrdatas = [ module.yavin.ingress_static_ipv6 ] } DigitalOcean does offer load balancers. We've opted not to use them to keep the DigitalOcean cluster cheap for developers. \u21a9","title":"Google Cloud"},{"location":"addons/overview/","text":"Addons \u00b6 Typhoon clusters are verified to work well with several post-install addons. Nginx Ingress Controller Prometheus Grafana fleetlock","title":"Overview"},{"location":"addons/overview/#addons","text":"Typhoon clusters are verified to work well with several post-install addons. Nginx Ingress Controller Prometheus Grafana fleetlock","title":"Addons"},{"location":"addons/prometheus/","text":"Prometheus \u00b6 Prometheus collects metrics (e.g. node_memory_usage_bytes ) from targets by scraping their HTTP metrics endpoints. Targets are organized into jobs , defined in the Prometheus config. Targets may expose counter, gauge, histogram, or summary metrics. Here's a simple config from the Prometheus tutorial . global: scrape_interval: 15s scrape_configs: - job_name: 'prometheus' scrape_interval: 5s static_configs: - targets: ['localhost:9090'] On Kubernetes clusters, Prometheus is run as a Deployment, configured with a ConfigMap, and accessed via a Service or Ingress. kubectl apply -f addons/prometheus -R The ConfigMap configures Prometheus to discover apiservers, kubelets, cAdvisor, services, endpoints, and exporters. By default, data is kept in an emptyDir so it is persisted until the pod is rescheduled. Exporters \u00b6 Exporters expose metrics for 3 rd -party systems that don't natively expose Prometheus metrics. node_exporter - DaemonSet that exposes a machine's hardware and OS metrics kube-state-metrics - Deployment that exposes Kubernetes object metrics blackbox_exporter - Scrapes HTTP, HTTPS, DNS, TCP, or ICMP endpoints and exposes availability as metrics Queries and Alerts \u00b6 Prometheus provides a basic UI for querying metrics and viewing alerts. Use kubectl to authenticate to the apiserver and create a local port-forward to the Prometheus pod. kubectl get pods -n monitoring kubectl port-forward prometheus-POD-ID 9090 -n monitoring Visit 127.0.0.1:9090 to query expressions , view targets , or check alerts . Use Grafana to view or build dashboards that use Prometheus as the datasource.","title":"Prometheus"},{"location":"addons/prometheus/#prometheus","text":"Prometheus collects metrics (e.g. node_memory_usage_bytes ) from targets by scraping their HTTP metrics endpoints. Targets are organized into jobs , defined in the Prometheus config. Targets may expose counter, gauge, histogram, or summary metrics. Here's a simple config from the Prometheus tutorial . global: scrape_interval: 15s scrape_configs: - job_name: 'prometheus' scrape_interval: 5s static_configs: - targets: ['localhost:9090'] On Kubernetes clusters, Prometheus is run as a Deployment, configured with a ConfigMap, and accessed via a Service or Ingress. kubectl apply -f addons/prometheus -R The ConfigMap configures Prometheus to discover apiservers, kubelets, cAdvisor, services, endpoints, and exporters. By default, data is kept in an emptyDir so it is persisted until the pod is rescheduled.","title":"Prometheus"},{"location":"addons/prometheus/#exporters","text":"Exporters expose metrics for 3 rd -party systems that don't natively expose Prometheus metrics. node_exporter - DaemonSet that exposes a machine's hardware and OS metrics kube-state-metrics - Deployment that exposes Kubernetes object metrics blackbox_exporter - Scrapes HTTP, HTTPS, DNS, TCP, or ICMP endpoints and exposes availability as metrics","title":"Exporters"},{"location":"addons/prometheus/#queries-and-alerts","text":"Prometheus provides a basic UI for querying metrics and viewing alerts. Use kubectl to authenticate to the apiserver and create a local port-forward to the Prometheus pod. kubectl get pods -n monitoring kubectl port-forward prometheus-POD-ID 9090 -n monitoring Visit 127.0.0.1:9090 to query expressions , view targets , or check alerts . Use Grafana to view or build dashboards that use Prometheus as the datasource.","title":"Queries and Alerts"},{"location":"advanced/arm64/","text":"ARM64 \u00b6 Typhoon has experimental support for ARM64 on AWS, with Fedora CoreOS or Flatcar Linux. Clusters can be created with ARM64 controller and worker nodes. Or worker pools of ARM64 nodes can be attached to an AMD64 cluster to create a hybrid/mixed architecture cluster. Note Currently, CNI networking must be set to flannel or cilium . Cluster \u00b6 Create a cluster with ARM64 controller and worker nodes. Container workloads must be arm64 compatible and use arm64 container images. Fedora CoreOS Cluster (arm64) Flatcar Linux Cluster (arm64) module \"gravitas\" { source = \"git::https://github.com/poseidon/typhoon//aws/fedora-coreos/kubernetes?ref=v1.24.3\" # AWS cluster_name = \"gravitas\" dns_zone = \"aws.example.com\" dns_zone_id = \"Z3PAABBCFAKEC0\" # configuration ssh_authorized_key = \"ssh-ed25519 AAAAB3Nz...\" # optional arch = \"arm64\" networking = \"cilium\" worker_count = 2 worker_price = \"0.0168\" controller_type = \"t4g.small\" worker_type = \"t4g.small\" } module \"gravitas\" { source = \"git::https://github.com/poseidon/typhoon//aws/flatcar-linux/kubernetes?ref=v1.24.3\" # AWS cluster_name = \"gravitas\" dns_zone = \"aws.example.com\" dns_zone_id = \"Z3PAABBCFAKEC0\" # configuration ssh_authorized_key = \"ssh-ed25519 AAAAB3Nz...\" # optional arch = \"arm64\" networking = \"cilium\" worker_count = 2 worker_price = \"0.0168\" controller_type = \"t4g.small\" worker_type = \"t4g.small\" } Verify the cluster has only arm64 ( aarch64 ) nodes. For Flatcar Linux, describe nodes. $ kubectl get nodes -o wide NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME ip-10-0-21-119 Ready <none> 77s v1.24.3 10.0.21.119 <none> Fedora CoreOS 35.20211215.3.0 5.15.7-200.fc35.aarch64 containerd://1.5.8 ip-10-0-32-166 Ready <none> 80s v1.24.3 10.0.32.166 <none> Fedora CoreOS 35.20211215.3.0 5.15.7-200.fc35.aarch64 containerd://1.5.8 ip-10-0-5-79 Ready <none> 77s v1.24.3 10.0.5.79 <none> Fedora CoreOS 35.20211215.3.0 5.15.7-200.fc35.aarch64 containerd://1.5.8 Hybrid \u00b6 Create a hybrid/mixed arch cluster by defining an AWS cluster. Then define a worker pool with ARM64 workers. Optional taints are added to aid in scheduling. FCOS Cluster Flatcar Cluster FCOS ARM64 Workers Flatcar ARM64 Workers module \"gravitas\" { source = \"git::https://github.com/poseidon/typhoon//aws/fedora-coreos/kubernetes?ref=v1.24.3\" # AWS cluster_name = \"gravitas\" dns_zone = \"aws.example.com\" dns_zone_id = \"Z3PAABBCFAKEC0\" # configuration ssh_authorized_key = \"ssh-ed25519 AAAAB3Nz...\" # optional networking = \"cilium\" worker_count = 2 worker_price = \"0.021\" daemonset_tolerations = [ \"arch\" ] # important } module \"gravitas\" { source = \"git::https://github.com/poseidon/typhoon//aws/flatcar-linux/kubernetes?ref=v1.24.3\" # AWS cluster_name = \"gravitas\" dns_zone = \"aws.example.com\" dns_zone_id = \"Z3PAABBCFAKEC0\" # configuration ssh_authorized_key = \"ssh-ed25519 AAAAB3Nz...\" # optional networking = \"cilium\" worker_count = 2 worker_price = \"0.021\" daemonset_tolerations = [ \"arch\" ] # important } module \"gravitas-arm64\" { source = \"git::https://github.com/poseidon/typhoon//aws/fedora-coreos/kubernetes/workers?ref=v1.24.3\" # AWS vpc_id = module.gravitas.vpc_id subnet_ids = module.gravitas.subnet_ids security_groups = module.gravitas.worker_security_groups # configuration name = \"gravitas-arm64\" kubeconfig = module.gravitas.kubeconfig ssh_authorized_key = var.ssh_authorized_key # optional arch = \"arm64\" instance_type = \"t4g.small\" spot_price = \"0.0168\" node_taints = [ \"arch=arm64:NoSchedule\" ] } module \"gravitas-arm64\" { source = \"git::https://github.com/poseidon/typhoon//aws/flatcar-linux/kubernetes/workers?ref=v1.24.3\" # AWS vpc_id = module.gravitas.vpc_id subnet_ids = module.gravitas.subnet_ids security_groups = module.gravitas.worker_security_groups # configuration name = \"gravitas-arm64\" kubeconfig = module.gravitas.kubeconfig ssh_authorized_key = var.ssh_authorized_key # optional arch = \"arm64\" instance_type = \"t4g.small\" spot_price = \"0.0168\" node_taints = [ \"arch=arm64:NoSchedule\" ] } Verify amd64 (x86_64) and arm64 (aarch64) nodes are present. $ kubectl get nodes -o wide NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME ip-10-0-1-73 Ready <none> 111m v1.24.3 10.0.1.73 <none> Fedora CoreOS 35.20211215.3.0 5.15.7-200.fc35.x86_64 containerd://1.5.8 ip-10-0-22-79... Ready <none> 111m v1.24.3 10.0.22.79 <none> Flatcar Container Linux by Kinvolk 3033.2.0 (Oklo) 5.10.84-flatcar containerd://1.5.8 ip-10-0-24-130 Ready <none> 111m v1.24.3 10.0.24.130 <none> Fedora CoreOS 35.20211215.3.0 5.15.7-200.fc35.x86_64 containerd://1.5.8 ip-10-0-39-19 Ready <none> 111m v1.24.3 10.0.39.19 <none> Fedora CoreOS 35.20211215.3.0 5.15.7-200.fc35.x86_64 containerd://1.5.8","title":"ARM64"},{"location":"advanced/arm64/#arm64","text":"Typhoon has experimental support for ARM64 on AWS, with Fedora CoreOS or Flatcar Linux. Clusters can be created with ARM64 controller and worker nodes. Or worker pools of ARM64 nodes can be attached to an AMD64 cluster to create a hybrid/mixed architecture cluster. Note Currently, CNI networking must be set to flannel or cilium .","title":"ARM64"},{"location":"advanced/arm64/#cluster","text":"Create a cluster with ARM64 controller and worker nodes. Container workloads must be arm64 compatible and use arm64 container images. Fedora CoreOS Cluster (arm64) Flatcar Linux Cluster (arm64) module \"gravitas\" { source = \"git::https://github.com/poseidon/typhoon//aws/fedora-coreos/kubernetes?ref=v1.24.3\" # AWS cluster_name = \"gravitas\" dns_zone = \"aws.example.com\" dns_zone_id = \"Z3PAABBCFAKEC0\" # configuration ssh_authorized_key = \"ssh-ed25519 AAAAB3Nz...\" # optional arch = \"arm64\" networking = \"cilium\" worker_count = 2 worker_price = \"0.0168\" controller_type = \"t4g.small\" worker_type = \"t4g.small\" } module \"gravitas\" { source = \"git::https://github.com/poseidon/typhoon//aws/flatcar-linux/kubernetes?ref=v1.24.3\" # AWS cluster_name = \"gravitas\" dns_zone = \"aws.example.com\" dns_zone_id = \"Z3PAABBCFAKEC0\" # configuration ssh_authorized_key = \"ssh-ed25519 AAAAB3Nz...\" # optional arch = \"arm64\" networking = \"cilium\" worker_count = 2 worker_price = \"0.0168\" controller_type = \"t4g.small\" worker_type = \"t4g.small\" } Verify the cluster has only arm64 ( aarch64 ) nodes. For Flatcar Linux, describe nodes. $ kubectl get nodes -o wide NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME ip-10-0-21-119 Ready <none> 77s v1.24.3 10.0.21.119 <none> Fedora CoreOS 35.20211215.3.0 5.15.7-200.fc35.aarch64 containerd://1.5.8 ip-10-0-32-166 Ready <none> 80s v1.24.3 10.0.32.166 <none> Fedora CoreOS 35.20211215.3.0 5.15.7-200.fc35.aarch64 containerd://1.5.8 ip-10-0-5-79 Ready <none> 77s v1.24.3 10.0.5.79 <none> Fedora CoreOS 35.20211215.3.0 5.15.7-200.fc35.aarch64 containerd://1.5.8","title":"Cluster"},{"location":"advanced/arm64/#hybrid","text":"Create a hybrid/mixed arch cluster by defining an AWS cluster. Then define a worker pool with ARM64 workers. Optional taints are added to aid in scheduling. FCOS Cluster Flatcar Cluster FCOS ARM64 Workers Flatcar ARM64 Workers module \"gravitas\" { source = \"git::https://github.com/poseidon/typhoon//aws/fedora-coreos/kubernetes?ref=v1.24.3\" # AWS cluster_name = \"gravitas\" dns_zone = \"aws.example.com\" dns_zone_id = \"Z3PAABBCFAKEC0\" # configuration ssh_authorized_key = \"ssh-ed25519 AAAAB3Nz...\" # optional networking = \"cilium\" worker_count = 2 worker_price = \"0.021\" daemonset_tolerations = [ \"arch\" ] # important } module \"gravitas\" { source = \"git::https://github.com/poseidon/typhoon//aws/flatcar-linux/kubernetes?ref=v1.24.3\" # AWS cluster_name = \"gravitas\" dns_zone = \"aws.example.com\" dns_zone_id = \"Z3PAABBCFAKEC0\" # configuration ssh_authorized_key = \"ssh-ed25519 AAAAB3Nz...\" # optional networking = \"cilium\" worker_count = 2 worker_price = \"0.021\" daemonset_tolerations = [ \"arch\" ] # important } module \"gravitas-arm64\" { source = \"git::https://github.com/poseidon/typhoon//aws/fedora-coreos/kubernetes/workers?ref=v1.24.3\" # AWS vpc_id = module.gravitas.vpc_id subnet_ids = module.gravitas.subnet_ids security_groups = module.gravitas.worker_security_groups # configuration name = \"gravitas-arm64\" kubeconfig = module.gravitas.kubeconfig ssh_authorized_key = var.ssh_authorized_key # optional arch = \"arm64\" instance_type = \"t4g.small\" spot_price = \"0.0168\" node_taints = [ \"arch=arm64:NoSchedule\" ] } module \"gravitas-arm64\" { source = \"git::https://github.com/poseidon/typhoon//aws/flatcar-linux/kubernetes/workers?ref=v1.24.3\" # AWS vpc_id = module.gravitas.vpc_id subnet_ids = module.gravitas.subnet_ids security_groups = module.gravitas.worker_security_groups # configuration name = \"gravitas-arm64\" kubeconfig = module.gravitas.kubeconfig ssh_authorized_key = var.ssh_authorized_key # optional arch = \"arm64\" instance_type = \"t4g.small\" spot_price = \"0.0168\" node_taints = [ \"arch=arm64:NoSchedule\" ] } Verify amd64 (x86_64) and arm64 (aarch64) nodes are present. $ kubectl get nodes -o wide NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME ip-10-0-1-73 Ready <none> 111m v1.24.3 10.0.1.73 <none> Fedora CoreOS 35.20211215.3.0 5.15.7-200.fc35.x86_64 containerd://1.5.8 ip-10-0-22-79... Ready <none> 111m v1.24.3 10.0.22.79 <none> Flatcar Container Linux by Kinvolk 3033.2.0 (Oklo) 5.10.84-flatcar containerd://1.5.8 ip-10-0-24-130 Ready <none> 111m v1.24.3 10.0.24.130 <none> Fedora CoreOS 35.20211215.3.0 5.15.7-200.fc35.x86_64 containerd://1.5.8 ip-10-0-39-19 Ready <none> 111m v1.24.3 10.0.39.19 <none> Fedora CoreOS 35.20211215.3.0 5.15.7-200.fc35.x86_64 containerd://1.5.8","title":"Hybrid"},{"location":"advanced/customization/","text":"Customization \u00b6 Typhoon provides Kubernetes clusters with defaults recommended for production. Terraform variables expose supported customization options. Advanced options are available for customizing the architecture or hosts as well. Variables \u00b6 Typhoon modules accept Terraform input variables for customizing clusters in meritorious ways (e.g. worker_count , etc). Variables are carefully considered to provide essentials, while limiting complexity and test matrix burden. See each platform's tutorial for options. Addons \u00b6 Clusters are kept to a minimal Kubernetes control plane by offering components like Nginx Ingress Controller, Prometheus, and Grafana as optional post-install addons . Customize addons by modifying a copy of our addon manifests. Hosts \u00b6 Typhoon uses the Ignition system of Fedora CoreOS and Flatcar Linux to immutably declare a system via first-boot disk provisioning. Fedora CoreOS uses a Butane Config and Flatcar Linux uses a Container Linux Config (CLC). These define disk partitions, filesystems, systemd units, dropins, config files, mount units, raid arrays, and users. Controller and worker instances form a minimal and secure Kubernetes cluster on each platform. Typhoon provides the snippets feature to accept Butane or Container Linux Configs to validate and additively merge into instance declarations. This allows advanced host customization and experimentation. Note Snippets cannot be used to modify an already existing instance, the antithesis of immutable provisioning. Ignition fully declares a system on first boot only. Danger Snippets provide the powerful host customization abilities of Ignition. You are responsible for additional units, configs, files, and conflicts. Danger Edits to snippets for controller instances can (correctly) cause Terraform to observe a diff (if not otherwise suppressed) and propose destroying and recreating controller(s). Recognize that this is destructive since controllers run etcd and are stateful. See blue/green clusters. Fedora CoreOS \u00b6 Note Fedora CoreOS snippets require terraform-provider-ct v0.5+ Define a Butane Config ( docs , config ) in version control near your Terraform workspace directory (e.g. perhaps in a snippets subdirectory). You may organize snippets into multiple files, if desired. For example, ensure an /opt/hello file is created with permissions 0644. # custom-files variant : fcos version : 1.4.0 storage : files : - path : /opt/hello contents : inline : | Hello World mode : 0644 Reference the FCC contents by location (e.g. file(\"./custom-units.yaml\") ). On AWS or Google Cloud extend the controller_snippets or worker_snippets list variables. module \"nemo\" { ... controller_count = 1 worker_count = 2 controller_snippets = [ file ( \"./custom-files\" ), file ( \"./custom-units\" ), ] worker_snippets = [ file ( \"./custom-files\" ), file ( \"./custom-units\")\" , ] ... } On Bare-Metal , different FCCs may be used for each node (since hardware may be heterogeneous). Extend the snippets map variable by mapping a controller or worker name key to a list of snippets. module \"mercury\" { ... snippets = { \"node2\" = [ file ( \"./units/hello.yaml\" )] \"node3\" = [ file ( \"./units/world.yaml\" ), file ( \"./units/hello.yaml\" ), ] } ... } Flatcar Linux \u00b6 Define a Container Linux Config (CLC) ( config , examples ) in version control near your Terraform workspace directory (e.g. perhaps in a snippets subdirectory). You may organize snippets into multiple files, if desired. For example, ensure an /opt/hello file is created with permissions 0644. # custom-files storage : files : - path : /opt/hello filesystem : root contents : inline : | Hello World mode : 0644 Or ensure a systemd unit hello.service is created and a dropin 50-etcd-cluster.conf is added for etcd-member.service . # custom-units systemd : units : - name : hello.service enable : true contents : | [Unit] Description=Hello World [Service] Type=oneshot ExecStart=/usr/bin/echo Hello World! [Install] WantedBy=multi-user.target - name : etcd-member.service enable : true dropins : - name : 50-etcd-cluster.conf contents : | Environment=\"ETCD_LOG_PACKAGE_LEVELS=etcdserver=WARNING,security=DEBUG\" Reference the CLC contents by location (e.g. file(\"./custom-units.yaml\") ). On AWS , Azure , DigitalOcean , or Google Cloud extend the controller_snippets or worker_snippets list variables. module \"nemo\" { ... controller_count = 1 worker_count = 2 controller_snippets = [ file ( \"./custom-files\" ), file ( \"./custom-units\" ), ] worker_snippets = [ file ( \"./custom-files\" ), file ( \"./custom-units\")\" , ] ... } On Bare-Metal , different CLCs may be used for each node (since hardware may be heterogeneous). Extend the snippets map variable by mapping a controller or worker name key to a list of snippets. module \"mercury\" { ... snippets = { \"node2\" = [ file ( \"./units/hello.yaml\" )] \"node3\" = [ file ( \"./units/world.yaml\" ), file ( \"./units/hello.yaml\" ), ] } ... } Architecture \u00b6 Typhoon chooses variables to expose with purpose. If you must customize clusters in ways that aren't supported by input variables, fork Typhoon and maintain a repository with customizations. Reference the repository by changing the username. module \"nemo\" { source = \"git::https://github.com/USERNAME/typhoon//digital-ocean/flatcar-linux/kubernetes?ref=myspecialcase\" ... } To customize low-level Kubernetes control plane bootstrapping, see the poseidon/terraform-render-bootstrap Terraform module. System Images \u00b6 Typhoon publishes Kubelet container images to Quay.io (default) and to Dockerhub (in case of a Quay outage or breach). Quay automated builds also provide the option for fully verifiable tagged images ( build-{short_sha} ). To set an alternative etcd image or Kubelet image, use a snippet to set a systemd dropin. Kubelet etcd # kubelet-image-override.yaml variant : fcos <- remove for Flatcar Linux version : 1.4.0 <- remove for Flatcar Linux systemd : units : - name : kubelet.service dropins : - name : 10-image-override.conf contents : | [Service] Environment=KUBELET_IMAGE=docker.io/psdn/kubelet:v1.18.3 # etcd-image-override.yaml variant : fcos <- remove for Flatcar Linux version : 1.4.0 <- remove for Flatcar Linux systemd : units : - name : etcd-member.service dropins : - name : 10-image-override.conf contents : | [Service] Environment=ETCD_IMAGE=quay.io/mymirror/etcd:v3.4.12 Then reference the snippet in the cluster or worker pool definition. module \"nemo\" { ... worker_snippets = [ file ( \"./snippets/kubelet-image-override.yaml\" ) ] ... }","title":"Customization"},{"location":"advanced/customization/#customization","text":"Typhoon provides Kubernetes clusters with defaults recommended for production. Terraform variables expose supported customization options. Advanced options are available for customizing the architecture or hosts as well.","title":"Customization"},{"location":"advanced/customization/#variables","text":"Typhoon modules accept Terraform input variables for customizing clusters in meritorious ways (e.g. worker_count , etc). Variables are carefully considered to provide essentials, while limiting complexity and test matrix burden. See each platform's tutorial for options.","title":"Variables"},{"location":"advanced/customization/#addons","text":"Clusters are kept to a minimal Kubernetes control plane by offering components like Nginx Ingress Controller, Prometheus, and Grafana as optional post-install addons . Customize addons by modifying a copy of our addon manifests.","title":"Addons"},{"location":"advanced/customization/#hosts","text":"Typhoon uses the Ignition system of Fedora CoreOS and Flatcar Linux to immutably declare a system via first-boot disk provisioning. Fedora CoreOS uses a Butane Config and Flatcar Linux uses a Container Linux Config (CLC). These define disk partitions, filesystems, systemd units, dropins, config files, mount units, raid arrays, and users. Controller and worker instances form a minimal and secure Kubernetes cluster on each platform. Typhoon provides the snippets feature to accept Butane or Container Linux Configs to validate and additively merge into instance declarations. This allows advanced host customization and experimentation. Note Snippets cannot be used to modify an already existing instance, the antithesis of immutable provisioning. Ignition fully declares a system on first boot only. Danger Snippets provide the powerful host customization abilities of Ignition. You are responsible for additional units, configs, files, and conflicts. Danger Edits to snippets for controller instances can (correctly) cause Terraform to observe a diff (if not otherwise suppressed) and propose destroying and recreating controller(s). Recognize that this is destructive since controllers run etcd and are stateful. See blue/green clusters.","title":"Hosts"},{"location":"advanced/customization/#fedora-coreos","text":"Note Fedora CoreOS snippets require terraform-provider-ct v0.5+ Define a Butane Config ( docs , config ) in version control near your Terraform workspace directory (e.g. perhaps in a snippets subdirectory). You may organize snippets into multiple files, if desired. For example, ensure an /opt/hello file is created with permissions 0644. # custom-files variant : fcos version : 1.4.0 storage : files : - path : /opt/hello contents : inline : | Hello World mode : 0644 Reference the FCC contents by location (e.g. file(\"./custom-units.yaml\") ). On AWS or Google Cloud extend the controller_snippets or worker_snippets list variables. module \"nemo\" { ... controller_count = 1 worker_count = 2 controller_snippets = [ file ( \"./custom-files\" ), file ( \"./custom-units\" ), ] worker_snippets = [ file ( \"./custom-files\" ), file ( \"./custom-units\")\" , ] ... } On Bare-Metal , different FCCs may be used for each node (since hardware may be heterogeneous). Extend the snippets map variable by mapping a controller or worker name key to a list of snippets. module \"mercury\" { ... snippets = { \"node2\" = [ file ( \"./units/hello.yaml\" )] \"node3\" = [ file ( \"./units/world.yaml\" ), file ( \"./units/hello.yaml\" ), ] } ... }","title":"Fedora CoreOS"},{"location":"advanced/customization/#flatcar-linux","text":"Define a Container Linux Config (CLC) ( config , examples ) in version control near your Terraform workspace directory (e.g. perhaps in a snippets subdirectory). You may organize snippets into multiple files, if desired. For example, ensure an /opt/hello file is created with permissions 0644. # custom-files storage : files : - path : /opt/hello filesystem : root contents : inline : | Hello World mode : 0644 Or ensure a systemd unit hello.service is created and a dropin 50-etcd-cluster.conf is added for etcd-member.service . # custom-units systemd : units : - name : hello.service enable : true contents : | [Unit] Description=Hello World [Service] Type=oneshot ExecStart=/usr/bin/echo Hello World! [Install] WantedBy=multi-user.target - name : etcd-member.service enable : true dropins : - name : 50-etcd-cluster.conf contents : | Environment=\"ETCD_LOG_PACKAGE_LEVELS=etcdserver=WARNING,security=DEBUG\" Reference the CLC contents by location (e.g. file(\"./custom-units.yaml\") ). On AWS , Azure , DigitalOcean , or Google Cloud extend the controller_snippets or worker_snippets list variables. module \"nemo\" { ... controller_count = 1 worker_count = 2 controller_snippets = [ file ( \"./custom-files\" ), file ( \"./custom-units\" ), ] worker_snippets = [ file ( \"./custom-files\" ), file ( \"./custom-units\")\" , ] ... } On Bare-Metal , different CLCs may be used for each node (since hardware may be heterogeneous). Extend the snippets map variable by mapping a controller or worker name key to a list of snippets. module \"mercury\" { ... snippets = { \"node2\" = [ file ( \"./units/hello.yaml\" )] \"node3\" = [ file ( \"./units/world.yaml\" ), file ( \"./units/hello.yaml\" ), ] } ... }","title":"Flatcar Linux"},{"location":"advanced/customization/#architecture","text":"Typhoon chooses variables to expose with purpose. If you must customize clusters in ways that aren't supported by input variables, fork Typhoon and maintain a repository with customizations. Reference the repository by changing the username. module \"nemo\" { source = \"git::https://github.com/USERNAME/typhoon//digital-ocean/flatcar-linux/kubernetes?ref=myspecialcase\" ... } To customize low-level Kubernetes control plane bootstrapping, see the poseidon/terraform-render-bootstrap Terraform module.","title":"Architecture"},{"location":"advanced/customization/#system-images","text":"Typhoon publishes Kubelet container images to Quay.io (default) and to Dockerhub (in case of a Quay outage or breach). Quay automated builds also provide the option for fully verifiable tagged images ( build-{short_sha} ). To set an alternative etcd image or Kubelet image, use a snippet to set a systemd dropin. Kubelet etcd # kubelet-image-override.yaml variant : fcos <- remove for Flatcar Linux version : 1.4.0 <- remove for Flatcar Linux systemd : units : - name : kubelet.service dropins : - name : 10-image-override.conf contents : | [Service] Environment=KUBELET_IMAGE=docker.io/psdn/kubelet:v1.18.3 # etcd-image-override.yaml variant : fcos <- remove for Flatcar Linux version : 1.4.0 <- remove for Flatcar Linux systemd : units : - name : etcd-member.service dropins : - name : 10-image-override.conf contents : | [Service] Environment=ETCD_IMAGE=quay.io/mymirror/etcd:v3.4.12 Then reference the snippet in the cluster or worker pool definition. module \"nemo\" { ... worker_snippets = [ file ( \"./snippets/kubelet-image-override.yaml\" ) ] ... }","title":"System Images"},{"location":"advanced/nodes/","text":"Nodes \u00b6 Typhoon clusters consist of controller node(s) and a (default) set of worker nodes. Overview \u00b6 Typhoon nodes use the standard set of Kubernetes node labels. Labels : kubernetes.io/arch=amd64 kubernetes.io/hostname=node-name kubernetes.io/os=linux Controller node(s) are labeled to allow node selection (for rare components that run on controllers) and tainted to prevent ordinary workloads running on controllers. Labels : node.kubernetes.io/controller=true Taints : node-role.kubernetes.io/controller:NoSchedule Worker nodes are labeled to allow node selection and untainted. Workloads will schedule on worker nodes by default, baring any contraindications. Labels : node.kubernetes.io/node= Taints : <none> On auto-scaling cloud platforms, you may add worker pools with different groups of nodes with their own labels and taints. On platforms like bare-metal, with heterogeneous machines, you may manage node labels and taints per node. Node Labels \u00b6 Add custom initial worker node labels to default workers or worker pool nodes to allow workloads to select among nodes that differ. Cluster Worker Pool module \"yavin\" { source = \"git::https://github.com/poseidon/typhoon//google-cloud/fedora-coreos/kubernetes?ref=v1.24.3\" # Google Cloud cluster_name = \"yavin\" region = \"us-central1\" dns_zone = \"example.com\" dns_zone_name = \"example-zone\" # configuration ssh_authorized_key = local.ssh_key # optional worker_count = 2 worker_node_labels = [ \"pool=default\" ] } module \"yavin-pool\" { source = \"git::https://github.com/poseidon/typhoon//google-cloud/fedora-coreos/kubernetes/workers?ref=v1.24.3\" # Google Cloud cluster_name = \"yavin\" region = \"europe-west2\" network = module.yavin.network_name # configuration name = \"yavin-16x\" kubeconfig = module.yavin.kubeconfig ssh_authorized_key = local.ssh_key # optional worker_count = 1 machine_type = \"n1-standard-16\" node_labels = [ \"pool=big\" ] } In the example above, the two default workers would be labeled pool: default and the additional worker would be labeled pool: big . Node Taints \u00b6 Add custom initial taints on worker pool nodes to indicate a node is unique and should only schedule workloads that explicitly tolerate a given taint key. Warning Since taints prevent workloads scheduling onto a node, you must decide whether kube-system DaemonSets (e.g. flannel, Calico, Cilium) should tolerate your custom taint by setting daemonset_tolerations . If you don't list your custom taint(s), important components won't run on these nodes. Cluster Worker Pool module \"yavin\" { source = \"git::https://github.com/poseidon/typhoon//google-cloud/fedora-coreos/kubernetes?ref=v1.24.3\" # Google Cloud cluster_name = \"yavin\" region = \"us-central1\" dns_zone = \"example.com\" dns_zone_name = \"example-zone\" # configuration ssh_authorized_key = local.ssh_key # optional worker_count = 2 daemonset_tolerations = [ \"role\" ] } module \"yavin-pool\" { source = \"git::https://github.com/poseidon/typhoon//google-cloud/fedora-coreos/kubernetes/workers?ref=v1.24.3\" # Google Cloud cluster_name = \"yavin\" region = \"europe-west2\" network = module.yavin.network_name # configuration name = \"yavin-16x\" kubeconfig = module.yavin.kubeconfig ssh_authorized_key = local.ssh_key # optional worker_count = 1 accelerator_type = \"nvidia-tesla-p100\" accelerator_count = 1 node_taints = [ \"role=gpu:NoSchedule\" ] } In the example above, the the additional worker would be tainted with role=gpu:NoSchedule to prevent workloads scheduling, but kube-system components like flannel, Calico, or Cilium would tolerate that custom taint to run there.","title":"Nodes"},{"location":"advanced/nodes/#nodes","text":"Typhoon clusters consist of controller node(s) and a (default) set of worker nodes.","title":"Nodes"},{"location":"advanced/nodes/#overview","text":"Typhoon nodes use the standard set of Kubernetes node labels. Labels : kubernetes.io/arch=amd64 kubernetes.io/hostname=node-name kubernetes.io/os=linux Controller node(s) are labeled to allow node selection (for rare components that run on controllers) and tainted to prevent ordinary workloads running on controllers. Labels : node.kubernetes.io/controller=true Taints : node-role.kubernetes.io/controller:NoSchedule Worker nodes are labeled to allow node selection and untainted. Workloads will schedule on worker nodes by default, baring any contraindications. Labels : node.kubernetes.io/node= Taints : <none> On auto-scaling cloud platforms, you may add worker pools with different groups of nodes with their own labels and taints. On platforms like bare-metal, with heterogeneous machines, you may manage node labels and taints per node.","title":"Overview"},{"location":"advanced/nodes/#node-labels","text":"Add custom initial worker node labels to default workers or worker pool nodes to allow workloads to select among nodes that differ. Cluster Worker Pool module \"yavin\" { source = \"git::https://github.com/poseidon/typhoon//google-cloud/fedora-coreos/kubernetes?ref=v1.24.3\" # Google Cloud cluster_name = \"yavin\" region = \"us-central1\" dns_zone = \"example.com\" dns_zone_name = \"example-zone\" # configuration ssh_authorized_key = local.ssh_key # optional worker_count = 2 worker_node_labels = [ \"pool=default\" ] } module \"yavin-pool\" { source = \"git::https://github.com/poseidon/typhoon//google-cloud/fedora-coreos/kubernetes/workers?ref=v1.24.3\" # Google Cloud cluster_name = \"yavin\" region = \"europe-west2\" network = module.yavin.network_name # configuration name = \"yavin-16x\" kubeconfig = module.yavin.kubeconfig ssh_authorized_key = local.ssh_key # optional worker_count = 1 machine_type = \"n1-standard-16\" node_labels = [ \"pool=big\" ] } In the example above, the two default workers would be labeled pool: default and the additional worker would be labeled pool: big .","title":"Node Labels"},{"location":"advanced/nodes/#node-taints","text":"Add custom initial taints on worker pool nodes to indicate a node is unique and should only schedule workloads that explicitly tolerate a given taint key. Warning Since taints prevent workloads scheduling onto a node, you must decide whether kube-system DaemonSets (e.g. flannel, Calico, Cilium) should tolerate your custom taint by setting daemonset_tolerations . If you don't list your custom taint(s), important components won't run on these nodes. Cluster Worker Pool module \"yavin\" { source = \"git::https://github.com/poseidon/typhoon//google-cloud/fedora-coreos/kubernetes?ref=v1.24.3\" # Google Cloud cluster_name = \"yavin\" region = \"us-central1\" dns_zone = \"example.com\" dns_zone_name = \"example-zone\" # configuration ssh_authorized_key = local.ssh_key # optional worker_count = 2 daemonset_tolerations = [ \"role\" ] } module \"yavin-pool\" { source = \"git::https://github.com/poseidon/typhoon//google-cloud/fedora-coreos/kubernetes/workers?ref=v1.24.3\" # Google Cloud cluster_name = \"yavin\" region = \"europe-west2\" network = module.yavin.network_name # configuration name = \"yavin-16x\" kubeconfig = module.yavin.kubeconfig ssh_authorized_key = local.ssh_key # optional worker_count = 1 accelerator_type = \"nvidia-tesla-p100\" accelerator_count = 1 node_taints = [ \"role=gpu:NoSchedule\" ] } In the example above, the the additional worker would be tainted with role=gpu:NoSchedule to prevent workloads scheduling, but kube-system components like flannel, Calico, or Cilium would tolerate that custom taint to run there.","title":"Node Taints"},{"location":"advanced/overview/","text":"Advanced \u00b6 Typhoon clusters offer several advanced features for skilled users. ARM64 Customization Nodes Worker Pools","title":"Overview"},{"location":"advanced/overview/#advanced","text":"Typhoon clusters offer several advanced features for skilled users. ARM64 Customization Nodes Worker Pools","title":"Advanced"},{"location":"advanced/worker-pools/","text":"Worker Pools \u00b6 Typhoon AWS, Azure, and Google Cloud allow additional groups of workers to be defined and joined to a cluster. For example, add worker pools of instances with different types, disk sizes, Container Linux channels, or preemptibility modes. Internal Terraform Modules: aws/flatcar-linux/kubernetes/workers aws/fedora-coreos/kubernetes/workers azure/flatcar-linux/kubernetes/workers azure/fedora-coreos/kubernetes/workers google-cloud/flatcar-linux/kubernetes/workers google-cloud/fedora-coreos/kubernetes/workers AWS \u00b6 Create a cluster following the AWS tutorial . Define a worker pool using the AWS internal workers module. Fedora CoreOS Flatcar Linux module \"tempest-worker-pool\" { source = \"git::https://github.com/poseidon/typhoon//aws/fedora-coreos/kubernetes/workers?ref=v1.24.3\" # AWS vpc_id = module.tempest.vpc_id subnet_ids = module.tempest.subnet_ids security_groups = module.tempest.worker_security_groups # configuration name = \"tempest-pool\" kubeconfig = module.tempest.kubeconfig ssh_authorized_key = var.ssh_authorized_key # optional worker_count = 2 instance_type = \"m5.large\" os_stream = \"next\" } module \"tempest-worker-pool\" { source = \"git::https://github.com/poseidon/typhoon//aws/flatcar-linux/kubernetes/workers?ref=v1.24.3\" # AWS vpc_id = module.tempest.vpc_id subnet_ids = module.tempest.subnet_ids security_groups = module.tempest.worker_security_groups # configuration name = \"tempest-pool\" kubeconfig = module.tempest.kubeconfig ssh_authorized_key = var.ssh_authorized_key # optional worker_count = 2 instance_type = \"m5.large\" os_image = \"flatcar-beta\" } Apply the change. terraform apply Verify an auto-scaling group of workers joins the cluster within a few minutes. Variables \u00b6 The AWS internal workers module supports a number of variables . Required \u00b6 Name Description Example name Unique name (distinct from cluster name) \"tempest-m5s\" vpc_id Must be set to vpc_id output by cluster module.cluster.vpc_id subnet_ids Must be set to subnet_ids output by cluster module.cluster.subnet_ids security_groups Must be set to worker_security_groups output by cluster module.cluster.worker_security_groups kubeconfig Must be set to kubeconfig output by cluster module.cluster.kubeconfig ssh_authorized_key SSH public key for user 'core' \"ssh-ed25519 AAAAB3NZ...\" Optional \u00b6 Name Description Default Example worker_count Number of instances 1 3 instance_type EC2 instance type \"t3.small\" \"t3.medium\" os_image AMI channel for a Container Linux derivative \"flatcar-stable\" flatcar-stable, flatcar-beta, flatcar-alpha os_stream Fedora CoreOS stream for compute instances \"stable\" \"testing\", \"next\" disk_size Size of the EBS volume in GB 40 100 disk_type Type of the EBS volume \"gp3\" standard, gp2, gp3, io1 disk_iops IOPS of the EBS volume 0 (i.e. auto) 400 spot_price Spot price in USD for worker instances or 0 to use on-demand instances 0 0.10 snippets Fedora CoreOS or Container Linux Config snippets [] examples service_cidr Must match service_cidr of cluster \"10.3.0.0/16\" \"10.3.0.0/24\" node_labels List of initial node labels [] [\"worker-pool=foo\"] node_taints List of initial node taints [] [\"role=gpu:NoSchedule\"] Check the list of valid instance types or per-region and per-type spot prices . Azure \u00b6 Create a cluster following the Azure tutorial . Define a worker pool using the Azure internal workers module. Fedora CoreOS Flatcar Linux module \"ramius-worker-pool\" { source = \"git::https://github.com/poseidon/typhoon//azure/fedora-coreos/kubernetes/workers?ref=v1.24.3\" # Azure region = module.ramius.region resource_group_name = module.ramius.resource_group_name subnet_id = module.ramius.subnet_id security_group_id = module.ramius.security_group_id backend_address_pool_id = module.ramius.backend_address_pool_id # configuration name = \"ramius-spot\" kubeconfig = module.ramius.kubeconfig ssh_authorized_key = var.ssh_authorized_key # optional worker_count = 2 vm_type = \"Standard_F4\" priority = \"Spot\" os_image = \"/subscriptions/some/path/Microsoft.Compute/images/fedora-coreos-31.20200323.3.2\" } module \"ramius-worker-pool\" { source = \"git::https://github.com/poseidon/typhoon//azure/flatcar-linux/kubernetes/workers?ref=v1.24.3\" # Azure region = module.ramius.region resource_group_name = module.ramius.resource_group_name subnet_id = module.ramius.subnet_id security_group_id = module.ramius.security_group_id backend_address_pool_id = module.ramius.backend_address_pool_id # configuration name = \"ramius-spot\" kubeconfig = module.ramius.kubeconfig ssh_authorized_key = var.ssh_authorized_key # optional worker_count = 2 vm_type = \"Standard_F4\" priority = \"Spot\" os_image = \"flatcar-beta\" } Apply the change. terraform apply Verify a scale set of workers joins the cluster within a few minutes. Variables \u00b6 The Azure internal workers module supports a number of variables . Required \u00b6 Name Description Example name Unique name (distinct from cluster name) \"ramius-f4\" region Must be set to region output by cluster module.cluster.region resource_group_name Must be set to resource_group_name output by cluster module.cluster.resource_group_name subnet_id Must be set to subnet_id output by cluster module.cluster.subnet_id security_group_id Must be set to security_group_id output by cluster module.cluster.security_group_id backend_address_pool_id Must be set to backend_address_pool_id output by cluster module.cluster.backend_address_pool_id kubeconfig Must be set to kubeconfig output by cluster module.cluster.kubeconfig ssh_authorized_key SSH public key for user 'core' \"ssh-ed25519 AAAAB3NZ...\" Optional \u00b6 Name Description Default Example worker_count Number of instances 1 3 vm_type Machine type for instances \"Standard_DS1_v2\" See below os_image Channel for a Container Linux derivative \"flatcar-stable\" flatcar-stable, flatcar-beta, flatcar-alpha priority Set priority to Spot to use reduced cost surplus capacity, with the tradeoff that instances can be deallocated at any time \"Regular\" \"Spot\" snippets Container Linux Config snippets [] examples service_cidr CIDR IPv4 range to assign to Kubernetes services \"10.3.0.0/16\" \"10.3.0.0/24\" node_labels List of initial node labels [] [\"worker-pool=foo\"] node_taints List of initial node taints [] [\"role=gpu:NoSchedule\"] Check the list of valid machine types and their specs . Use az vm list-skus to get the identifier. Google Cloud \u00b6 Create a cluster following the Google Cloud tutorial . Define a worker pool using the Google Cloud internal workers module. Fedora CoreOS Flatcar Linux module \"yavin-worker-pool\" { source = \"git::https://github.com/poseidon/typhoon//google-cloud/fedora-coreos/kubernetes/workers?ref=v1.24.3\" # Google Cloud region = \"europe-west2\" network = module.yavin.network_name cluster_name = \"yavin\" # configuration name = \"yavin-16x\" kubeconfig = module.yavin.kubeconfig ssh_authorized_key = var.ssh_authorized_key # optional worker_count = 2 machine_type = \"n1-standard-16\" os_stream = \"testing\" preemptible = true } module \"yavin-worker-pool\" { source = \"git::https://github.com/poseidon/typhoon//google-cloud/flatcar-linux/kubernetes/workers?ref=v1.24.3\" # Google Cloud region = \"europe-west2\" network = module.yavin.network_name cluster_name = \"yavin\" # configuration name = \"yavin-16x\" kubeconfig = module.yavin.kubeconfig ssh_authorized_key = var.ssh_authorized_key # optional worker_count = 2 machine_type = \"n1-standard-16\" os_image = \"flatcar-stable\" preemptible = true } Apply the change. terraform apply Verify a managed instance group of workers joins the cluster within a few minutes. $ kubectl get nodes NAME STATUS AGE VERSION yavin-controller-0.c.example-com.internal Ready 6m v1.24.3 yavin-worker-jrbf.c.example-com.internal Ready 5m v1.24.3 yavin-worker-mzdm.c.example-com.internal Ready 5m v1.24.3 yavin-16x-worker-jrbf.c.example-com.internal Ready 3m v1.24.3 yavin-16x-worker-mzdm.c.example-com.internal Ready 3m v1.24.3 Variables \u00b6 The Google Cloud internal workers module supports a number of variables . Required \u00b6 Name Description Example name Unique name (distinct from cluster name) \"yavin-16x\" cluster_name Must be set to cluster_name of cluster \"yavin\" region Region for the worker pool instances. May differ from the cluster's region \"europe-west2\" network Must be set to network_name output by cluster module.cluster.network_name kubeconfig Must be set to kubeconfig output by cluster module.cluster.kubeconfig os_image Container Linux image for compute instances \"uploaded-flatcar-image\" ssh_authorized_key SSH public key for user 'core' \"ssh-ed25519 AAAAB3NZ...\" Check the list of regions docs or with gcloud compute regions list . Optional \u00b6 Name Description Default Example worker_count Number of instances 1 3 machine_type Compute instance machine type \"n1-standard-1\" See below os_stream Fedora CoreOS stream for compute instances \"stable\" \"testing\", \"next\" disk_size Size of the disk in GB 40 100 preemptible If true, Compute Engine will terminate instances randomly within 24 hours false true snippets Container Linux Config snippets [] examples service_cidr Must match service_cidr of cluster \"10.3.0.0/16\" \"10.3.0.0/24\" node_labels List of initial node labels [] [\"worker-pool=foo\"] node_taints List of initial node taints [] [\"role=gpu:NoSchedule\"] Check the list of valid machine types .","title":"Worker Pools"},{"location":"advanced/worker-pools/#worker-pools","text":"Typhoon AWS, Azure, and Google Cloud allow additional groups of workers to be defined and joined to a cluster. For example, add worker pools of instances with different types, disk sizes, Container Linux channels, or preemptibility modes. Internal Terraform Modules: aws/flatcar-linux/kubernetes/workers aws/fedora-coreos/kubernetes/workers azure/flatcar-linux/kubernetes/workers azure/fedora-coreos/kubernetes/workers google-cloud/flatcar-linux/kubernetes/workers google-cloud/fedora-coreos/kubernetes/workers","title":"Worker Pools"},{"location":"advanced/worker-pools/#aws","text":"Create a cluster following the AWS tutorial . Define a worker pool using the AWS internal workers module. Fedora CoreOS Flatcar Linux module \"tempest-worker-pool\" { source = \"git::https://github.com/poseidon/typhoon//aws/fedora-coreos/kubernetes/workers?ref=v1.24.3\" # AWS vpc_id = module.tempest.vpc_id subnet_ids = module.tempest.subnet_ids security_groups = module.tempest.worker_security_groups # configuration name = \"tempest-pool\" kubeconfig = module.tempest.kubeconfig ssh_authorized_key = var.ssh_authorized_key # optional worker_count = 2 instance_type = \"m5.large\" os_stream = \"next\" } module \"tempest-worker-pool\" { source = \"git::https://github.com/poseidon/typhoon//aws/flatcar-linux/kubernetes/workers?ref=v1.24.3\" # AWS vpc_id = module.tempest.vpc_id subnet_ids = module.tempest.subnet_ids security_groups = module.tempest.worker_security_groups # configuration name = \"tempest-pool\" kubeconfig = module.tempest.kubeconfig ssh_authorized_key = var.ssh_authorized_key # optional worker_count = 2 instance_type = \"m5.large\" os_image = \"flatcar-beta\" } Apply the change. terraform apply Verify an auto-scaling group of workers joins the cluster within a few minutes.","title":"AWS"},{"location":"advanced/worker-pools/#variables","text":"The AWS internal workers module supports a number of variables .","title":"Variables"},{"location":"advanced/worker-pools/#required","text":"Name Description Example name Unique name (distinct from cluster name) \"tempest-m5s\" vpc_id Must be set to vpc_id output by cluster module.cluster.vpc_id subnet_ids Must be set to subnet_ids output by cluster module.cluster.subnet_ids security_groups Must be set to worker_security_groups output by cluster module.cluster.worker_security_groups kubeconfig Must be set to kubeconfig output by cluster module.cluster.kubeconfig ssh_authorized_key SSH public key for user 'core' \"ssh-ed25519 AAAAB3NZ...\"","title":"Required"},{"location":"advanced/worker-pools/#optional","text":"Name Description Default Example worker_count Number of instances 1 3 instance_type EC2 instance type \"t3.small\" \"t3.medium\" os_image AMI channel for a Container Linux derivative \"flatcar-stable\" flatcar-stable, flatcar-beta, flatcar-alpha os_stream Fedora CoreOS stream for compute instances \"stable\" \"testing\", \"next\" disk_size Size of the EBS volume in GB 40 100 disk_type Type of the EBS volume \"gp3\" standard, gp2, gp3, io1 disk_iops IOPS of the EBS volume 0 (i.e. auto) 400 spot_price Spot price in USD for worker instances or 0 to use on-demand instances 0 0.10 snippets Fedora CoreOS or Container Linux Config snippets [] examples service_cidr Must match service_cidr of cluster \"10.3.0.0/16\" \"10.3.0.0/24\" node_labels List of initial node labels [] [\"worker-pool=foo\"] node_taints List of initial node taints [] [\"role=gpu:NoSchedule\"] Check the list of valid instance types or per-region and per-type spot prices .","title":"Optional"},{"location":"advanced/worker-pools/#azure","text":"Create a cluster following the Azure tutorial . Define a worker pool using the Azure internal workers module. Fedora CoreOS Flatcar Linux module \"ramius-worker-pool\" { source = \"git::https://github.com/poseidon/typhoon//azure/fedora-coreos/kubernetes/workers?ref=v1.24.3\" # Azure region = module.ramius.region resource_group_name = module.ramius.resource_group_name subnet_id = module.ramius.subnet_id security_group_id = module.ramius.security_group_id backend_address_pool_id = module.ramius.backend_address_pool_id # configuration name = \"ramius-spot\" kubeconfig = module.ramius.kubeconfig ssh_authorized_key = var.ssh_authorized_key # optional worker_count = 2 vm_type = \"Standard_F4\" priority = \"Spot\" os_image = \"/subscriptions/some/path/Microsoft.Compute/images/fedora-coreos-31.20200323.3.2\" } module \"ramius-worker-pool\" { source = \"git::https://github.com/poseidon/typhoon//azure/flatcar-linux/kubernetes/workers?ref=v1.24.3\" # Azure region = module.ramius.region resource_group_name = module.ramius.resource_group_name subnet_id = module.ramius.subnet_id security_group_id = module.ramius.security_group_id backend_address_pool_id = module.ramius.backend_address_pool_id # configuration name = \"ramius-spot\" kubeconfig = module.ramius.kubeconfig ssh_authorized_key = var.ssh_authorized_key # optional worker_count = 2 vm_type = \"Standard_F4\" priority = \"Spot\" os_image = \"flatcar-beta\" } Apply the change. terraform apply Verify a scale set of workers joins the cluster within a few minutes.","title":"Azure"},{"location":"advanced/worker-pools/#variables_1","text":"The Azure internal workers module supports a number of variables .","title":"Variables"},{"location":"advanced/worker-pools/#required_1","text":"Name Description Example name Unique name (distinct from cluster name) \"ramius-f4\" region Must be set to region output by cluster module.cluster.region resource_group_name Must be set to resource_group_name output by cluster module.cluster.resource_group_name subnet_id Must be set to subnet_id output by cluster module.cluster.subnet_id security_group_id Must be set to security_group_id output by cluster module.cluster.security_group_id backend_address_pool_id Must be set to backend_address_pool_id output by cluster module.cluster.backend_address_pool_id kubeconfig Must be set to kubeconfig output by cluster module.cluster.kubeconfig ssh_authorized_key SSH public key for user 'core' \"ssh-ed25519 AAAAB3NZ...\"","title":"Required"},{"location":"advanced/worker-pools/#optional_1","text":"Name Description Default Example worker_count Number of instances 1 3 vm_type Machine type for instances \"Standard_DS1_v2\" See below os_image Channel for a Container Linux derivative \"flatcar-stable\" flatcar-stable, flatcar-beta, flatcar-alpha priority Set priority to Spot to use reduced cost surplus capacity, with the tradeoff that instances can be deallocated at any time \"Regular\" \"Spot\" snippets Container Linux Config snippets [] examples service_cidr CIDR IPv4 range to assign to Kubernetes services \"10.3.0.0/16\" \"10.3.0.0/24\" node_labels List of initial node labels [] [\"worker-pool=foo\"] node_taints List of initial node taints [] [\"role=gpu:NoSchedule\"] Check the list of valid machine types and their specs . Use az vm list-skus to get the identifier.","title":"Optional"},{"location":"advanced/worker-pools/#google-cloud","text":"Create a cluster following the Google Cloud tutorial . Define a worker pool using the Google Cloud internal workers module. Fedora CoreOS Flatcar Linux module \"yavin-worker-pool\" { source = \"git::https://github.com/poseidon/typhoon//google-cloud/fedora-coreos/kubernetes/workers?ref=v1.24.3\" # Google Cloud region = \"europe-west2\" network = module.yavin.network_name cluster_name = \"yavin\" # configuration name = \"yavin-16x\" kubeconfig = module.yavin.kubeconfig ssh_authorized_key = var.ssh_authorized_key # optional worker_count = 2 machine_type = \"n1-standard-16\" os_stream = \"testing\" preemptible = true } module \"yavin-worker-pool\" { source = \"git::https://github.com/poseidon/typhoon//google-cloud/flatcar-linux/kubernetes/workers?ref=v1.24.3\" # Google Cloud region = \"europe-west2\" network = module.yavin.network_name cluster_name = \"yavin\" # configuration name = \"yavin-16x\" kubeconfig = module.yavin.kubeconfig ssh_authorized_key = var.ssh_authorized_key # optional worker_count = 2 machine_type = \"n1-standard-16\" os_image = \"flatcar-stable\" preemptible = true } Apply the change. terraform apply Verify a managed instance group of workers joins the cluster within a few minutes. $ kubectl get nodes NAME STATUS AGE VERSION yavin-controller-0.c.example-com.internal Ready 6m v1.24.3 yavin-worker-jrbf.c.example-com.internal Ready 5m v1.24.3 yavin-worker-mzdm.c.example-com.internal Ready 5m v1.24.3 yavin-16x-worker-jrbf.c.example-com.internal Ready 3m v1.24.3 yavin-16x-worker-mzdm.c.example-com.internal Ready 3m v1.24.3","title":"Google Cloud"},{"location":"advanced/worker-pools/#variables_2","text":"The Google Cloud internal workers module supports a number of variables .","title":"Variables"},{"location":"advanced/worker-pools/#required_2","text":"Name Description Example name Unique name (distinct from cluster name) \"yavin-16x\" cluster_name Must be set to cluster_name of cluster \"yavin\" region Region for the worker pool instances. May differ from the cluster's region \"europe-west2\" network Must be set to network_name output by cluster module.cluster.network_name kubeconfig Must be set to kubeconfig output by cluster module.cluster.kubeconfig os_image Container Linux image for compute instances \"uploaded-flatcar-image\" ssh_authorized_key SSH public key for user 'core' \"ssh-ed25519 AAAAB3NZ...\" Check the list of regions docs or with gcloud compute regions list .","title":"Required"},{"location":"advanced/worker-pools/#optional_2","text":"Name Description Default Example worker_count Number of instances 1 3 machine_type Compute instance machine type \"n1-standard-1\" See below os_stream Fedora CoreOS stream for compute instances \"stable\" \"testing\", \"next\" disk_size Size of the disk in GB 40 100 preemptible If true, Compute Engine will terminate instances randomly within 24 hours false true snippets Container Linux Config snippets [] examples service_cidr Must match service_cidr of cluster \"10.3.0.0/16\" \"10.3.0.0/24\" node_labels List of initial node labels [] [\"worker-pool=foo\"] node_taints List of initial node taints [] [\"role=gpu:NoSchedule\"] Check the list of valid machine types .","title":"Optional"},{"location":"architecture/aws/","text":"AWS \u00b6 Load Balancing \u00b6 kube-apiserver \u00b6 A network load balancer (NLB) distributes IPv4 TCP/6443 traffic across a target group of controller nodes with a healthy kube-apiserver . Clusters with multiple controllers span zones in a region to tolerate zone outages. HTTP/HTTPS Ingress \u00b6 A network load balancer (NLB) distributes IPv4 TCP/80 and TCP/443 traffic across two target groups of worker nodes with a healthy Ingress controller. Workers span the zones in a region to tolerate zone outages. The AWS NLB has a DNS alias record (regional) resolving to 3 zonal IPv4 addresses. The alias record is output as ingress_dns_name for use in application DNS CNAME records. See Ingress on AWS . TCP Services \u00b6 Load balance TCP applications by adding a listener and target group. A listener and target group may map different ports (e.g 3333 external, 30333 internal). # Forward TCP traffic to a target group resource \"aws_lb_listener\" \"some-app\" { load_balancer_arn = module.tempest.nlb_id protocol = \"TCP\" port = \"3333\" default_action { type = \"forward\" target_group_arn = aws_lb_target_group.some-app.arn } } # Target group of workers for some-app resource \"aws_lb_target_group\" \"some-app\" { name = \"some-app\" vpc_id = module.tempest.vpc_id target_type = \"instance\" protocol = \"TCP\" port = 3333 health_check { protocol = \"TCP\" port = 30333 } } Pass worker_target_groups to the cluster to register worker instances into custom target groups. module \"tempest\" { ... worker_target_groups = [ aws_lb_target_group.some-app.id , ] } Notes: AWS NLBs and target groups do not support UDP Global Accelerator does support UDP, but its expensive Firewalls \u00b6 Add firewall rules to the worker security group. resource \"aws_security_group_rule\" \"some-app\" { security_group_id = module.tempest.worker_security_groups[0 ] type = \"ingress\" protocol = \"tcp\" from_port = 3333 to_port = 30333 cidr_blocks = [ \"0.0.0.0/0\" ] } Routes \u00b6 Add a custom route to the VPC route table. data \"aws_route_table\" \"default\" { vpc_id = module.temptest.vpc_id subnet_id = module.tempest.subnet_ids[0 ] } resource \"aws_route\" \"peering\" { route_table_id = data.aws_route_table.default.id destination_cidr_block = \"192.168.4.0/24\" ... } IPv6 \u00b6 IPv6 Feature Supported Node IPv6 address Yes Node Outbound IPv6 Yes Kubernetes Ingress IPv6 Yes","title":"AWS"},{"location":"architecture/aws/#aws","text":"","title":"AWS"},{"location":"architecture/aws/#load-balancing","text":"","title":"Load Balancing"},{"location":"architecture/aws/#kube-apiserver","text":"A network load balancer (NLB) distributes IPv4 TCP/6443 traffic across a target group of controller nodes with a healthy kube-apiserver . Clusters with multiple controllers span zones in a region to tolerate zone outages.","title":"kube-apiserver"},{"location":"architecture/aws/#httphttps-ingress","text":"A network load balancer (NLB) distributes IPv4 TCP/80 and TCP/443 traffic across two target groups of worker nodes with a healthy Ingress controller. Workers span the zones in a region to tolerate zone outages. The AWS NLB has a DNS alias record (regional) resolving to 3 zonal IPv4 addresses. The alias record is output as ingress_dns_name for use in application DNS CNAME records. See Ingress on AWS .","title":"HTTP/HTTPS Ingress"},{"location":"architecture/aws/#tcp-services","text":"Load balance TCP applications by adding a listener and target group. A listener and target group may map different ports (e.g 3333 external, 30333 internal). # Forward TCP traffic to a target group resource \"aws_lb_listener\" \"some-app\" { load_balancer_arn = module.tempest.nlb_id protocol = \"TCP\" port = \"3333\" default_action { type = \"forward\" target_group_arn = aws_lb_target_group.some-app.arn } } # Target group of workers for some-app resource \"aws_lb_target_group\" \"some-app\" { name = \"some-app\" vpc_id = module.tempest.vpc_id target_type = \"instance\" protocol = \"TCP\" port = 3333 health_check { protocol = \"TCP\" port = 30333 } } Pass worker_target_groups to the cluster to register worker instances into custom target groups. module \"tempest\" { ... worker_target_groups = [ aws_lb_target_group.some-app.id , ] } Notes: AWS NLBs and target groups do not support UDP Global Accelerator does support UDP, but its expensive","title":"TCP Services"},{"location":"architecture/aws/#firewalls","text":"Add firewall rules to the worker security group. resource \"aws_security_group_rule\" \"some-app\" { security_group_id = module.tempest.worker_security_groups[0 ] type = \"ingress\" protocol = \"tcp\" from_port = 3333 to_port = 30333 cidr_blocks = [ \"0.0.0.0/0\" ] }","title":"Firewalls"},{"location":"architecture/aws/#routes","text":"Add a custom route to the VPC route table. data \"aws_route_table\" \"default\" { vpc_id = module.temptest.vpc_id subnet_id = module.tempest.subnet_ids[0 ] } resource \"aws_route\" \"peering\" { route_table_id = data.aws_route_table.default.id destination_cidr_block = \"192.168.4.0/24\" ... }","title":"Routes"},{"location":"architecture/aws/#ipv6","text":"IPv6 Feature Supported Node IPv6 address Yes Node Outbound IPv6 Yes Kubernetes Ingress IPv6 Yes","title":"IPv6"},{"location":"architecture/azure/","text":"Azure \u00b6 Load Balancing \u00b6 kube-apiserver \u00b6 A load balancer distributes IPv4 TCP/6443 traffic across a backend address pool of controllers with a healthy kube-apiserver . Clusters with multiple controllers use an availability set with 2 fault domains to tolerate hardware failures within Azure. HTTP/HTTPS Ingress \u00b6 A load balancer distributes IPv4 TCP/80 and TCP/443 traffic across a backend address pool of workers with a healthy Ingress controller. The Azure LB IPv4 address is output as ingress_static_ipv4 for use in DNS A records. See Ingress on Azure . TCP/UDP Services \u00b6 Load balance TCP/UDP applications by adding rules to the Azure LB (output). A rule may map different ports (e.g. 3333 external, 30333 internal). # Forward traffic to the worker backend address pool resource \"azurerm_lb_rule\" \"some-app-tcp\" { resource_group_name = module.ramius.resource_group_name name = \"some-app-tcp\" loadbalancer_id = module.ramius.loadbalancer_id frontend_ip_configuration_name = \"ingress\" protocol = \"Tcp\" frontend_port = 3333 backend_port = 30333 backend_address_pool_id = module.ramius.backend_address_pool_id probe_id = azurerm_lb_probe.some-app.id } # Health check some-app resource \"azurerm_lb_probe\" \"some-app\" { resource_group_name = module.ramius.resource_group_name name = \"some-app\" loadbalancer_id = module.ramius.loadbalancer_id protocol = \"Tcp\" port = 30333 } Firewalls \u00b6 Add firewall rules to the worker security group. resource \"azurerm_network_security_rule\" \"some-app\" { resource_group_name = \"${module.ramius.resource_group_name}\" name = \"some-app\" network_security_group_name = module.ramius.worker_security_group_name priority = \"3001\" access = \"Allow\" direction = \"Inbound\" protocol = \"Tcp\" source_port_range = \"*\" destination_port_range = \"30333\" source_address_prefix = \"*\" destination_address_prefixes = module.ramius.worker_address_prefixes } IPv6 \u00b6 Azure does not provide public IPv6 addresses at the standard SKU. IPv6 Feature Supported Node IPv6 address No Node Outbound IPv6 No Kubernetes Ingress IPv6 No","title":"Azure"},{"location":"architecture/azure/#azure","text":"","title":"Azure"},{"location":"architecture/azure/#load-balancing","text":"","title":"Load Balancing"},{"location":"architecture/azure/#kube-apiserver","text":"A load balancer distributes IPv4 TCP/6443 traffic across a backend address pool of controllers with a healthy kube-apiserver . Clusters with multiple controllers use an availability set with 2 fault domains to tolerate hardware failures within Azure.","title":"kube-apiserver"},{"location":"architecture/azure/#httphttps-ingress","text":"A load balancer distributes IPv4 TCP/80 and TCP/443 traffic across a backend address pool of workers with a healthy Ingress controller. The Azure LB IPv4 address is output as ingress_static_ipv4 for use in DNS A records. See Ingress on Azure .","title":"HTTP/HTTPS Ingress"},{"location":"architecture/azure/#tcpudp-services","text":"Load balance TCP/UDP applications by adding rules to the Azure LB (output). A rule may map different ports (e.g. 3333 external, 30333 internal). # Forward traffic to the worker backend address pool resource \"azurerm_lb_rule\" \"some-app-tcp\" { resource_group_name = module.ramius.resource_group_name name = \"some-app-tcp\" loadbalancer_id = module.ramius.loadbalancer_id frontend_ip_configuration_name = \"ingress\" protocol = \"Tcp\" frontend_port = 3333 backend_port = 30333 backend_address_pool_id = module.ramius.backend_address_pool_id probe_id = azurerm_lb_probe.some-app.id } # Health check some-app resource \"azurerm_lb_probe\" \"some-app\" { resource_group_name = module.ramius.resource_group_name name = \"some-app\" loadbalancer_id = module.ramius.loadbalancer_id protocol = \"Tcp\" port = 30333 }","title":"TCP/UDP Services"},{"location":"architecture/azure/#firewalls","text":"Add firewall rules to the worker security group. resource \"azurerm_network_security_rule\" \"some-app\" { resource_group_name = \"${module.ramius.resource_group_name}\" name = \"some-app\" network_security_group_name = module.ramius.worker_security_group_name priority = \"3001\" access = \"Allow\" direction = \"Inbound\" protocol = \"Tcp\" source_port_range = \"*\" destination_port_range = \"30333\" source_address_prefix = \"*\" destination_address_prefixes = module.ramius.worker_address_prefixes }","title":"Firewalls"},{"location":"architecture/azure/#ipv6","text":"Azure does not provide public IPv6 addresses at the standard SKU. IPv6 Feature Supported Node IPv6 address No Node Outbound IPv6 No Kubernetes Ingress IPv6 No","title":"IPv6"},{"location":"architecture/bare-metal/","text":"Bare-Metal \u00b6 Load Balancing \u00b6 kube-apiserver \u00b6 Load balancing across controller nodes with a healthy kube-apiserver is determined by your unique bare-metal environment and its capabilities. HTTP/HTTPS Ingress \u00b6 Load balancing across worker nodes with a healthy Ingress Controller is determined by your unique bare-metal environment and its capabilities. See the nginx-ingress addon to run Nginx as the Ingress Controller for bare-metal. TCP/UDP Services \u00b6 Load balancing across worker nodes with TCP/UDP services is determined by your unique bare-metal environment and its capabilities. IPv6 \u00b6 Status of IPv6 on Typhoon bare-metal clusters. IPv6 Feature Supported Node IPv6 address Yes Node Outbound IPv6 Yes Kubernetes Ingress IPv6 Possible IPv6 support depends upon the bare-metal network environment.","title":"Bare-Metal"},{"location":"architecture/bare-metal/#bare-metal","text":"","title":"Bare-Metal"},{"location":"architecture/bare-metal/#load-balancing","text":"","title":"Load Balancing"},{"location":"architecture/bare-metal/#kube-apiserver","text":"Load balancing across controller nodes with a healthy kube-apiserver is determined by your unique bare-metal environment and its capabilities.","title":"kube-apiserver"},{"location":"architecture/bare-metal/#httphttps-ingress","text":"Load balancing across worker nodes with a healthy Ingress Controller is determined by your unique bare-metal environment and its capabilities. See the nginx-ingress addon to run Nginx as the Ingress Controller for bare-metal.","title":"HTTP/HTTPS Ingress"},{"location":"architecture/bare-metal/#tcpudp-services","text":"Load balancing across worker nodes with TCP/UDP services is determined by your unique bare-metal environment and its capabilities.","title":"TCP/UDP Services"},{"location":"architecture/bare-metal/#ipv6","text":"Status of IPv6 on Typhoon bare-metal clusters. IPv6 Feature Supported Node IPv6 address Yes Node Outbound IPv6 Yes Kubernetes Ingress IPv6 Possible IPv6 support depends upon the bare-metal network environment.","title":"IPv6"},{"location":"architecture/concepts/","text":"Concepts \u00b6 Let's cover the concepts you'll need to get started. Kubernetes \u00b6 Kubernetes is an open-source cluster system for deploying, scaling, and managing containerized applications across a pool of compute nodes (bare-metal, droplets, instances). Nodes \u00b6 All cluster nodes provision themselves from a declarative configuration upfront. Nodes run a kubelet service and register themselves with the control plane to join the cluster. All nodes run kube-proxy and calico or flannel pods. Controllers \u00b6 Controller nodes are scheduled to run the Kubernetes apiserver , scheduler , controller-manager , coredns , and kube-proxy . A fully qualified domain name (e.g. cluster_name.domain.com) resolving to a network load balancer or round-robin DNS (depends on platform) is used to refer to the control plane. Workers \u00b6 Worker nodes register with the control plane and run application workloads. Terraform \u00b6 Terraform config files declare resources that Terraform should manage. Resources include infrastructure components created through a provider API (e.g. Compute instances, DNS records) or local assets like TLS certificates and config files. # Declare an instance resource \"google_compute_instance\" \"pet\" { # ... } The terraform tool parses configs, reconciles the desired state with actual state, and updates resources to reach desired state. $ terraform plan Plan: 4 to add, 0 to change, 0 to destroy. $ terraform apply Apply complete! Resources: 4 added, 0 changed, 0 destroyed. With Typhoon, you'll be able to manage clusters with Terraform. Modules \u00b6 Terraform modules allow a collection of resources to be configured and managed together. Typhoon provides a Kubernetes cluster Terraform module for each supported platform and operating system. Clusters are declared in Terraform by referencing the module. module \"yavin\" { source = \"git::https://github.com/poseidon/typhoon//google-cloud/fedora-coreos/kubernetes\" cluster_name = \"yavin\" ... } Versioning \u00b6 Modules are updated regularly, set the version to a release tag or commit hash. ... source = \"git:https://github.com/poseidon/typhoon//google-cloud/fedora-coreos/kubernetes?ref=hash\" Module versioning ensures terraform get --update only fetches the desired version, so plan and apply don't change cluster resources, unless the version is altered. Organize \u00b6 Maintain Terraform configs for \"live\" infrastructure in a versioned repository. Seek to organize configs to reflect resources that should be managed together in a terraform apply invocation. You may choose to organize resources all together, by team, by project, or some other scheme. Here's an example that manages clusters together: .git/ infra/ \u2514\u2500\u2500 terraform \u2514\u2500\u2500 clusters \u251c\u2500\u2500 aws-tempest.tf \u251c\u2500\u2500 azure-ramius.tf \u251c\u2500\u2500 bare-metal-mercury.tf \u251c\u2500\u2500 google-cloud-yavin.tf \u251c\u2500\u2500 digital-ocean-nemo.tf \u251c\u2500\u2500 providers.tf \u251c\u2500\u2500 terraform.tfvars \u2514\u2500\u2500 remote-backend.tf By convention, providers.tf registers provider APIs, terraform.tfvars stores shared values, and state is written to a remote backend. State \u00b6 Terraform syncs its state with provider APIs to plan changes to reconcile to the desired state. By default, Terraform writes state data (including secrets!) to a terraform.tfstate file. At a minimum , add a .gitignore file (or equivalent) to prevent state from being committed to your infrastructure repository. # .gitignore *.tfstate *.tfstate.backup .terraform/ Remote Backend \u00b6 Later, you may wish to checkout Terraform remote backends which store state in a remote bucket like Google Storage or S3. terraform { backend \"gcs\" { credentials = \"/path/to/credentials.json\" project = \"project-id\" bucket = \"bucket-id\" path = \"metal.tfstate\" } }","title":"Concepts"},{"location":"architecture/concepts/#concepts","text":"Let's cover the concepts you'll need to get started.","title":"Concepts"},{"location":"architecture/concepts/#kubernetes","text":"Kubernetes is an open-source cluster system for deploying, scaling, and managing containerized applications across a pool of compute nodes (bare-metal, droplets, instances).","title":"Kubernetes"},{"location":"architecture/concepts/#nodes","text":"All cluster nodes provision themselves from a declarative configuration upfront. Nodes run a kubelet service and register themselves with the control plane to join the cluster. All nodes run kube-proxy and calico or flannel pods.","title":"Nodes"},{"location":"architecture/concepts/#controllers","text":"Controller nodes are scheduled to run the Kubernetes apiserver , scheduler , controller-manager , coredns , and kube-proxy . A fully qualified domain name (e.g. cluster_name.domain.com) resolving to a network load balancer or round-robin DNS (depends on platform) is used to refer to the control plane.","title":"Controllers"},{"location":"architecture/concepts/#workers","text":"Worker nodes register with the control plane and run application workloads.","title":"Workers"},{"location":"architecture/concepts/#terraform","text":"Terraform config files declare resources that Terraform should manage. Resources include infrastructure components created through a provider API (e.g. Compute instances, DNS records) or local assets like TLS certificates and config files. # Declare an instance resource \"google_compute_instance\" \"pet\" { # ... } The terraform tool parses configs, reconciles the desired state with actual state, and updates resources to reach desired state. $ terraform plan Plan: 4 to add, 0 to change, 0 to destroy. $ terraform apply Apply complete! Resources: 4 added, 0 changed, 0 destroyed. With Typhoon, you'll be able to manage clusters with Terraform.","title":"Terraform"},{"location":"architecture/concepts/#modules","text":"Terraform modules allow a collection of resources to be configured and managed together. Typhoon provides a Kubernetes cluster Terraform module for each supported platform and operating system. Clusters are declared in Terraform by referencing the module. module \"yavin\" { source = \"git::https://github.com/poseidon/typhoon//google-cloud/fedora-coreos/kubernetes\" cluster_name = \"yavin\" ... }","title":"Modules"},{"location":"architecture/concepts/#versioning","text":"Modules are updated regularly, set the version to a release tag or commit hash. ... source = \"git:https://github.com/poseidon/typhoon//google-cloud/fedora-coreos/kubernetes?ref=hash\" Module versioning ensures terraform get --update only fetches the desired version, so plan and apply don't change cluster resources, unless the version is altered.","title":"Versioning"},{"location":"architecture/concepts/#organize","text":"Maintain Terraform configs for \"live\" infrastructure in a versioned repository. Seek to organize configs to reflect resources that should be managed together in a terraform apply invocation. You may choose to organize resources all together, by team, by project, or some other scheme. Here's an example that manages clusters together: .git/ infra/ \u2514\u2500\u2500 terraform \u2514\u2500\u2500 clusters \u251c\u2500\u2500 aws-tempest.tf \u251c\u2500\u2500 azure-ramius.tf \u251c\u2500\u2500 bare-metal-mercury.tf \u251c\u2500\u2500 google-cloud-yavin.tf \u251c\u2500\u2500 digital-ocean-nemo.tf \u251c\u2500\u2500 providers.tf \u251c\u2500\u2500 terraform.tfvars \u2514\u2500\u2500 remote-backend.tf By convention, providers.tf registers provider APIs, terraform.tfvars stores shared values, and state is written to a remote backend.","title":"Organize"},{"location":"architecture/concepts/#state","text":"Terraform syncs its state with provider APIs to plan changes to reconcile to the desired state. By default, Terraform writes state data (including secrets!) to a terraform.tfstate file. At a minimum , add a .gitignore file (or equivalent) to prevent state from being committed to your infrastructure repository. # .gitignore *.tfstate *.tfstate.backup .terraform/","title":"State"},{"location":"architecture/concepts/#remote-backend","text":"Later, you may wish to checkout Terraform remote backends which store state in a remote bucket like Google Storage or S3. terraform { backend \"gcs\" { credentials = \"/path/to/credentials.json\" project = \"project-id\" bucket = \"bucket-id\" path = \"metal.tfstate\" } }","title":"Remote Backend"},{"location":"architecture/digitalocean/","text":"DigitalOcean \u00b6 Load Balancing \u00b6 kube-apiserver \u00b6 DNS A records round-robin 1 resolve IPv4 TCP/6443 traffic to controller droplets (regardless of whether their kube-apiserver is healthy). Clusters with multiple controllers are supported, but round-robin means \u2153 down causes ~\u2153 of apiserver requests will fail). HTTP/HTTPS Ingress \u00b6 DNS records (A and AAAA) round-robin 1 resolve the workers_dns name (e.g. nemo-workers.example.com ) to a worker droplet's IPv4 and IPv6 address. This allows running an Ingress controller Daemonset across workers (resolved regardless of whether its the controller is healthy). The DNS record name is output as workers_dns for use in application DNS CNAME records. See Ingess on DigitalOcean . TCP/UDP Services \u00b6 DNS records (A and AAAA) round-robin 1 resolve the workers_dns name (e.g. nemo-workers.example.com ) to a worker droplet's IPv4 and IPv6 address. The DNS record name is output as workers_dns for use in application DNS CNAME records. With round-robin as \"load balancing\", TCP/UDP services can be served via the same CNAME. Don't forget to add a firewall rule for the application. Custom Load Balancer \u00b6 Add a DigitalOcean load balancer to distribute IPv4 TCP traffic (HTTP/HTTPS Ingress or TCP service) across worker droplets (tagged with worker_tag ) with a healthy Ingress controller. A load balancer adds cost, but adds redundancy against worker failures (closer to Typhoon clusters on other platforms). resource \"digitalocean_loadbalancer\" \"ingress\" { name = \"ingress\" region = \"fra1\" vpc_uuid = module.nemo.vpc_id droplet_tag = module.nemo.worker_tag healthcheck { protocol = \"http\" port = \"10254\" path = \"/healthz\" healthy_threshold = 2 } forwarding_rule { entry_protocol = \"tcp\" entry_port = 80 target_protocol = \"tcp\" target_port = 80 } forwarding_rule { entry_protocol = \"tcp\" entry_port = 443 target_protocol = \"tcp\" target_port = 443 } forwarding_rule { entry_protocol = \"tcp\" entry_port = 3333 target_protocol = \"tcp\" target_port = 30300 } } Define DNS A records to digitalocean_loadbalancer.ingress.ip instead of CNAMEs. Firewalls \u00b6 Add firewall rules matching worker droplets with worker_tag . resource \"digitalocean_firewall\" \"some-app\" { name = \"some-app\" tags = [ module.nemo.worker_tag ] inbound_rule { protocol = \"tcp\" port_range = \"30300\" source_addresses = [ \"0.0.0.0/0\" ] } } IPv6 \u00b6 DigitalOcean load balancers do not have an IPv6 address. Resolving individual droplets' IPv6 addresses and using an Ingress controller with hostNetwork: true is a possible way to serve IPv6 traffic, if one must. IPv6 Feature Supported Node IPv6 address Yes Node Outbound IPv6 Yes Kubernetes Ingress IPv6 Possible DigitalOcean does offer load balancers. We've opted not to use them to keep the DigitalOcean cluster cheap for developers. \u21a9 \u21a9 \u21a9","title":"DigitalOcean"},{"location":"architecture/digitalocean/#digitalocean","text":"","title":"DigitalOcean"},{"location":"architecture/digitalocean/#load-balancing","text":"","title":"Load Balancing"},{"location":"architecture/digitalocean/#kube-apiserver","text":"DNS A records round-robin 1 resolve IPv4 TCP/6443 traffic to controller droplets (regardless of whether their kube-apiserver is healthy). Clusters with multiple controllers are supported, but round-robin means \u2153 down causes ~\u2153 of apiserver requests will fail).","title":"kube-apiserver"},{"location":"architecture/digitalocean/#httphttps-ingress","text":"DNS records (A and AAAA) round-robin 1 resolve the workers_dns name (e.g. nemo-workers.example.com ) to a worker droplet's IPv4 and IPv6 address. This allows running an Ingress controller Daemonset across workers (resolved regardless of whether its the controller is healthy). The DNS record name is output as workers_dns for use in application DNS CNAME records. See Ingess on DigitalOcean .","title":"HTTP/HTTPS Ingress"},{"location":"architecture/digitalocean/#tcpudp-services","text":"DNS records (A and AAAA) round-robin 1 resolve the workers_dns name (e.g. nemo-workers.example.com ) to a worker droplet's IPv4 and IPv6 address. The DNS record name is output as workers_dns for use in application DNS CNAME records. With round-robin as \"load balancing\", TCP/UDP services can be served via the same CNAME. Don't forget to add a firewall rule for the application.","title":"TCP/UDP Services"},{"location":"architecture/digitalocean/#custom-load-balancer","text":"Add a DigitalOcean load balancer to distribute IPv4 TCP traffic (HTTP/HTTPS Ingress or TCP service) across worker droplets (tagged with worker_tag ) with a healthy Ingress controller. A load balancer adds cost, but adds redundancy against worker failures (closer to Typhoon clusters on other platforms). resource \"digitalocean_loadbalancer\" \"ingress\" { name = \"ingress\" region = \"fra1\" vpc_uuid = module.nemo.vpc_id droplet_tag = module.nemo.worker_tag healthcheck { protocol = \"http\" port = \"10254\" path = \"/healthz\" healthy_threshold = 2 } forwarding_rule { entry_protocol = \"tcp\" entry_port = 80 target_protocol = \"tcp\" target_port = 80 } forwarding_rule { entry_protocol = \"tcp\" entry_port = 443 target_protocol = \"tcp\" target_port = 443 } forwarding_rule { entry_protocol = \"tcp\" entry_port = 3333 target_protocol = \"tcp\" target_port = 30300 } } Define DNS A records to digitalocean_loadbalancer.ingress.ip instead of CNAMEs.","title":"Custom Load Balancer"},{"location":"architecture/digitalocean/#firewalls","text":"Add firewall rules matching worker droplets with worker_tag . resource \"digitalocean_firewall\" \"some-app\" { name = \"some-app\" tags = [ module.nemo.worker_tag ] inbound_rule { protocol = \"tcp\" port_range = \"30300\" source_addresses = [ \"0.0.0.0/0\" ] } }","title":"Firewalls"},{"location":"architecture/digitalocean/#ipv6","text":"DigitalOcean load balancers do not have an IPv6 address. Resolving individual droplets' IPv6 addresses and using an Ingress controller with hostNetwork: true is a possible way to serve IPv6 traffic, if one must. IPv6 Feature Supported Node IPv6 address Yes Node Outbound IPv6 Yes Kubernetes Ingress IPv6 Possible DigitalOcean does offer load balancers. We've opted not to use them to keep the DigitalOcean cluster cheap for developers. \u21a9 \u21a9 \u21a9","title":"IPv6"},{"location":"architecture/google-cloud/","text":"Google Cloud \u00b6 Load Balancing \u00b6 kube-apiserver \u00b6 A global forwarding rule (IPv4 anycast) and TCP Proxy distribute IPv4 TCP/443 traffic across a backend service with zonal instance groups of controller(s) with a healthy kube-apiserver (TCP/6443). Clusters with multiple controllers span zones in a region to tolerate zone outages. Notes: GCP TCP Proxy limits external port options (e.g. must use 443, not 6443) A regional NLB cannot be used for multi-controller (see #190 ) HTTP/HTTP Ingress \u00b6 Global forwarding rules and a TCP Proxy distribute IPv4/IPv6 TCP/80 and TCP/443 traffic across a managed instance group of workers with a healthy Ingress Controller. Workers span zones in a region to tolerate zone outages. The IPv4 and IPv6 anycast addresses are output as ingress_static_ipv4 and ingress_static_ipv6 for use in DNS A and AAAA records. See Ingress on Google Cloud . TCP/UDP Services \u00b6 Load balance TCP/UDP applications by adding a forwarding rule to the worker target pool (output). # Static IPv4 address for some-app Load Balancing resource \"google_compute_address\" \"some-app-ipv4\" { name = \"some-app-ipv4\" } # Forward IPv4 TCP traffic to the target pool resource \"google_compute_forwarding_rule\" \"some-app-tcp\" { name = \"some-app-tcp\" ip_address = google_compute_address.some-app-ipv4.address ip_protocol = \"TCP\" port_range = \"3333\" target = module.yavin.worker_target_pool } # Forward IPv4 UDP traffic to the target pool resource \"google_compute_forwarding_rule\" \"some-app-udp\" { name = \"some-app-udp\" ip_address = google_compute_address.some-app-ipv4.address ip_protocol = \"UDP\" port_range = \"3333\" target = module.yavin.worker_target_pool } Notes: GCP Global Load Balancers aren't appropriate for custom TCP/UDP. Backend Services require a named port corresponding to an instance group (output by Typhoon) port. Typhoon shouldn't accept a list of every TCP/UDP service that may later be hosted on the cluster. Backend Services don't support UDP (i.e. rules out global load balancers) IPv4 Only: Regional Load Balancers use a regional IPv4 address (e.g. google_compute_address ), no IPv6. Forward rules don't support differing external and internal ports. Some Ingress controllers (e.g. nginx) can proxy TCP/UDP traffic to achieve this. Worker target pool health checks workers HTTP:10254/healthz (i.e. nginx-ingress ) Firewalls \u00b6 Add firewall rules to the cluster's network. resource \"google_compute_firewall\" \"some-app\" { name = \"some-app\" network = module.yavin.network_self_link allow { protocol = \"tcp\" ports = [ 3333 ] } allow { protocol = \"udp\" ports = [ 3333 ] } source_ranges = [ \"0.0.0.0/0\" ] target_tags = [ \"yavin-worker\" ] } IPv6 \u00b6 Applications exposed via HTTP/HTTPS Ingress can be served over IPv6. IPv6 Feature Supported Node IPv6 address No Node Outbound IPv6 No Kubernetes Ingress IPv6 Yes","title":"Google Cloud"},{"location":"architecture/google-cloud/#google-cloud","text":"","title":"Google Cloud"},{"location":"architecture/google-cloud/#load-balancing","text":"","title":"Load Balancing"},{"location":"architecture/google-cloud/#kube-apiserver","text":"A global forwarding rule (IPv4 anycast) and TCP Proxy distribute IPv4 TCP/443 traffic across a backend service with zonal instance groups of controller(s) with a healthy kube-apiserver (TCP/6443). Clusters with multiple controllers span zones in a region to tolerate zone outages. Notes: GCP TCP Proxy limits external port options (e.g. must use 443, not 6443) A regional NLB cannot be used for multi-controller (see #190 )","title":"kube-apiserver"},{"location":"architecture/google-cloud/#httphttp-ingress","text":"Global forwarding rules and a TCP Proxy distribute IPv4/IPv6 TCP/80 and TCP/443 traffic across a managed instance group of workers with a healthy Ingress Controller. Workers span zones in a region to tolerate zone outages. The IPv4 and IPv6 anycast addresses are output as ingress_static_ipv4 and ingress_static_ipv6 for use in DNS A and AAAA records. See Ingress on Google Cloud .","title":"HTTP/HTTP Ingress"},{"location":"architecture/google-cloud/#tcpudp-services","text":"Load balance TCP/UDP applications by adding a forwarding rule to the worker target pool (output). # Static IPv4 address for some-app Load Balancing resource \"google_compute_address\" \"some-app-ipv4\" { name = \"some-app-ipv4\" } # Forward IPv4 TCP traffic to the target pool resource \"google_compute_forwarding_rule\" \"some-app-tcp\" { name = \"some-app-tcp\" ip_address = google_compute_address.some-app-ipv4.address ip_protocol = \"TCP\" port_range = \"3333\" target = module.yavin.worker_target_pool } # Forward IPv4 UDP traffic to the target pool resource \"google_compute_forwarding_rule\" \"some-app-udp\" { name = \"some-app-udp\" ip_address = google_compute_address.some-app-ipv4.address ip_protocol = \"UDP\" port_range = \"3333\" target = module.yavin.worker_target_pool } Notes: GCP Global Load Balancers aren't appropriate for custom TCP/UDP. Backend Services require a named port corresponding to an instance group (output by Typhoon) port. Typhoon shouldn't accept a list of every TCP/UDP service that may later be hosted on the cluster. Backend Services don't support UDP (i.e. rules out global load balancers) IPv4 Only: Regional Load Balancers use a regional IPv4 address (e.g. google_compute_address ), no IPv6. Forward rules don't support differing external and internal ports. Some Ingress controllers (e.g. nginx) can proxy TCP/UDP traffic to achieve this. Worker target pool health checks workers HTTP:10254/healthz (i.e. nginx-ingress )","title":"TCP/UDP Services"},{"location":"architecture/google-cloud/#firewalls","text":"Add firewall rules to the cluster's network. resource \"google_compute_firewall\" \"some-app\" { name = \"some-app\" network = module.yavin.network_self_link allow { protocol = \"tcp\" ports = [ 3333 ] } allow { protocol = \"udp\" ports = [ 3333 ] } source_ranges = [ \"0.0.0.0/0\" ] target_tags = [ \"yavin-worker\" ] }","title":"Firewalls"},{"location":"architecture/google-cloud/#ipv6","text":"Applications exposed via HTTP/HTTPS Ingress can be served over IPv6. IPv6 Feature Supported Node IPv6 address No Node Outbound IPv6 No Kubernetes Ingress IPv6 Yes","title":"IPv6"},{"location":"architecture/operating-systems/","text":"Operating Systems \u00b6 Typhoon supports Fedora CoreOS and Flatcar Linux . These operating systems were chosen because they offer: Minimalism and focus on clustered operation Automated and atomic operating system upgrades Declarative and immutable configuration Optimization for containerized applications Together, they diversify Typhoon to support a range of container technologies. Fedora CoreOS: rpm-ostree, podman, moby Flatcar Linux: Gentoo core, rkt-fly, docker Host Properties \u00b6 Property Flatcar Linux Fedora CoreOS Kernel ~5.10.x ~5.16.x systemd 249 249 Username core core Ignition system Ignition v2.x spec Ignition v3.x spec storage driver overlay2 (extfs) overlay2 (xfs) logging driver json-file journald cgroup driver systemd systemd cgroup version v2 v2 Networking systemd-networkd NetworkManager Resolver systemd-resolved systemd-resolved Kubernetes Properties \u00b6 Property Flatcar Linux Fedora CoreOS single-master all platforms all platforms multi-master all platforms all platforms control plane static pods static pods Container Runtime containerd 1.5.9 containerd 1.6.0 kubelet image kubelet image with upstream binary kubelet image with upstream binary control plane images upstream images upstream images on-host etcd docker podman on-host kubelet docker podman CNI plugins calico, cilium, flannel calico, cilium, flannel coordinated drain & OS update FLUO addon fleetlock Directory Locations \u00b6 Typhoon conventional directories. Kubelet setting Host location cni-conf-dir /etc/cni/net.d pod-manifest-path /etc/kubernetes/manifests volume-plugin-dir /var/lib/kubelet/volumeplugins","title":"Operating Systems"},{"location":"architecture/operating-systems/#operating-systems","text":"Typhoon supports Fedora CoreOS and Flatcar Linux . These operating systems were chosen because they offer: Minimalism and focus on clustered operation Automated and atomic operating system upgrades Declarative and immutable configuration Optimization for containerized applications Together, they diversify Typhoon to support a range of container technologies. Fedora CoreOS: rpm-ostree, podman, moby Flatcar Linux: Gentoo core, rkt-fly, docker","title":"Operating Systems"},{"location":"architecture/operating-systems/#host-properties","text":"Property Flatcar Linux Fedora CoreOS Kernel ~5.10.x ~5.16.x systemd 249 249 Username core core Ignition system Ignition v2.x spec Ignition v3.x spec storage driver overlay2 (extfs) overlay2 (xfs) logging driver json-file journald cgroup driver systemd systemd cgroup version v2 v2 Networking systemd-networkd NetworkManager Resolver systemd-resolved systemd-resolved","title":"Host Properties"},{"location":"architecture/operating-systems/#kubernetes-properties","text":"Property Flatcar Linux Fedora CoreOS single-master all platforms all platforms multi-master all platforms all platforms control plane static pods static pods Container Runtime containerd 1.5.9 containerd 1.6.0 kubelet image kubelet image with upstream binary kubelet image with upstream binary control plane images upstream images upstream images on-host etcd docker podman on-host kubelet docker podman CNI plugins calico, cilium, flannel calico, cilium, flannel coordinated drain & OS update FLUO addon fleetlock","title":"Kubernetes Properties"},{"location":"architecture/operating-systems/#directory-locations","text":"Typhoon conventional directories. Kubelet setting Host location cni-conf-dir /etc/cni/net.d pod-manifest-path /etc/kubernetes/manifests volume-plugin-dir /var/lib/kubelet/volumeplugins","title":"Directory Locations"},{"location":"fedora-coreos/aws/","text":"AWS \u00b6 In this tutorial, we'll create a Kubernetes v1.24.3 cluster on AWS with Fedora CoreOS. We'll declare a Kubernetes cluster using the Typhoon Terraform module. Then apply the changes to create a VPC, gateway, subnets, security groups, controller instances, worker auto-scaling group, network load balancer, and TLS assets. Controller hosts are provisioned to run an etcd-member peer and a kubelet service. Worker hosts run a kubelet service. Controller nodes run kube-apiserver , kube-scheduler , kube-controller-manager , and coredns , while kube-proxy and calico (or flannel ) run on every node. A generated kubeconfig provides kubectl access to the cluster. Requirements \u00b6 AWS Account and IAM credentials AWS Route53 DNS Zone (registered Domain Name or delegated subdomain) Terraform v0.13.0+ Terraform Setup \u00b6 Install Terraform v0.13.0+ on your system. $ terraform version Terraform v1.0.0 Read concepts to learn about Terraform, modules, and organizing resources. Change to your infrastructure repository (e.g. infra ). cd infra/clusters Provider \u00b6 Login to your AWS IAM dashboard and find your IAM user. Select \"Security Credentials\" and create an access key. Save the id and secret to a file that can be referenced in configs. [default] aws_access_key_id = xxx aws_secret_access_key = yyy Configure the AWS provider to use your access key credentials in a providers.tf file. provider \"aws\" { region = \"eu-central-1\" shared_credentials_file = \"/home/user/.config/aws/credentials\" } provider \"ct\" {} terraform { required_providers { ct = { source = \"poseidon/ct\" version = \"0.10.0\" } aws = { source = \"hashicorp/aws\" version = \"4.22.0\" } } } Additional configuration options are described in the aws provider docs . Tip Regions are listed in docs or with aws ec2 describe-regions . Cluster \u00b6 Define a Kubernetes cluster using the module aws/fedora-coreos/kubernetes . module \"tempest\" { source = \"git::https://github.com/poseidon/typhoon//aws/fedora-coreos/kubernetes?ref=v1.24.3\" # AWS cluster_name = \"tempest\" dns_zone = \"aws.example.com\" dns_zone_id = \"Z3PAABBCFAKEC0\" # configuration ssh_authorized_key = \"ssh-ed25519 AAAAB3Nz...\" # optional worker_count = 2 worker_type = \"t3.small\" } Reference the variables docs or the variables.tf source. ssh-agent \u00b6 Initial bootstrapping requires bootstrap.service be started on one controller node. Terraform uses ssh-agent to automate this step. Add your SSH private key to ssh-agent . ssh-add ~/.ssh/id_ed25519 ssh-add -L Apply \u00b6 Initialize the config directory if this is the first use with Terraform. terraform init Plan the resources to be created. $ terraform plan Plan: 81 to add, 0 to change, 0 to destroy. Apply the changes to create the cluster. $ terraform apply ... module.tempest.null_resource.bootstrap: Still creating... ( 4m50s elapsed ) module.tempest.null_resource.bootstrap: Still creating... ( 5m0s elapsed ) module.tempest.null_resource.bootstrap: Creation complete after 5m8s ( ID: 3961816482286168143 ) Apply complete! Resources: 98 added, 0 changed, 0 destroyed. In 4-8 minutes, the Kubernetes cluster will be ready. Verify \u00b6 Install kubectl on your system. Obtain the generated cluster kubeconfig from module outputs (e.g. write to a local file). resource \"local_file\" \"kubeconfig-tempest\" { content = module.tempest.kubeconfig-admin filename = \"/home/user/.kube/configs/tempest-config\" } List nodes in the cluster. $ export KUBECONFIG=/home/user/.kube/configs/tempest-config $ kubectl get nodes NAME STATUS ROLES AGE VERSION ip-10-0-3-155 Ready <none> 10m v1.24.3 ip-10-0-26-65 Ready <none> 10m v1.24.3 ip-10-0-41-21 Ready <none> 10m v1.24.3 List the pods. $ kubectl get pods --all-namespaces NAMESPACE NAME READY STATUS RESTARTS AGE kube-system calico-node-1m5bf 2/2 Running 0 34m kube-system calico-node-7jmr1 2/2 Running 0 34m kube-system calico-node-bknc8 2/2 Running 0 34m kube-system coredns-1187388186-wx1lg 1/1 Running 0 34m kube-system coredns-1187388186-qjnvp 1/1 Running 0 34m kube-system kube-apiserver-ip-10-0-3-155 1/1 Running 0 34m kube-system kube-controller-manager-ip-10-0-3-155 1/1 Running 0 34m kube-system kube-proxy-14wxv 1/1 Running 0 34m kube-system kube-proxy-9vxh2 1/1 Running 0 34m kube-system kube-proxy-sbbsh 1/1 Running 0 34m kube-system kube-scheduler-ip-10-0-3-155 1/1 Running 1 34m Going Further \u00b6 Learn about maintenance and addons . Variables \u00b6 Check the variables.tf source. Required \u00b6 Name Description Example cluster_name Unique cluster name (prepended to dns_zone) \"tempest\" dns_zone AWS Route53 DNS zone \"aws.example.com\" dns_zone_id AWS Route53 DNS zone id \"Z3PAABBCFAKEC0\" ssh_authorized_key SSH public key for user 'core' \"ssh-ed25519 AAAAB3NZ...\" DNS Zone \u00b6 Clusters create a DNS A record ${cluster_name}.${dns_zone} to resolve a network load balancer backed by controller instances. This FQDN is used by workers and kubectl to access the apiserver(s). In this example, the cluster's apiserver would be accessible at tempest.aws.example.com . You'll need a registered domain name or delegated subdomain on AWS Route53. You can set this up once and create many clusters with unique names. resource \"aws_route53_zone\" \"zone-for-clusters\" { name = \"aws.example.com.\" } Reference the DNS zone id with aws_route53_zone.zone-for-clusters.zone_id . If you have an existing domain name with a zone file elsewhere, just delegate a subdomain that can be managed on Route53 (e.g. aws.mydomain.com) and update nameservers . Optional \u00b6 Name Description Default Example controller_count Number of controllers (i.e. masters) 1 1 worker_count Number of workers 1 3 controller_type EC2 instance type for controllers \"t3.small\" See below worker_type EC2 instance type for workers \"t3.small\" See below os_stream Fedora CoreOS stream for compute instances \"stable\" \"testing\", \"next\" disk_size Size of the EBS volume in GB 30 100 disk_type Type of the EBS volume \"gp3\" standard, gp2, gp3, io1 disk_iops IOPS of the EBS volume 0 (i.e. auto) 400 worker_target_groups Target group ARNs to which worker instances should be added [] [aws_lb_target_group.app.id] worker_price Spot price in USD for worker instances or 0 to use on-demand instances 0 0.10 controller_snippets Controller Butane snippets [] examples worker_snippets Worker Butane snippets [] examples networking Choice of networking provider \"cilium\" \"calico\" or \"cilium\" or \"flannel\" network_mtu CNI interface MTU (calico only) 1480 8981 host_cidr CIDR IPv4 range to assign to EC2 instances \"10.0.0.0/16\" \"10.1.0.0/16\" pod_cidr CIDR IPv4 range to assign to Kubernetes pods \"10.2.0.0/16\" \"10.22.0.0/16\" service_cidr CIDR IPv4 range to assign to Kubernetes services \"10.3.0.0/16\" \"10.3.0.0/24\" worker_node_labels List of initial worker node labels [] [\"worker-pool=default\"] Check the list of valid instance types . Warning Do not choose a controller_type smaller than t2.small . Smaller instances are not sufficient for running a controller. MTU If your EC2 instance type supports Jumbo frames (most do), we recommend you change the network_mtu to 8981! You will get better pod-to-pod bandwidth. Spot \u00b6 Add worker_price = \"0.10\" to use spot instance workers (instead of \"on-demand\") and set a maximum spot price in USD. Clusters can tolerate spot market interuptions fairly well (reschedules pods, but cannot drain) to save money, with the tradeoff that requests for workers may go unfulfilled.","title":"AWS"},{"location":"fedora-coreos/aws/#aws","text":"In this tutorial, we'll create a Kubernetes v1.24.3 cluster on AWS with Fedora CoreOS. We'll declare a Kubernetes cluster using the Typhoon Terraform module. Then apply the changes to create a VPC, gateway, subnets, security groups, controller instances, worker auto-scaling group, network load balancer, and TLS assets. Controller hosts are provisioned to run an etcd-member peer and a kubelet service. Worker hosts run a kubelet service. Controller nodes run kube-apiserver , kube-scheduler , kube-controller-manager , and coredns , while kube-proxy and calico (or flannel ) run on every node. A generated kubeconfig provides kubectl access to the cluster.","title":"AWS"},{"location":"fedora-coreos/aws/#requirements","text":"AWS Account and IAM credentials AWS Route53 DNS Zone (registered Domain Name or delegated subdomain) Terraform v0.13.0+","title":"Requirements"},{"location":"fedora-coreos/aws/#terraform-setup","text":"Install Terraform v0.13.0+ on your system. $ terraform version Terraform v1.0.0 Read concepts to learn about Terraform, modules, and organizing resources. Change to your infrastructure repository (e.g. infra ). cd infra/clusters","title":"Terraform Setup"},{"location":"fedora-coreos/aws/#provider","text":"Login to your AWS IAM dashboard and find your IAM user. Select \"Security Credentials\" and create an access key. Save the id and secret to a file that can be referenced in configs. [default] aws_access_key_id = xxx aws_secret_access_key = yyy Configure the AWS provider to use your access key credentials in a providers.tf file. provider \"aws\" { region = \"eu-central-1\" shared_credentials_file = \"/home/user/.config/aws/credentials\" } provider \"ct\" {} terraform { required_providers { ct = { source = \"poseidon/ct\" version = \"0.10.0\" } aws = { source = \"hashicorp/aws\" version = \"4.22.0\" } } } Additional configuration options are described in the aws provider docs . Tip Regions are listed in docs or with aws ec2 describe-regions .","title":"Provider"},{"location":"fedora-coreos/aws/#cluster","text":"Define a Kubernetes cluster using the module aws/fedora-coreos/kubernetes . module \"tempest\" { source = \"git::https://github.com/poseidon/typhoon//aws/fedora-coreos/kubernetes?ref=v1.24.3\" # AWS cluster_name = \"tempest\" dns_zone = \"aws.example.com\" dns_zone_id = \"Z3PAABBCFAKEC0\" # configuration ssh_authorized_key = \"ssh-ed25519 AAAAB3Nz...\" # optional worker_count = 2 worker_type = \"t3.small\" } Reference the variables docs or the variables.tf source.","title":"Cluster"},{"location":"fedora-coreos/aws/#ssh-agent","text":"Initial bootstrapping requires bootstrap.service be started on one controller node. Terraform uses ssh-agent to automate this step. Add your SSH private key to ssh-agent . ssh-add ~/.ssh/id_ed25519 ssh-add -L","title":"ssh-agent"},{"location":"fedora-coreos/aws/#apply","text":"Initialize the config directory if this is the first use with Terraform. terraform init Plan the resources to be created. $ terraform plan Plan: 81 to add, 0 to change, 0 to destroy. Apply the changes to create the cluster. $ terraform apply ... module.tempest.null_resource.bootstrap: Still creating... ( 4m50s elapsed ) module.tempest.null_resource.bootstrap: Still creating... ( 5m0s elapsed ) module.tempest.null_resource.bootstrap: Creation complete after 5m8s ( ID: 3961816482286168143 ) Apply complete! Resources: 98 added, 0 changed, 0 destroyed. In 4-8 minutes, the Kubernetes cluster will be ready.","title":"Apply"},{"location":"fedora-coreos/aws/#verify","text":"Install kubectl on your system. Obtain the generated cluster kubeconfig from module outputs (e.g. write to a local file). resource \"local_file\" \"kubeconfig-tempest\" { content = module.tempest.kubeconfig-admin filename = \"/home/user/.kube/configs/tempest-config\" } List nodes in the cluster. $ export KUBECONFIG=/home/user/.kube/configs/tempest-config $ kubectl get nodes NAME STATUS ROLES AGE VERSION ip-10-0-3-155 Ready <none> 10m v1.24.3 ip-10-0-26-65 Ready <none> 10m v1.24.3 ip-10-0-41-21 Ready <none> 10m v1.24.3 List the pods. $ kubectl get pods --all-namespaces NAMESPACE NAME READY STATUS RESTARTS AGE kube-system calico-node-1m5bf 2/2 Running 0 34m kube-system calico-node-7jmr1 2/2 Running 0 34m kube-system calico-node-bknc8 2/2 Running 0 34m kube-system coredns-1187388186-wx1lg 1/1 Running 0 34m kube-system coredns-1187388186-qjnvp 1/1 Running 0 34m kube-system kube-apiserver-ip-10-0-3-155 1/1 Running 0 34m kube-system kube-controller-manager-ip-10-0-3-155 1/1 Running 0 34m kube-system kube-proxy-14wxv 1/1 Running 0 34m kube-system kube-proxy-9vxh2 1/1 Running 0 34m kube-system kube-proxy-sbbsh 1/1 Running 0 34m kube-system kube-scheduler-ip-10-0-3-155 1/1 Running 1 34m","title":"Verify"},{"location":"fedora-coreos/aws/#going-further","text":"Learn about maintenance and addons .","title":"Going Further"},{"location":"fedora-coreos/aws/#variables","text":"Check the variables.tf source.","title":"Variables"},{"location":"fedora-coreos/aws/#required","text":"Name Description Example cluster_name Unique cluster name (prepended to dns_zone) \"tempest\" dns_zone AWS Route53 DNS zone \"aws.example.com\" dns_zone_id AWS Route53 DNS zone id \"Z3PAABBCFAKEC0\" ssh_authorized_key SSH public key for user 'core' \"ssh-ed25519 AAAAB3NZ...\"","title":"Required"},{"location":"fedora-coreos/aws/#dns-zone","text":"Clusters create a DNS A record ${cluster_name}.${dns_zone} to resolve a network load balancer backed by controller instances. This FQDN is used by workers and kubectl to access the apiserver(s). In this example, the cluster's apiserver would be accessible at tempest.aws.example.com . You'll need a registered domain name or delegated subdomain on AWS Route53. You can set this up once and create many clusters with unique names. resource \"aws_route53_zone\" \"zone-for-clusters\" { name = \"aws.example.com.\" } Reference the DNS zone id with aws_route53_zone.zone-for-clusters.zone_id . If you have an existing domain name with a zone file elsewhere, just delegate a subdomain that can be managed on Route53 (e.g. aws.mydomain.com) and update nameservers .","title":"DNS Zone"},{"location":"fedora-coreos/aws/#optional","text":"Name Description Default Example controller_count Number of controllers (i.e. masters) 1 1 worker_count Number of workers 1 3 controller_type EC2 instance type for controllers \"t3.small\" See below worker_type EC2 instance type for workers \"t3.small\" See below os_stream Fedora CoreOS stream for compute instances \"stable\" \"testing\", \"next\" disk_size Size of the EBS volume in GB 30 100 disk_type Type of the EBS volume \"gp3\" standard, gp2, gp3, io1 disk_iops IOPS of the EBS volume 0 (i.e. auto) 400 worker_target_groups Target group ARNs to which worker instances should be added [] [aws_lb_target_group.app.id] worker_price Spot price in USD for worker instances or 0 to use on-demand instances 0 0.10 controller_snippets Controller Butane snippets [] examples worker_snippets Worker Butane snippets [] examples networking Choice of networking provider \"cilium\" \"calico\" or \"cilium\" or \"flannel\" network_mtu CNI interface MTU (calico only) 1480 8981 host_cidr CIDR IPv4 range to assign to EC2 instances \"10.0.0.0/16\" \"10.1.0.0/16\" pod_cidr CIDR IPv4 range to assign to Kubernetes pods \"10.2.0.0/16\" \"10.22.0.0/16\" service_cidr CIDR IPv4 range to assign to Kubernetes services \"10.3.0.0/16\" \"10.3.0.0/24\" worker_node_labels List of initial worker node labels [] [\"worker-pool=default\"] Check the list of valid instance types . Warning Do not choose a controller_type smaller than t2.small . Smaller instances are not sufficient for running a controller. MTU If your EC2 instance type supports Jumbo frames (most do), we recommend you change the network_mtu to 8981! You will get better pod-to-pod bandwidth.","title":"Optional"},{"location":"fedora-coreos/aws/#spot","text":"Add worker_price = \"0.10\" to use spot instance workers (instead of \"on-demand\") and set a maximum spot price in USD. Clusters can tolerate spot market interuptions fairly well (reschedules pods, but cannot drain) to save money, with the tradeoff that requests for workers may go unfulfilled.","title":"Spot"},{"location":"fedora-coreos/azure/","text":"Azure \u00b6 In this tutorial, we'll create a Kubernetes v1.24.3 cluster on Azure with Fedora CoreOS. We'll declare a Kubernetes cluster using the Typhoon Terraform module. Then apply the changes to create a resource group, virtual network, subnets, security groups, controller availability set, worker scale set, load balancer, and TLS assets. Controller hosts are provisioned to run an etcd-member peer and a kubelet service. Worker hosts run a kubelet service. Controller nodes run kube-apiserver , kube-scheduler , kube-controller-manager , and coredns , while kube-proxy and calico (or flannel ) run on every node. A generated kubeconfig provides kubectl access to the cluster. Requirements \u00b6 Azure account Azure DNS Zone (registered Domain Name or delegated subdomain) Terraform v0.13.0+ Terraform Setup \u00b6 Install Terraform v0.13.0+ on your system. $ terraform version Terraform v1.0.0 Read concepts to learn about Terraform, modules, and organizing resources. Change to your infrastructure repository (e.g. infra ). cd infra/clusters Provider \u00b6 Install the Azure az command line tool to authenticate with Azure . az login Configure the Azure provider in a providers.tf file. provider \"azurerm\" { features {} } provider \"ct\" {} terraform { required_providers { ct = { source = \"poseidon/ct\" version = \"0.10.0\" } azurerm = { source = \"hashicorp/azurerm\" version = \"3.14.0\" } } } Additional configuration options are described in the azurerm provider docs . Fedora CoreOS Images \u00b6 Fedora CoreOS publishes images for Azure, but does not yet upload them. Azure allows custom images to be uploaded to a storage account bucket and imported. Download a Fedora CoreOS Azure VHD image and upload it to an Azure storage account container (i.e. bucket) via the UI (quite slow). xz -d fedora-coreos-31.20200323.3.2-azure.x86_64.vhd.xz Create an Azure disk (note disk ID) and create an Azure image from it (note image ID). az disk create --name fedora-coreos-31.20200323.3.2 -g GROUP --source https://BUCKET.blob.core.windows.net/fedora-coreos/fedora-coreos-31.20200323.3.2-azure.x86_64.vhd az image create --name fedora-coreos-31.20200323.3.2 -g GROUP --os-type=linux --source /subscriptions/some/path/providers/Microsoft.Compute/disks/fedora-coreos-31.20200323.3.2 Set the os_image in the next step. Cluster \u00b6 Define a Kubernetes cluster using the module azure/fedora-coreos/kubernetes . module \"ramius\" { source = \"git::https://github.com/poseidon/typhoon//azure/fedora-coreos/kubernetes?ref=v1.24.3\" # Azure cluster_name = \"ramius\" region = \"centralus\" dns_zone = \"azure.example.com\" dns_zone_group = \"example-group\" # configuration os_image = \"/subscriptions/some/path/Microsoft.Compute/images/fedora-coreos-31.20200323.3.2\" ssh_authorized_key = \"ssh-ed25519 AAAAB3Nz...\" # optional worker_count = 2 host_cidr = \"10.0.0.0/20\" } Reference the variables docs or the variables.tf source. ssh-agent \u00b6 Initial bootstrapping requires bootstrap.service be started on one controller node. Terraform uses ssh-agent to automate this step. Add your SSH private key to ssh-agent . ssh-add ~/.ssh/id_ed25519 ssh-add -L Apply \u00b6 Initialize the config directory if this is the first use with Terraform. terraform init Plan the resources to be created. $ terraform plan Plan: 86 to add, 0 to change, 0 to destroy. Apply the changes to create the cluster. $ terraform apply ... module.ramius.null_resource.bootstrap: Still creating... ( 6m50s elapsed ) module.ramius.null_resource.bootstrap: Still creating... ( 7m0s elapsed ) module.ramius.null_resource.bootstrap: Creation complete after 7m8s ( ID: 3961816482286168143 ) Apply complete! Resources: 69 added, 0 changed, 0 destroyed. In 4-8 minutes, the Kubernetes cluster will be ready. Verify \u00b6 Install kubectl on your system. Obtain the generated cluster kubeconfig from module outputs (e.g. write to a local file). resource \"local_file\" \"kubeconfig-ramius\" { content = module.ramius.kubeconfig-admin filename = \"/home/user/.kube/configs/ramius-config\" } List nodes in the cluster. $ export KUBECONFIG=/home/user/.kube/configs/ramius-config $ kubectl get nodes NAME STATUS ROLES AGE VERSION ramius-controller-0 Ready <none> 24m v1.24.3 ramius-worker-000001 Ready <none> 25m v1.24.3 ramius-worker-000002 Ready <none> 24m v1.24.3 List the pods. $ kubectl get pods --all-namespaces NAMESPACE NAME READY STATUS RESTARTS AGE kube-system coredns-7c6fbb4f4b-b6qzx 1/1 Running 0 26m kube-system coredns-7c6fbb4f4b-j2k3d 1/1 Running 0 26m kube-system calico-node-1m5bf 2/2 Running 0 26m kube-system calico-node-7jmr1 2/2 Running 0 26m kube-system calico-node-bknc8 2/2 Running 0 26m kube-system kube-apiserver-ramius-controller-0 1/1 Running 0 26m kube-system kube-controller-manager-ramius-controller-0 1/1 Running 0 26m kube-system kube-proxy-j4vpq 1/1 Running 0 26m kube-system kube-proxy-jxr5d 1/1 Running 0 26m kube-system kube-proxy-lbdw5 1/1 Running 0 26m kube-system kube-scheduler-ramius-controller-0 1/1 Running 0 26m Going Further \u00b6 Learn about maintenance and addons . Variables \u00b6 Check the variables.tf source. Required \u00b6 Name Description Example cluster_name Unique cluster name (prepended to dns_zone) \"ramius\" region Azure region \"centralus\" dns_zone Azure DNS zone \"azure.example.com\" dns_zone_group Resource group where the Azure DNS zone resides \"global\" os_image Fedora CoreOS image for instances \"/subscriptions/..../custom-image\" ssh_authorized_key SSH public key for user 'core' \"ssh-ed25519 AAAAB3NZ...\" Tip Regions are shown in docs or with az account list-locations --output table . DNS Zone \u00b6 Clusters create a DNS A record ${cluster_name}.${dns_zone} to resolve a load balancer backed by controller instances. This FQDN is used by workers and kubectl to access the apiserver(s). In this example, the cluster's apiserver would be accessible at ramius.azure.example.com . You'll need a registered domain name or delegated subdomain on Azure DNS. You can set this up once and create many clusters with unique names. # Azure resource group for DNS zone resource \"azurerm_resource_group\" \"global\" { name = \"global\" location = \"centralus\" } # DNS zone for clusters resource \"azurerm_dns_zone\" \"clusters\" { resource_group_name = azurerm_resource_group.global.name name = \"azure.example.com\" zone_type = \"Public\" } Reference the DNS zone with azurerm_dns_zone.clusters.name and its resource group with \"azurerm_resource_group.global.name . If you have an existing domain name with a zone file elsewhere, just delegate a subdomain that can be managed on Azure DNS (e.g. azure.mydomain.com) and update nameservers . Optional \u00b6 Name Description Default Example controller_count Number of controllers (i.e. masters) 1 1 worker_count Number of workers 1 3 controller_type Machine type for controllers \"Standard_B2s\" See below worker_type Machine type for workers \"Standard_DS1_v2\" See below disk_size Size of the disk in GB 30 100 worker_priority Set priority to Spot to use reduced cost surplus capacity, with the tradeoff that instances can be deallocated at any time Regular Spot controller_snippets Controller Butane snippets [] example worker_snippets Worker Butane snippets [] example networking Choice of networking provider \"cilium\" \"calico\" or \"cilium\" or \"flannel\" host_cidr CIDR IPv4 range to assign to instances \"10.0.0.0/16\" \"10.0.0.0/20\" pod_cidr CIDR IPv4 range to assign to Kubernetes pods \"10.2.0.0/16\" \"10.22.0.0/16\" service_cidr CIDR IPv4 range to assign to Kubernetes services \"10.3.0.0/16\" \"10.3.0.0/24\" worker_node_labels List of initial worker node labels [] [\"worker-pool=default\"] Check the list of valid machine types and their specs . Use az vm list-skus to get the identifier. Warning Unlike AWS and GCP, Azure requires its virtual networks to have non-overlapping IPv4 CIDRs (yeah, go figure). Instead of each cluster just using 10.0.0.0/16 for instances, each Azure cluster's host_cidr must be non-overlapping (e.g. 10.0.0.0/20 for the 1 st cluster, 10.0.16.0/20 for the 2 nd cluster, etc). Warning Do not choose a controller_type smaller than Standard_B2s . Smaller instances are not sufficient for running a controller. Spot Priority \u00b6 Add worker_priority=Spot to use Spot Priority workers that run on Azure's surplus capacity at lower cost, but with the tradeoff that they can be deallocated at random. Spot priority VMs are Azure's analog to AWS spot instances or GCP premptible instances.","title":"Azure"},{"location":"fedora-coreos/azure/#azure","text":"In this tutorial, we'll create a Kubernetes v1.24.3 cluster on Azure with Fedora CoreOS. We'll declare a Kubernetes cluster using the Typhoon Terraform module. Then apply the changes to create a resource group, virtual network, subnets, security groups, controller availability set, worker scale set, load balancer, and TLS assets. Controller hosts are provisioned to run an etcd-member peer and a kubelet service. Worker hosts run a kubelet service. Controller nodes run kube-apiserver , kube-scheduler , kube-controller-manager , and coredns , while kube-proxy and calico (or flannel ) run on every node. A generated kubeconfig provides kubectl access to the cluster.","title":"Azure"},{"location":"fedora-coreos/azure/#requirements","text":"Azure account Azure DNS Zone (registered Domain Name or delegated subdomain) Terraform v0.13.0+","title":"Requirements"},{"location":"fedora-coreos/azure/#terraform-setup","text":"Install Terraform v0.13.0+ on your system. $ terraform version Terraform v1.0.0 Read concepts to learn about Terraform, modules, and organizing resources. Change to your infrastructure repository (e.g. infra ). cd infra/clusters","title":"Terraform Setup"},{"location":"fedora-coreos/azure/#provider","text":"Install the Azure az command line tool to authenticate with Azure . az login Configure the Azure provider in a providers.tf file. provider \"azurerm\" { features {} } provider \"ct\" {} terraform { required_providers { ct = { source = \"poseidon/ct\" version = \"0.10.0\" } azurerm = { source = \"hashicorp/azurerm\" version = \"3.14.0\" } } } Additional configuration options are described in the azurerm provider docs .","title":"Provider"},{"location":"fedora-coreos/azure/#fedora-coreos-images","text":"Fedora CoreOS publishes images for Azure, but does not yet upload them. Azure allows custom images to be uploaded to a storage account bucket and imported. Download a Fedora CoreOS Azure VHD image and upload it to an Azure storage account container (i.e. bucket) via the UI (quite slow). xz -d fedora-coreos-31.20200323.3.2-azure.x86_64.vhd.xz Create an Azure disk (note disk ID) and create an Azure image from it (note image ID). az disk create --name fedora-coreos-31.20200323.3.2 -g GROUP --source https://BUCKET.blob.core.windows.net/fedora-coreos/fedora-coreos-31.20200323.3.2-azure.x86_64.vhd az image create --name fedora-coreos-31.20200323.3.2 -g GROUP --os-type=linux --source /subscriptions/some/path/providers/Microsoft.Compute/disks/fedora-coreos-31.20200323.3.2 Set the os_image in the next step.","title":"Fedora CoreOS Images"},{"location":"fedora-coreos/azure/#cluster","text":"Define a Kubernetes cluster using the module azure/fedora-coreos/kubernetes . module \"ramius\" { source = \"git::https://github.com/poseidon/typhoon//azure/fedora-coreos/kubernetes?ref=v1.24.3\" # Azure cluster_name = \"ramius\" region = \"centralus\" dns_zone = \"azure.example.com\" dns_zone_group = \"example-group\" # configuration os_image = \"/subscriptions/some/path/Microsoft.Compute/images/fedora-coreos-31.20200323.3.2\" ssh_authorized_key = \"ssh-ed25519 AAAAB3Nz...\" # optional worker_count = 2 host_cidr = \"10.0.0.0/20\" } Reference the variables docs or the variables.tf source.","title":"Cluster"},{"location":"fedora-coreos/azure/#ssh-agent","text":"Initial bootstrapping requires bootstrap.service be started on one controller node. Terraform uses ssh-agent to automate this step. Add your SSH private key to ssh-agent . ssh-add ~/.ssh/id_ed25519 ssh-add -L","title":"ssh-agent"},{"location":"fedora-coreos/azure/#apply","text":"Initialize the config directory if this is the first use with Terraform. terraform init Plan the resources to be created. $ terraform plan Plan: 86 to add, 0 to change, 0 to destroy. Apply the changes to create the cluster. $ terraform apply ... module.ramius.null_resource.bootstrap: Still creating... ( 6m50s elapsed ) module.ramius.null_resource.bootstrap: Still creating... ( 7m0s elapsed ) module.ramius.null_resource.bootstrap: Creation complete after 7m8s ( ID: 3961816482286168143 ) Apply complete! Resources: 69 added, 0 changed, 0 destroyed. In 4-8 minutes, the Kubernetes cluster will be ready.","title":"Apply"},{"location":"fedora-coreos/azure/#verify","text":"Install kubectl on your system. Obtain the generated cluster kubeconfig from module outputs (e.g. write to a local file). resource \"local_file\" \"kubeconfig-ramius\" { content = module.ramius.kubeconfig-admin filename = \"/home/user/.kube/configs/ramius-config\" } List nodes in the cluster. $ export KUBECONFIG=/home/user/.kube/configs/ramius-config $ kubectl get nodes NAME STATUS ROLES AGE VERSION ramius-controller-0 Ready <none> 24m v1.24.3 ramius-worker-000001 Ready <none> 25m v1.24.3 ramius-worker-000002 Ready <none> 24m v1.24.3 List the pods. $ kubectl get pods --all-namespaces NAMESPACE NAME READY STATUS RESTARTS AGE kube-system coredns-7c6fbb4f4b-b6qzx 1/1 Running 0 26m kube-system coredns-7c6fbb4f4b-j2k3d 1/1 Running 0 26m kube-system calico-node-1m5bf 2/2 Running 0 26m kube-system calico-node-7jmr1 2/2 Running 0 26m kube-system calico-node-bknc8 2/2 Running 0 26m kube-system kube-apiserver-ramius-controller-0 1/1 Running 0 26m kube-system kube-controller-manager-ramius-controller-0 1/1 Running 0 26m kube-system kube-proxy-j4vpq 1/1 Running 0 26m kube-system kube-proxy-jxr5d 1/1 Running 0 26m kube-system kube-proxy-lbdw5 1/1 Running 0 26m kube-system kube-scheduler-ramius-controller-0 1/1 Running 0 26m","title":"Verify"},{"location":"fedora-coreos/azure/#going-further","text":"Learn about maintenance and addons .","title":"Going Further"},{"location":"fedora-coreos/azure/#variables","text":"Check the variables.tf source.","title":"Variables"},{"location":"fedora-coreos/azure/#required","text":"Name Description Example cluster_name Unique cluster name (prepended to dns_zone) \"ramius\" region Azure region \"centralus\" dns_zone Azure DNS zone \"azure.example.com\" dns_zone_group Resource group where the Azure DNS zone resides \"global\" os_image Fedora CoreOS image for instances \"/subscriptions/..../custom-image\" ssh_authorized_key SSH public key for user 'core' \"ssh-ed25519 AAAAB3NZ...\" Tip Regions are shown in docs or with az account list-locations --output table .","title":"Required"},{"location":"fedora-coreos/azure/#dns-zone","text":"Clusters create a DNS A record ${cluster_name}.${dns_zone} to resolve a load balancer backed by controller instances. This FQDN is used by workers and kubectl to access the apiserver(s). In this example, the cluster's apiserver would be accessible at ramius.azure.example.com . You'll need a registered domain name or delegated subdomain on Azure DNS. You can set this up once and create many clusters with unique names. # Azure resource group for DNS zone resource \"azurerm_resource_group\" \"global\" { name = \"global\" location = \"centralus\" } # DNS zone for clusters resource \"azurerm_dns_zone\" \"clusters\" { resource_group_name = azurerm_resource_group.global.name name = \"azure.example.com\" zone_type = \"Public\" } Reference the DNS zone with azurerm_dns_zone.clusters.name and its resource group with \"azurerm_resource_group.global.name . If you have an existing domain name with a zone file elsewhere, just delegate a subdomain that can be managed on Azure DNS (e.g. azure.mydomain.com) and update nameservers .","title":"DNS Zone"},{"location":"fedora-coreos/azure/#optional","text":"Name Description Default Example controller_count Number of controllers (i.e. masters) 1 1 worker_count Number of workers 1 3 controller_type Machine type for controllers \"Standard_B2s\" See below worker_type Machine type for workers \"Standard_DS1_v2\" See below disk_size Size of the disk in GB 30 100 worker_priority Set priority to Spot to use reduced cost surplus capacity, with the tradeoff that instances can be deallocated at any time Regular Spot controller_snippets Controller Butane snippets [] example worker_snippets Worker Butane snippets [] example networking Choice of networking provider \"cilium\" \"calico\" or \"cilium\" or \"flannel\" host_cidr CIDR IPv4 range to assign to instances \"10.0.0.0/16\" \"10.0.0.0/20\" pod_cidr CIDR IPv4 range to assign to Kubernetes pods \"10.2.0.0/16\" \"10.22.0.0/16\" service_cidr CIDR IPv4 range to assign to Kubernetes services \"10.3.0.0/16\" \"10.3.0.0/24\" worker_node_labels List of initial worker node labels [] [\"worker-pool=default\"] Check the list of valid machine types and their specs . Use az vm list-skus to get the identifier. Warning Unlike AWS and GCP, Azure requires its virtual networks to have non-overlapping IPv4 CIDRs (yeah, go figure). Instead of each cluster just using 10.0.0.0/16 for instances, each Azure cluster's host_cidr must be non-overlapping (e.g. 10.0.0.0/20 for the 1 st cluster, 10.0.16.0/20 for the 2 nd cluster, etc). Warning Do not choose a controller_type smaller than Standard_B2s . Smaller instances are not sufficient for running a controller.","title":"Optional"},{"location":"fedora-coreos/azure/#spot-priority","text":"Add worker_priority=Spot to use Spot Priority workers that run on Azure's surplus capacity at lower cost, but with the tradeoff that they can be deallocated at random. Spot priority VMs are Azure's analog to AWS spot instances or GCP premptible instances.","title":"Spot Priority"},{"location":"fedora-coreos/bare-metal/","text":"Bare-Metal \u00b6 In this tutorial, we'll network boot and provision a Kubernetes v1.24.3 cluster on bare-metal with Fedora CoreOS. First, we'll deploy a Matchbox service and setup a network boot environment. Then, we'll declare a Kubernetes cluster using the Typhoon Terraform module and power on machines. On PXE boot, machines will install Fedora CoreOS to disk, reboot into the disk install, and provision themselves as Kubernetes controllers or workers via Ignition. Controller hosts are provisioned to run an etcd-member peer and a kubelet service. Worker hosts run a kubelet service. Controller nodes run kube-apiserver , kube-scheduler , kube-controller-manager , and coredns , while kube-proxy and calico (or flannel ) run on every node. A generated kubeconfig provides kubectl access to the cluster. Requirements \u00b6 Machines with 2GB RAM, 30GB disk, PXE-enabled NIC, IPMI PXE-enabled network boot environment (with HTTPS support) Matchbox v0.6+ deployment with API enabled Matchbox credentials client.crt , client.key , ca.crt Terraform v0.13.0+ Machines \u00b6 Collect a MAC address from each machine. For machines with multiple PXE-enabled NICs, pick one of the MAC addresses. MAC addresses will be used to match machines to profiles during network boot. 52:54:00:a1:9c:ae (node1) 52:54:00:b2:2f:86 (node2) 52:54:00:c3:61:77 (node3) Configure each machine to boot from the disk through IPMI or the BIOS menu. ipmitool -H node1 -U USER -P PASS chassis bootdev disk options=persistent During provisioning, you'll explicitly set the boot device to pxe for the next boot only. Machines will install (overwrite) the operating system to disk on PXE boot and reboot into the disk install. Ask your hardware vendor to provide MACs and preconfigure IPMI, if possible. With it, you can rack new servers, terraform apply with new info, and power on machines that network boot and provision into clusters. DNS \u00b6 Create a DNS A (or AAAA) record for each node's default interface. Create a record that resolves to each controller node (or re-use the node record if there's one controller). node1.example.com (node1) node2.example.com (node2) node3.example.com (node3) myk8s.example.com (node1) Cluster nodes will be configured to refer to the control plane and themselves by these fully qualified names and they'll be used in generated TLS certificates. Matchbox \u00b6 Matchbox is an open-source app that matches network-booted bare-metal machines (based on labels like MAC, UUID, etc.) to profiles to automate cluster provisioning. Install Matchbox on a Kubernetes cluster or dedicated server. Installing on Kubernetes (recommended) Installing on a server Tip Deploy Matchbox as service that can be accessed by all of your bare-metal machines globally. This provides a single endpoint to use Terraform to manage bare-metal clusters at different sites. Typhoon will never include secrets in provisioning user-data so you may even deploy matchbox publicly. Matchbox provides a TLS client-authenticated API that clients, like Terraform, can use to manage machine matching and profiles. Think of it like a cloud provider API, but for creating bare-metal instances. Generate TLS client credentials. Save the ca.crt , client.crt , and client.key where they can be referenced in Terraform configs. mv ca.crt client.crt client.key ~/.config/matchbox/ Verify the matchbox read-only HTTP endpoints are accessible (port is configurable). $ curl http://matchbox.example.com:8080 matchbox Verify your TLS client certificate and key can be used to access the Matchbox API (port is configurable). $ openssl s_client -connect matchbox.example.com:8081 \\ -CAfile ~/.config/matchbox/ca.crt \\ -cert ~/.config/matchbox/client.crt \\ -key ~/.config/matchbox/client.key PXE Environment \u00b6 Create an iPXE-enabled network boot environment. Configure PXE clients to chainload iPXE firmware compiled to support HTTPS downloads . Instruct iPXE clients to chainload from your Matchbox service's /boot.ipxe endpoint. For networks already supporting iPXE clients, you can add a default.ipxe config. # /var/www/html/ipxe/default.ipxe chain http://matchbox.foo:8080/boot.ipxe For networks with Ubiquiti Routers, you can configure the router itself to chainload machines to iPXE and Matchbox. Read about the many ways to setup a compliant iPXE-enabled network. There is quite a bit of flexibility: Continue using existing DHCP, TFTP, or DNS services Configure specific machines, subnets, or architectures to chainload from Matchbox Place Matchbox behind a menu entry (timeout and default to Matchbox) TFTP chainloading to modern boot firmware, like iPXE, avoids issues with old NICs and allows faster transfer protocols like HTTP to be used. Warning Compile iPXE from source with support for HTTPS downloads . iPXE's pre-built firmware binaries do not enable this. Fedora CoreOS downloads are HTTPS-only. Terraform Setup \u00b6 Install Terraform v0.13.0+ on your system. $ terraform version Terraform v1.0.0 Read concepts to learn about Terraform, modules, and organizing resources. Change to your infrastructure repository (e.g. infra ). cd infra/clusters Provider \u00b6 Configure the Matchbox provider to use your Matchbox API endpoint and client certificate in a providers.tf file. provider \"matchbox\" { endpoint = \"matchbox.example.com:8081\" client_cert = file ( \"~/.config/matchbox/client.crt\" ) client_key = file ( \"~/.config/matchbox/client.key\" ) ca = file ( \"~/.config/matchbox/ca.crt\" ) } provider \"ct\" {} terraform { required_providers { ct = { source = \"poseidon/ct\" version = \"0.10.0\" } matchbox = { source = \"poseidon/matchbox\" version = \"0.5.0\" } } } Cluster \u00b6 Define a Kubernetes cluster using the module bare-metal/fedora-coreos/kubernetes . module \"mercury\" { source = \"git::https://github.com/poseidon/typhoon//bare-metal/fedora-coreos/kubernetes?ref=v1.24.3\" # bare-metal cluster_name = \"mercury\" matchbox_http_endpoint = \"http://matchbox.example.com\" os_stream = \"stable\" os_version = \"32.20201104.3.0\" # configuration k8s_domain_name = \"node1.example.com\" ssh_authorized_key = \"ssh-ed25519 AAAAB3Nz...\" # machines controllers = [{ name = \"node1\" mac = \"52:54:00:a1:9c:ae\" domain = \"node1.example.com\" }] workers = [ { name = \"node2\" , mac = \"52:54:00:b2:2f:86\" domain = \"node2.example.com\" }, { name = \"node3\" , mac = \"52:54:00:c3:61:77\" domain = \"node3.example.com\" } ] } Reference the variables docs or the variables.tf source. ssh-agent \u00b6 Initial bootstrapping requires bootstrap.service be started on one controller node. Terraform uses ssh-agent to automate this step. Add your SSH private key to ssh-agent . ssh-add ~/.ssh/id_ed25519 ssh-add -L Apply \u00b6 Initialize the config directory if this is the first use with Terraform. terraform init Plan the resources to be created. $ terraform plan Plan: 55 to add, 0 to change, 0 to destroy. Apply the changes. Terraform will generate bootstrap assets and create Matchbox profiles (e.g. controller, worker) and matching rules via the Matchbox API. $ terraform apply module.mercury.null_resource.copy-kubeconfig.0: Provisioning with 'file' ... module.mercury.null_resource.copy-etcd-secrets.0: Provisioning with 'file' ... module.mercury.null_resource.copy-kubeconfig.0: Still creating... ( 10s elapsed ) module.mercury.null_resource.copy-etcd-secrets.0: Still creating... ( 10s elapsed ) ... Apply will then loop until it can successfully copy credentials to each machine and start the one-time Kubernetes bootstrap service. Proceed to the next step while this loops. Power \u00b6 Power on each machine with the boot device set to pxe for the next boot only. ipmitool -H node1.example.com -U USER -P PASS chassis bootdev pxe ipmitool -H node1.example.com -U USER -P PASS power on Machines will network boot, install Fedora CoreOS to disk, reboot into the disk install, and provision themselves as controllers or workers. If this is the first test of your PXE-enabled network boot environment, watch the SOL console of a machine to spot any misconfigurations. Bootstrap \u00b6 Wait for the bootstrap step to finish bootstrapping the Kubernetes control plane. This may take 5-15 minutes depending on your network. module.mercury.null_resource.bootstrap: Still creating... (6m10s elapsed) module.mercury.null_resource.bootstrap: Still creating... (6m20s elapsed) module.mercury.null_resource.bootstrap: Still creating... (6m30s elapsed) module.mercury.null_resource.bootstrap: Still creating... (6m40s elapsed) module.mercury.null_resource.bootstrap: Creation complete (ID: 5441741360626669024) Apply complete! Resources: 55 added, 0 changed, 0 destroyed. To watch the bootstrap process in detail, SSH to the first controller and journal the logs. $ ssh core@node1.example.com $ journalctl -f -u bootstrap podman[1750]: The connection to the server cluster.example.com:6443 was refused - did you specify the right host or port? podman[1750]: Waiting for static pod control plane ... podman[1750]: serviceaccount/calico-node unchanged systemd[1]: Started Kubernetes control plane. Verify \u00b6 Install kubectl on your system. Obtain the generated cluster kubeconfig from module outputs (e.g. write to a local file). resource \"local_file\" \"kubeconfig-mercury\" { content = module.mercury.kubeconfig-admin filename = \"/home/user/.kube/configs/mercury-config\" } List nodes in the cluster. $ export KUBECONFIG=/home/user/.kube/configs/mercury-config $ kubectl get nodes NAME STATUS ROLES AGE VERSION node1.example.com Ready <none> 10m v1.24.3 node2.example.com Ready <none> 10m v1.24.3 node3.example.com Ready <none> 10m v1.24.3 List the pods. $ kubectl get pods --all-namespaces NAMESPACE NAME READY STATUS RESTARTS AGE kube-system calico-node-6qp7f 2/2 Running 1 11m kube-system calico-node-gnjrm 2/2 Running 0 11m kube-system calico-node-llbgt 2/2 Running 0 11m kube-system coredns-1187388186-dj3pd 1/1 Running 0 11m kube-system coredns-1187388186-mx9rt 1/1 Running 0 11m kube-system kube-apiserver-node1.example.com 1/1 Running 0 11m kube-system kube-controller-manager-node1.example.com 1/1 Running 1 11m kube-system kube-proxy-50sd4 1/1 Running 0 11m kube-system kube-proxy-bczhp 1/1 Running 0 11m kube-system kube-proxy-mp2fw 1/1 Running 0 11m kube-system kube-scheduler-node1.example.com 1/1 Running 0 11m Going Further \u00b6 Learn about maintenance and addons . Variables \u00b6 Check the variables.tf source. Required \u00b6 Name Description Example cluster_name Unique cluster name \"mercury\" matchbox_http_endpoint Matchbox HTTP read-only endpoint \" http://matchbox.example.com:port \" os_stream Fedora CoreOS release stream \"stable\" os_version Fedora CoreOS version to PXE and install \"32.20201104.3.0\" k8s_domain_name FQDN resolving to the controller(s) nodes. Workers and kubectl will communicate with this endpoint \"myk8s.example.com\" ssh_authorized_key SSH public key for user 'core' \"ssh-ed25519 AAAAB3Nz...\" controllers List of controller machine detail objects (unique name, identifying MAC address, FQDN) [{name=\"node1\", mac=\"52:54:00:a1:9c:ae\", domain=\"node1.example.com\"}] workers List of worker machine detail objects (unique name, identifying MAC address, FQDN) [{name=\"node2\", mac=\"52:54:00:b2:2f:86\", domain=\"node2.example.com\"}, {name=\"node3\", mac=\"52:54:00:c3:61:77\", domain=\"node3.example.com\"}] Optional \u00b6 Name Description Default Example cached_install PXE boot and install from the Matchbox /assets cache. Admin MUST have downloaded Fedora CoreOS images into the cache false true install_disk Disk device where Fedora CoreOS should be installed \"sda\" (not \"/dev/sda\" like Container Linux) \"sdb\" networking Choice of networking provider \"cilium\" \"calico\" or \"cilium\" or \"flannel\" network_mtu CNI interface MTU (calico-only) 1480 - snippets Map from machine names to lists of Butane snippets {} examples network_ip_autodetection_method Method to detect host IPv4 address (calico-only) \"first-found\" \"can-reach=10.0.0.1\" pod_cidr CIDR IPv4 range to assign to Kubernetes pods \"10.2.0.0/16\" \"10.22.0.0/16\" service_cidr CIDR IPv4 range to assign to Kubernetes services \"10.3.0.0/16\" \"10.3.0.0/24\" kernel_args Additional kernel args to provide at PXE boot [] [\"kvm-intel.nested=1\"] worker_node_labels Map from worker name to list of initial node labels {} {\"node2\" = [\"role=special\"]} worker_node_taints Map from worker name to list of initial node taints {} {\"node2\" = [\"role=special:NoSchedule\"]}","title":"Bare-Metal"},{"location":"fedora-coreos/bare-metal/#bare-metal","text":"In this tutorial, we'll network boot and provision a Kubernetes v1.24.3 cluster on bare-metal with Fedora CoreOS. First, we'll deploy a Matchbox service and setup a network boot environment. Then, we'll declare a Kubernetes cluster using the Typhoon Terraform module and power on machines. On PXE boot, machines will install Fedora CoreOS to disk, reboot into the disk install, and provision themselves as Kubernetes controllers or workers via Ignition. Controller hosts are provisioned to run an etcd-member peer and a kubelet service. Worker hosts run a kubelet service. Controller nodes run kube-apiserver , kube-scheduler , kube-controller-manager , and coredns , while kube-proxy and calico (or flannel ) run on every node. A generated kubeconfig provides kubectl access to the cluster.","title":"Bare-Metal"},{"location":"fedora-coreos/bare-metal/#requirements","text":"Machines with 2GB RAM, 30GB disk, PXE-enabled NIC, IPMI PXE-enabled network boot environment (with HTTPS support) Matchbox v0.6+ deployment with API enabled Matchbox credentials client.crt , client.key , ca.crt Terraform v0.13.0+","title":"Requirements"},{"location":"fedora-coreos/bare-metal/#machines","text":"Collect a MAC address from each machine. For machines with multiple PXE-enabled NICs, pick one of the MAC addresses. MAC addresses will be used to match machines to profiles during network boot. 52:54:00:a1:9c:ae (node1) 52:54:00:b2:2f:86 (node2) 52:54:00:c3:61:77 (node3) Configure each machine to boot from the disk through IPMI or the BIOS menu. ipmitool -H node1 -U USER -P PASS chassis bootdev disk options=persistent During provisioning, you'll explicitly set the boot device to pxe for the next boot only. Machines will install (overwrite) the operating system to disk on PXE boot and reboot into the disk install. Ask your hardware vendor to provide MACs and preconfigure IPMI, if possible. With it, you can rack new servers, terraform apply with new info, and power on machines that network boot and provision into clusters.","title":"Machines"},{"location":"fedora-coreos/bare-metal/#dns","text":"Create a DNS A (or AAAA) record for each node's default interface. Create a record that resolves to each controller node (or re-use the node record if there's one controller). node1.example.com (node1) node2.example.com (node2) node3.example.com (node3) myk8s.example.com (node1) Cluster nodes will be configured to refer to the control plane and themselves by these fully qualified names and they'll be used in generated TLS certificates.","title":"DNS"},{"location":"fedora-coreos/bare-metal/#matchbox","text":"Matchbox is an open-source app that matches network-booted bare-metal machines (based on labels like MAC, UUID, etc.) to profiles to automate cluster provisioning. Install Matchbox on a Kubernetes cluster or dedicated server. Installing on Kubernetes (recommended) Installing on a server Tip Deploy Matchbox as service that can be accessed by all of your bare-metal machines globally. This provides a single endpoint to use Terraform to manage bare-metal clusters at different sites. Typhoon will never include secrets in provisioning user-data so you may even deploy matchbox publicly. Matchbox provides a TLS client-authenticated API that clients, like Terraform, can use to manage machine matching and profiles. Think of it like a cloud provider API, but for creating bare-metal instances. Generate TLS client credentials. Save the ca.crt , client.crt , and client.key where they can be referenced in Terraform configs. mv ca.crt client.crt client.key ~/.config/matchbox/ Verify the matchbox read-only HTTP endpoints are accessible (port is configurable). $ curl http://matchbox.example.com:8080 matchbox Verify your TLS client certificate and key can be used to access the Matchbox API (port is configurable). $ openssl s_client -connect matchbox.example.com:8081 \\ -CAfile ~/.config/matchbox/ca.crt \\ -cert ~/.config/matchbox/client.crt \\ -key ~/.config/matchbox/client.key","title":"Matchbox"},{"location":"fedora-coreos/bare-metal/#pxe-environment","text":"Create an iPXE-enabled network boot environment. Configure PXE clients to chainload iPXE firmware compiled to support HTTPS downloads . Instruct iPXE clients to chainload from your Matchbox service's /boot.ipxe endpoint. For networks already supporting iPXE clients, you can add a default.ipxe config. # /var/www/html/ipxe/default.ipxe chain http://matchbox.foo:8080/boot.ipxe For networks with Ubiquiti Routers, you can configure the router itself to chainload machines to iPXE and Matchbox. Read about the many ways to setup a compliant iPXE-enabled network. There is quite a bit of flexibility: Continue using existing DHCP, TFTP, or DNS services Configure specific machines, subnets, or architectures to chainload from Matchbox Place Matchbox behind a menu entry (timeout and default to Matchbox) TFTP chainloading to modern boot firmware, like iPXE, avoids issues with old NICs and allows faster transfer protocols like HTTP to be used. Warning Compile iPXE from source with support for HTTPS downloads . iPXE's pre-built firmware binaries do not enable this. Fedora CoreOS downloads are HTTPS-only.","title":"PXE Environment"},{"location":"fedora-coreos/bare-metal/#terraform-setup","text":"Install Terraform v0.13.0+ on your system. $ terraform version Terraform v1.0.0 Read concepts to learn about Terraform, modules, and organizing resources. Change to your infrastructure repository (e.g. infra ). cd infra/clusters","title":"Terraform Setup"},{"location":"fedora-coreos/bare-metal/#provider","text":"Configure the Matchbox provider to use your Matchbox API endpoint and client certificate in a providers.tf file. provider \"matchbox\" { endpoint = \"matchbox.example.com:8081\" client_cert = file ( \"~/.config/matchbox/client.crt\" ) client_key = file ( \"~/.config/matchbox/client.key\" ) ca = file ( \"~/.config/matchbox/ca.crt\" ) } provider \"ct\" {} terraform { required_providers { ct = { source = \"poseidon/ct\" version = \"0.10.0\" } matchbox = { source = \"poseidon/matchbox\" version = \"0.5.0\" } } }","title":"Provider"},{"location":"fedora-coreos/bare-metal/#cluster","text":"Define a Kubernetes cluster using the module bare-metal/fedora-coreos/kubernetes . module \"mercury\" { source = \"git::https://github.com/poseidon/typhoon//bare-metal/fedora-coreos/kubernetes?ref=v1.24.3\" # bare-metal cluster_name = \"mercury\" matchbox_http_endpoint = \"http://matchbox.example.com\" os_stream = \"stable\" os_version = \"32.20201104.3.0\" # configuration k8s_domain_name = \"node1.example.com\" ssh_authorized_key = \"ssh-ed25519 AAAAB3Nz...\" # machines controllers = [{ name = \"node1\" mac = \"52:54:00:a1:9c:ae\" domain = \"node1.example.com\" }] workers = [ { name = \"node2\" , mac = \"52:54:00:b2:2f:86\" domain = \"node2.example.com\" }, { name = \"node3\" , mac = \"52:54:00:c3:61:77\" domain = \"node3.example.com\" } ] } Reference the variables docs or the variables.tf source.","title":"Cluster"},{"location":"fedora-coreos/bare-metal/#ssh-agent","text":"Initial bootstrapping requires bootstrap.service be started on one controller node. Terraform uses ssh-agent to automate this step. Add your SSH private key to ssh-agent . ssh-add ~/.ssh/id_ed25519 ssh-add -L","title":"ssh-agent"},{"location":"fedora-coreos/bare-metal/#apply","text":"Initialize the config directory if this is the first use with Terraform. terraform init Plan the resources to be created. $ terraform plan Plan: 55 to add, 0 to change, 0 to destroy. Apply the changes. Terraform will generate bootstrap assets and create Matchbox profiles (e.g. controller, worker) and matching rules via the Matchbox API. $ terraform apply module.mercury.null_resource.copy-kubeconfig.0: Provisioning with 'file' ... module.mercury.null_resource.copy-etcd-secrets.0: Provisioning with 'file' ... module.mercury.null_resource.copy-kubeconfig.0: Still creating... ( 10s elapsed ) module.mercury.null_resource.copy-etcd-secrets.0: Still creating... ( 10s elapsed ) ... Apply will then loop until it can successfully copy credentials to each machine and start the one-time Kubernetes bootstrap service. Proceed to the next step while this loops.","title":"Apply"},{"location":"fedora-coreos/bare-metal/#power","text":"Power on each machine with the boot device set to pxe for the next boot only. ipmitool -H node1.example.com -U USER -P PASS chassis bootdev pxe ipmitool -H node1.example.com -U USER -P PASS power on Machines will network boot, install Fedora CoreOS to disk, reboot into the disk install, and provision themselves as controllers or workers. If this is the first test of your PXE-enabled network boot environment, watch the SOL console of a machine to spot any misconfigurations.","title":"Power"},{"location":"fedora-coreos/bare-metal/#bootstrap","text":"Wait for the bootstrap step to finish bootstrapping the Kubernetes control plane. This may take 5-15 minutes depending on your network. module.mercury.null_resource.bootstrap: Still creating... (6m10s elapsed) module.mercury.null_resource.bootstrap: Still creating... (6m20s elapsed) module.mercury.null_resource.bootstrap: Still creating... (6m30s elapsed) module.mercury.null_resource.bootstrap: Still creating... (6m40s elapsed) module.mercury.null_resource.bootstrap: Creation complete (ID: 5441741360626669024) Apply complete! Resources: 55 added, 0 changed, 0 destroyed. To watch the bootstrap process in detail, SSH to the first controller and journal the logs. $ ssh core@node1.example.com $ journalctl -f -u bootstrap podman[1750]: The connection to the server cluster.example.com:6443 was refused - did you specify the right host or port? podman[1750]: Waiting for static pod control plane ... podman[1750]: serviceaccount/calico-node unchanged systemd[1]: Started Kubernetes control plane.","title":"Bootstrap"},{"location":"fedora-coreos/bare-metal/#verify","text":"Install kubectl on your system. Obtain the generated cluster kubeconfig from module outputs (e.g. write to a local file). resource \"local_file\" \"kubeconfig-mercury\" { content = module.mercury.kubeconfig-admin filename = \"/home/user/.kube/configs/mercury-config\" } List nodes in the cluster. $ export KUBECONFIG=/home/user/.kube/configs/mercury-config $ kubectl get nodes NAME STATUS ROLES AGE VERSION node1.example.com Ready <none> 10m v1.24.3 node2.example.com Ready <none> 10m v1.24.3 node3.example.com Ready <none> 10m v1.24.3 List the pods. $ kubectl get pods --all-namespaces NAMESPACE NAME READY STATUS RESTARTS AGE kube-system calico-node-6qp7f 2/2 Running 1 11m kube-system calico-node-gnjrm 2/2 Running 0 11m kube-system calico-node-llbgt 2/2 Running 0 11m kube-system coredns-1187388186-dj3pd 1/1 Running 0 11m kube-system coredns-1187388186-mx9rt 1/1 Running 0 11m kube-system kube-apiserver-node1.example.com 1/1 Running 0 11m kube-system kube-controller-manager-node1.example.com 1/1 Running 1 11m kube-system kube-proxy-50sd4 1/1 Running 0 11m kube-system kube-proxy-bczhp 1/1 Running 0 11m kube-system kube-proxy-mp2fw 1/1 Running 0 11m kube-system kube-scheduler-node1.example.com 1/1 Running 0 11m","title":"Verify"},{"location":"fedora-coreos/bare-metal/#going-further","text":"Learn about maintenance and addons .","title":"Going Further"},{"location":"fedora-coreos/bare-metal/#variables","text":"Check the variables.tf source.","title":"Variables"},{"location":"fedora-coreos/bare-metal/#required","text":"Name Description Example cluster_name Unique cluster name \"mercury\" matchbox_http_endpoint Matchbox HTTP read-only endpoint \" http://matchbox.example.com:port \" os_stream Fedora CoreOS release stream \"stable\" os_version Fedora CoreOS version to PXE and install \"32.20201104.3.0\" k8s_domain_name FQDN resolving to the controller(s) nodes. Workers and kubectl will communicate with this endpoint \"myk8s.example.com\" ssh_authorized_key SSH public key for user 'core' \"ssh-ed25519 AAAAB3Nz...\" controllers List of controller machine detail objects (unique name, identifying MAC address, FQDN) [{name=\"node1\", mac=\"52:54:00:a1:9c:ae\", domain=\"node1.example.com\"}] workers List of worker machine detail objects (unique name, identifying MAC address, FQDN) [{name=\"node2\", mac=\"52:54:00:b2:2f:86\", domain=\"node2.example.com\"}, {name=\"node3\", mac=\"52:54:00:c3:61:77\", domain=\"node3.example.com\"}]","title":"Required"},{"location":"fedora-coreos/bare-metal/#optional","text":"Name Description Default Example cached_install PXE boot and install from the Matchbox /assets cache. Admin MUST have downloaded Fedora CoreOS images into the cache false true install_disk Disk device where Fedora CoreOS should be installed \"sda\" (not \"/dev/sda\" like Container Linux) \"sdb\" networking Choice of networking provider \"cilium\" \"calico\" or \"cilium\" or \"flannel\" network_mtu CNI interface MTU (calico-only) 1480 - snippets Map from machine names to lists of Butane snippets {} examples network_ip_autodetection_method Method to detect host IPv4 address (calico-only) \"first-found\" \"can-reach=10.0.0.1\" pod_cidr CIDR IPv4 range to assign to Kubernetes pods \"10.2.0.0/16\" \"10.22.0.0/16\" service_cidr CIDR IPv4 range to assign to Kubernetes services \"10.3.0.0/16\" \"10.3.0.0/24\" kernel_args Additional kernel args to provide at PXE boot [] [\"kvm-intel.nested=1\"] worker_node_labels Map from worker name to list of initial node labels {} {\"node2\" = [\"role=special\"]} worker_node_taints Map from worker name to list of initial node taints {} {\"node2\" = [\"role=special:NoSchedule\"]}","title":"Optional"},{"location":"fedora-coreos/digitalocean/","text":"DigitalOcean \u00b6 In this tutorial, we'll create a Kubernetes v1.24.3 cluster on DigitalOcean with Fedora CoreOS. We'll declare a Kubernetes cluster using the Typhoon Terraform module. Then apply the changes to create controller droplets, worker droplets, DNS records, tags, and TLS assets. Controller hosts are provisioned to run an etcd-member peer and a kubelet service. Worker hosts run a kubelet service. Controller nodes run kube-apiserver , kube-scheduler , kube-controller-manager , and coredns , while kube-proxy and calico (or flannel ) run on every node. A generated kubeconfig provides kubectl access to the cluster. Requirements \u00b6 Digital Ocean Account and Token Digital Ocean Domain (registered Domain Name or delegated subdomain) Terraform v0.13.0+ Terraform Setup \u00b6 Install Terraform v0.13.0+ on your system. $ terraform version Terraform v1.0.0 Read concepts to learn about Terraform, modules, and organizing resources. Change to your infrastructure repository (e.g. infra ). cd infra/clusters Provider \u00b6 Login to DigitalOcean . Or if you don't have one, create an account with our referral link to get free credits. Generate a Personal Access Token with read/write scope from the API tab . Write the token to a file that can be referenced in configs. mkdir -p ~/.config/digital-ocean echo \"TOKEN\" > ~/.config/digital-ocean/token Configure the DigitalOcean provider to use your token in a providers.tf file. provider \"digitalocean\" { token = \"${chomp(file(\"~/.config/digital-ocean/token\"))}\" } provider \"ct\" {} terraform { required_providers { ct = { source = \"poseidon/ct\" version = \"0.10.0\" } digitalocean = { source = \"digitalocean/digitalocean\" version = \"2.21.0\" } } } Fedora CoreOS Images \u00b6 Fedora CoreOS publishes images for DigitalOcean, but does not yet upload them. DigitalOcean allows custom images to be uploaded via URL or file. Import a Fedora CoreOS image via URL to desired a region(s). data \"digitalocean_image\" \"fedora-coreos-31-20200323-3-2\" { name = \"fedora-coreos-31.20200323.3.2-digitalocean.x86_64.qcow2.gz\" } Set the os_image in the next step. Cluster \u00b6 Define a Kubernetes cluster using the module digital-ocean/fedora-coreos/kubernetes . module \"nemo\" { source = \"git::https://github.com/poseidon/typhoon//digital-ocean/fedora-coreos/kubernetes?ref=v1.24.3\" # Digital Ocean cluster_name = \"nemo\" region = \"nyc3\" dns_zone = \"digital-ocean.example.com\" # configuration os_image = data.digitalocean_image.fedora-coreos-31-20200323-3-2.id ssh_fingerprints = [ \"d7:9d:79:ae:56:32:73:79:95:88:e3:a2:ab:5d:45:e7\" ] # optional worker_count = 2 } Reference the variables docs or the variables.tf source. ssh-agent \u00b6 Initial bootstrapping requires bootstrap.service be started on one controller node. Terraform uses ssh-agent to automate this step. Add your SSH private key to ssh-agent . ssh-add ~/.ssh/id_ed25519 ssh-add -L Apply \u00b6 Initialize the config directory if this is the first use with Terraform. terraform init Plan the resources to be created. $ terraform plan Plan: 54 to add, 0 to change, 0 to destroy. Apply the changes to create the cluster. $ terraform apply module.nemo.null_resource.bootstrap: Still creating... ( 30s elapsed ) module.nemo.null_resource.bootstrap: Provisioning with 'remote-exec' ... ... module.nemo.null_resource.bootstrap: Still creating... ( 6m20s elapsed ) module.nemo.null_resource.bootstrap: Creation complete ( ID: 7599298447329218468 ) Apply complete! Resources: 42 added, 0 changed, 0 destroyed. In 3-6 minutes, the Kubernetes cluster will be ready. Verify \u00b6 Install kubectl on your system. Obtain the generated cluster kubeconfig from module outputs (e.g. write to a local file). resource \"local_file\" \"kubeconfig-nemo\" { content = module.nemo.kubeconfig-admin filename = \"/home/user/.kube/configs/nemo-config\" } List nodes in the cluster. $ export KUBECONFIG=/home/user/.kube/configs/nemo-config $ kubectl get nodes NAME STATUS ROLES AGE VERSION 10.132.110.130 Ready <none> 10m v1.24.3 10.132.115.81 Ready <none> 10m v1.24.3 10.132.124.107 Ready <none> 10m v1.24.3 List the pods. NAMESPACE NAME READY STATUS RESTARTS AGE kube-system coredns-1187388186-ld1j7 1/1 Running 0 11m kube-system coredns-1187388186-rdhf7 1/1 Running 0 11m kube-system calico-node-1m5bf 2/2 Running 0 11m kube-system calico-node-7jmr1 2/2 Running 0 11m kube-system calico-node-bknc8 2/2 Running 0 11m kube-system kube-apiserver-ip-10.132.115.81 1/1 Running 0 11m kube-system kube-controller-manager-ip-10.132.115.81 1/1 Running 0 11m kube-system kube-proxy-6kxjf 1/1 Running 0 11m kube-system kube-proxy-fh3td 1/1 Running 0 11m kube-system kube-proxy-k35rc 1/1 Running 0 11m kube-system kube-scheduler-ip-10.132.115.81 1/1 Running 0 11m Going Further \u00b6 Learn about maintenance and addons . Variables \u00b6 Check the variables.tf source. Required \u00b6 Name Description Example cluster_name Unique cluster name (prepended to dns_zone) \"nemo\" region Digital Ocean region \"nyc1\", \"sfo2\", \"fra1\", tor1\" dns_zone Digital Ocean domain (i.e. DNS zone) \"do.example.com\" os_image Fedora CoreOS image for instances \"custom-image-id\" ssh_fingerprints SSH public key fingerprints [\"d7:9d...\"] DNS Zone \u00b6 Clusters create DNS A records ${cluster_name}.${dns_zone} to resolve to controller droplets (round robin). This FQDN is used by workers and kubectl to access the apiserver(s). In this example, the cluster's apiserver would be accessible at nemo.do.example.com . You'll need a registered domain name or delegated subdomain in DigitalOcean Domains (i.e. DNS zones). You can set this up once and create many clusters with unique names. # Declare a DigitalOcean record to also create a zone file resource \"digitalocean_domain\" \"zone-for-clusters\" { name = \"do.example.com\" ip_address = \"8.8.8.8\" } If you have an existing domain name with a zone file elsewhere, just delegate a subdomain that can be managed on DigitalOcean (e.g. do.mydomain.com) and update nameservers . SSH Fingerprints \u00b6 DigitalOcean droplets are created with your SSH public key \"fingerprint\" (i.e. MD5 hash) to allow access. If your SSH public key is at ~/.ssh/id_ed25519.pub , find the fingerprint with, ssh-keygen -E md5 -lf ~/.ssh/id_ed25519.pub | awk '{print $2}' MD5:d7:9d:79:ae:56:32:73:79:95:88:e3:a2:ab:5d:45:e7 If you use ssh-agent (e.g. Yubikey for SSH), find the fingerprint with, ssh-add -l -E md5 256 MD5:20:d0:eb:ad:50:b0:09:6d:4b:ba:ad:7c:9c:c1:39:24 foo@xample.com (ED25519) Digital Ocean requires the SSH public key be uploaded to your account, so you may also find the fingerprint under Settings -> Security. Finally, if you don't have an SSH key, create one now . Optional \u00b6 Name Description Default Example controller_count Number of controllers (i.e. masters) 1 1 worker_count Number of workers 1 3 controller_type Droplet type for controllers \"s-2vcpu-2gb\" s-2vcpu-2gb, s-2vcpu-4gb, s-4vcpu-8gb, ... worker_type Droplet type for workers \"s-1vcpu-2gb\" s-1vcpu-2gb, s-2vcpu-2gb, ... controller_snippets Controller Butane snippets [] example worker_snippets Worker Butane snippets [] example networking Choice of networking provider \"cilium\" \"calico\" or \"cilium\" or \"flannel\" pod_cidr CIDR IPv4 range to assign to Kubernetes pods \"10.2.0.0/16\" \"10.22.0.0/16\" service_cidr CIDR IPv4 range to assign to Kubernetes services \"10.3.0.0/16\" \"10.3.0.0/24\" Check the list of valid droplet types or use doctl compute size list . Warning Do not choose a controller_type smaller than 2GB. Smaller droplets are not sufficient for running a controller and bootstrapping will fail.","title":"DigitalOcean"},{"location":"fedora-coreos/digitalocean/#digitalocean","text":"In this tutorial, we'll create a Kubernetes v1.24.3 cluster on DigitalOcean with Fedora CoreOS. We'll declare a Kubernetes cluster using the Typhoon Terraform module. Then apply the changes to create controller droplets, worker droplets, DNS records, tags, and TLS assets. Controller hosts are provisioned to run an etcd-member peer and a kubelet service. Worker hosts run a kubelet service. Controller nodes run kube-apiserver , kube-scheduler , kube-controller-manager , and coredns , while kube-proxy and calico (or flannel ) run on every node. A generated kubeconfig provides kubectl access to the cluster.","title":"DigitalOcean"},{"location":"fedora-coreos/digitalocean/#requirements","text":"Digital Ocean Account and Token Digital Ocean Domain (registered Domain Name or delegated subdomain) Terraform v0.13.0+","title":"Requirements"},{"location":"fedora-coreos/digitalocean/#terraform-setup","text":"Install Terraform v0.13.0+ on your system. $ terraform version Terraform v1.0.0 Read concepts to learn about Terraform, modules, and organizing resources. Change to your infrastructure repository (e.g. infra ). cd infra/clusters","title":"Terraform Setup"},{"location":"fedora-coreos/digitalocean/#provider","text":"Login to DigitalOcean . Or if you don't have one, create an account with our referral link to get free credits. Generate a Personal Access Token with read/write scope from the API tab . Write the token to a file that can be referenced in configs. mkdir -p ~/.config/digital-ocean echo \"TOKEN\" > ~/.config/digital-ocean/token Configure the DigitalOcean provider to use your token in a providers.tf file. provider \"digitalocean\" { token = \"${chomp(file(\"~/.config/digital-ocean/token\"))}\" } provider \"ct\" {} terraform { required_providers { ct = { source = \"poseidon/ct\" version = \"0.10.0\" } digitalocean = { source = \"digitalocean/digitalocean\" version = \"2.21.0\" } } }","title":"Provider"},{"location":"fedora-coreos/digitalocean/#fedora-coreos-images","text":"Fedora CoreOS publishes images for DigitalOcean, but does not yet upload them. DigitalOcean allows custom images to be uploaded via URL or file. Import a Fedora CoreOS image via URL to desired a region(s). data \"digitalocean_image\" \"fedora-coreos-31-20200323-3-2\" { name = \"fedora-coreos-31.20200323.3.2-digitalocean.x86_64.qcow2.gz\" } Set the os_image in the next step.","title":"Fedora CoreOS Images"},{"location":"fedora-coreos/digitalocean/#cluster","text":"Define a Kubernetes cluster using the module digital-ocean/fedora-coreos/kubernetes . module \"nemo\" { source = \"git::https://github.com/poseidon/typhoon//digital-ocean/fedora-coreos/kubernetes?ref=v1.24.3\" # Digital Ocean cluster_name = \"nemo\" region = \"nyc3\" dns_zone = \"digital-ocean.example.com\" # configuration os_image = data.digitalocean_image.fedora-coreos-31-20200323-3-2.id ssh_fingerprints = [ \"d7:9d:79:ae:56:32:73:79:95:88:e3:a2:ab:5d:45:e7\" ] # optional worker_count = 2 } Reference the variables docs or the variables.tf source.","title":"Cluster"},{"location":"fedora-coreos/digitalocean/#ssh-agent","text":"Initial bootstrapping requires bootstrap.service be started on one controller node. Terraform uses ssh-agent to automate this step. Add your SSH private key to ssh-agent . ssh-add ~/.ssh/id_ed25519 ssh-add -L","title":"ssh-agent"},{"location":"fedora-coreos/digitalocean/#apply","text":"Initialize the config directory if this is the first use with Terraform. terraform init Plan the resources to be created. $ terraform plan Plan: 54 to add, 0 to change, 0 to destroy. Apply the changes to create the cluster. $ terraform apply module.nemo.null_resource.bootstrap: Still creating... ( 30s elapsed ) module.nemo.null_resource.bootstrap: Provisioning with 'remote-exec' ... ... module.nemo.null_resource.bootstrap: Still creating... ( 6m20s elapsed ) module.nemo.null_resource.bootstrap: Creation complete ( ID: 7599298447329218468 ) Apply complete! Resources: 42 added, 0 changed, 0 destroyed. In 3-6 minutes, the Kubernetes cluster will be ready.","title":"Apply"},{"location":"fedora-coreos/digitalocean/#verify","text":"Install kubectl on your system. Obtain the generated cluster kubeconfig from module outputs (e.g. write to a local file). resource \"local_file\" \"kubeconfig-nemo\" { content = module.nemo.kubeconfig-admin filename = \"/home/user/.kube/configs/nemo-config\" } List nodes in the cluster. $ export KUBECONFIG=/home/user/.kube/configs/nemo-config $ kubectl get nodes NAME STATUS ROLES AGE VERSION 10.132.110.130 Ready <none> 10m v1.24.3 10.132.115.81 Ready <none> 10m v1.24.3 10.132.124.107 Ready <none> 10m v1.24.3 List the pods. NAMESPACE NAME READY STATUS RESTARTS AGE kube-system coredns-1187388186-ld1j7 1/1 Running 0 11m kube-system coredns-1187388186-rdhf7 1/1 Running 0 11m kube-system calico-node-1m5bf 2/2 Running 0 11m kube-system calico-node-7jmr1 2/2 Running 0 11m kube-system calico-node-bknc8 2/2 Running 0 11m kube-system kube-apiserver-ip-10.132.115.81 1/1 Running 0 11m kube-system kube-controller-manager-ip-10.132.115.81 1/1 Running 0 11m kube-system kube-proxy-6kxjf 1/1 Running 0 11m kube-system kube-proxy-fh3td 1/1 Running 0 11m kube-system kube-proxy-k35rc 1/1 Running 0 11m kube-system kube-scheduler-ip-10.132.115.81 1/1 Running 0 11m","title":"Verify"},{"location":"fedora-coreos/digitalocean/#going-further","text":"Learn about maintenance and addons .","title":"Going Further"},{"location":"fedora-coreos/digitalocean/#variables","text":"Check the variables.tf source.","title":"Variables"},{"location":"fedora-coreos/digitalocean/#required","text":"Name Description Example cluster_name Unique cluster name (prepended to dns_zone) \"nemo\" region Digital Ocean region \"nyc1\", \"sfo2\", \"fra1\", tor1\" dns_zone Digital Ocean domain (i.e. DNS zone) \"do.example.com\" os_image Fedora CoreOS image for instances \"custom-image-id\" ssh_fingerprints SSH public key fingerprints [\"d7:9d...\"]","title":"Required"},{"location":"fedora-coreos/digitalocean/#dns-zone","text":"Clusters create DNS A records ${cluster_name}.${dns_zone} to resolve to controller droplets (round robin). This FQDN is used by workers and kubectl to access the apiserver(s). In this example, the cluster's apiserver would be accessible at nemo.do.example.com . You'll need a registered domain name or delegated subdomain in DigitalOcean Domains (i.e. DNS zones). You can set this up once and create many clusters with unique names. # Declare a DigitalOcean record to also create a zone file resource \"digitalocean_domain\" \"zone-for-clusters\" { name = \"do.example.com\" ip_address = \"8.8.8.8\" } If you have an existing domain name with a zone file elsewhere, just delegate a subdomain that can be managed on DigitalOcean (e.g. do.mydomain.com) and update nameservers .","title":"DNS Zone"},{"location":"fedora-coreos/digitalocean/#ssh-fingerprints","text":"DigitalOcean droplets are created with your SSH public key \"fingerprint\" (i.e. MD5 hash) to allow access. If your SSH public key is at ~/.ssh/id_ed25519.pub , find the fingerprint with, ssh-keygen -E md5 -lf ~/.ssh/id_ed25519.pub | awk '{print $2}' MD5:d7:9d:79:ae:56:32:73:79:95:88:e3:a2:ab:5d:45:e7 If you use ssh-agent (e.g. Yubikey for SSH), find the fingerprint with, ssh-add -l -E md5 256 MD5:20:d0:eb:ad:50:b0:09:6d:4b:ba:ad:7c:9c:c1:39:24 foo@xample.com (ED25519) Digital Ocean requires the SSH public key be uploaded to your account, so you may also find the fingerprint under Settings -> Security. Finally, if you don't have an SSH key, create one now .","title":"SSH Fingerprints"},{"location":"fedora-coreos/digitalocean/#optional","text":"Name Description Default Example controller_count Number of controllers (i.e. masters) 1 1 worker_count Number of workers 1 3 controller_type Droplet type for controllers \"s-2vcpu-2gb\" s-2vcpu-2gb, s-2vcpu-4gb, s-4vcpu-8gb, ... worker_type Droplet type for workers \"s-1vcpu-2gb\" s-1vcpu-2gb, s-2vcpu-2gb, ... controller_snippets Controller Butane snippets [] example worker_snippets Worker Butane snippets [] example networking Choice of networking provider \"cilium\" \"calico\" or \"cilium\" or \"flannel\" pod_cidr CIDR IPv4 range to assign to Kubernetes pods \"10.2.0.0/16\" \"10.22.0.0/16\" service_cidr CIDR IPv4 range to assign to Kubernetes services \"10.3.0.0/16\" \"10.3.0.0/24\" Check the list of valid droplet types or use doctl compute size list . Warning Do not choose a controller_type smaller than 2GB. Smaller droplets are not sufficient for running a controller and bootstrapping will fail.","title":"Optional"},{"location":"fedora-coreos/google-cloud/","text":"Google Cloud \u00b6 In this tutorial, we'll create a Kubernetes v1.24.3 cluster on Google Compute Engine with Fedora CoreOS. We'll declare a Kubernetes cluster using the Typhoon Terraform module. Then apply the changes to create a network, firewall rules, health checks, controller instances, worker managed instance group, load balancers, and TLS assets. Controller hosts are provisioned to run an etcd-member peer and a kubelet service. Worker hosts run a kubelet service. Controller nodes run kube-apiserver , kube-scheduler , kube-controller-manager , and coredns , while kube-proxy and calico (or flannel ) run on every node. A generated kubeconfig provides kubectl access to the cluster. Requirements \u00b6 Google Cloud Account and Service Account Google Cloud DNS Zone (registered Domain Name or delegated subdomain) Terraform v0.13.0+ Terraform Setup \u00b6 Install Terraform v0.13.0+ on your system. $ terraform version Terraform v1.0.0 Read concepts to learn about Terraform, modules, and organizing resources. Change to your infrastructure repository (e.g. infra ). cd infra/clusters Provider \u00b6 Login to your Google Console API Manager and select a project, or signup if you don't have an account. Select \"Credentials\" and create a service account key. Choose the \"Compute Engine Admin\" and \"DNS Administrator\" roles and save the JSON private key to a file that can be referenced in configs. mv ~/Downloads/project-id-43048204.json ~/.config/google-cloud/terraform.json Configure the Google Cloud provider to use your service account key, project-id, and region in a providers.tf file. provider \"google\" { project = \"project-id\" region = \"us-central1\" credentials = file ( \"~/.config/google-cloud/terraform.json\" ) } provider \"ct\" {} terraform { required_providers { ct = { source = \"poseidon/ct\" version = \"0.10.0\" } google = { source = \"hashicorp/google\" version = \"4.29.0\" } } } Additional configuration options are described in the google provider docs . Tip Regions are listed in docs or with gcloud compute regions list . A project may contain multiple clusters across different regions. Cluster \u00b6 Define a Kubernetes cluster using the module google-cloud/fedora-coreos/kubernetes . module \"yavin\" { source = \"git::https://github.com/poseidon/typhoon//google-cloud/fedora-coreos/kubernetes?ref=development-sha\" # Google Cloud cluster_name = \"yavin\" region = \"us-central1\" dns_zone = \"example.com\" dns_zone_name = \"example-zone\" # configuration ssh_authorized_key = \"ssh-ed25519 AAAAB3Nz...\" # optional worker_count = 2 } Reference the variables docs or the variables.tf source. ssh-agent \u00b6 Initial bootstrapping requires bootstrap.service be started on one controller node. Terraform uses ssh-agent to automate this step. Add your SSH private key to ssh-agent . ssh-add ~/.ssh/id_ed25519 ssh-add -L Apply \u00b6 Initialize the config directory if this is the first use with Terraform. terraform init Plan the resources to be created. $ terraform plan Plan: 64 to add, 0 to change, 0 to destroy. Apply the changes to create the cluster. $ terraform apply module.yavin.null_resource.bootstrap: Still creating... ( 10s elapsed ) ... module.yavin.null_resource.bootstrap: Still creating... ( 5m30s elapsed ) module.yavin.null_resource.bootstrap: Still creating... ( 5m40s elapsed ) module.yavin.null_resource.bootstrap: Creation complete ( ID: 5768638456220583358 ) Apply complete! Resources: 62 added, 0 changed, 0 destroyed. In 4-8 minutes, the Kubernetes cluster will be ready. Verify \u00b6 Install kubectl on your system. Obtain the generated cluster kubeconfig from module outputs (e.g. write to a local file). resource \"local_file\" \"kubeconfig-yavin\" { content = module.yavin.kubeconfig-admin filename = \"/home/user/.kube/configs/yavin-config\" } List nodes in the cluster. $ export KUBECONFIG=/home/user/.kube/configs/yavin-config $ kubectl get nodes NAME ROLES STATUS AGE VERSION yavin-controller-0.c.example-com.internal <none> Ready 6m v1.24.3 yavin-worker-jrbf.c.example-com.internal <none> Ready 5m v1.24.3 yavin-worker-mzdm.c.example-com.internal <none> Ready 5m v1.24.3 List the pods. $ kubectl get pods --all-namespaces NAMESPACE NAME READY STATUS RESTARTS AGE kube-system calico-node-1cs8z 2/2 Running 0 6m kube-system calico-node-d1l5b 2/2 Running 0 6m kube-system calico-node-sp9ps 2/2 Running 0 6m kube-system coredns-1187388186-dkh3o 1/1 Running 0 6m kube-system coredns-1187388186-zj5dl 1/1 Running 0 6m kube-system kube-apiserver-controller-0 1/1 Running 0 6m kube-system kube-controller-manager-controller-0 1/1 Running 0 6m kube-system kube-proxy-117v6 1/1 Running 0 6m kube-system kube-proxy-9886n 1/1 Running 0 6m kube-system kube-proxy-njn47 1/1 Running 0 6m kube-system kube-scheduler-controller-0 1/1 Running 0 6m Going Further \u00b6 Learn about maintenance and addons . Variables \u00b6 Check the variables.tf source. Required \u00b6 Name Description Example cluster_name Unique cluster name (prepended to dns_zone) \"yavin\" region Google Cloud region \"us-central1\" dns_zone Google Cloud DNS zone \"google-cloud.example.com\" dns_zone_name Google Cloud DNS zone name \"example-zone\" ssh_authorized_key SSH public key for user 'core' \"ssh-ed25519 AAAAB3NZ...\" Check the list of valid regions and list Fedora CoreOS images with gcloud compute images list | grep fedora-coreos . DNS Zone \u00b6 Clusters create a DNS A record ${cluster_name}.${dns_zone} to resolve a TCP proxy load balancer backed by controller instances. This FQDN is used by workers and kubectl to access the apiserver(s). In this example, the cluster's apiserver would be accessible at yavin.google-cloud.example.com . You'll need a registered domain name or delegated subdomain on Google Cloud DNS. You can set this up once and create many clusters with unique names. resource \"google_dns_managed_zone\" \"zone-for-clusters\" { dns_name = \"google-cloud.example.com.\" name = \"example-zone\" description = \"Production DNS zone\" } If you have an existing domain name with a zone file elsewhere, just delegate a subdomain that can be managed on Google Cloud (e.g. google-cloud.mydomain.com) and update nameservers . Optional \u00b6 Name Description Default Example controller_count Number of controllers (i.e. masters) 1 3 worker_count Number of workers 1 3 controller_type Machine type for controllers \"n1-standard-1\" See below worker_type Machine type for workers \"n1-standard-1\" See below os_stream Fedora CoreOS stream for compute instances \"stable\" \"stable\", \"testing\", \"next\" disk_size Size of the disk in GB 30 100 worker_preemptible If enabled, Compute Engine will terminate workers randomly within 24 hours false true controller_snippets Controller Butane snippets [] examples worker_snippets Worker Butane snippets [] examples networking Choice of networking provider \"cilium\" \"calico\" or \"cilium\" or \"flannel\" pod_cidr CIDR IPv4 range to assign to Kubernetes pods \"10.2.0.0/16\" \"10.22.0.0/16\" service_cidr CIDR IPv4 range to assign to Kubernetes services \"10.3.0.0/16\" \"10.3.0.0/24\" worker_node_labels List of initial worker node labels [] [\"worker-pool=default\"] Check the list of valid machine types . Preemption \u00b6 Add worker_preemptible = \"true\" to allow worker nodes to be preempted at random, but pay significantly less. Clusters tolerate stopping instances fairly well (reschedules pods, but cannot drain) and preemption provides a nice reward for running fault-tolerant cluster systems.`","title":"Google Cloud"},{"location":"fedora-coreos/google-cloud/#google-cloud","text":"In this tutorial, we'll create a Kubernetes v1.24.3 cluster on Google Compute Engine with Fedora CoreOS. We'll declare a Kubernetes cluster using the Typhoon Terraform module. Then apply the changes to create a network, firewall rules, health checks, controller instances, worker managed instance group, load balancers, and TLS assets. Controller hosts are provisioned to run an etcd-member peer and a kubelet service. Worker hosts run a kubelet service. Controller nodes run kube-apiserver , kube-scheduler , kube-controller-manager , and coredns , while kube-proxy and calico (or flannel ) run on every node. A generated kubeconfig provides kubectl access to the cluster.","title":"Google Cloud"},{"location":"fedora-coreos/google-cloud/#requirements","text":"Google Cloud Account and Service Account Google Cloud DNS Zone (registered Domain Name or delegated subdomain) Terraform v0.13.0+","title":"Requirements"},{"location":"fedora-coreos/google-cloud/#terraform-setup","text":"Install Terraform v0.13.0+ on your system. $ terraform version Terraform v1.0.0 Read concepts to learn about Terraform, modules, and organizing resources. Change to your infrastructure repository (e.g. infra ). cd infra/clusters","title":"Terraform Setup"},{"location":"fedora-coreos/google-cloud/#provider","text":"Login to your Google Console API Manager and select a project, or signup if you don't have an account. Select \"Credentials\" and create a service account key. Choose the \"Compute Engine Admin\" and \"DNS Administrator\" roles and save the JSON private key to a file that can be referenced in configs. mv ~/Downloads/project-id-43048204.json ~/.config/google-cloud/terraform.json Configure the Google Cloud provider to use your service account key, project-id, and region in a providers.tf file. provider \"google\" { project = \"project-id\" region = \"us-central1\" credentials = file ( \"~/.config/google-cloud/terraform.json\" ) } provider \"ct\" {} terraform { required_providers { ct = { source = \"poseidon/ct\" version = \"0.10.0\" } google = { source = \"hashicorp/google\" version = \"4.29.0\" } } } Additional configuration options are described in the google provider docs . Tip Regions are listed in docs or with gcloud compute regions list . A project may contain multiple clusters across different regions.","title":"Provider"},{"location":"fedora-coreos/google-cloud/#cluster","text":"Define a Kubernetes cluster using the module google-cloud/fedora-coreos/kubernetes . module \"yavin\" { source = \"git::https://github.com/poseidon/typhoon//google-cloud/fedora-coreos/kubernetes?ref=development-sha\" # Google Cloud cluster_name = \"yavin\" region = \"us-central1\" dns_zone = \"example.com\" dns_zone_name = \"example-zone\" # configuration ssh_authorized_key = \"ssh-ed25519 AAAAB3Nz...\" # optional worker_count = 2 } Reference the variables docs or the variables.tf source.","title":"Cluster"},{"location":"fedora-coreos/google-cloud/#ssh-agent","text":"Initial bootstrapping requires bootstrap.service be started on one controller node. Terraform uses ssh-agent to automate this step. Add your SSH private key to ssh-agent . ssh-add ~/.ssh/id_ed25519 ssh-add -L","title":"ssh-agent"},{"location":"fedora-coreos/google-cloud/#apply","text":"Initialize the config directory if this is the first use with Terraform. terraform init Plan the resources to be created. $ terraform plan Plan: 64 to add, 0 to change, 0 to destroy. Apply the changes to create the cluster. $ terraform apply module.yavin.null_resource.bootstrap: Still creating... ( 10s elapsed ) ... module.yavin.null_resource.bootstrap: Still creating... ( 5m30s elapsed ) module.yavin.null_resource.bootstrap: Still creating... ( 5m40s elapsed ) module.yavin.null_resource.bootstrap: Creation complete ( ID: 5768638456220583358 ) Apply complete! Resources: 62 added, 0 changed, 0 destroyed. In 4-8 minutes, the Kubernetes cluster will be ready.","title":"Apply"},{"location":"fedora-coreos/google-cloud/#verify","text":"Install kubectl on your system. Obtain the generated cluster kubeconfig from module outputs (e.g. write to a local file). resource \"local_file\" \"kubeconfig-yavin\" { content = module.yavin.kubeconfig-admin filename = \"/home/user/.kube/configs/yavin-config\" } List nodes in the cluster. $ export KUBECONFIG=/home/user/.kube/configs/yavin-config $ kubectl get nodes NAME ROLES STATUS AGE VERSION yavin-controller-0.c.example-com.internal <none> Ready 6m v1.24.3 yavin-worker-jrbf.c.example-com.internal <none> Ready 5m v1.24.3 yavin-worker-mzdm.c.example-com.internal <none> Ready 5m v1.24.3 List the pods. $ kubectl get pods --all-namespaces NAMESPACE NAME READY STATUS RESTARTS AGE kube-system calico-node-1cs8z 2/2 Running 0 6m kube-system calico-node-d1l5b 2/2 Running 0 6m kube-system calico-node-sp9ps 2/2 Running 0 6m kube-system coredns-1187388186-dkh3o 1/1 Running 0 6m kube-system coredns-1187388186-zj5dl 1/1 Running 0 6m kube-system kube-apiserver-controller-0 1/1 Running 0 6m kube-system kube-controller-manager-controller-0 1/1 Running 0 6m kube-system kube-proxy-117v6 1/1 Running 0 6m kube-system kube-proxy-9886n 1/1 Running 0 6m kube-system kube-proxy-njn47 1/1 Running 0 6m kube-system kube-scheduler-controller-0 1/1 Running 0 6m","title":"Verify"},{"location":"fedora-coreos/google-cloud/#going-further","text":"Learn about maintenance and addons .","title":"Going Further"},{"location":"fedora-coreos/google-cloud/#variables","text":"Check the variables.tf source.","title":"Variables"},{"location":"fedora-coreos/google-cloud/#required","text":"Name Description Example cluster_name Unique cluster name (prepended to dns_zone) \"yavin\" region Google Cloud region \"us-central1\" dns_zone Google Cloud DNS zone \"google-cloud.example.com\" dns_zone_name Google Cloud DNS zone name \"example-zone\" ssh_authorized_key SSH public key for user 'core' \"ssh-ed25519 AAAAB3NZ...\" Check the list of valid regions and list Fedora CoreOS images with gcloud compute images list | grep fedora-coreos .","title":"Required"},{"location":"fedora-coreos/google-cloud/#dns-zone","text":"Clusters create a DNS A record ${cluster_name}.${dns_zone} to resolve a TCP proxy load balancer backed by controller instances. This FQDN is used by workers and kubectl to access the apiserver(s). In this example, the cluster's apiserver would be accessible at yavin.google-cloud.example.com . You'll need a registered domain name or delegated subdomain on Google Cloud DNS. You can set this up once and create many clusters with unique names. resource \"google_dns_managed_zone\" \"zone-for-clusters\" { dns_name = \"google-cloud.example.com.\" name = \"example-zone\" description = \"Production DNS zone\" } If you have an existing domain name with a zone file elsewhere, just delegate a subdomain that can be managed on Google Cloud (e.g. google-cloud.mydomain.com) and update nameservers .","title":"DNS Zone"},{"location":"fedora-coreos/google-cloud/#optional","text":"Name Description Default Example controller_count Number of controllers (i.e. masters) 1 3 worker_count Number of workers 1 3 controller_type Machine type for controllers \"n1-standard-1\" See below worker_type Machine type for workers \"n1-standard-1\" See below os_stream Fedora CoreOS stream for compute instances \"stable\" \"stable\", \"testing\", \"next\" disk_size Size of the disk in GB 30 100 worker_preemptible If enabled, Compute Engine will terminate workers randomly within 24 hours false true controller_snippets Controller Butane snippets [] examples worker_snippets Worker Butane snippets [] examples networking Choice of networking provider \"cilium\" \"calico\" or \"cilium\" or \"flannel\" pod_cidr CIDR IPv4 range to assign to Kubernetes pods \"10.2.0.0/16\" \"10.22.0.0/16\" service_cidr CIDR IPv4 range to assign to Kubernetes services \"10.3.0.0/16\" \"10.3.0.0/24\" worker_node_labels List of initial worker node labels [] [\"worker-pool=default\"] Check the list of valid machine types .","title":"Optional"},{"location":"fedora-coreos/google-cloud/#preemption","text":"Add worker_preemptible = \"true\" to allow worker nodes to be preempted at random, but pay significantly less. Clusters tolerate stopping instances fairly well (reschedules pods, but cannot drain) and preemption provides a nice reward for running fault-tolerant cluster systems.`","title":"Preemption"},{"location":"flatcar-linux/aws/","text":"AWS \u00b6 In this tutorial, we'll create a Kubernetes v1.24.3 cluster on AWS with Flatcar Linux. We'll declare a Kubernetes cluster using the Typhoon Terraform module. Then apply the changes to create a VPC, gateway, subnets, security groups, controller instances, worker auto-scaling group, network load balancer, and TLS assets. Controller hosts are provisioned to run an etcd-member peer and a kubelet service. Worker hosts run a kubelet service. Controller nodes run kube-apiserver , kube-scheduler , kube-controller-manager , and coredns , while kube-proxy and calico (or flannel ) run on every node. A generated kubeconfig provides kubectl access to the cluster. Requirements \u00b6 AWS Account and IAM credentials AWS Route53 DNS Zone (registered Domain Name or delegated subdomain) Terraform v0.13.0+ Terraform Setup \u00b6 Install Terraform v0.13.0+ on your system. $ terraform version Terraform v1.0.0 Read concepts to learn about Terraform, modules, and organizing resources. Change to your infrastructure repository (e.g. infra ). cd infra/clusters Provider \u00b6 Login to your AWS IAM dashboard and find your IAM user. Select \"Security Credentials\" and create an access key. Save the id and secret to a file that can be referenced in configs. [default] aws_access_key_id = xxx aws_secret_access_key = yyy Configure the AWS provider to use your access key credentials in a providers.tf file. provider \"aws\" { region = \"eu-central-1\" shared_credentials_file = \"/home/user/.config/aws/credentials\" } provider \"ct\" {} terraform { required_providers { ct = { source = \"poseidon/ct\" version = \"0.10.0\" } aws = { source = \"hashicorp/aws\" version = \"4.22.0\" } } } Additional configuration options are described in the aws provider docs . Tip Regions are listed in docs or with aws ec2 describe-regions . Cluster \u00b6 Define a Kubernetes cluster using the module aws/flatcar-linux/kubernetes . module \"tempest\" { source = \"git::https://github.com/poseidon/typhoon//aws/flatcar-linux/kubernetes?ref=v1.24.3\" # AWS cluster_name = \"tempest\" dns_zone = \"aws.example.com\" dns_zone_id = \"Z3PAABBCFAKEC0\" # configuration ssh_authorized_key = \"ssh-rsa AAAAB3Nz...\" # optional worker_count = 2 worker_type = \"t3.small\" } Reference the variables docs or the variables.tf source. ssh-agent \u00b6 Initial bootstrapping requires bootstrap.service be started on one controller node. Terraform uses ssh-agent to automate this step. Add your SSH private key to ssh-agent . ssh-add ~/.ssh/id_rsa ssh-add -L Apply \u00b6 Initialize the config directory if this is the first use with Terraform. terraform init Plan the resources to be created. $ terraform plan Plan: 80 to add, 0 to change, 0 to destroy. Apply the changes to create the cluster. $ terraform apply ... module.tempest.null_resource.bootstrap: Still creating... ( 4m50s elapsed ) module.tempest.null_resource.bootstrap: Still creating... ( 5m0s elapsed ) module.tempest.null_resource.bootstrap: Creation complete after 11m8s ( ID: 3961816482286168143 ) Apply complete! Resources: 98 added, 0 changed, 0 destroyed. In 4-8 minutes, the Kubernetes cluster will be ready. Verify \u00b6 Install kubectl on your system. Obtain the generated cluster kubeconfig from module outputs (e.g. write to a local file). resource \"local_file\" \"kubeconfig-tempest\" { content = module.tempest.kubeconfig-admin filename = \"/home/user/.kube/configs/tempest-config\" } List nodes in the cluster. $ export KUBECONFIG=/home/user/.kube/configs/tempest-config $ kubectl get nodes NAME STATUS ROLES AGE VERSION ip-10-0-3-155 Ready <none> 10m v1.24.3 ip-10-0-26-65 Ready <none> 10m v1.24.3 ip-10-0-41-21 Ready <none> 10m v1.24.3 List the pods. $ kubectl get pods --all-namespaces NAMESPACE NAME READY STATUS RESTARTS AGE kube-system calico-node-1m5bf 2/2 Running 0 34m kube-system calico-node-7jmr1 2/2 Running 0 34m kube-system calico-node-bknc8 2/2 Running 0 34m kube-system coredns-1187388186-wx1lg 1/1 Running 0 34m kube-system coredns-1187388186-qjnvp 1/1 Running 0 34m kube-system kube-apiserver-ip-10-0-3-155 1/1 Running 0 34m kube-system kube-controller-manager-ip-10-0-3-155 1/1 Running 0 34m kube-system kube-proxy-14wxv 1/1 Running 0 34m kube-system kube-proxy-9vxh2 1/1 Running 0 34m kube-system kube-proxy-sbbsh 1/1 Running 0 34m kube-system kube-scheduler-ip-10-0-3-155 1/1 Running 1 34m Going Further \u00b6 Learn about maintenance and addons . Variables \u00b6 Check the variables.tf source. Required \u00b6 Name Description Example cluster_name Unique cluster name (prepended to dns_zone) \"tempest\" dns_zone AWS Route53 DNS zone \"aws.example.com\" dns_zone_id AWS Route53 DNS zone id \"Z3PAABBCFAKEC0\" ssh_authorized_key SSH public key for user 'core' \"ssh-rsa AAAAB3NZ...\" DNS Zone \u00b6 Clusters create a DNS A record ${cluster_name}.${dns_zone} to resolve a network load balancer backed by controller instances. This FQDN is used by workers and kubectl to access the apiserver(s). In this example, the cluster's apiserver would be accessible at tempest.aws.example.com . You'll need a registered domain name or delegated subdomain on AWS Route53. You can set this up once and create many clusters with unique names. resource \"aws_route53_zone\" \"zone-for-clusters\" { name = \"aws.example.com.\" } Reference the DNS zone id with aws_route53_zone.zone-for-clusters.zone_id . If you have an existing domain name with a zone file elsewhere, just delegate a subdomain that can be managed on Route53 (e.g. aws.mydomain.com) and update nameservers . Optional \u00b6 Name Description Default Example controller_count Number of controllers (i.e. masters) 1 1 worker_count Number of workers 1 3 controller_type EC2 instance type for controllers \"t3.small\" See below worker_type EC2 instance type for workers \"t3.small\" See below os_image AMI channel for a Container Linux derivative \"flatcar-stable\" flatcar-stable, flatcar-beta, flatcar-alpha disk_size Size of the EBS volume in GB 30 100 disk_type Type of the EBS volume \"gp3\" standard, gp2, gp3, io1 disk_iops IOPS of the EBS volume 0 (i.e. auto) 400 worker_target_groups Target group ARNs to which worker instances should be added [] [aws_lb_target_group.app.id] worker_price Spot price in USD for worker instances or 0 to use on-demand instances 0/null 0.10 controller_snippets Controller Container Linux Config snippets [] example worker_snippets Worker Container Linux Config snippets [] example networking Choice of networking provider \"cilium\" \"calico\" or \"cilium\" or \"flannel\" network_mtu CNI interface MTU (calico only) 1480 8981 host_cidr CIDR IPv4 range to assign to EC2 instances \"10.0.0.0/16\" \"10.1.0.0/16\" pod_cidr CIDR IPv4 range to assign to Kubernetes pods \"10.2.0.0/16\" \"10.22.0.0/16\" service_cidr CIDR IPv4 range to assign to Kubernetes services \"10.3.0.0/16\" \"10.3.0.0/24\" worker_node_labels List of initial worker node labels [] [\"worker-pool=default\"] Check the list of valid instance types . Warning Do not choose a controller_type smaller than t2.small . Smaller instances are not sufficient for running a controller. MTU If your EC2 instance type supports Jumbo frames (most do), we recommend you change the network_mtu to 8981! You will get better pod-to-pod bandwidth. Spot \u00b6 Add worker_price = \"0.10\" to use spot instance workers (instead of \"on-demand\") and set a maximum spot price in USD. Clusters can tolerate spot market interuptions fairly well (reschedules pods, but cannot drain) to save money, with the tradeoff that requests for workers may go unfulfilled.","title":"AWS"},{"location":"flatcar-linux/aws/#aws","text":"In this tutorial, we'll create a Kubernetes v1.24.3 cluster on AWS with Flatcar Linux. We'll declare a Kubernetes cluster using the Typhoon Terraform module. Then apply the changes to create a VPC, gateway, subnets, security groups, controller instances, worker auto-scaling group, network load balancer, and TLS assets. Controller hosts are provisioned to run an etcd-member peer and a kubelet service. Worker hosts run a kubelet service. Controller nodes run kube-apiserver , kube-scheduler , kube-controller-manager , and coredns , while kube-proxy and calico (or flannel ) run on every node. A generated kubeconfig provides kubectl access to the cluster.","title":"AWS"},{"location":"flatcar-linux/aws/#requirements","text":"AWS Account and IAM credentials AWS Route53 DNS Zone (registered Domain Name or delegated subdomain) Terraform v0.13.0+","title":"Requirements"},{"location":"flatcar-linux/aws/#terraform-setup","text":"Install Terraform v0.13.0+ on your system. $ terraform version Terraform v1.0.0 Read concepts to learn about Terraform, modules, and organizing resources. Change to your infrastructure repository (e.g. infra ). cd infra/clusters","title":"Terraform Setup"},{"location":"flatcar-linux/aws/#provider","text":"Login to your AWS IAM dashboard and find your IAM user. Select \"Security Credentials\" and create an access key. Save the id and secret to a file that can be referenced in configs. [default] aws_access_key_id = xxx aws_secret_access_key = yyy Configure the AWS provider to use your access key credentials in a providers.tf file. provider \"aws\" { region = \"eu-central-1\" shared_credentials_file = \"/home/user/.config/aws/credentials\" } provider \"ct\" {} terraform { required_providers { ct = { source = \"poseidon/ct\" version = \"0.10.0\" } aws = { source = \"hashicorp/aws\" version = \"4.22.0\" } } } Additional configuration options are described in the aws provider docs . Tip Regions are listed in docs or with aws ec2 describe-regions .","title":"Provider"},{"location":"flatcar-linux/aws/#cluster","text":"Define a Kubernetes cluster using the module aws/flatcar-linux/kubernetes . module \"tempest\" { source = \"git::https://github.com/poseidon/typhoon//aws/flatcar-linux/kubernetes?ref=v1.24.3\" # AWS cluster_name = \"tempest\" dns_zone = \"aws.example.com\" dns_zone_id = \"Z3PAABBCFAKEC0\" # configuration ssh_authorized_key = \"ssh-rsa AAAAB3Nz...\" # optional worker_count = 2 worker_type = \"t3.small\" } Reference the variables docs or the variables.tf source.","title":"Cluster"},{"location":"flatcar-linux/aws/#ssh-agent","text":"Initial bootstrapping requires bootstrap.service be started on one controller node. Terraform uses ssh-agent to automate this step. Add your SSH private key to ssh-agent . ssh-add ~/.ssh/id_rsa ssh-add -L","title":"ssh-agent"},{"location":"flatcar-linux/aws/#apply","text":"Initialize the config directory if this is the first use with Terraform. terraform init Plan the resources to be created. $ terraform plan Plan: 80 to add, 0 to change, 0 to destroy. Apply the changes to create the cluster. $ terraform apply ... module.tempest.null_resource.bootstrap: Still creating... ( 4m50s elapsed ) module.tempest.null_resource.bootstrap: Still creating... ( 5m0s elapsed ) module.tempest.null_resource.bootstrap: Creation complete after 11m8s ( ID: 3961816482286168143 ) Apply complete! Resources: 98 added, 0 changed, 0 destroyed. In 4-8 minutes, the Kubernetes cluster will be ready.","title":"Apply"},{"location":"flatcar-linux/aws/#verify","text":"Install kubectl on your system. Obtain the generated cluster kubeconfig from module outputs (e.g. write to a local file). resource \"local_file\" \"kubeconfig-tempest\" { content = module.tempest.kubeconfig-admin filename = \"/home/user/.kube/configs/tempest-config\" } List nodes in the cluster. $ export KUBECONFIG=/home/user/.kube/configs/tempest-config $ kubectl get nodes NAME STATUS ROLES AGE VERSION ip-10-0-3-155 Ready <none> 10m v1.24.3 ip-10-0-26-65 Ready <none> 10m v1.24.3 ip-10-0-41-21 Ready <none> 10m v1.24.3 List the pods. $ kubectl get pods --all-namespaces NAMESPACE NAME READY STATUS RESTARTS AGE kube-system calico-node-1m5bf 2/2 Running 0 34m kube-system calico-node-7jmr1 2/2 Running 0 34m kube-system calico-node-bknc8 2/2 Running 0 34m kube-system coredns-1187388186-wx1lg 1/1 Running 0 34m kube-system coredns-1187388186-qjnvp 1/1 Running 0 34m kube-system kube-apiserver-ip-10-0-3-155 1/1 Running 0 34m kube-system kube-controller-manager-ip-10-0-3-155 1/1 Running 0 34m kube-system kube-proxy-14wxv 1/1 Running 0 34m kube-system kube-proxy-9vxh2 1/1 Running 0 34m kube-system kube-proxy-sbbsh 1/1 Running 0 34m kube-system kube-scheduler-ip-10-0-3-155 1/1 Running 1 34m","title":"Verify"},{"location":"flatcar-linux/aws/#going-further","text":"Learn about maintenance and addons .","title":"Going Further"},{"location":"flatcar-linux/aws/#variables","text":"Check the variables.tf source.","title":"Variables"},{"location":"flatcar-linux/aws/#required","text":"Name Description Example cluster_name Unique cluster name (prepended to dns_zone) \"tempest\" dns_zone AWS Route53 DNS zone \"aws.example.com\" dns_zone_id AWS Route53 DNS zone id \"Z3PAABBCFAKEC0\" ssh_authorized_key SSH public key for user 'core' \"ssh-rsa AAAAB3NZ...\"","title":"Required"},{"location":"flatcar-linux/aws/#dns-zone","text":"Clusters create a DNS A record ${cluster_name}.${dns_zone} to resolve a network load balancer backed by controller instances. This FQDN is used by workers and kubectl to access the apiserver(s). In this example, the cluster's apiserver would be accessible at tempest.aws.example.com . You'll need a registered domain name or delegated subdomain on AWS Route53. You can set this up once and create many clusters with unique names. resource \"aws_route53_zone\" \"zone-for-clusters\" { name = \"aws.example.com.\" } Reference the DNS zone id with aws_route53_zone.zone-for-clusters.zone_id . If you have an existing domain name with a zone file elsewhere, just delegate a subdomain that can be managed on Route53 (e.g. aws.mydomain.com) and update nameservers .","title":"DNS Zone"},{"location":"flatcar-linux/aws/#optional","text":"Name Description Default Example controller_count Number of controllers (i.e. masters) 1 1 worker_count Number of workers 1 3 controller_type EC2 instance type for controllers \"t3.small\" See below worker_type EC2 instance type for workers \"t3.small\" See below os_image AMI channel for a Container Linux derivative \"flatcar-stable\" flatcar-stable, flatcar-beta, flatcar-alpha disk_size Size of the EBS volume in GB 30 100 disk_type Type of the EBS volume \"gp3\" standard, gp2, gp3, io1 disk_iops IOPS of the EBS volume 0 (i.e. auto) 400 worker_target_groups Target group ARNs to which worker instances should be added [] [aws_lb_target_group.app.id] worker_price Spot price in USD for worker instances or 0 to use on-demand instances 0/null 0.10 controller_snippets Controller Container Linux Config snippets [] example worker_snippets Worker Container Linux Config snippets [] example networking Choice of networking provider \"cilium\" \"calico\" or \"cilium\" or \"flannel\" network_mtu CNI interface MTU (calico only) 1480 8981 host_cidr CIDR IPv4 range to assign to EC2 instances \"10.0.0.0/16\" \"10.1.0.0/16\" pod_cidr CIDR IPv4 range to assign to Kubernetes pods \"10.2.0.0/16\" \"10.22.0.0/16\" service_cidr CIDR IPv4 range to assign to Kubernetes services \"10.3.0.0/16\" \"10.3.0.0/24\" worker_node_labels List of initial worker node labels [] [\"worker-pool=default\"] Check the list of valid instance types . Warning Do not choose a controller_type smaller than t2.small . Smaller instances are not sufficient for running a controller. MTU If your EC2 instance type supports Jumbo frames (most do), we recommend you change the network_mtu to 8981! You will get better pod-to-pod bandwidth.","title":"Optional"},{"location":"flatcar-linux/aws/#spot","text":"Add worker_price = \"0.10\" to use spot instance workers (instead of \"on-demand\") and set a maximum spot price in USD. Clusters can tolerate spot market interuptions fairly well (reschedules pods, but cannot drain) to save money, with the tradeoff that requests for workers may go unfulfilled.","title":"Spot"},{"location":"flatcar-linux/azure/","text":"Azure \u00b6 In this tutorial, we'll create a Kubernetes v1.24.3 cluster on Azure with Flatcar Linux. We'll declare a Kubernetes cluster using the Typhoon Terraform module. Then apply the changes to create a resource group, virtual network, subnets, security groups, controller availability set, worker scale set, load balancer, and TLS assets. Controller hosts are provisioned to run an etcd-member peer and a kubelet service. Worker hosts run a kubelet service. Controller nodes run kube-apiserver , kube-scheduler , kube-controller-manager , and coredns , while kube-proxy and calico (or flannel ) run on every node. A generated kubeconfig provides kubectl access to the cluster. Requirements \u00b6 Azure account Azure DNS Zone (registered Domain Name or delegated subdomain) Terraform v0.13.0+ Terraform Setup \u00b6 Install Terraform v0.13.0+ on your system. $ terraform version Terraform v1.0.0 Read concepts to learn about Terraform, modules, and organizing resources. Change to your infrastructure repository (e.g. infra ). cd infra/clusters Provider \u00b6 Install the Azure az command line tool to authenticate with Azure . az login Configure the Azure provider in a providers.tf file. provider \"azurerm\" { features {} } provider \"ct\" {} terraform { required_providers { ct = { source = \"poseidon/ct\" version = \"0.10.0\" } azurerm = { source = \"hashicorp/azurerm\" version = \"3.14.0\" } } } Additional configuration options are described in the azurerm provider docs . Flatcar Linux Images \u00b6 Flatcar Linux publishes images to the Azure Marketplace and requires accepting terms. az vm image terms show --publish kinvolk --offer flatcar-container-linux-free --plan stable az vm image terms accept --publish kinvolk --offer flatcar-container-linux-free --plan stable Cluster \u00b6 Define a Kubernetes cluster using the module azure/flatcar-linux/kubernetes . module \"ramius\" { source = \"git::https://github.com/poseidon/typhoon//azure/flatcar-linux/kubernetes?ref=v1.24.3\" # Azure cluster_name = \"ramius\" region = \"centralus\" dns_zone = \"azure.example.com\" dns_zone_group = \"example-group\" # configuration ssh_authorized_key = \"ssh-rsa AAAAB3Nz...\" # optional worker_count = 2 host_cidr = \"10.0.0.0/20\" } Reference the variables docs or the variables.tf source. ssh-agent \u00b6 Initial bootstrapping requires bootstrap.service be started on one controller node. Terraform uses ssh-agent to automate this step. Add your SSH private key to ssh-agent . ssh-add ~/.ssh/id_rsa ssh-add -L Apply \u00b6 Initialize the config directory if this is the first use with Terraform. terraform init Plan the resources to be created. $ terraform plan Plan: 86 to add, 0 to change, 0 to destroy. Apply the changes to create the cluster. $ terraform apply ... module.ramius.null_resource.bootstrap: Still creating... ( 6m50s elapsed ) module.ramius.null_resource.bootstrap: Still creating... ( 7m0s elapsed ) module.ramius.null_resource.bootstrap: Creation complete after 7m8s ( ID: 3961816482286168143 ) Apply complete! Resources: 69 added, 0 changed, 0 destroyed. In 4-8 minutes, the Kubernetes cluster will be ready. Verify \u00b6 Install kubectl on your system. Obtain the generated cluster kubeconfig from module outputs (e.g. write to a local file). resource \"local_file\" \"kubeconfig-ramius\" { content = module.ramius.kubeconfig-admin filename = \"/home/user/.kube/configs/ramius-config\" } List nodes in the cluster. $ export KUBECONFIG=/home/user/.kube/configs/ramius-config $ kubectl get nodes NAME STATUS ROLES AGE VERSION ramius-controller-0 Ready <none> 24m v1.24.3 ramius-worker-000001 Ready <none> 25m v1.24.3 ramius-worker-000002 Ready <none> 24m v1.24.3 List the pods. $ kubectl get pods --all-namespaces NAMESPACE NAME READY STATUS RESTARTS AGE kube-system coredns-7c6fbb4f4b-b6qzx 1/1 Running 0 26m kube-system coredns-7c6fbb4f4b-j2k3d 1/1 Running 0 26m kube-system calico-node-1m5bf 2/2 Running 0 26m kube-system calico-node-7jmr1 2/2 Running 0 26m kube-system calico-node-bknc8 2/2 Running 0 26m kube-system kube-apiserver-ramius-controller-0 1/1 Running 0 26m kube-system kube-controller-manager-ramius-controller-0 1/1 Running 0 26m kube-system kube-proxy-j4vpq 1/1 Running 0 26m kube-system kube-proxy-jxr5d 1/1 Running 0 26m kube-system kube-proxy-lbdw5 1/1 Running 0 26m kube-system kube-scheduler-ramius-controller-0 1/1 Running 0 26m Going Further \u00b6 Learn about maintenance and addons . Variables \u00b6 Check the variables.tf source. Required \u00b6 Name Description Example cluster_name Unique cluster name (prepended to dns_zone) \"ramius\" region Azure region \"centralus\" dns_zone Azure DNS zone \"azure.example.com\" dns_zone_group Resource group where the Azure DNS zone resides \"global\" ssh_authorized_key SSH public key for user 'core' \"ssh-rsa AAAAB3NZ...\" Tip Regions are shown in docs or with az account list-locations --output table . DNS Zone \u00b6 Clusters create a DNS A record ${cluster_name}.${dns_zone} to resolve a load balancer backed by controller instances. This FQDN is used by workers and kubectl to access the apiserver(s). In this example, the cluster's apiserver would be accessible at ramius.azure.example.com . You'll need a registered domain name or delegated subdomain on Azure DNS. You can set this up once and create many clusters with unique names. # Azure resource group for DNS zone resource \"azurerm_resource_group\" \"global\" { name = \"global\" location = \"centralus\" } # DNS zone for clusters resource \"azurerm_dns_zone\" \"clusters\" { resource_group_name = azurerm_resource_group.global.name name = \"azure.example.com\" zone_type = \"Public\" } Reference the DNS zone with azurerm_dns_zone.clusters.name and its resource group with \"azurerm_resource_group.global.name . If you have an existing domain name with a zone file elsewhere, just delegate a subdomain that can be managed on Azure DNS (e.g. azure.mydomain.com) and update nameservers . Optional \u00b6 Name Description Default Example controller_count Number of controllers (i.e. masters) 1 1 worker_count Number of workers 1 3 controller_type Machine type for controllers \"Standard_B2s\" See below worker_type Machine type for workers \"Standard_DS1_v2\" See below os_image Channel for a Container Linux derivative \"flatcar-stable\" flatcar-stable, flatcar-beta, flatcar-alpha disk_size Size of the disk in GB 30 100 worker_priority Set priority to Spot to use reduced cost surplus capacity, with the tradeoff that instances can be deallocated at any time Regular Spot controller_snippets Controller Container Linux Config snippets [] example worker_snippets Worker Container Linux Config snippets [] example networking Choice of networking provider \"cilium\" \"calico\" or \"cilium\" or \"flannel\" host_cidr CIDR IPv4 range to assign to instances \"10.0.0.0/16\" \"10.0.0.0/20\" pod_cidr CIDR IPv4 range to assign to Kubernetes pods \"10.2.0.0/16\" \"10.22.0.0/16\" service_cidr CIDR IPv4 range to assign to Kubernetes services \"10.3.0.0/16\" \"10.3.0.0/24\" worker_node_labels List of initial worker node labels [] [\"worker-pool=default\"] Check the list of valid machine types and their specs . Use az vm list-skus to get the identifier. Warning Unlike AWS and GCP, Azure requires its virtual networks to have non-overlapping IPv4 CIDRs (yeah, go figure). Instead of each cluster just using 10.0.0.0/16 for instances, each Azure cluster's host_cidr must be non-overlapping (e.g. 10.0.0.0/20 for the 1 st cluster, 10.0.16.0/20 for the 2 nd cluster, etc). Warning Do not choose a controller_type smaller than Standard_B2s . Smaller instances are not sufficient for running a controller. Spot Priority \u00b6 Add worker_priority=Spot to use Spot Priority workers that run on Azure's surplus capacity at lower cost, but with the tradeoff that they can be deallocated at random. Spot priority VMs are Azure's analog to AWS spot instances or GCP premptible instances.","title":"Azure"},{"location":"flatcar-linux/azure/#azure","text":"In this tutorial, we'll create a Kubernetes v1.24.3 cluster on Azure with Flatcar Linux. We'll declare a Kubernetes cluster using the Typhoon Terraform module. Then apply the changes to create a resource group, virtual network, subnets, security groups, controller availability set, worker scale set, load balancer, and TLS assets. Controller hosts are provisioned to run an etcd-member peer and a kubelet service. Worker hosts run a kubelet service. Controller nodes run kube-apiserver , kube-scheduler , kube-controller-manager , and coredns , while kube-proxy and calico (or flannel ) run on every node. A generated kubeconfig provides kubectl access to the cluster.","title":"Azure"},{"location":"flatcar-linux/azure/#requirements","text":"Azure account Azure DNS Zone (registered Domain Name or delegated subdomain) Terraform v0.13.0+","title":"Requirements"},{"location":"flatcar-linux/azure/#terraform-setup","text":"Install Terraform v0.13.0+ on your system. $ terraform version Terraform v1.0.0 Read concepts to learn about Terraform, modules, and organizing resources. Change to your infrastructure repository (e.g. infra ). cd infra/clusters","title":"Terraform Setup"},{"location":"flatcar-linux/azure/#provider","text":"Install the Azure az command line tool to authenticate with Azure . az login Configure the Azure provider in a providers.tf file. provider \"azurerm\" { features {} } provider \"ct\" {} terraform { required_providers { ct = { source = \"poseidon/ct\" version = \"0.10.0\" } azurerm = { source = \"hashicorp/azurerm\" version = \"3.14.0\" } } } Additional configuration options are described in the azurerm provider docs .","title":"Provider"},{"location":"flatcar-linux/azure/#flatcar-linux-images","text":"Flatcar Linux publishes images to the Azure Marketplace and requires accepting terms. az vm image terms show --publish kinvolk --offer flatcar-container-linux-free --plan stable az vm image terms accept --publish kinvolk --offer flatcar-container-linux-free --plan stable","title":"Flatcar Linux Images"},{"location":"flatcar-linux/azure/#cluster","text":"Define a Kubernetes cluster using the module azure/flatcar-linux/kubernetes . module \"ramius\" { source = \"git::https://github.com/poseidon/typhoon//azure/flatcar-linux/kubernetes?ref=v1.24.3\" # Azure cluster_name = \"ramius\" region = \"centralus\" dns_zone = \"azure.example.com\" dns_zone_group = \"example-group\" # configuration ssh_authorized_key = \"ssh-rsa AAAAB3Nz...\" # optional worker_count = 2 host_cidr = \"10.0.0.0/20\" } Reference the variables docs or the variables.tf source.","title":"Cluster"},{"location":"flatcar-linux/azure/#ssh-agent","text":"Initial bootstrapping requires bootstrap.service be started on one controller node. Terraform uses ssh-agent to automate this step. Add your SSH private key to ssh-agent . ssh-add ~/.ssh/id_rsa ssh-add -L","title":"ssh-agent"},{"location":"flatcar-linux/azure/#apply","text":"Initialize the config directory if this is the first use with Terraform. terraform init Plan the resources to be created. $ terraform plan Plan: 86 to add, 0 to change, 0 to destroy. Apply the changes to create the cluster. $ terraform apply ... module.ramius.null_resource.bootstrap: Still creating... ( 6m50s elapsed ) module.ramius.null_resource.bootstrap: Still creating... ( 7m0s elapsed ) module.ramius.null_resource.bootstrap: Creation complete after 7m8s ( ID: 3961816482286168143 ) Apply complete! Resources: 69 added, 0 changed, 0 destroyed. In 4-8 minutes, the Kubernetes cluster will be ready.","title":"Apply"},{"location":"flatcar-linux/azure/#verify","text":"Install kubectl on your system. Obtain the generated cluster kubeconfig from module outputs (e.g. write to a local file). resource \"local_file\" \"kubeconfig-ramius\" { content = module.ramius.kubeconfig-admin filename = \"/home/user/.kube/configs/ramius-config\" } List nodes in the cluster. $ export KUBECONFIG=/home/user/.kube/configs/ramius-config $ kubectl get nodes NAME STATUS ROLES AGE VERSION ramius-controller-0 Ready <none> 24m v1.24.3 ramius-worker-000001 Ready <none> 25m v1.24.3 ramius-worker-000002 Ready <none> 24m v1.24.3 List the pods. $ kubectl get pods --all-namespaces NAMESPACE NAME READY STATUS RESTARTS AGE kube-system coredns-7c6fbb4f4b-b6qzx 1/1 Running 0 26m kube-system coredns-7c6fbb4f4b-j2k3d 1/1 Running 0 26m kube-system calico-node-1m5bf 2/2 Running 0 26m kube-system calico-node-7jmr1 2/2 Running 0 26m kube-system calico-node-bknc8 2/2 Running 0 26m kube-system kube-apiserver-ramius-controller-0 1/1 Running 0 26m kube-system kube-controller-manager-ramius-controller-0 1/1 Running 0 26m kube-system kube-proxy-j4vpq 1/1 Running 0 26m kube-system kube-proxy-jxr5d 1/1 Running 0 26m kube-system kube-proxy-lbdw5 1/1 Running 0 26m kube-system kube-scheduler-ramius-controller-0 1/1 Running 0 26m","title":"Verify"},{"location":"flatcar-linux/azure/#going-further","text":"Learn about maintenance and addons .","title":"Going Further"},{"location":"flatcar-linux/azure/#variables","text":"Check the variables.tf source.","title":"Variables"},{"location":"flatcar-linux/azure/#required","text":"Name Description Example cluster_name Unique cluster name (prepended to dns_zone) \"ramius\" region Azure region \"centralus\" dns_zone Azure DNS zone \"azure.example.com\" dns_zone_group Resource group where the Azure DNS zone resides \"global\" ssh_authorized_key SSH public key for user 'core' \"ssh-rsa AAAAB3NZ...\" Tip Regions are shown in docs or with az account list-locations --output table .","title":"Required"},{"location":"flatcar-linux/azure/#dns-zone","text":"Clusters create a DNS A record ${cluster_name}.${dns_zone} to resolve a load balancer backed by controller instances. This FQDN is used by workers and kubectl to access the apiserver(s). In this example, the cluster's apiserver would be accessible at ramius.azure.example.com . You'll need a registered domain name or delegated subdomain on Azure DNS. You can set this up once and create many clusters with unique names. # Azure resource group for DNS zone resource \"azurerm_resource_group\" \"global\" { name = \"global\" location = \"centralus\" } # DNS zone for clusters resource \"azurerm_dns_zone\" \"clusters\" { resource_group_name = azurerm_resource_group.global.name name = \"azure.example.com\" zone_type = \"Public\" } Reference the DNS zone with azurerm_dns_zone.clusters.name and its resource group with \"azurerm_resource_group.global.name . If you have an existing domain name with a zone file elsewhere, just delegate a subdomain that can be managed on Azure DNS (e.g. azure.mydomain.com) and update nameservers .","title":"DNS Zone"},{"location":"flatcar-linux/azure/#optional","text":"Name Description Default Example controller_count Number of controllers (i.e. masters) 1 1 worker_count Number of workers 1 3 controller_type Machine type for controllers \"Standard_B2s\" See below worker_type Machine type for workers \"Standard_DS1_v2\" See below os_image Channel for a Container Linux derivative \"flatcar-stable\" flatcar-stable, flatcar-beta, flatcar-alpha disk_size Size of the disk in GB 30 100 worker_priority Set priority to Spot to use reduced cost surplus capacity, with the tradeoff that instances can be deallocated at any time Regular Spot controller_snippets Controller Container Linux Config snippets [] example worker_snippets Worker Container Linux Config snippets [] example networking Choice of networking provider \"cilium\" \"calico\" or \"cilium\" or \"flannel\" host_cidr CIDR IPv4 range to assign to instances \"10.0.0.0/16\" \"10.0.0.0/20\" pod_cidr CIDR IPv4 range to assign to Kubernetes pods \"10.2.0.0/16\" \"10.22.0.0/16\" service_cidr CIDR IPv4 range to assign to Kubernetes services \"10.3.0.0/16\" \"10.3.0.0/24\" worker_node_labels List of initial worker node labels [] [\"worker-pool=default\"] Check the list of valid machine types and their specs . Use az vm list-skus to get the identifier. Warning Unlike AWS and GCP, Azure requires its virtual networks to have non-overlapping IPv4 CIDRs (yeah, go figure). Instead of each cluster just using 10.0.0.0/16 for instances, each Azure cluster's host_cidr must be non-overlapping (e.g. 10.0.0.0/20 for the 1 st cluster, 10.0.16.0/20 for the 2 nd cluster, etc). Warning Do not choose a controller_type smaller than Standard_B2s . Smaller instances are not sufficient for running a controller.","title":"Optional"},{"location":"flatcar-linux/azure/#spot-priority","text":"Add worker_priority=Spot to use Spot Priority workers that run on Azure's surplus capacity at lower cost, but with the tradeoff that they can be deallocated at random. Spot priority VMs are Azure's analog to AWS spot instances or GCP premptible instances.","title":"Spot Priority"},{"location":"flatcar-linux/bare-metal/","text":"Bare-Metal \u00b6 In this tutorial, we'll network boot and provision a Kubernetes v1.24.3 cluster on bare-metal with Flatcar Linux. First, we'll deploy a Matchbox service and setup a network boot environment. Then, we'll declare a Kubernetes cluster using the Typhoon Terraform module and power on machines. On PXE boot, machines will install Container Linux to disk, reboot into the disk install, and provision themselves as Kubernetes controllers or workers via Ignition. Controller hosts are provisioned to run an etcd-member peer and a kubelet service. Worker hosts run a kubelet service. Controller nodes run kube-apiserver , kube-scheduler , kube-controller-manager , and coredns while kube-proxy and calico (or flannel ) run on every node. A generated kubeconfig provides kubectl access to the cluster. Requirements \u00b6 Machines with 2GB RAM, 30GB disk, PXE-enabled NIC, IPMI PXE-enabled network boot environment (with HTTPS support) Matchbox v0.6+ deployment with API enabled Matchbox credentials client.crt , client.key , ca.crt Terraform v0.13.0+ Machines \u00b6 Collect a MAC address from each machine. For machines with multiple PXE-enabled NICs, pick one of the MAC addresses. MAC addresses will be used to match machines to profiles during network boot. 52:54:00:a1:9c:ae (node1) 52:54:00:b2:2f:86 (node2) 52:54:00:c3:61:77 (node3) Configure each machine to boot from the disk through IPMI or the BIOS menu. ipmitool -H node1 -U USER -P PASS chassis bootdev disk options=persistent During provisioning, you'll explicitly set the boot device to pxe for the next boot only. Machines will install (overwrite) the operating system to disk on PXE boot and reboot into the disk install. Ask your hardware vendor to provide MACs and preconfigure IPMI, if possible. With it, you can rack new servers, terraform apply with new info, and power on machines that network boot and provision into clusters. DNS \u00b6 Create a DNS A (or AAAA) record for each node's default interface. Create a record that resolves to each controller node (or re-use the node record if there's one controller). node1.example.com (node1) node2.example.com (node2) node3.example.com (node3) myk8s.example.com (node1) Cluster nodes will be configured to refer to the control plane and themselves by these fully qualified names and they'll be used in generated TLS certificates. Matchbox \u00b6 Matchbox is an open-source app that matches network-booted bare-metal machines (based on labels like MAC, UUID, etc.) to profiles to automate cluster provisioning. Install Matchbox on a Kubernetes cluster or dedicated server. Installing on Kubernetes (recommended) Installing on a server Tip Deploy Matchbox as service that can be accessed by all of your bare-metal machines globally. This provides a single endpoint to use Terraform to manage bare-metal clusters at different sites. Typhoon will never include secrets in provisioning user-data so you may even deploy matchbox publicly. Matchbox provides a TLS client-authenticated API that clients, like Terraform, can use to manage machine matching and profiles. Think of it like a cloud provider API, but for creating bare-metal instances. Generate TLS client credentials. Save the ca.crt , client.crt , and client.key where they can be referenced in Terraform configs. mv ca.crt client.crt client.key ~/.config/matchbox/ Verify the matchbox read-only HTTP endpoints are accessible (port is configurable). $ curl http://matchbox.example.com:8080 matchbox Verify your TLS client certificate and key can be used to access the Matchbox API (port is configurable). $ openssl s_client -connect matchbox.example.com:8081 \\ -CAfile ~/.config/matchbox/ca.crt \\ -cert ~/.config/matchbox/client.crt \\ -key ~/.config/matchbox/client.key PXE Environment \u00b6 Create an iPXE-enabled network boot environment. Configure PXE clients to chainload iPXE firmware compiled to support HTTPS downloads . Instruct iPXE clients to chainload from your Matchbox service's /boot.ipxe endpoint. For networks already supporting iPXE clients, you can add a default.ipxe config. # /var/www/html/ipxe/default.ipxe chain http://matchbox.foo:8080/boot.ipxe For networks with Ubiquiti Routers, you can configure the router itself to chainload machines to iPXE and Matchbox. Read about the many ways to setup a compliant iPXE-enabled network. There is quite a bit of flexibility: Continue using existing DHCP, TFTP, or DNS services Configure specific machines, subnets, or architectures to chainload from Matchbox Place Matchbox behind a menu entry (timeout and default to Matchbox) TFTP chainloading to modern boot firmware, like iPXE, avoids issues with old NICs and allows faster transfer protocols like HTTP to be used. Warning Compile iPXE from source with support for HTTPS downloads . iPXE's pre-built firmware binaries do not enable this. If you cannot enable HTTPS downloads, set download_protocol = \"http\" (discouraged). Terraform Setup \u00b6 Install Terraform v0.13.0+ on your system. $ terraform version Terraform v1.0.0 Read concepts to learn about Terraform, modules, and organizing resources. Change to your infrastructure repository (e.g. infra ). cd infra/clusters Provider \u00b6 Configure the Matchbox provider to use your Matchbox API endpoint and client certificate in a providers.tf file. provider \"matchbox\" { endpoint = \"matchbox.example.com:8081\" client_cert = file ( \"~/.config/matchbox/client.crt\" ) client_key = file ( \"~/.config/matchbox/client.key\" ) ca = file ( \"~/.config/matchbox/ca.crt\" ) } provider \"ct\" {} terraform { required_providers { ct = { source = \"poseidon/ct\" version = \"0.10.0\" } matchbox = { source = \"poseidon/matchbox\" version = \"0.5.0\" } } } Cluster \u00b6 Define a Kubernetes cluster using the module bare-metal/flatcar-linux/kubernetes . module \"mercury\" { source = \"git::https://github.com/poseidon/typhoon//bare-metal/flatcar-linux/kubernetes?ref=v1.24.3\" # bare-metal cluster_name = \"mercury\" matchbox_http_endpoint = \"http://matchbox.example.com\" os_channel = \"flatcar-stable\" os_version = \"2345.3.1\" # configuration k8s_domain_name = \"node1.example.com\" ssh_authorized_key = \"ssh-rsa AAAAB3Nz...\" # machines controllers = [{ name = \"node1\" mac = \"52:54:00:a1:9c:ae\" domain = \"node1.example.com\" }] workers = [ { name = \"node2\" , mac = \"52:54:00:b2:2f:86\" domain = \"node2.example.com\" }, { name = \"node3\" , mac = \"52:54:00:c3:61:77\" domain = \"node3.example.com\" } ] # set to http only if you cannot chainload to iPXE firmware with https support # download_protocol = \"http\" } Reference the variables docs or the variables.tf source. ssh-agent \u00b6 Initial bootstrapping requires bootstrap.service be started on one controller node. Terraform uses ssh-agent to automate this step. Add your SSH private key to ssh-agent . ssh-add ~/.ssh/id_rsa ssh-add -L Apply \u00b6 Initialize the config directory if this is the first use with Terraform. terraform init Plan the resources to be created. $ terraform plan Plan: 55 to add, 0 to change, 0 to destroy. Apply the changes. Terraform will generate bootstrap assets and create Matchbox profiles (e.g. controller, worker) and matching rules via the Matchbox API. $ terraform apply module.mercury.null_resource.copy-controller-secrets.0: Still creating... ( 10s elapsed ) module.mercury.null_resource.copy-worker-secrets.0: Still creating... ( 10s elapsed ) ... Apply will then loop until it can successfully copy credentials to each machine and start the one-time Kubernetes bootstrap service. Proceed to the next step while this loops. Power \u00b6 Power on each machine with the boot device set to pxe for the next boot only. ipmitool -H node1.example.com -U USER -P PASS chassis bootdev pxe ipmitool -H node1.example.com -U USER -P PASS power on Machines will network boot, install Container Linux to disk, reboot into the disk install, and provision themselves as controllers or workers. If this is the first test of your PXE-enabled network boot environment, watch the SOL console of a machine to spot any misconfigurations. Bootstrap \u00b6 Wait for the bootstrap step to finish bootstrapping the Kubernetes control plane. This may take 5-15 minutes depending on your network. module.mercury.null_resource.bootstrap: Still creating... (6m10s elapsed) module.mercury.null_resource.bootstrap: Still creating... (6m20s elapsed) module.mercury.null_resource.bootstrap: Still creating... (6m30s elapsed) module.mercury.null_resource.bootstrap: Still creating... (6m40s elapsed) module.mercury.null_resource.bootstrap: Creation complete (ID: 5441741360626669024) Apply complete! Resources: 55 added, 0 changed, 0 destroyed. To watch the install to disk (until machines reboot from disk), SSH to port 2222. # before v1.10.1 $ ssh debug@node1.example.com # after v1.10.1 $ ssh -p 2222 core@node1.example.com To watch the bootstrap process in detail, SSH to the first controller and journal the logs. $ ssh core@node1.example.com $ journalctl -f -u bootstrap rkt[1750]: The connection to the server cluster.example.com:6443 was refused - did you specify the right host or port? rkt[1750]: Waiting for static pod control plane ... rkt[1750]: serviceaccount/calico-node unchanged systemd[1]: Started Kubernetes control plane. Verify \u00b6 Install kubectl on your system. Obtain the generated cluster kubeconfig from module outputs (e.g. write to a local file). resource \"local_file\" \"kubeconfig-mercury\" { content = module.mercury.kubeconfig-admin filename = \"/home/user/.kube/configs/mercury-config\" } List nodes in the cluster. $ export KUBECONFIG=/home/user/.kube/configs/mercury-config $ kubectl get nodes NAME STATUS ROLES AGE VERSION node1.example.com Ready <none> 10m v1.24.3 node2.example.com Ready <none> 10m v1.24.3 node3.example.com Ready <none> 10m v1.24.3 List the pods. $ kubectl get pods --all-namespaces NAMESPACE NAME READY STATUS RESTARTS AGE kube-system calico-node-6qp7f 2/2 Running 1 11m kube-system calico-node-gnjrm 2/2 Running 0 11m kube-system calico-node-llbgt 2/2 Running 0 11m kube-system coredns-1187388186-dj3pd 1/1 Running 0 11m kube-system coredns-1187388186-mx9rt 1/1 Running 0 11m kube-system kube-apiserver-node1.example.com 1/1 Running 0 11m kube-system kube-controller-node1.example.com 1/1 Running 1 11m kube-system kube-proxy-50sd4 1/1 Running 0 11m kube-system kube-proxy-bczhp 1/1 Running 0 11m kube-system kube-proxy-mp2fw 1/1 Running 0 11m kube-system kube-scheduler-node1.example.com 1/1 Running 0 11m Going Further \u00b6 Learn about maintenance and addons . Variables \u00b6 Check the variables.tf source. Required \u00b6 Name Description Example cluster_name Unique cluster name \"mercury\" matchbox_http_endpoint Matchbox HTTP read-only endpoint \" http://matchbox.example.com:port \" os_channel Channel for a Container Linux derivative flatcar-stable, flatcar-beta, flatcar-alpha os_version Version for a Container Linux derivative to PXE and install \"2345.3.1\" k8s_domain_name FQDN resolving to the controller(s) nodes. Workers and kubectl will communicate with this endpoint \"myk8s.example.com\" ssh_authorized_key SSH public key for user 'core' \"ssh-rsa AAAAB3Nz...\" controllers List of controller machine detail objects (unique name, identifying MAC address, FQDN) [{name=\"node1\", mac=\"52:54:00:a1:9c:ae\", domain=\"node1.example.com\"}] workers List of worker machine detail objects (unique name, identifying MAC address, FQDN) [{name=\"node2\", mac=\"52:54:00:b2:2f:86\", domain=\"node2.example.com\"}, {name=\"node3\", mac=\"52:54:00:c3:61:77\", domain=\"node3.example.com\"}] Optional \u00b6 Name Description Default Example download_protocol Protocol iPXE uses to download the kernel and initrd. iPXE must be compiled with crypto support for https. Unused if cached_install is true \"https\" \"http\" cached_install PXE boot and install from the Matchbox /assets cache. Admin MUST have downloaded Container Linux or Flatcar images into the cache false true install_disk Disk device where Container Linux should be installed \"/dev/sda\" \"/dev/sdb\" networking Choice of networking provider \"cilium\" \"calico\" or \"cilium\" or \"flannel\" network_mtu CNI interface MTU (calico-only) 1480 - snippets Map from machine names to lists of Container Linux Config snippets {} examples network_ip_autodetection_method Method to detect host IPv4 address (calico-only) \"first-found\" \"can-reach=10.0.0.1\" pod_cidr CIDR IPv4 range to assign to Kubernetes pods \"10.2.0.0/16\" \"10.22.0.0/16\" service_cidr CIDR IPv4 range to assign to Kubernetes services \"10.3.0.0/16\" \"10.3.0.0/24\" kernel_args Additional kernel args to provide at PXE boot [] [\"kvm-intel.nested=1\"] worker_node_labels Map from worker name to list of initial node labels {} {\"node2\" = [\"role=special\"]} worker_node_taints Map from worker name to list of initial node taints {} {\"node2\" = [\"role=special:NoSchedule\"]}","title":"Bare-Metal"},{"location":"flatcar-linux/bare-metal/#bare-metal","text":"In this tutorial, we'll network boot and provision a Kubernetes v1.24.3 cluster on bare-metal with Flatcar Linux. First, we'll deploy a Matchbox service and setup a network boot environment. Then, we'll declare a Kubernetes cluster using the Typhoon Terraform module and power on machines. On PXE boot, machines will install Container Linux to disk, reboot into the disk install, and provision themselves as Kubernetes controllers or workers via Ignition. Controller hosts are provisioned to run an etcd-member peer and a kubelet service. Worker hosts run a kubelet service. Controller nodes run kube-apiserver , kube-scheduler , kube-controller-manager , and coredns while kube-proxy and calico (or flannel ) run on every node. A generated kubeconfig provides kubectl access to the cluster.","title":"Bare-Metal"},{"location":"flatcar-linux/bare-metal/#requirements","text":"Machines with 2GB RAM, 30GB disk, PXE-enabled NIC, IPMI PXE-enabled network boot environment (with HTTPS support) Matchbox v0.6+ deployment with API enabled Matchbox credentials client.crt , client.key , ca.crt Terraform v0.13.0+","title":"Requirements"},{"location":"flatcar-linux/bare-metal/#machines","text":"Collect a MAC address from each machine. For machines with multiple PXE-enabled NICs, pick one of the MAC addresses. MAC addresses will be used to match machines to profiles during network boot. 52:54:00:a1:9c:ae (node1) 52:54:00:b2:2f:86 (node2) 52:54:00:c3:61:77 (node3) Configure each machine to boot from the disk through IPMI or the BIOS menu. ipmitool -H node1 -U USER -P PASS chassis bootdev disk options=persistent During provisioning, you'll explicitly set the boot device to pxe for the next boot only. Machines will install (overwrite) the operating system to disk on PXE boot and reboot into the disk install. Ask your hardware vendor to provide MACs and preconfigure IPMI, if possible. With it, you can rack new servers, terraform apply with new info, and power on machines that network boot and provision into clusters.","title":"Machines"},{"location":"flatcar-linux/bare-metal/#dns","text":"Create a DNS A (or AAAA) record for each node's default interface. Create a record that resolves to each controller node (or re-use the node record if there's one controller). node1.example.com (node1) node2.example.com (node2) node3.example.com (node3) myk8s.example.com (node1) Cluster nodes will be configured to refer to the control plane and themselves by these fully qualified names and they'll be used in generated TLS certificates.","title":"DNS"},{"location":"flatcar-linux/bare-metal/#matchbox","text":"Matchbox is an open-source app that matches network-booted bare-metal machines (based on labels like MAC, UUID, etc.) to profiles to automate cluster provisioning. Install Matchbox on a Kubernetes cluster or dedicated server. Installing on Kubernetes (recommended) Installing on a server Tip Deploy Matchbox as service that can be accessed by all of your bare-metal machines globally. This provides a single endpoint to use Terraform to manage bare-metal clusters at different sites. Typhoon will never include secrets in provisioning user-data so you may even deploy matchbox publicly. Matchbox provides a TLS client-authenticated API that clients, like Terraform, can use to manage machine matching and profiles. Think of it like a cloud provider API, but for creating bare-metal instances. Generate TLS client credentials. Save the ca.crt , client.crt , and client.key where they can be referenced in Terraform configs. mv ca.crt client.crt client.key ~/.config/matchbox/ Verify the matchbox read-only HTTP endpoints are accessible (port is configurable). $ curl http://matchbox.example.com:8080 matchbox Verify your TLS client certificate and key can be used to access the Matchbox API (port is configurable). $ openssl s_client -connect matchbox.example.com:8081 \\ -CAfile ~/.config/matchbox/ca.crt \\ -cert ~/.config/matchbox/client.crt \\ -key ~/.config/matchbox/client.key","title":"Matchbox"},{"location":"flatcar-linux/bare-metal/#pxe-environment","text":"Create an iPXE-enabled network boot environment. Configure PXE clients to chainload iPXE firmware compiled to support HTTPS downloads . Instruct iPXE clients to chainload from your Matchbox service's /boot.ipxe endpoint. For networks already supporting iPXE clients, you can add a default.ipxe config. # /var/www/html/ipxe/default.ipxe chain http://matchbox.foo:8080/boot.ipxe For networks with Ubiquiti Routers, you can configure the router itself to chainload machines to iPXE and Matchbox. Read about the many ways to setup a compliant iPXE-enabled network. There is quite a bit of flexibility: Continue using existing DHCP, TFTP, or DNS services Configure specific machines, subnets, or architectures to chainload from Matchbox Place Matchbox behind a menu entry (timeout and default to Matchbox) TFTP chainloading to modern boot firmware, like iPXE, avoids issues with old NICs and allows faster transfer protocols like HTTP to be used. Warning Compile iPXE from source with support for HTTPS downloads . iPXE's pre-built firmware binaries do not enable this. If you cannot enable HTTPS downloads, set download_protocol = \"http\" (discouraged).","title":"PXE Environment"},{"location":"flatcar-linux/bare-metal/#terraform-setup","text":"Install Terraform v0.13.0+ on your system. $ terraform version Terraform v1.0.0 Read concepts to learn about Terraform, modules, and organizing resources. Change to your infrastructure repository (e.g. infra ). cd infra/clusters","title":"Terraform Setup"},{"location":"flatcar-linux/bare-metal/#provider","text":"Configure the Matchbox provider to use your Matchbox API endpoint and client certificate in a providers.tf file. provider \"matchbox\" { endpoint = \"matchbox.example.com:8081\" client_cert = file ( \"~/.config/matchbox/client.crt\" ) client_key = file ( \"~/.config/matchbox/client.key\" ) ca = file ( \"~/.config/matchbox/ca.crt\" ) } provider \"ct\" {} terraform { required_providers { ct = { source = \"poseidon/ct\" version = \"0.10.0\" } matchbox = { source = \"poseidon/matchbox\" version = \"0.5.0\" } } }","title":"Provider"},{"location":"flatcar-linux/bare-metal/#cluster","text":"Define a Kubernetes cluster using the module bare-metal/flatcar-linux/kubernetes . module \"mercury\" { source = \"git::https://github.com/poseidon/typhoon//bare-metal/flatcar-linux/kubernetes?ref=v1.24.3\" # bare-metal cluster_name = \"mercury\" matchbox_http_endpoint = \"http://matchbox.example.com\" os_channel = \"flatcar-stable\" os_version = \"2345.3.1\" # configuration k8s_domain_name = \"node1.example.com\" ssh_authorized_key = \"ssh-rsa AAAAB3Nz...\" # machines controllers = [{ name = \"node1\" mac = \"52:54:00:a1:9c:ae\" domain = \"node1.example.com\" }] workers = [ { name = \"node2\" , mac = \"52:54:00:b2:2f:86\" domain = \"node2.example.com\" }, { name = \"node3\" , mac = \"52:54:00:c3:61:77\" domain = \"node3.example.com\" } ] # set to http only if you cannot chainload to iPXE firmware with https support # download_protocol = \"http\" } Reference the variables docs or the variables.tf source.","title":"Cluster"},{"location":"flatcar-linux/bare-metal/#ssh-agent","text":"Initial bootstrapping requires bootstrap.service be started on one controller node. Terraform uses ssh-agent to automate this step. Add your SSH private key to ssh-agent . ssh-add ~/.ssh/id_rsa ssh-add -L","title":"ssh-agent"},{"location":"flatcar-linux/bare-metal/#apply","text":"Initialize the config directory if this is the first use with Terraform. terraform init Plan the resources to be created. $ terraform plan Plan: 55 to add, 0 to change, 0 to destroy. Apply the changes. Terraform will generate bootstrap assets and create Matchbox profiles (e.g. controller, worker) and matching rules via the Matchbox API. $ terraform apply module.mercury.null_resource.copy-controller-secrets.0: Still creating... ( 10s elapsed ) module.mercury.null_resource.copy-worker-secrets.0: Still creating... ( 10s elapsed ) ... Apply will then loop until it can successfully copy credentials to each machine and start the one-time Kubernetes bootstrap service. Proceed to the next step while this loops.","title":"Apply"},{"location":"flatcar-linux/bare-metal/#power","text":"Power on each machine with the boot device set to pxe for the next boot only. ipmitool -H node1.example.com -U USER -P PASS chassis bootdev pxe ipmitool -H node1.example.com -U USER -P PASS power on Machines will network boot, install Container Linux to disk, reboot into the disk install, and provision themselves as controllers or workers. If this is the first test of your PXE-enabled network boot environment, watch the SOL console of a machine to spot any misconfigurations.","title":"Power"},{"location":"flatcar-linux/bare-metal/#bootstrap","text":"Wait for the bootstrap step to finish bootstrapping the Kubernetes control plane. This may take 5-15 minutes depending on your network. module.mercury.null_resource.bootstrap: Still creating... (6m10s elapsed) module.mercury.null_resource.bootstrap: Still creating... (6m20s elapsed) module.mercury.null_resource.bootstrap: Still creating... (6m30s elapsed) module.mercury.null_resource.bootstrap: Still creating... (6m40s elapsed) module.mercury.null_resource.bootstrap: Creation complete (ID: 5441741360626669024) Apply complete! Resources: 55 added, 0 changed, 0 destroyed. To watch the install to disk (until machines reboot from disk), SSH to port 2222. # before v1.10.1 $ ssh debug@node1.example.com # after v1.10.1 $ ssh -p 2222 core@node1.example.com To watch the bootstrap process in detail, SSH to the first controller and journal the logs. $ ssh core@node1.example.com $ journalctl -f -u bootstrap rkt[1750]: The connection to the server cluster.example.com:6443 was refused - did you specify the right host or port? rkt[1750]: Waiting for static pod control plane ... rkt[1750]: serviceaccount/calico-node unchanged systemd[1]: Started Kubernetes control plane.","title":"Bootstrap"},{"location":"flatcar-linux/bare-metal/#verify","text":"Install kubectl on your system. Obtain the generated cluster kubeconfig from module outputs (e.g. write to a local file). resource \"local_file\" \"kubeconfig-mercury\" { content = module.mercury.kubeconfig-admin filename = \"/home/user/.kube/configs/mercury-config\" } List nodes in the cluster. $ export KUBECONFIG=/home/user/.kube/configs/mercury-config $ kubectl get nodes NAME STATUS ROLES AGE VERSION node1.example.com Ready <none> 10m v1.24.3 node2.example.com Ready <none> 10m v1.24.3 node3.example.com Ready <none> 10m v1.24.3 List the pods. $ kubectl get pods --all-namespaces NAMESPACE NAME READY STATUS RESTARTS AGE kube-system calico-node-6qp7f 2/2 Running 1 11m kube-system calico-node-gnjrm 2/2 Running 0 11m kube-system calico-node-llbgt 2/2 Running 0 11m kube-system coredns-1187388186-dj3pd 1/1 Running 0 11m kube-system coredns-1187388186-mx9rt 1/1 Running 0 11m kube-system kube-apiserver-node1.example.com 1/1 Running 0 11m kube-system kube-controller-node1.example.com 1/1 Running 1 11m kube-system kube-proxy-50sd4 1/1 Running 0 11m kube-system kube-proxy-bczhp 1/1 Running 0 11m kube-system kube-proxy-mp2fw 1/1 Running 0 11m kube-system kube-scheduler-node1.example.com 1/1 Running 0 11m","title":"Verify"},{"location":"flatcar-linux/bare-metal/#going-further","text":"Learn about maintenance and addons .","title":"Going Further"},{"location":"flatcar-linux/bare-metal/#variables","text":"Check the variables.tf source.","title":"Variables"},{"location":"flatcar-linux/bare-metal/#required","text":"Name Description Example cluster_name Unique cluster name \"mercury\" matchbox_http_endpoint Matchbox HTTP read-only endpoint \" http://matchbox.example.com:port \" os_channel Channel for a Container Linux derivative flatcar-stable, flatcar-beta, flatcar-alpha os_version Version for a Container Linux derivative to PXE and install \"2345.3.1\" k8s_domain_name FQDN resolving to the controller(s) nodes. Workers and kubectl will communicate with this endpoint \"myk8s.example.com\" ssh_authorized_key SSH public key for user 'core' \"ssh-rsa AAAAB3Nz...\" controllers List of controller machine detail objects (unique name, identifying MAC address, FQDN) [{name=\"node1\", mac=\"52:54:00:a1:9c:ae\", domain=\"node1.example.com\"}] workers List of worker machine detail objects (unique name, identifying MAC address, FQDN) [{name=\"node2\", mac=\"52:54:00:b2:2f:86\", domain=\"node2.example.com\"}, {name=\"node3\", mac=\"52:54:00:c3:61:77\", domain=\"node3.example.com\"}]","title":"Required"},{"location":"flatcar-linux/bare-metal/#optional","text":"Name Description Default Example download_protocol Protocol iPXE uses to download the kernel and initrd. iPXE must be compiled with crypto support for https. Unused if cached_install is true \"https\" \"http\" cached_install PXE boot and install from the Matchbox /assets cache. Admin MUST have downloaded Container Linux or Flatcar images into the cache false true install_disk Disk device where Container Linux should be installed \"/dev/sda\" \"/dev/sdb\" networking Choice of networking provider \"cilium\" \"calico\" or \"cilium\" or \"flannel\" network_mtu CNI interface MTU (calico-only) 1480 - snippets Map from machine names to lists of Container Linux Config snippets {} examples network_ip_autodetection_method Method to detect host IPv4 address (calico-only) \"first-found\" \"can-reach=10.0.0.1\" pod_cidr CIDR IPv4 range to assign to Kubernetes pods \"10.2.0.0/16\" \"10.22.0.0/16\" service_cidr CIDR IPv4 range to assign to Kubernetes services \"10.3.0.0/16\" \"10.3.0.0/24\" kernel_args Additional kernel args to provide at PXE boot [] [\"kvm-intel.nested=1\"] worker_node_labels Map from worker name to list of initial node labels {} {\"node2\" = [\"role=special\"]} worker_node_taints Map from worker name to list of initial node taints {} {\"node2\" = [\"role=special:NoSchedule\"]}","title":"Optional"},{"location":"flatcar-linux/digitalocean/","text":"DigitalOcean \u00b6 In this tutorial, we'll create a Kubernetes v1.24.3 cluster on DigitalOcean with Flatcar Linux. We'll declare a Kubernetes cluster using the Typhoon Terraform module. Then apply the changes to create controller droplets, worker droplets, DNS records, tags, and TLS assets. Controller hosts are provisioned to run an etcd-member peer and a kubelet service. Worker hosts run a kubelet service. Controller nodes run kube-apiserver , kube-scheduler , kube-controller-manager , and coredns , while kube-proxy and calico (or flannel ) run on every node. A generated kubeconfig provides kubectl access to the cluster. Requirements \u00b6 Digital Ocean Account and Token Digital Ocean Domain (registered Domain Name or delegated subdomain) Terraform v0.13.0+ Terraform Setup \u00b6 Install Terraform v0.13.0+ on your system. $ terraform version Terraform v1.0.0 Read concepts to learn about Terraform, modules, and organizing resources. Change to your infrastructure repository (e.g. infra ). cd infra/clusters Provider \u00b6 Login to DigitalOcean . Or if you don't have one, create an account with our referral link to get free credits. Generate a Personal Access Token with read/write scope from the API tab . Write the token to a file that can be referenced in configs. mkdir -p ~/.config/digital-ocean echo \"TOKEN\" > ~/.config/digital-ocean/token Configure the DigitalOcean provider to use your token in a providers.tf file. provider \"digitalocean\" { token = \"${chomp(file(\"~/.config/digital-ocean/token\"))}\" } provider \"ct\" {} terraform { required_providers { ct = { source = \"poseidon/ct\" version = \"0.10.0\" } digitalocean = { source = \"digitalocean/digitalocean\" version = \"2.21.0\" } } } Flatcar Linux Images \u00b6 Flatcar Linux publishes DigitalOcean images, but does not yet upload them. DigitalOcean allows custom images to be uploaded via URLor file. Download the Flatcar Linux DigitalOcean bin image. Rename the image with the channel and version (to refer to these images over time) and upload it as a custom image. data \"digitalocean_image\" \"flatcar-stable-2303-4-0\" { name = \"flatcar-stable-2303.4.0.bin.bz2\" } Set the os_image in the next step. Cluster \u00b6 Define a Kubernetes cluster using the module digital-ocean/flatcar-linux/kubernetes . module \"nemo\" { source = \"git::https://github.com/poseidon/typhoon//digital-ocean/flatcar-linux/kubernetes?ref=v1.24.3\" # Digital Ocean cluster_name = \"nemo\" region = \"nyc3\" dns_zone = \"digital-ocean.example.com\" # configuration os_image = data.digitalocean_image.flatcar-stable-2303-4-0.id ssh_fingerprints = [ \"d7:9d:79:ae:56:32:73:79:95:88:e3:a2:ab:5d:45:e7\" ] # optional worker_count = 2 } Reference the variables docs or the variables.tf source. ssh-agent \u00b6 Initial bootstrapping requires bootstrap.service be started on one controller node. Terraform uses ssh-agent to automate this step. Add your SSH private key to ssh-agent . ssh-add ~/.ssh/id_rsa ssh-add -L Apply \u00b6 Initialize the config directory if this is the first use with Terraform. terraform init Plan the resources to be created. $ terraform plan Plan: 54 to add, 0 to change, 0 to destroy. Apply the changes to create the cluster. $ terraform apply module.nemo.null_resource.bootstrap: Still creating... ( 30s elapsed ) module.nemo.null_resource.bootstrap: Provisioning with 'remote-exec' ... ... module.nemo.null_resource.bootstrap: Still creating... ( 6m20s elapsed ) module.nemo.null_resource.bootstrap: Creation complete ( ID: 7599298447329218468 ) Apply complete! Resources: 42 added, 0 changed, 0 destroyed. In 3-6 minutes, the Kubernetes cluster will be ready. Verify \u00b6 Install kubectl on your system. Obtain the generated cluster kubeconfig from module outputs (e.g. write to a local file). resource \"local_file\" \"kubeconfig-nemo\" { content = module.nemo.kubeconfig-admin filename = \"/home/user/.kube/configs/nemo-config\" } List nodes in the cluster. $ export KUBECONFIG=/home/user/.kube/configs/nemo-config $ kubectl get nodes NAME STATUS ROLES AGE VERSION 10.132.110.130 Ready <none> 10m v1.24.3 10.132.115.81 Ready <none> 10m v1.24.3 10.132.124.107 Ready <none> 10m v1.24.3 List the pods. NAMESPACE NAME READY STATUS RESTARTS AGE kube-system coredns-1187388186-ld1j7 1/1 Running 0 11m kube-system coredns-1187388186-rdhf7 1/1 Running 0 11m kube-system calico-node-1m5bf 2/2 Running 0 11m kube-system calico-node-7jmr1 2/2 Running 0 11m kube-system calico-node-bknc8 2/2 Running 0 11m kube-system kube-apiserver-ip-10.132.115.81 1/1 Running 0 11m kube-system kube-controller-manager-ip-10.132.115.81 1/1 Running 0 11m kube-system kube-proxy-6kxjf 1/1 Running 0 11m kube-system kube-proxy-fh3td 1/1 Running 0 11m kube-system kube-proxy-k35rc 1/1 Running 0 11m kube-system kube-scheduler-ip-10.132.115.81 1/1 Running 0 11m Going Further \u00b6 Learn about maintenance and addons . Variables \u00b6 Check the variables.tf source. Required \u00b6 Name Description Example cluster_name Unique cluster name (prepended to dns_zone) \"nemo\" region Digital Ocean region \"nyc1\", \"sfo2\", \"fra1\", tor1\" dns_zone Digital Ocean domain (i.e. DNS zone) \"do.example.com\" os_image Container Linux image for instances \"uploaded-flatcar-image-id\" ssh_fingerprints SSH public key fingerprints [\"d7:9d...\"] DNS Zone \u00b6 Clusters create DNS A records ${cluster_name}.${dns_zone} to resolve to controller droplets (round robin). This FQDN is used by workers and kubectl to access the apiserver(s). In this example, the cluster's apiserver would be accessible at nemo.do.example.com . You'll need a registered domain name or delegated subdomain in DigitalOcean Domains (i.e. DNS zones). You can set this up once and create many clusters with unique names. # Declare a DigitalOcean record to also create a zone file resource \"digitalocean_domain\" \"zone-for-clusters\" { name = \"do.example.com\" ip_address = \"8.8.8.8\" } If you have an existing domain name with a zone file elsewhere, just delegate a subdomain that can be managed on DigitalOcean (e.g. do.mydomain.com) and update nameservers . SSH Fingerprints \u00b6 DigitalOcean droplets are created with your SSH public key \"fingerprint\" (i.e. MD5 hash) to allow access. If your SSH public key is at ~/.ssh/id_rsa , find the fingerprint with, ssh-keygen -E md5 -lf ~/.ssh/id_rsa.pub | awk '{print $2}' MD5:d7:9d:79:ae:56:32:73:79:95:88:e3:a2:ab:5d:45:e7 If you use ssh-agent (e.g. Yubikey for SSH), find the fingerprint with, ssh-add -l -E md5 2048 MD5:d7:9d:79:ae:56:32:73:79:95:88:e3:a2:ab:5d:45:e7 cardno:000603633110 (RSA) Digital Ocean requires the SSH public key be uploaded to your account, so you may also find the fingerprint under Settings -> Security. Finally, if you don't have an SSH key, create one now . Optional \u00b6 Name Description Default Example controller_count Number of controllers (i.e. masters) 1 1 worker_count Number of workers 1 3 controller_type Droplet type for controllers \"s-2vcpu-2gb\" s-2vcpu-2gb, s-2vcpu-4gb, s-4vcpu-8gb, ... worker_type Droplet type for workers \"s-1vcpu-2gb\" s-1vcpu-2gb, s-2vcpu-2gb, ... controller_snippets Controller Container Linux Config snippets [] example worker_snippets Worker Container Linux Config snippets [] example networking Choice of networking provider \"cilium\" \"calico\" or \"cilium\" or \"flannel\" pod_cidr CIDR IPv4 range to assign to Kubernetes pods \"10.2.0.0/16\" \"10.22.0.0/16\" service_cidr CIDR IPv4 range to assign to Kubernetes services \"10.3.0.0/16\" \"10.3.0.0/24\" Check the list of valid droplet types or use doctl compute size list . Warning Do not choose a controller_type smaller than 2GB. Smaller droplets are not sufficient for running a controller and bootstrapping will fail.","title":"DigitalOcean"},{"location":"flatcar-linux/digitalocean/#digitalocean","text":"In this tutorial, we'll create a Kubernetes v1.24.3 cluster on DigitalOcean with Flatcar Linux. We'll declare a Kubernetes cluster using the Typhoon Terraform module. Then apply the changes to create controller droplets, worker droplets, DNS records, tags, and TLS assets. Controller hosts are provisioned to run an etcd-member peer and a kubelet service. Worker hosts run a kubelet service. Controller nodes run kube-apiserver , kube-scheduler , kube-controller-manager , and coredns , while kube-proxy and calico (or flannel ) run on every node. A generated kubeconfig provides kubectl access to the cluster.","title":"DigitalOcean"},{"location":"flatcar-linux/digitalocean/#requirements","text":"Digital Ocean Account and Token Digital Ocean Domain (registered Domain Name or delegated subdomain) Terraform v0.13.0+","title":"Requirements"},{"location":"flatcar-linux/digitalocean/#terraform-setup","text":"Install Terraform v0.13.0+ on your system. $ terraform version Terraform v1.0.0 Read concepts to learn about Terraform, modules, and organizing resources. Change to your infrastructure repository (e.g. infra ). cd infra/clusters","title":"Terraform Setup"},{"location":"flatcar-linux/digitalocean/#provider","text":"Login to DigitalOcean . Or if you don't have one, create an account with our referral link to get free credits. Generate a Personal Access Token with read/write scope from the API tab . Write the token to a file that can be referenced in configs. mkdir -p ~/.config/digital-ocean echo \"TOKEN\" > ~/.config/digital-ocean/token Configure the DigitalOcean provider to use your token in a providers.tf file. provider \"digitalocean\" { token = \"${chomp(file(\"~/.config/digital-ocean/token\"))}\" } provider \"ct\" {} terraform { required_providers { ct = { source = \"poseidon/ct\" version = \"0.10.0\" } digitalocean = { source = \"digitalocean/digitalocean\" version = \"2.21.0\" } } }","title":"Provider"},{"location":"flatcar-linux/digitalocean/#flatcar-linux-images","text":"Flatcar Linux publishes DigitalOcean images, but does not yet upload them. DigitalOcean allows custom images to be uploaded via URLor file. Download the Flatcar Linux DigitalOcean bin image. Rename the image with the channel and version (to refer to these images over time) and upload it as a custom image. data \"digitalocean_image\" \"flatcar-stable-2303-4-0\" { name = \"flatcar-stable-2303.4.0.bin.bz2\" } Set the os_image in the next step.","title":"Flatcar Linux Images"},{"location":"flatcar-linux/digitalocean/#cluster","text":"Define a Kubernetes cluster using the module digital-ocean/flatcar-linux/kubernetes . module \"nemo\" { source = \"git::https://github.com/poseidon/typhoon//digital-ocean/flatcar-linux/kubernetes?ref=v1.24.3\" # Digital Ocean cluster_name = \"nemo\" region = \"nyc3\" dns_zone = \"digital-ocean.example.com\" # configuration os_image = data.digitalocean_image.flatcar-stable-2303-4-0.id ssh_fingerprints = [ \"d7:9d:79:ae:56:32:73:79:95:88:e3:a2:ab:5d:45:e7\" ] # optional worker_count = 2 } Reference the variables docs or the variables.tf source.","title":"Cluster"},{"location":"flatcar-linux/digitalocean/#ssh-agent","text":"Initial bootstrapping requires bootstrap.service be started on one controller node. Terraform uses ssh-agent to automate this step. Add your SSH private key to ssh-agent . ssh-add ~/.ssh/id_rsa ssh-add -L","title":"ssh-agent"},{"location":"flatcar-linux/digitalocean/#apply","text":"Initialize the config directory if this is the first use with Terraform. terraform init Plan the resources to be created. $ terraform plan Plan: 54 to add, 0 to change, 0 to destroy. Apply the changes to create the cluster. $ terraform apply module.nemo.null_resource.bootstrap: Still creating... ( 30s elapsed ) module.nemo.null_resource.bootstrap: Provisioning with 'remote-exec' ... ... module.nemo.null_resource.bootstrap: Still creating... ( 6m20s elapsed ) module.nemo.null_resource.bootstrap: Creation complete ( ID: 7599298447329218468 ) Apply complete! Resources: 42 added, 0 changed, 0 destroyed. In 3-6 minutes, the Kubernetes cluster will be ready.","title":"Apply"},{"location":"flatcar-linux/digitalocean/#verify","text":"Install kubectl on your system. Obtain the generated cluster kubeconfig from module outputs (e.g. write to a local file). resource \"local_file\" \"kubeconfig-nemo\" { content = module.nemo.kubeconfig-admin filename = \"/home/user/.kube/configs/nemo-config\" } List nodes in the cluster. $ export KUBECONFIG=/home/user/.kube/configs/nemo-config $ kubectl get nodes NAME STATUS ROLES AGE VERSION 10.132.110.130 Ready <none> 10m v1.24.3 10.132.115.81 Ready <none> 10m v1.24.3 10.132.124.107 Ready <none> 10m v1.24.3 List the pods. NAMESPACE NAME READY STATUS RESTARTS AGE kube-system coredns-1187388186-ld1j7 1/1 Running 0 11m kube-system coredns-1187388186-rdhf7 1/1 Running 0 11m kube-system calico-node-1m5bf 2/2 Running 0 11m kube-system calico-node-7jmr1 2/2 Running 0 11m kube-system calico-node-bknc8 2/2 Running 0 11m kube-system kube-apiserver-ip-10.132.115.81 1/1 Running 0 11m kube-system kube-controller-manager-ip-10.132.115.81 1/1 Running 0 11m kube-system kube-proxy-6kxjf 1/1 Running 0 11m kube-system kube-proxy-fh3td 1/1 Running 0 11m kube-system kube-proxy-k35rc 1/1 Running 0 11m kube-system kube-scheduler-ip-10.132.115.81 1/1 Running 0 11m","title":"Verify"},{"location":"flatcar-linux/digitalocean/#going-further","text":"Learn about maintenance and addons .","title":"Going Further"},{"location":"flatcar-linux/digitalocean/#variables","text":"Check the variables.tf source.","title":"Variables"},{"location":"flatcar-linux/digitalocean/#required","text":"Name Description Example cluster_name Unique cluster name (prepended to dns_zone) \"nemo\" region Digital Ocean region \"nyc1\", \"sfo2\", \"fra1\", tor1\" dns_zone Digital Ocean domain (i.e. DNS zone) \"do.example.com\" os_image Container Linux image for instances \"uploaded-flatcar-image-id\" ssh_fingerprints SSH public key fingerprints [\"d7:9d...\"]","title":"Required"},{"location":"flatcar-linux/digitalocean/#dns-zone","text":"Clusters create DNS A records ${cluster_name}.${dns_zone} to resolve to controller droplets (round robin). This FQDN is used by workers and kubectl to access the apiserver(s). In this example, the cluster's apiserver would be accessible at nemo.do.example.com . You'll need a registered domain name or delegated subdomain in DigitalOcean Domains (i.e. DNS zones). You can set this up once and create many clusters with unique names. # Declare a DigitalOcean record to also create a zone file resource \"digitalocean_domain\" \"zone-for-clusters\" { name = \"do.example.com\" ip_address = \"8.8.8.8\" } If you have an existing domain name with a zone file elsewhere, just delegate a subdomain that can be managed on DigitalOcean (e.g. do.mydomain.com) and update nameservers .","title":"DNS Zone"},{"location":"flatcar-linux/digitalocean/#ssh-fingerprints","text":"DigitalOcean droplets are created with your SSH public key \"fingerprint\" (i.e. MD5 hash) to allow access. If your SSH public key is at ~/.ssh/id_rsa , find the fingerprint with, ssh-keygen -E md5 -lf ~/.ssh/id_rsa.pub | awk '{print $2}' MD5:d7:9d:79:ae:56:32:73:79:95:88:e3:a2:ab:5d:45:e7 If you use ssh-agent (e.g. Yubikey for SSH), find the fingerprint with, ssh-add -l -E md5 2048 MD5:d7:9d:79:ae:56:32:73:79:95:88:e3:a2:ab:5d:45:e7 cardno:000603633110 (RSA) Digital Ocean requires the SSH public key be uploaded to your account, so you may also find the fingerprint under Settings -> Security. Finally, if you don't have an SSH key, create one now .","title":"SSH Fingerprints"},{"location":"flatcar-linux/digitalocean/#optional","text":"Name Description Default Example controller_count Number of controllers (i.e. masters) 1 1 worker_count Number of workers 1 3 controller_type Droplet type for controllers \"s-2vcpu-2gb\" s-2vcpu-2gb, s-2vcpu-4gb, s-4vcpu-8gb, ... worker_type Droplet type for workers \"s-1vcpu-2gb\" s-1vcpu-2gb, s-2vcpu-2gb, ... controller_snippets Controller Container Linux Config snippets [] example worker_snippets Worker Container Linux Config snippets [] example networking Choice of networking provider \"cilium\" \"calico\" or \"cilium\" or \"flannel\" pod_cidr CIDR IPv4 range to assign to Kubernetes pods \"10.2.0.0/16\" \"10.22.0.0/16\" service_cidr CIDR IPv4 range to assign to Kubernetes services \"10.3.0.0/16\" \"10.3.0.0/24\" Check the list of valid droplet types or use doctl compute size list . Warning Do not choose a controller_type smaller than 2GB. Smaller droplets are not sufficient for running a controller and bootstrapping will fail.","title":"Optional"},{"location":"flatcar-linux/google-cloud/","text":"Google Cloud \u00b6 In this tutorial, we'll create a Kubernetes v1.24.3 cluster on Google Compute Engine with Flatcar Linux. We'll declare a Kubernetes cluster using the Typhoon Terraform module. Then apply the changes to create a network, firewall rules, health checks, controller instances, worker managed instance group, load balancers, and TLS assets. Controller hosts are provisioned to run an etcd-member peer and a kubelet service. Worker hosts run a kubelet service. Controller nodes run kube-apiserver , kube-scheduler , kube-controller-manager , and coredns , while kube-proxy and calico (or flannel ) run on every node. A generated kubeconfig provides kubectl access to the cluster. Requirements \u00b6 Google Cloud Account and Service Account Google Cloud DNS Zone (registered Domain Name or delegated subdomain) Terraform v0.13.0+ Terraform Setup \u00b6 Install Terraform v0.13.0+ on your system. $ terraform version Terraform v1.0.0 Read concepts to learn about Terraform, modules, and organizing resources. Change to your infrastructure repository (e.g. infra ). cd infra/clusters Provider \u00b6 Login to your Google Console API Manager and select a project, or signup if you don't have an account. Select \"Credentials\" and create a service account key. Choose the \"Compute Engine Admin\" and \"DNS Administrator\" roles and save the JSON private key to a file that can be referenced in configs. mv ~/Downloads/project-id-43048204.json ~/.config/google-cloud/terraform.json Configure the Google Cloud provider to use your service account key, project-id, and region in a providers.tf file. provider \"google\" { project = \"project-id\" region = \"us-central1\" credentials = file ( \"~/.config/google-cloud/terraform.json\" ) } provider \"ct\" {} terraform { required_providers { ct = { source = \"poseidon/ct\" version = \"0.10.0\" } google = { source = \"hashicorp/google\" version = \"4.29.0\" } } } Additional configuration options are described in the google provider docs . Tip Regions are listed in docs or with gcloud compute regions list . A project may contain multiple clusters across different regions. Cluster \u00b6 Define a Kubernetes cluster using the module google-cloud/flatcar-linux/kubernetes . module \"yavin\" { source = \"git::https://github.com/poseidon/typhoon//google-cloud/flatcar-linux/kubernetes?ref=v1.24.3\" # Google Cloud cluster_name = \"yavin\" region = \"us-central1\" dns_zone = \"example.com\" dns_zone_name = \"example-zone\" # configuration ssh_authorized_key = \"ssh-rsa AAAAB3Nz...\" # optional worker_count = 2 } Reference the variables docs or the variables.tf source. ssh-agent \u00b6 Initial bootstrapping requires bootstrap.service be started on one controller node. Terraform uses ssh-agent to automate this step. Add your SSH private key to ssh-agent . ssh-add ~/.ssh/id_rsa ssh-add -L Apply \u00b6 Initialize the config directory if this is the first use with Terraform. terraform init Plan the resources to be created. $ terraform plan Plan: 64 to add, 0 to change, 0 to destroy. Apply the changes to create the cluster. $ terraform apply module.yavin.null_resource.bootstrap: Still creating... ( 10s elapsed ) ... module.yavin.null_resource.bootstrap: Still creating... ( 5m30s elapsed ) module.yavin.null_resource.bootstrap: Still creating... ( 5m40s elapsed ) module.yavin.null_resource.bootstrap: Creation complete ( ID: 5768638456220583358 ) Apply complete! Resources: 62 added, 0 changed, 0 destroyed. In 4-8 minutes, the Kubernetes cluster will be ready. Verify \u00b6 Install kubectl on your system. Obtain the generated cluster kubeconfig from module outputs (e.g. write to a local file). resource \"local_file\" \"kubeconfig-yavin\" { content = module.yavin.kubeconfig-admin filename = \"/home/user/.kube/configs/yavin-config\" } List nodes in the cluster. $ export KUBECONFIG=/home/user/.kube/configs/yavin-config $ kubectl get nodes NAME ROLES STATUS AGE VERSION yavin-controller-0.c.example-com.internal <none> Ready 6m v1.24.3 yavin-worker-jrbf.c.example-com.internal <none> Ready 5m v1.24.3 yavin-worker-mzdm.c.example-com.internal <none> Ready 5m v1.24.3 List the pods. $ kubectl get pods --all-namespaces NAMESPACE NAME READY STATUS RESTARTS AGE kube-system calico-node-1cs8z 2/2 Running 0 6m kube-system calico-node-d1l5b 2/2 Running 0 6m kube-system calico-node-sp9ps 2/2 Running 0 6m kube-system coredns-1187388186-dkh3o 1/1 Running 0 6m kube-system coredns-1187388186-zj5dl 1/1 Running 0 6m kube-system kube-apiserver-controller-0 1/1 Running 0 6m kube-system kube-controller-manager-controller-0 1/1 Running 0 6m kube-system kube-proxy-117v6 1/1 Running 0 6m kube-system kube-proxy-9886n 1/1 Running 0 6m kube-system kube-proxy-njn47 1/1 Running 0 6m kube-system kube-scheduler-controller-0 1/1 Running 0 6m Going Further \u00b6 Learn about maintenance and addons . Variables \u00b6 Check the variables.tf source. Required \u00b6 Name Description Example cluster_name Unique cluster name (prepended to dns_zone) \"yavin\" region Google Cloud region \"us-central1\" dns_zone Google Cloud DNS zone \"google-cloud.example.com\" dns_zone_name Google Cloud DNS zone name \"example-zone\" ssh_authorized_key SSH public key for user 'core' \"ssh-rsa AAAAB3NZ...\" Check the list of valid regions and list Container Linux images with gcloud compute images list | grep coreos . DNS Zone \u00b6 Clusters create a DNS A record ${cluster_name}.${dns_zone} to resolve a TCP proxy load balancer backed by controller instances. This FQDN is used by workers and kubectl to access the apiserver(s). In this example, the cluster's apiserver would be accessible at yavin.google-cloud.example.com . You'll need a registered domain name or delegated subdomain on Google Cloud DNS. You can set this up once and create many clusters with unique names. resource \"google_dns_managed_zone\" \"zone-for-clusters\" { dns_name = \"google-cloud.example.com.\" name = \"example-zone\" description = \"Production DNS zone\" } If you have an existing domain name with a zone file elsewhere, just delegate a subdomain that can be managed on Google Cloud (e.g. google-cloud.mydomain.com) and update nameservers . Optional \u00b6 Name Description Default Example controller_count Number of controllers (i.e. masters) 1 3 worker_count Number of workers 1 3 controller_type Machine type for controllers \"n1-standard-1\" See below worker_type Machine type for workers \"n1-standard-1\" See below os_image Flatcar Linux image for compute instances \"flatcar-stable\" flatcar-stable, flatcar-beta, flatcar-alpha disk_size Size of the disk in GB 30 100 worker_preemptible If enabled, Compute Engine will terminate workers randomly within 24 hours false true controller_snippets Controller Container Linux Config snippets [] example worker_snippets Worker Container Linux Config snippets [] example networking Choice of networking provider \"cilium\" \"calico\" or \"cilium\" or \"flannel\" pod_cidr CIDR IPv4 range to assign to Kubernetes pods \"10.2.0.0/16\" \"10.22.0.0/16\" service_cidr CIDR IPv4 range to assign to Kubernetes services \"10.3.0.0/16\" \"10.3.0.0/24\" worker_node_labels List of initial worker node labels [] [\"worker-pool=default\"] Check the list of valid machine types . Preemption \u00b6 Add worker_preemptible = \"true\" to allow worker nodes to be preempted at random, but pay significantly less. Clusters tolerate stopping instances fairly well (reschedules pods, but cannot drain) and preemption provides a nice reward for running fault-tolerant cluster systems.`","title":"Google Cloud"},{"location":"flatcar-linux/google-cloud/#google-cloud","text":"In this tutorial, we'll create a Kubernetes v1.24.3 cluster on Google Compute Engine with Flatcar Linux. We'll declare a Kubernetes cluster using the Typhoon Terraform module. Then apply the changes to create a network, firewall rules, health checks, controller instances, worker managed instance group, load balancers, and TLS assets. Controller hosts are provisioned to run an etcd-member peer and a kubelet service. Worker hosts run a kubelet service. Controller nodes run kube-apiserver , kube-scheduler , kube-controller-manager , and coredns , while kube-proxy and calico (or flannel ) run on every node. A generated kubeconfig provides kubectl access to the cluster.","title":"Google Cloud"},{"location":"flatcar-linux/google-cloud/#requirements","text":"Google Cloud Account and Service Account Google Cloud DNS Zone (registered Domain Name or delegated subdomain) Terraform v0.13.0+","title":"Requirements"},{"location":"flatcar-linux/google-cloud/#terraform-setup","text":"Install Terraform v0.13.0+ on your system. $ terraform version Terraform v1.0.0 Read concepts to learn about Terraform, modules, and organizing resources. Change to your infrastructure repository (e.g. infra ). cd infra/clusters","title":"Terraform Setup"},{"location":"flatcar-linux/google-cloud/#provider","text":"Login to your Google Console API Manager and select a project, or signup if you don't have an account. Select \"Credentials\" and create a service account key. Choose the \"Compute Engine Admin\" and \"DNS Administrator\" roles and save the JSON private key to a file that can be referenced in configs. mv ~/Downloads/project-id-43048204.json ~/.config/google-cloud/terraform.json Configure the Google Cloud provider to use your service account key, project-id, and region in a providers.tf file. provider \"google\" { project = \"project-id\" region = \"us-central1\" credentials = file ( \"~/.config/google-cloud/terraform.json\" ) } provider \"ct\" {} terraform { required_providers { ct = { source = \"poseidon/ct\" version = \"0.10.0\" } google = { source = \"hashicorp/google\" version = \"4.29.0\" } } } Additional configuration options are described in the google provider docs . Tip Regions are listed in docs or with gcloud compute regions list . A project may contain multiple clusters across different regions.","title":"Provider"},{"location":"flatcar-linux/google-cloud/#cluster","text":"Define a Kubernetes cluster using the module google-cloud/flatcar-linux/kubernetes . module \"yavin\" { source = \"git::https://github.com/poseidon/typhoon//google-cloud/flatcar-linux/kubernetes?ref=v1.24.3\" # Google Cloud cluster_name = \"yavin\" region = \"us-central1\" dns_zone = \"example.com\" dns_zone_name = \"example-zone\" # configuration ssh_authorized_key = \"ssh-rsa AAAAB3Nz...\" # optional worker_count = 2 } Reference the variables docs or the variables.tf source.","title":"Cluster"},{"location":"flatcar-linux/google-cloud/#ssh-agent","text":"Initial bootstrapping requires bootstrap.service be started on one controller node. Terraform uses ssh-agent to automate this step. Add your SSH private key to ssh-agent . ssh-add ~/.ssh/id_rsa ssh-add -L","title":"ssh-agent"},{"location":"flatcar-linux/google-cloud/#apply","text":"Initialize the config directory if this is the first use with Terraform. terraform init Plan the resources to be created. $ terraform plan Plan: 64 to add, 0 to change, 0 to destroy. Apply the changes to create the cluster. $ terraform apply module.yavin.null_resource.bootstrap: Still creating... ( 10s elapsed ) ... module.yavin.null_resource.bootstrap: Still creating... ( 5m30s elapsed ) module.yavin.null_resource.bootstrap: Still creating... ( 5m40s elapsed ) module.yavin.null_resource.bootstrap: Creation complete ( ID: 5768638456220583358 ) Apply complete! Resources: 62 added, 0 changed, 0 destroyed. In 4-8 minutes, the Kubernetes cluster will be ready.","title":"Apply"},{"location":"flatcar-linux/google-cloud/#verify","text":"Install kubectl on your system. Obtain the generated cluster kubeconfig from module outputs (e.g. write to a local file). resource \"local_file\" \"kubeconfig-yavin\" { content = module.yavin.kubeconfig-admin filename = \"/home/user/.kube/configs/yavin-config\" } List nodes in the cluster. $ export KUBECONFIG=/home/user/.kube/configs/yavin-config $ kubectl get nodes NAME ROLES STATUS AGE VERSION yavin-controller-0.c.example-com.internal <none> Ready 6m v1.24.3 yavin-worker-jrbf.c.example-com.internal <none> Ready 5m v1.24.3 yavin-worker-mzdm.c.example-com.internal <none> Ready 5m v1.24.3 List the pods. $ kubectl get pods --all-namespaces NAMESPACE NAME READY STATUS RESTARTS AGE kube-system calico-node-1cs8z 2/2 Running 0 6m kube-system calico-node-d1l5b 2/2 Running 0 6m kube-system calico-node-sp9ps 2/2 Running 0 6m kube-system coredns-1187388186-dkh3o 1/1 Running 0 6m kube-system coredns-1187388186-zj5dl 1/1 Running 0 6m kube-system kube-apiserver-controller-0 1/1 Running 0 6m kube-system kube-controller-manager-controller-0 1/1 Running 0 6m kube-system kube-proxy-117v6 1/1 Running 0 6m kube-system kube-proxy-9886n 1/1 Running 0 6m kube-system kube-proxy-njn47 1/1 Running 0 6m kube-system kube-scheduler-controller-0 1/1 Running 0 6m","title":"Verify"},{"location":"flatcar-linux/google-cloud/#going-further","text":"Learn about maintenance and addons .","title":"Going Further"},{"location":"flatcar-linux/google-cloud/#variables","text":"Check the variables.tf source.","title":"Variables"},{"location":"flatcar-linux/google-cloud/#required","text":"Name Description Example cluster_name Unique cluster name (prepended to dns_zone) \"yavin\" region Google Cloud region \"us-central1\" dns_zone Google Cloud DNS zone \"google-cloud.example.com\" dns_zone_name Google Cloud DNS zone name \"example-zone\" ssh_authorized_key SSH public key for user 'core' \"ssh-rsa AAAAB3NZ...\" Check the list of valid regions and list Container Linux images with gcloud compute images list | grep coreos .","title":"Required"},{"location":"flatcar-linux/google-cloud/#dns-zone","text":"Clusters create a DNS A record ${cluster_name}.${dns_zone} to resolve a TCP proxy load balancer backed by controller instances. This FQDN is used by workers and kubectl to access the apiserver(s). In this example, the cluster's apiserver would be accessible at yavin.google-cloud.example.com . You'll need a registered domain name or delegated subdomain on Google Cloud DNS. You can set this up once and create many clusters with unique names. resource \"google_dns_managed_zone\" \"zone-for-clusters\" { dns_name = \"google-cloud.example.com.\" name = \"example-zone\" description = \"Production DNS zone\" } If you have an existing domain name with a zone file elsewhere, just delegate a subdomain that can be managed on Google Cloud (e.g. google-cloud.mydomain.com) and update nameservers .","title":"DNS Zone"},{"location":"flatcar-linux/google-cloud/#optional","text":"Name Description Default Example controller_count Number of controllers (i.e. masters) 1 3 worker_count Number of workers 1 3 controller_type Machine type for controllers \"n1-standard-1\" See below worker_type Machine type for workers \"n1-standard-1\" See below os_image Flatcar Linux image for compute instances \"flatcar-stable\" flatcar-stable, flatcar-beta, flatcar-alpha disk_size Size of the disk in GB 30 100 worker_preemptible If enabled, Compute Engine will terminate workers randomly within 24 hours false true controller_snippets Controller Container Linux Config snippets [] example worker_snippets Worker Container Linux Config snippets [] example networking Choice of networking provider \"cilium\" \"calico\" or \"cilium\" or \"flannel\" pod_cidr CIDR IPv4 range to assign to Kubernetes pods \"10.2.0.0/16\" \"10.22.0.0/16\" service_cidr CIDR IPv4 range to assign to Kubernetes services \"10.3.0.0/16\" \"10.3.0.0/24\" worker_node_labels List of initial worker node labels [] [\"worker-pool=default\"] Check the list of valid machine types .","title":"Optional"},{"location":"flatcar-linux/google-cloud/#preemption","text":"Add worker_preemptible = \"true\" to allow worker nodes to be preempted at random, but pay significantly less. Clusters tolerate stopping instances fairly well (reschedules pods, but cannot drain) and preemption provides a nice reward for running fault-tolerant cluster systems.`","title":"Preemption"},{"location":"topics/faq/","text":"FAQ \u00b6 Terraform \u00b6 Typhoon provides a Terraform Module for each supported operating system and platform. Terraform is considered a format detail, much like a Linux distro might provide images in the qcow2 or ISO format. It is a mechanism for sharing Typhoon in a way that works for many users. Formats rise and evolve. Typhoon may choose to adapt the format over time (with lots of forewarning). However, the authors' have built several Kubernetes \"distros\" before and learned from mistakes - Terraform modules are the right format for now. Security Issues \u00b6 If you find security issues, please see security disclosures . Maintainers \u00b6 Typhoon clusters are Kubernetes clusters the maintainers use in real-world, production clusters. Maintainers must personally operate a bare-metal and cloud provider cluster and strive to exercise it in real-world scenarios We merge features that are along the \"blessed path\". We minimize options to reduce complexity and matrix size. We remove outdated materials to reduce sprawl. \"Skate where the puck is going\", but also \"wait until the fit is right\". No is temporary, yes is forever.","title":"FAQ"},{"location":"topics/faq/#faq","text":"","title":"FAQ"},{"location":"topics/faq/#terraform","text":"Typhoon provides a Terraform Module for each supported operating system and platform. Terraform is considered a format detail, much like a Linux distro might provide images in the qcow2 or ISO format. It is a mechanism for sharing Typhoon in a way that works for many users. Formats rise and evolve. Typhoon may choose to adapt the format over time (with lots of forewarning). However, the authors' have built several Kubernetes \"distros\" before and learned from mistakes - Terraform modules are the right format for now.","title":"Terraform"},{"location":"topics/faq/#security-issues","text":"If you find security issues, please see security disclosures .","title":"Security Issues"},{"location":"topics/faq/#maintainers","text":"Typhoon clusters are Kubernetes clusters the maintainers use in real-world, production clusters. Maintainers must personally operate a bare-metal and cloud provider cluster and strive to exercise it in real-world scenarios We merge features that are along the \"blessed path\". We minimize options to reduce complexity and matrix size. We remove outdated materials to reduce sprawl. \"Skate where the puck is going\", but also \"wait until the fit is right\". No is temporary, yes is forever.","title":"Maintainers"},{"location":"topics/hardware/","text":"Hardware \u00b6 Typhoon ensures certain networking hardware integrates well with bare-metal Kubernetes. Ubiquiti \u00b6 Ubiquiti EdgeRouters and EdgeOS work well with bare-metal Kubernetes clusters. Familiarity with EdgeRouter setup and CLI usage is required. DHCP \u00b6 Assign static IPs to clients with known MAC addresses. This is called a static mapping by EdgeOS. Configure the router with the commands based on region inventory. configure show service dhcp-server shared-network set service dhcp-server shared-network-name LAN subnet SUBNET static-mapping NAME mac-address MACADDR set service dhcp-server shared-network-name LAN subnet SUBNET static-mapping NAME ip-address 10.0.0.20 DNS \u00b6 Add DNS A records to static IPs as dnsmasq host-records. configure set service dns forwarding options host-record=node.example.com,10.0.0.20 Forward *.svc.cluster.local queries to the CoreDNS Kubernetes service IP to allow clients to resolve Kubernetes services. set service dns forwarding options server=/svc.cluster.local/10.3.0.10 commit-confirm Restart dnsmasq . sudo /etc/init.d/dnsmasq restart PXE \u00b6 Ubiquiti EdgeRouters can provide a PXE-enabled network boot environment for client machines. ISC DHCP \u00b6 With ISC DHCP, add a subnet parameter to the LAN DHCP server to include an ISC DHCP config file. configure show service dhcp-server shared-network-name NAME subnet SUBNET set service dhcp-server shared-network-name NAME subnet SUBNET subnet-parameters \"include &quot;/config/scripts/ipxe.conf&quot;;\" commit-confirm Switch to root (i.e. sudo -i ) and write the ISC DHCP config /config/scripts/ipxe.conf . iPXE client machines will chainload to matchbox.example.com , while non-iPXE clients will chainload to undionly.kpxe (requires TFTP). allow bootp; allow booting; next-server ADD_ROUTER_IP_HERE; if exists user-class and option user-class = \"iPXE\" { filename \"http://matchbox.example.com/boot.ipxe\"; } else { filename \"undionly.kpxe\"; } dnsmasq \u00b6 With dnsmasq for DHCP, add options to chainload PXE clients to iPXE undionly.kpxe (requires TFTP), tag iPXE clients, and chainload iPXE clients to matchbox.example.com . set service dns forwarding options 'dhcp-userclass=set:ipxe,iPXE' set service dns forwarding options 'pxe-service=tag:#ipxe,x86PC,PXE chainload to iPXE,undionly.kpxe' set service dns forwarding options 'pxe-service=tag:ipxe,x86PC,iPXE,http://matchbox.example.com/boot.ipxe' TFTP \u00b6 Use dnsmasq as a TFTP server to serve undionly.kpxe . Compiling from source with TLS support is strongly recommended. If you use a pre-compiled copy, you must set download_protocol = \"http\" in your cluster definition (discouraged). sudo -i mkdir /config/tftpboot && cd /config/tftpboot curl http://boot.ipxe.org/undionly.kpxe -o undionly.kpxe Add dnsmasq command line options to enable the TFTP file server. configure show service dns forwarding set service dns forwarding options enable-tftp set service dns forwarding options tftp-root=/config/tftpboot commit-confirm Routing \u00b6 Static Routes \u00b6 Add static route(s) to Kubernetes node(s) that can route to Kubernetes service IPs (default: 10.3.0.0/16). Kubernetes service IPs will become routeable on the LAN. configure show protocols static route set protocols static route 10.3.0.0/16 next-hop NODE_IP commit-confirm Note Adding multiple next-hop nodes provides equal-cost multi-path (ECMP) routing. EdgeOS v2.0+ is required. The kernel in prior versions used flow-hash to balanced packets, whereas with v2.0, round-robin sessions are used. BGP \u00b6 EdgeRouter can exchange routes with other autonomous systems, including a cluster's Calico AS. Peers will exchange podCIDR routes to make individual pods routeable on the LAN. Define the EdgeRouter AS (if undefined). configure show protocols bgp 1 set protocols bgp 1 parameters router-id ROUTER_IP Peer with node(s) in another AS (eg. Calico default 64512) set protocols bgp 1 neighbor NODE1_IP remote-as 64512 set protocols bgp 1 neighbor NODE2_IP remote-as 64512 set protocols bgp 1 neighbor NODE3_IP remote-as 64512 commit-confirm Configure Calico node(s) as to peer with the EdgeRouter. apiVersion: crd.projectcalico.org/v1 kind: BGPPeer metadata: name: NODE_NAME-to-edgerouter spec: peerIP: ROUTER_IP asNumber: 1 node: NODE_NAME Or, if every node is to be peered (i.e. full mesh), define a global BGPPeer. apiVersion: crd.projectcalico.org/v1 kind: BGPPeer metadata: name: global spec: peerIP: ROUTER_IP asNumber: 1 If Calico nodes should advertise Kubernetes Service IPs (i.e. ClusterIPs) as well, add a BGPConfiguration . apiVersion: crd.projectcalico.org/v1 kind: BGPConfiguration metadata: name: default spec: logSeverityScreen: Info nodeToNodeMeshEnabled: true serviceClusterIPs: - cidr: 10.3.0.0/16 Show a summary of peers and exchanged routes. show ip bgp summary show ip route bgp Port Forwarding \u00b6 Expose the Ingress Controller by adding port-forward rules that DNAT a port on the router's WAN interface to an internal IP and port. By convention, a public Ingress controller is assigned a fixed service IP (e.g. 10.3.0.12). configure set port-forward wan-interface eth0 set port-forward lan-interface eth1 set port-forward auto-firewall enable set port-forward hairpin-nat enable set port-forward rule 1 description 'ingress http' set port-forward rule 1 forward-to address 10.3.0.12 set port-forward rule 1 forward-to port 80 set port-forward rule 1 original-port 80 set port-forward rule 1 protocol tcp_udp set port-forward rule 2 description 'ingress https' set port-forward rule 2 forward-to address 10.3.0.12 set port-forward rule 2 forward-to port 443 set port-forward rule 2 original-port 443 set port-forward rule 2 protocol tcp_udp commit-confirm Web UI \u00b6 The web UI is often accessible from the LAN on ports 80/443 by default. Edit the ports to 8080 and 4443 to avoid a conflict. configure show service gui set service gui http-port 8080 set service gui https-port 4443 commit-confirm","title":"Hardware"},{"location":"topics/hardware/#hardware","text":"Typhoon ensures certain networking hardware integrates well with bare-metal Kubernetes.","title":"Hardware"},{"location":"topics/hardware/#ubiquiti","text":"Ubiquiti EdgeRouters and EdgeOS work well with bare-metal Kubernetes clusters. Familiarity with EdgeRouter setup and CLI usage is required.","title":"Ubiquiti"},{"location":"topics/hardware/#dhcp","text":"Assign static IPs to clients with known MAC addresses. This is called a static mapping by EdgeOS. Configure the router with the commands based on region inventory. configure show service dhcp-server shared-network set service dhcp-server shared-network-name LAN subnet SUBNET static-mapping NAME mac-address MACADDR set service dhcp-server shared-network-name LAN subnet SUBNET static-mapping NAME ip-address 10.0.0.20","title":"DHCP"},{"location":"topics/hardware/#dns","text":"Add DNS A records to static IPs as dnsmasq host-records. configure set service dns forwarding options host-record=node.example.com,10.0.0.20 Forward *.svc.cluster.local queries to the CoreDNS Kubernetes service IP to allow clients to resolve Kubernetes services. set service dns forwarding options server=/svc.cluster.local/10.3.0.10 commit-confirm Restart dnsmasq . sudo /etc/init.d/dnsmasq restart","title":"DNS"},{"location":"topics/hardware/#pxe","text":"Ubiquiti EdgeRouters can provide a PXE-enabled network boot environment for client machines.","title":"PXE"},{"location":"topics/hardware/#isc-dhcp","text":"With ISC DHCP, add a subnet parameter to the LAN DHCP server to include an ISC DHCP config file. configure show service dhcp-server shared-network-name NAME subnet SUBNET set service dhcp-server shared-network-name NAME subnet SUBNET subnet-parameters \"include &quot;/config/scripts/ipxe.conf&quot;;\" commit-confirm Switch to root (i.e. sudo -i ) and write the ISC DHCP config /config/scripts/ipxe.conf . iPXE client machines will chainload to matchbox.example.com , while non-iPXE clients will chainload to undionly.kpxe (requires TFTP). allow bootp; allow booting; next-server ADD_ROUTER_IP_HERE; if exists user-class and option user-class = \"iPXE\" { filename \"http://matchbox.example.com/boot.ipxe\"; } else { filename \"undionly.kpxe\"; }","title":"ISC DHCP"},{"location":"topics/hardware/#dnsmasq","text":"With dnsmasq for DHCP, add options to chainload PXE clients to iPXE undionly.kpxe (requires TFTP), tag iPXE clients, and chainload iPXE clients to matchbox.example.com . set service dns forwarding options 'dhcp-userclass=set:ipxe,iPXE' set service dns forwarding options 'pxe-service=tag:#ipxe,x86PC,PXE chainload to iPXE,undionly.kpxe' set service dns forwarding options 'pxe-service=tag:ipxe,x86PC,iPXE,http://matchbox.example.com/boot.ipxe'","title":"dnsmasq"},{"location":"topics/hardware/#tftp","text":"Use dnsmasq as a TFTP server to serve undionly.kpxe . Compiling from source with TLS support is strongly recommended. If you use a pre-compiled copy, you must set download_protocol = \"http\" in your cluster definition (discouraged). sudo -i mkdir /config/tftpboot && cd /config/tftpboot curl http://boot.ipxe.org/undionly.kpxe -o undionly.kpxe Add dnsmasq command line options to enable the TFTP file server. configure show service dns forwarding set service dns forwarding options enable-tftp set service dns forwarding options tftp-root=/config/tftpboot commit-confirm","title":"TFTP"},{"location":"topics/hardware/#routing","text":"","title":"Routing"},{"location":"topics/hardware/#static-routes","text":"Add static route(s) to Kubernetes node(s) that can route to Kubernetes service IPs (default: 10.3.0.0/16). Kubernetes service IPs will become routeable on the LAN. configure show protocols static route set protocols static route 10.3.0.0/16 next-hop NODE_IP commit-confirm Note Adding multiple next-hop nodes provides equal-cost multi-path (ECMP) routing. EdgeOS v2.0+ is required. The kernel in prior versions used flow-hash to balanced packets, whereas with v2.0, round-robin sessions are used.","title":"Static Routes"},{"location":"topics/hardware/#bgp","text":"EdgeRouter can exchange routes with other autonomous systems, including a cluster's Calico AS. Peers will exchange podCIDR routes to make individual pods routeable on the LAN. Define the EdgeRouter AS (if undefined). configure show protocols bgp 1 set protocols bgp 1 parameters router-id ROUTER_IP Peer with node(s) in another AS (eg. Calico default 64512) set protocols bgp 1 neighbor NODE1_IP remote-as 64512 set protocols bgp 1 neighbor NODE2_IP remote-as 64512 set protocols bgp 1 neighbor NODE3_IP remote-as 64512 commit-confirm Configure Calico node(s) as to peer with the EdgeRouter. apiVersion: crd.projectcalico.org/v1 kind: BGPPeer metadata: name: NODE_NAME-to-edgerouter spec: peerIP: ROUTER_IP asNumber: 1 node: NODE_NAME Or, if every node is to be peered (i.e. full mesh), define a global BGPPeer. apiVersion: crd.projectcalico.org/v1 kind: BGPPeer metadata: name: global spec: peerIP: ROUTER_IP asNumber: 1 If Calico nodes should advertise Kubernetes Service IPs (i.e. ClusterIPs) as well, add a BGPConfiguration . apiVersion: crd.projectcalico.org/v1 kind: BGPConfiguration metadata: name: default spec: logSeverityScreen: Info nodeToNodeMeshEnabled: true serviceClusterIPs: - cidr: 10.3.0.0/16 Show a summary of peers and exchanged routes. show ip bgp summary show ip route bgp","title":"BGP"},{"location":"topics/hardware/#port-forwarding","text":"Expose the Ingress Controller by adding port-forward rules that DNAT a port on the router's WAN interface to an internal IP and port. By convention, a public Ingress controller is assigned a fixed service IP (e.g. 10.3.0.12). configure set port-forward wan-interface eth0 set port-forward lan-interface eth1 set port-forward auto-firewall enable set port-forward hairpin-nat enable set port-forward rule 1 description 'ingress http' set port-forward rule 1 forward-to address 10.3.0.12 set port-forward rule 1 forward-to port 80 set port-forward rule 1 original-port 80 set port-forward rule 1 protocol tcp_udp set port-forward rule 2 description 'ingress https' set port-forward rule 2 forward-to address 10.3.0.12 set port-forward rule 2 forward-to port 443 set port-forward rule 2 original-port 443 set port-forward rule 2 protocol tcp_udp commit-confirm","title":"Port Forwarding"},{"location":"topics/hardware/#web-ui","text":"The web UI is often accessible from the LAN on ports 80/443 by default. Edit the ports to 8080 and 4443 to avoid a conflict. configure show service gui set service gui http-port 8080 set service gui https-port 4443 commit-confirm","title":"Web UI"},{"location":"topics/maintenance/","text":"Maintenance \u00b6 Best Practices \u00b6 Run multiple Kubernetes clusters. Run across platforms. Plan for regional and cloud outages. Require applications be platform agnostic. Moving an application between a Kubernetes AWS cluster and a Kubernetes bare-metal cluster should be normal. Strive to make single-cluster outages tolerable. Practice performing failovers. Strive to make single-cluster outages a non-event. Load balance applications between multiple clusters, automate failover behaviors, and adjust alerting behaviors. Versioning \u00b6 Typhoon provides tagged releases to allow clusters to be versioned using ordinary Terraform configs. module \"yavin\" { source = \"git::https://github.com/poseidon/typhoon//google-cloud/fedora-coreos/kubernetes?ref=v1.24.3\" ... } module \"mercury\" { source = \"git::https://github.com/poseidon/typhoon//bare-metal/flatcar-linux/kubernetes?ref=v1.24.3\" ... } Master is updated regularly, so it is recommended to pin modules to a release tag or commit hash. Pinning ensures terraform get --update only fetches the desired version. Upgrades \u00b6 Typhoon recommends upgrading clusters using a blue-green replacement strategy and migrating workloads. Launch new (candidate) clusters from tagged releases Apply workloads from existing cluster(s) Evaluate application health and performance Migrate application traffic to the new cluster Compare metrics and delete old cluster when ready Blue-green replacement reduces risk for clusters running critical applications. Candidate clusters allow baseline properties of clusters to be assessed (e.g. pod-to-pod bandwidth). Applying application workloads allows health to be assessed before being subjected to traffic (e.g. detect any changes in Kubernetes behavior between versions). Migration to the new cluster can be controlled according to requirements. Migration may mean updating DNS records to resolve the new cluster's ingress or may involve a load balancer gradually shifting traffic to the new cluster \"backend\". Retain the old cluster for a time to compare metrics or for fallback if issues arise. Blue-green replacement provides some subtler benefits as well: Encourages investment in tooling for traffic migration and failovers. When a cluster incident arises, shifting applications to a healthy cluster will be second nature. Discourages reliance on in-place opaque state. Retain confidence in your ability to create infrastructure from scratch. Allows Typhoon to make architecture changes between releases and eases the burden on Typhoon maintainers. By contrast, distros promising in-place upgrades get stuck with their mistakes or require complex and error-prone migrations. Bare-Metal \u00b6 Typhoon bare-metal clusters are provisioned by a PXE-enabled network boot environment and a Matchbox service. To upgrade, re-provision machines into a new cluster. Failover application workloads to another cluster (varies). kubectl config use-context other-context kubectl apply -f mercury -R # DNS or load balancer changes Power off bare-metal machines and set their next boot device to PXE. ipmitool -H node1.example.com -U USER -P PASS power off ipmitool -H node1.example.com -U USER -P PASS chassis bootdev pxe Delete or comment the Terraform config for the cluster. - module \"mercury\" { - source = \"git::https://github.com/poseidon/typhoon//bare-metal/flatcar-linux/kubernetes\" - ... -} Apply to delete old provisioning configs from Matchbox. $ terraform apply Apply complete! Resources: 0 added, 0 changed, 55 destroyed. Re-provision a new cluster by following the bare-metal tutorial . Cloud \u00b6 Create a new cluster following the tutorials. Failover application workloads to the new cluster (varies). kubectl config use-context other-context kubectl apply -f mercury -R # DNS or load balancer changes Once you're confident in the new cluster, delete the Terraform config for the old cluster. - module \"yavin\" { - source = \"git::https://github.com/poseidon/typhoon//google-cloud/flatcar-linux/kubernetes\" - ... -} Apply to delete the cluster. $ terraform apply Apply complete! Resources: 0 added, 0 changed, 55 destroyed. Alternatives \u00b6 In-place Edits \u00b6 Typhoon uses a static pod Kubernetes control plane which allows certain manifest upgrades to be performed in-place. Components like kube-apiserver , kube-controller-manager , and kube-scheduler are run as static pods. Components flannel / calico , coredns , and kube-proxy are scheduled on Kubernetes and can be edited via kubectl . In certain scenarios, in-place edits can be useful for quickly rolling out security patches (e.g. bumping coredns ) or prioritizing speed over the safety of a proper cluster re-provision and transition. Note Rarely, we may test certain security in-place edits and mention them as an option in release notes. Warning Typhoon does not support or document in-place edits as an upgrade strategy. They involve inherent risks and we choose not to make recommendations or guarentees about the safety of different in-place upgrades. Its explicitly a non-goal. Node Replacement \u00b6 Typhoon supports multi-controller clusters, so it is possible to upgrade a cluster by deleting and replacing nodes one by one. Warning Typhoon does not support or document node replacement as an upgrade strategy. It limits Typhoon's ability to make infrastructure and architectural changes between tagged releases. Upgrade terraform-provider-ct \u00b6 The terraform-provider-ct plugin parses, validates, and converts Fedora CoreOS or Flatcar Linux Configs into Ignition user-data for provisioning instances. Since Typhoon v1.12.2+, the plugin can be updated in-place so that on apply, only workers will be replaced. Update the version of the ct plugin in each Terraform working directory. Typhoon clusters managed in the working directory must be v1.12.2 or higher. provider \"ct\" {} terraform { required_providers { ct = { source = \"poseidon/ct\" - version = \"0.8.0\" + version = \"0.9.0\" } ... } } Run init and plan to check that no diff is proposed for the controller nodes (a diff would destroy cluster state). terraform init terraform plan Apply the change. Worker nodes' user-data will be changed and workers will be replaced. Rollout happens slightly differently on each platform: AWS \u00b6 AWS creates a new worker ASG, then removes the old ASG. New workers join the cluster and old workers disappear. terraform apply will hang during this process. Azure \u00b6 Azure edits the worker scale set in-place instantly. Manually terminate workers to create replacement workers using the new user-data. Bare-Metal \u00b6 No action is needed. Bare-Metal machines do not re-PXE unless explicitly made to do so. DigitalOcean \u00b6 DigitalOcean destroys existing worker nodes and DNS records, then creates new workers and DNS records. DigitalOcean lacks a \"managed group\" notion. For worker droplets to join the cluster, you must taint the secret copying step to indicate it must be repeated to add the kubeconfig to new workers. # old workers destroyed, new workers created terraform apply # add kubeconfig to new workers terraform state list | grep null_resource terraform taint module.nemo.null_resource.copy-worker-secrets[N] terraform apply Expect downtime. Google Cloud \u00b6 Google Cloud creates a new worker template and edits the worker instance group instantly. Manually terminate workers and replacement workers will use the user-data. Terraform Versions \u00b6 Terraform v0.13 introduced major changes to the provider plugin system. Terraform init can automatically install both hashicorp and poseidon provider plugins, eliminating the need to manually install plugin binaries. Typhoon modules have been updated for v0.13.x. Poseidon publishes providers to the Terraform Provider Registry for usage with v0.13+. Typhoon Release Terraform version v1.21.2 - ? v0.13.x, v0.14.4+, v0.15.x, v1.0.x v1.21.1 - v1.21.1 v0.13.x, v0.14.4+, v0.15.x v1.20.2 - v1.21.0 v0.13.x, v0.14.4+ v1.20.0 - v1.20.2 v0.13.x v1.18.8 - v1.19.4 v0.12.26+, v0.13.x v1.15.0 - v1.18.8 v0.12.x v1.10.3 - v1.15.0 v0.11.x v1.9.2 - v1.10.2 v0.10.4+ or v0.11.x v1.7.3 - v1.9.1 v0.10.x v1.6.4 - v1.7.2 v0.9.x","title":"Maintenance"},{"location":"topics/maintenance/#maintenance","text":"","title":"Maintenance"},{"location":"topics/maintenance/#best-practices","text":"Run multiple Kubernetes clusters. Run across platforms. Plan for regional and cloud outages. Require applications be platform agnostic. Moving an application between a Kubernetes AWS cluster and a Kubernetes bare-metal cluster should be normal. Strive to make single-cluster outages tolerable. Practice performing failovers. Strive to make single-cluster outages a non-event. Load balance applications between multiple clusters, automate failover behaviors, and adjust alerting behaviors.","title":"Best Practices"},{"location":"topics/maintenance/#versioning","text":"Typhoon provides tagged releases to allow clusters to be versioned using ordinary Terraform configs. module \"yavin\" { source = \"git::https://github.com/poseidon/typhoon//google-cloud/fedora-coreos/kubernetes?ref=v1.24.3\" ... } module \"mercury\" { source = \"git::https://github.com/poseidon/typhoon//bare-metal/flatcar-linux/kubernetes?ref=v1.24.3\" ... } Master is updated regularly, so it is recommended to pin modules to a release tag or commit hash. Pinning ensures terraform get --update only fetches the desired version.","title":"Versioning"},{"location":"topics/maintenance/#upgrades","text":"Typhoon recommends upgrading clusters using a blue-green replacement strategy and migrating workloads. Launch new (candidate) clusters from tagged releases Apply workloads from existing cluster(s) Evaluate application health and performance Migrate application traffic to the new cluster Compare metrics and delete old cluster when ready Blue-green replacement reduces risk for clusters running critical applications. Candidate clusters allow baseline properties of clusters to be assessed (e.g. pod-to-pod bandwidth). Applying application workloads allows health to be assessed before being subjected to traffic (e.g. detect any changes in Kubernetes behavior between versions). Migration to the new cluster can be controlled according to requirements. Migration may mean updating DNS records to resolve the new cluster's ingress or may involve a load balancer gradually shifting traffic to the new cluster \"backend\". Retain the old cluster for a time to compare metrics or for fallback if issues arise. Blue-green replacement provides some subtler benefits as well: Encourages investment in tooling for traffic migration and failovers. When a cluster incident arises, shifting applications to a healthy cluster will be second nature. Discourages reliance on in-place opaque state. Retain confidence in your ability to create infrastructure from scratch. Allows Typhoon to make architecture changes between releases and eases the burden on Typhoon maintainers. By contrast, distros promising in-place upgrades get stuck with their mistakes or require complex and error-prone migrations.","title":"Upgrades"},{"location":"topics/maintenance/#bare-metal","text":"Typhoon bare-metal clusters are provisioned by a PXE-enabled network boot environment and a Matchbox service. To upgrade, re-provision machines into a new cluster. Failover application workloads to another cluster (varies). kubectl config use-context other-context kubectl apply -f mercury -R # DNS or load balancer changes Power off bare-metal machines and set their next boot device to PXE. ipmitool -H node1.example.com -U USER -P PASS power off ipmitool -H node1.example.com -U USER -P PASS chassis bootdev pxe Delete or comment the Terraform config for the cluster. - module \"mercury\" { - source = \"git::https://github.com/poseidon/typhoon//bare-metal/flatcar-linux/kubernetes\" - ... -} Apply to delete old provisioning configs from Matchbox. $ terraform apply Apply complete! Resources: 0 added, 0 changed, 55 destroyed. Re-provision a new cluster by following the bare-metal tutorial .","title":"Bare-Metal"},{"location":"topics/maintenance/#cloud","text":"Create a new cluster following the tutorials. Failover application workloads to the new cluster (varies). kubectl config use-context other-context kubectl apply -f mercury -R # DNS or load balancer changes Once you're confident in the new cluster, delete the Terraform config for the old cluster. - module \"yavin\" { - source = \"git::https://github.com/poseidon/typhoon//google-cloud/flatcar-linux/kubernetes\" - ... -} Apply to delete the cluster. $ terraform apply Apply complete! Resources: 0 added, 0 changed, 55 destroyed.","title":"Cloud"},{"location":"topics/maintenance/#alternatives","text":"","title":"Alternatives"},{"location":"topics/maintenance/#in-place-edits","text":"Typhoon uses a static pod Kubernetes control plane which allows certain manifest upgrades to be performed in-place. Components like kube-apiserver , kube-controller-manager , and kube-scheduler are run as static pods. Components flannel / calico , coredns , and kube-proxy are scheduled on Kubernetes and can be edited via kubectl . In certain scenarios, in-place edits can be useful for quickly rolling out security patches (e.g. bumping coredns ) or prioritizing speed over the safety of a proper cluster re-provision and transition. Note Rarely, we may test certain security in-place edits and mention them as an option in release notes. Warning Typhoon does not support or document in-place edits as an upgrade strategy. They involve inherent risks and we choose not to make recommendations or guarentees about the safety of different in-place upgrades. Its explicitly a non-goal.","title":"In-place Edits"},{"location":"topics/maintenance/#node-replacement","text":"Typhoon supports multi-controller clusters, so it is possible to upgrade a cluster by deleting and replacing nodes one by one. Warning Typhoon does not support or document node replacement as an upgrade strategy. It limits Typhoon's ability to make infrastructure and architectural changes between tagged releases.","title":"Node Replacement"},{"location":"topics/maintenance/#upgrade-terraform-provider-ct","text":"The terraform-provider-ct plugin parses, validates, and converts Fedora CoreOS or Flatcar Linux Configs into Ignition user-data for provisioning instances. Since Typhoon v1.12.2+, the plugin can be updated in-place so that on apply, only workers will be replaced. Update the version of the ct plugin in each Terraform working directory. Typhoon clusters managed in the working directory must be v1.12.2 or higher. provider \"ct\" {} terraform { required_providers { ct = { source = \"poseidon/ct\" - version = \"0.8.0\" + version = \"0.9.0\" } ... } } Run init and plan to check that no diff is proposed for the controller nodes (a diff would destroy cluster state). terraform init terraform plan Apply the change. Worker nodes' user-data will be changed and workers will be replaced. Rollout happens slightly differently on each platform:","title":"Upgrade terraform-provider-ct"},{"location":"topics/maintenance/#aws","text":"AWS creates a new worker ASG, then removes the old ASG. New workers join the cluster and old workers disappear. terraform apply will hang during this process.","title":"AWS"},{"location":"topics/maintenance/#azure","text":"Azure edits the worker scale set in-place instantly. Manually terminate workers to create replacement workers using the new user-data.","title":"Azure"},{"location":"topics/maintenance/#bare-metal_1","text":"No action is needed. Bare-Metal machines do not re-PXE unless explicitly made to do so.","title":"Bare-Metal"},{"location":"topics/maintenance/#digitalocean","text":"DigitalOcean destroys existing worker nodes and DNS records, then creates new workers and DNS records. DigitalOcean lacks a \"managed group\" notion. For worker droplets to join the cluster, you must taint the secret copying step to indicate it must be repeated to add the kubeconfig to new workers. # old workers destroyed, new workers created terraform apply # add kubeconfig to new workers terraform state list | grep null_resource terraform taint module.nemo.null_resource.copy-worker-secrets[N] terraform apply Expect downtime.","title":"DigitalOcean"},{"location":"topics/maintenance/#google-cloud","text":"Google Cloud creates a new worker template and edits the worker instance group instantly. Manually terminate workers and replacement workers will use the user-data.","title":"Google Cloud"},{"location":"topics/maintenance/#terraform-versions","text":"Terraform v0.13 introduced major changes to the provider plugin system. Terraform init can automatically install both hashicorp and poseidon provider plugins, eliminating the need to manually install plugin binaries. Typhoon modules have been updated for v0.13.x. Poseidon publishes providers to the Terraform Provider Registry for usage with v0.13+. Typhoon Release Terraform version v1.21.2 - ? v0.13.x, v0.14.4+, v0.15.x, v1.0.x v1.21.1 - v1.21.1 v0.13.x, v0.14.4+, v0.15.x v1.20.2 - v1.21.0 v0.13.x, v0.14.4+ v1.20.0 - v1.20.2 v0.13.x v1.18.8 - v1.19.4 v0.12.26+, v0.13.x v1.15.0 - v1.18.8 v0.12.x v1.10.3 - v1.15.0 v0.11.x v1.9.2 - v1.10.2 v0.10.4+ or v0.11.x v1.7.3 - v1.9.1 v0.10.x v1.6.4 - v1.7.2 v0.9.x","title":"Terraform Versions"},{"location":"topics/performance/","text":"Performance \u00b6 Provision Time \u00b6 Provisioning times vary based on the operating system and platform. Sampling the time to create (apply) and destroy clusters with 1 controller and 2 workers shows (roughly) what to expect. Platform Apply Destroy AWS 5 min 3 min Azure 10 min 7 min Bare-Metal 10-15 min NA Digital Ocean 3 min 30 sec 20 sec Google Cloud 8 min 5 min Notes: SOA TTL and NXDOMAIN caching can have a large impact on provision time Platforms with auto-scaling take more time to provision (AWS, Azure, Google) Bare-metal POST times and network bandwidth will affect provision times Network Performance \u00b6 Network performance varies based on the platform and CNI plugin. iperf was used to measure the bandwidth between different hosts and different pods. Host-to-host shows typical bandwidth between host machines. Pod-to-pod shows the bandwidth between two iperf containers. Platform / Plugin Theory Host to Host Pod to Pod AWS (flannel) 5 Gb/s 4.94 Gb/s 4.89 Gb/s AWS (calico, MTU 1480) 5 Gb/s 4.94 Gb/s 4.42 Gb/s AWS (calico, MTU 8981) 5 Gb/s 4.94 Gb/s 4.90 Gb/s Azure (flannel) Varies 749 Mb/s 650 Mb/s Azure (calico) Varies 749 Mb/s 650 Mb/s Bare-Metal (flannel) 1 Gb/s 940 Mb/s 903 Mb/s Bare-Metal (calico) 1 Gb/s 940 Mb/s 931 Mb/s Digital Ocean (flannel) Varies 1.97 Gb/s 1.20 Gb/s Digital Ocean (calico) Varies 1.97 Gb/s 1.20 Gb/s Google Cloud (flannel) 2 Gb/s 1.94 Gb/s 1.76 Gb/s Google Cloud (calico) 2 Gb/s 1.94 Gb/s 1.81 Gb/s Notes: Calico, Cilium, and Flannel have comparable performance. Platform and configuration differences dominate. Azure and DigitalOcean network performance can be quite variable or depend on machine type Only certain AWS EC2 instance types allow jumbo frames. This is why the default MTU on AWS must be 1480.","title":"Performance"},{"location":"topics/performance/#performance","text":"","title":"Performance"},{"location":"topics/performance/#provision-time","text":"Provisioning times vary based on the operating system and platform. Sampling the time to create (apply) and destroy clusters with 1 controller and 2 workers shows (roughly) what to expect. Platform Apply Destroy AWS 5 min 3 min Azure 10 min 7 min Bare-Metal 10-15 min NA Digital Ocean 3 min 30 sec 20 sec Google Cloud 8 min 5 min Notes: SOA TTL and NXDOMAIN caching can have a large impact on provision time Platforms with auto-scaling take more time to provision (AWS, Azure, Google) Bare-metal POST times and network bandwidth will affect provision times","title":"Provision Time"},{"location":"topics/performance/#network-performance","text":"Network performance varies based on the platform and CNI plugin. iperf was used to measure the bandwidth between different hosts and different pods. Host-to-host shows typical bandwidth between host machines. Pod-to-pod shows the bandwidth between two iperf containers. Platform / Plugin Theory Host to Host Pod to Pod AWS (flannel) 5 Gb/s 4.94 Gb/s 4.89 Gb/s AWS (calico, MTU 1480) 5 Gb/s 4.94 Gb/s 4.42 Gb/s AWS (calico, MTU 8981) 5 Gb/s 4.94 Gb/s 4.90 Gb/s Azure (flannel) Varies 749 Mb/s 650 Mb/s Azure (calico) Varies 749 Mb/s 650 Mb/s Bare-Metal (flannel) 1 Gb/s 940 Mb/s 903 Mb/s Bare-Metal (calico) 1 Gb/s 940 Mb/s 931 Mb/s Digital Ocean (flannel) Varies 1.97 Gb/s 1.20 Gb/s Digital Ocean (calico) Varies 1.97 Gb/s 1.20 Gb/s Google Cloud (flannel) 2 Gb/s 1.94 Gb/s 1.76 Gb/s Google Cloud (calico) 2 Gb/s 1.94 Gb/s 1.81 Gb/s Notes: Calico, Cilium, and Flannel have comparable performance. Platform and configuration differences dominate. Azure and DigitalOcean network performance can be quite variable or depend on machine type Only certain AWS EC2 instance types allow jumbo frames. This is why the default MTU on AWS must be 1480.","title":"Network Performance"},{"location":"topics/security/","text":"Security \u00b6 Typhoon aims to be minimal and secure. We're running it ourselves after all. Overview \u00b6 Kubernetes etcd with peer-to-peer and client-auth TLS Kubelets TLS bootstrap certificates (72 hours) Generated TLS certificate (365 days) for admin kubeconfig NodeRestriction is enabled to limit Kubelet authorization Role-Based Access Control is enabled. Apps must define RBAC policies for API access Workloads run on worker nodes only, unless they tolerate the master taint Kubernetes Network Policy and Calico NetworkPolicy support 1 Hosts Container Linux auto-updates are enabled Hosts limit logins to SSH key-based auth (user \"core\") SELinux enforcing mode 2 Platform Cloud firewalls limit access to ssh, kube-apiserver, and ingress No cluster credentials are stored in Matchbox (used for bare-metal) No cluster credentials are stored in Digital Ocean metadata Cluster credentials are stored in AWS metadata (for ASGs) Cluster credentials are stored in Azure metadata (for scale sets) Cluster credentials are stored in Google Cloud metadata (for managed instance groups) No account credentials are available to Digital Ocean droplets No account credentials are available to AWS EC2 instances (no IAM permissions) No account credentials are available to Azure instances (no IAM permissions) No account credentials are available to Google Cloud instances (no IAM permissions) Precautions \u00b6 Typhoon limits exposure to many security threats, but it is not a silver bullet. As usual, Do not run untrusted images or accept manifests from strangers Do not give untrusted users a shell behind your firewall Define network policies for your namespaces Container Images \u00b6 Typhoon uses upstream container images (where possible) and upstream binaries. Note Kubernetes releases kubelet as a binary for distros to package, either as a DEB/RPM on traditional distros or as a container image for container-optimized operating systems. Typhoon packages the upstream Kubelet and its dependencies as a container image . Builds fetch the upstream Kubelet binary and verify its checksum. The Kubelet image is published to Quay.io and Dockerhub. quay.io/poseidon/kubelet (official) docker.io/psdn/kubelet (fallback) Two tag styles indicate the build strategy used. Typhoon internal infra publishes single and multi-arch images (e.g. v1.18.4 , v1.18.4-amd64 , v1.18.4-arm64 , v1.18.4-2-g23228e6-amd64 , v1.18.4-2-g23228e6-arm64 ) Quay automated builds publish verifiable images (e.g. build-SHA on Quay) The Typhoon-built Kubelet image is used as the official image. Automated builds provide an alternative image for those preferring to trust images built by Quay (albeit lacking multi-arch). To use the fallback registry or an alternative tag, see customization . flannel-cni \u00b6 Typhoon packages the flannel-cni container image to provide security patches. quay.io/poseidon/flannel-cni (official) Terraform Providers \u00b6 Typhoon publishes Terraform providers to the Terraform Registry, GPG signed by 0x8F515AD1602065C8. Name Source Registry ct github poseidon/ct matchbox github poseidon/matchbox Disclosures \u00b6 If you find security issues, please email security@psdn.io . If the issue lies in upstream Kubernetes, please inform upstream Kubernetes as well. Requires networking = \"calico\" . Calico is the default on all platforms (AWS, Azure, bare-metal, DigitalOcean, and Google Cloud). \u21a9 SELinux is enforcing on Fedora CoreOS, permissive on Flatcar Linux. \u21a9","title":"Security"},{"location":"topics/security/#security","text":"Typhoon aims to be minimal and secure. We're running it ourselves after all.","title":"Security"},{"location":"topics/security/#overview","text":"Kubernetes etcd with peer-to-peer and client-auth TLS Kubelets TLS bootstrap certificates (72 hours) Generated TLS certificate (365 days) for admin kubeconfig NodeRestriction is enabled to limit Kubelet authorization Role-Based Access Control is enabled. Apps must define RBAC policies for API access Workloads run on worker nodes only, unless they tolerate the master taint Kubernetes Network Policy and Calico NetworkPolicy support 1 Hosts Container Linux auto-updates are enabled Hosts limit logins to SSH key-based auth (user \"core\") SELinux enforcing mode 2 Platform Cloud firewalls limit access to ssh, kube-apiserver, and ingress No cluster credentials are stored in Matchbox (used for bare-metal) No cluster credentials are stored in Digital Ocean metadata Cluster credentials are stored in AWS metadata (for ASGs) Cluster credentials are stored in Azure metadata (for scale sets) Cluster credentials are stored in Google Cloud metadata (for managed instance groups) No account credentials are available to Digital Ocean droplets No account credentials are available to AWS EC2 instances (no IAM permissions) No account credentials are available to Azure instances (no IAM permissions) No account credentials are available to Google Cloud instances (no IAM permissions)","title":"Overview"},{"location":"topics/security/#precautions","text":"Typhoon limits exposure to many security threats, but it is not a silver bullet. As usual, Do not run untrusted images or accept manifests from strangers Do not give untrusted users a shell behind your firewall Define network policies for your namespaces","title":"Precautions"},{"location":"topics/security/#container-images","text":"Typhoon uses upstream container images (where possible) and upstream binaries. Note Kubernetes releases kubelet as a binary for distros to package, either as a DEB/RPM on traditional distros or as a container image for container-optimized operating systems. Typhoon packages the upstream Kubelet and its dependencies as a container image . Builds fetch the upstream Kubelet binary and verify its checksum. The Kubelet image is published to Quay.io and Dockerhub. quay.io/poseidon/kubelet (official) docker.io/psdn/kubelet (fallback) Two tag styles indicate the build strategy used. Typhoon internal infra publishes single and multi-arch images (e.g. v1.18.4 , v1.18.4-amd64 , v1.18.4-arm64 , v1.18.4-2-g23228e6-amd64 , v1.18.4-2-g23228e6-arm64 ) Quay automated builds publish verifiable images (e.g. build-SHA on Quay) The Typhoon-built Kubelet image is used as the official image. Automated builds provide an alternative image for those preferring to trust images built by Quay (albeit lacking multi-arch). To use the fallback registry or an alternative tag, see customization .","title":"Container Images"},{"location":"topics/security/#flannel-cni","text":"Typhoon packages the flannel-cni container image to provide security patches. quay.io/poseidon/flannel-cni (official)","title":"flannel-cni"},{"location":"topics/security/#terraform-providers","text":"Typhoon publishes Terraform providers to the Terraform Registry, GPG signed by 0x8F515AD1602065C8. Name Source Registry ct github poseidon/ct matchbox github poseidon/matchbox","title":"Terraform Providers"},{"location":"topics/security/#disclosures","text":"If you find security issues, please email security@psdn.io . If the issue lies in upstream Kubernetes, please inform upstream Kubernetes as well. Requires networking = \"calico\" . Calico is the default on all platforms (AWS, Azure, bare-metal, DigitalOcean, and Google Cloud). \u21a9 SELinux is enforcing on Fedora CoreOS, permissive on Flatcar Linux. \u21a9","title":"Disclosures"}]}