* AWS IPv4 address pricing is quite high compared to other
clouds, and an NLB unavoidably uses at least 3.
* Unlike Azure's nice outbound through LB options, AWS has
only NAT options which are even more costly than IPv4 in
budget clusters. Another option is to simply forget about
accessing nodes via IPv4 or outbound IPv4 internet access
(tradeoff: GitHub is a notable website that only serves
via IPv4, so cut ties)
* flannel and Cilium default to UDP 8472 for VXLAN traffic to
avoid conflicts with other VXLAN usage (e.g. Open vSwith)
* Aligning flannel and Cilium to use the same vxlan port makes
firewall rules or security policies simpler across clouds
Rel: https://github.com/poseidon/terraform-render-bootstrap/pull/403
* Explicitly load the `nf_conntrack` and `br_netfilter` kernel
modules that are needed for flannel CNI setups
* Specifically, flannel needs `br_netfilter` and kube-proxy (used
in flannel setups) needs `nf_conntrack`. Previously these kernel
modules were loaded by default but no longer seem to be
* Cilium has been the default for about 3 years and is the defacto
standard CNI choice. flannel is supported as a simple alternative
* Remove various historical options that were needed that are
specific to Calico
* By default, Kubelet will pull container images one by one
(in series), which is mostly related to Docker-era bugs in
parallel image pulls. These days we use containerd so parallel
pulls should be fine
* Serial image pulls are undesirable because one slow registry
or image can cause other image pulls to wait. Parallel image
pulls ensure only large images / slow registries see that impact
Docs: https://kubernetes.io/docs/reference/config-api/kubelet-config.v1beta1/
* Change the default Pod CIDR from 10.2.0.0/16 to 10.20.0.0/14
(10.20.0.0 - 10.23.255.255) to support 1024 nodes by default
* Most CNI providers divide the Pod CIDR so that each node has
a /24 to allocate to local pods (256). The previous `10.2.0.0/16`
default only fits 256 /24's so 256 nodes were supported without
customizing the pod_cidr
* When using the Cilium component, disable bootstrapping the
kube-proxy DaemonSet. Instead, configure Cilium to provide its
kube-proxy replacement with BPF
* Update the self-managed Cilium component to use kube-proxy
replacement as well
* Use EC2 resource-based hostnames instead of IP-based hostnames. The Amazon
DNS server can resolve A and AAAA queries to IPv4 and IPv6 node addresses
* For example, nodes used to be named like `ip-10-11-12-13.us-east-1.compute.internal`
but going forward use the instance id `i-0123456789abcdef.us-east-1.compute.internal`
* Tag controller node EBS volumes with a name based on the controller node name
* Set reasonable values and remove some variable clutter
* enable_reporting is only used with Calico and we can just default
to false, I doubt anyone uses Calico and cares much about reporting
metrics to upstream Calico
* Drop support for `cluster_domain_suffix` customization and
always use `cluster.local`. Many components in the Kubernetes
ecosystem assume this default suffix and its very rare to be
setting a special value here these days
* Cleanup a few variables that are seldom used
* On platforms that support ARM64 instances, configure controller
and worker node host architectures separately
* For example, you can run arm64 controllers and amd64 workers
* Add `controller_arch` and `worker_arch` variables
* Remove `arch` variable
* Add `controller_disk_type`, `controller_disk_size`, and `controller_disk_iops`
variables
* Add `worker_disk_type`, `worker_disk_size`, and `worker_disk_iops` variables
and fix propagation to worker nodes
* Remove `disk_type`, `disk_size`, and `disk_iops` variables
* Add `controller_cpu_credits` and `worker_cpu_credits` variables to set CPU
pricing mode for burstable instance types
* Previously: Typhoon provisions clusters with kube-system components
like CoreDNS, kube-proxy, and a chosen CNI provider (among flannel,
Calico, or Cilium) pre-installed. This is convenient since clusters
come with "batteries included". But it also means upgrading these
components is generally done in lock-step, by upgrading to a new
Typhoon / Kubernetes release
* It can be valuable to manage these components with a separate
plan/apply process or through automations and deploy systems. For
example, this allows managing CoreDNS separately from the cluster's
lifecycle.
* These "components" will continue to be pre-installed by default,
but a new `components` variable allows them to be disabled and
managed as "addons", components you apply after cluster creation
and manage on a rolling basis. For some of these, we may provide
Terraform modules to aide in managing these components.
```
module "cluster" {
# defaults
components = {
enable = true
coredns = {
enable = true
}
kube_proxy = {
enable = true
}
# Only the CNI set in var.networking will be installed
flannel = {
enable = true
}
calico = {
enable = true
}
cilium = {
enable = true
}
}
}
```
An earlier variable `install_container_networking = true/false` has
been removed, since it can now be achieved with this more extensible
and general components mechanism by setting the chosen networking
provider enable field to false.
* Add firewall or security riles to allow node-to-node traffic
on ports 9962-9965 for Cilium and Hubble metrics. Cilium runs
with host network, so these require cloud firewall changes
* Allow for more minimal base cluster setups, that manage CoreDNS or
kube-proxy as applications, with rolling updates, or deploy systems.
Or in the case of kube-proxy, its becoming more common to not install
it and instead use Cilium
* Add a `components` pass-through variable to configure pre-installed
components like kube-proxy and CoreDNS. These components can be
disabled (individually or together) to allow for managing components
with separate plan/apply processes or automations
* terraform-render-bootstrap manifest assets are now structured as
manifests/{coredns,kube-proxy,network} so adapt the controller
layout scripts accordingly
* This is similar to some changes in v1.29.2 that allowed for the
container networking provider manifests to be skipped
Related: https://github.com/poseidon/typhoon/pull/1419, https://github.com/poseidon/typhoon/pull/1421
* When `true`, the chosen container `networking` provider is installed during cluster bootstrap
* Set `false` to self-manage the container networking provider. This allows flannel, Calico, or Cilium
to be managed via Terraform (like any other Kubernetes resources). Nodes will be NotReady until you
apply the self-managed container networking provider. This may become the default in future.