mirror of
https://github.com/poseidon/typhoon
synced 2024-11-17 20:14:02 +01:00
316f06df06
* Simplify clusters to come with a single NLB * Listen for apiserver traffic on port 6443 and forward to controllers (with healthy apiserver) * Listen for ingress traffic on ports 80/443 and forward to workers (with healthy ingress controller) * Reduce cost of default clusters by 1 NLB ($18.14/month) * Keep using CNAME records to the `ingress_dns_name` NLB and the nginx-ingress addon for Ingress (up to a few million RPS) * Users with heavy traffic (many million RPS) can create their own separate NLB(s) for Ingress and use the new output worker target groups * Fix issue where additional worker pools come with an extraneous network load balancer
22 KiB
22 KiB
Typhoon
Notable changes between versions.
Latest
AWS
- Switch
kube-apiserver
port from 443 to 6443 (#248) - Combine apiserver and ingress NLBs (#249)
- Simplify clusters to come with one NLB. Reduce cost by ~$18/month per cluster.
- Users may keep using CNAME records to
ingress_dns_name
and thenginx-ingress
addon for Ingress (up to a few million RPS) - Users with heavy traffic (many million RPS) should create a separate NLB(s) for Ingress instead
- Listen for apiserver traffic on port 6443 and forward to controllers (with healthy apiserver)
- Listen for ingress traffic on ports 80/443 and forward to workers (with healthy ingress controller)
- Worker pools (advanced) no longer include an extraneous load balancer
Bare-Metal
- Switch
kube-apiserver
port from 443 to 6443 (#248)- Users who exposed kube-apiserver on a WAN via their router/load-balancer will need to adjust its configuration (e.g. DNAT 6443). Most apiservers are on a LAN (internal, VPN-only, etc) so if you didn't specially configure network gear for 443, no change is needed. (possible action required)
- Fix possible deadlock when provisioning clusters larger than 10 nodes (#244)
DigitalOcean
- Switch
kube-apiserver
port from 443 to 6443 (#248)- Update firewall rules and generated kubeconfig's
Addons
- Update CLUO from v0.6.0 to v0.7.0 (#242)
v1.10.4
- Kubernetes v1.10.4
- Update etcd from v3.3.5 to v3.3.6
- Update Calico from v3.1.2 to v3.1.3
Addons
- Update Prometheus from v2.2.1 to v2.3.1
- Add Prometheus liveness and readiness probes
- Update Grafana from 5.1.3 to 5.1.4
- Annotate Grafana service so Prometheus scrapes metrics
- Label namespaces to ease writing Network Policies
v1.10.3
- Kubernetes v1.10.3
- Add Flatcar Linux (Container Linux derivative) as an option for AWS and bare-metal (thanks @kinvolk folks)
- Allow bearer token authentication to the Kubelet (#216)
- Require Webhook authorization to the Kubelet
- Switch apiserver X509 client cert org to satisfy new authorization requirement
- Require Terraform v0.11.x and drop support for v0.10.x (migration guide)
- Update etcd from v3.3.4 to v3.3.5 (#213)
- Update Calico from v3.1.1 to v3.1.2
AWS
- Allow Flatcar Linux by setting
os_image
to flatcar-stable (default), flatcar-beta, flatcar-alpha (#211) - Replace
os_channel
variable withos_image
to align naming across clouds- Please change values stable, beta, or alpha to coreos-stable, coreos-beta, coreos-alpha (action required!)
- Allow preemptible workers via spot instances (#202)
- Add
worker_price
to allow worker spot instances. Default to empty string for the worker autoscaling group to use regular on-demand instances - Add
spot_price
to internalworkers
module for spot worker pools
- Add
Bare-Metal
- Allow Flatcar Linux by setting
os_channel
to flatcar-stable, flatcar-beta, flatcar-alpha (#220) - Replace
container_linux_channel
variable withos_channel
- Please change values stable, beta, or alpha to coreos-stable, coreos-beta, coreos-alpha (action required!)
- Replace
container_linux_version
variable withos_version
- Add
network_ip_autodetection_method
variable for Calico host IPv4 address detection- Use Calico's default "first-found" to support single NIC and bonded NIC nodes
- Allow alternative methods for multi NIC nodes, like can-reach=IP or interface=REGEX
- Deprecate
container_linux_oem
variable
DigitalOcean
- Update Fedora Atomic module to use Fedora Atomic 28 (#225)
- Fedora Atomic 27 images disappeared from DigitalOcean and forced this early update
Addons
- Fix Prometheus data directory location (#203)
- Configure Prometheus to scrape Kubelets directly with bearer token auth instead of proxying through the apiserver (#217)
- Security improvement: Drop RBAC permission from
nodes/proxy
tonodes/metrics
- Scale: Remove per-node proxied scrape load from the apiserver
- Security improvement: Drop RBAC permission from
- Update Grafana from v5.04 to v5.1.3 (#208)
- Disable Grafana Google Analytics by default (#214)
- Update nginx-ingress from 0.14.0 to 0.15.0
- Annotate nginx-ingress service so Prometheus auto-discovers and scrapes service endpoints (#222)
v1.10.2
- Kubernetes v1.10.2
- Introduce Typhoon for Fedora Atomic (#199)
- Update Calico from v3.0.4 to v3.1.1 (#197)
- Update etcd from v3.3.3 to v3.3.4
- Update kube-dns from v1.14.9 to v1.14.10
Google Cloud
- Add support for multi-controller clusters (i.e. multi-master) (#54, #190)
- Switch from Google Cloud network load balancer to a TCP proxy load balancer. Avoid a bug in Google network load balancers that limited clusters to only bootstrapping one controller node.
- Add TCP health check for apiserver pods on controllers. Replace kubelet check approximation.
Addons
- Update nginx-ingress from 0.12.0 to 0.14.0
- Update kube-state-metrics from v1.3.0 to v1.3.1
v1.10.1
- Kubernetes v1.10.1
- Enable etcd v3.3 metrics endpoint (#175)
- Use
k8s.gcr.io
instead ofgcr.io/google_containers
(#180)- Kubernetes recommends using the alias to pull from the nearest regional mirror and to abstract the backing container registry
- Update etcd from v3.3.2 to v3.3.3
- Update kube-dns from v1.14.8 to v1.14.9
- Use kubernetes-incubator/bootkube v0.12.0
Bare-Metal
- Fix need for multiple
terraform apply
runs to create a cluster with Terraform v0.11.4 (#181)- To SSH during a disk install for debugging, SSH as user "core" with port 2222
- Remove the old trick of using a user "debug" during disk install
Google Cloud
- Refactor out the
controller
internal module
Addons
- Add Prometheus discovery for etcd peers on controller nodes (#175)
- Scrape etcd v3.3
--listen-metrics-urls
for metrics - Enable etcd alerts and populate the etcd Grafana dashboard
- Scrape etcd v3.3
- Update kube-state-metrics from v1.2.0 to v1.3.0
v1.10.0
- Kubernetes v1.10.0
- Remove unused, unmaintained
pxe-worker
internal module
AWS
- Add
disk_type
optional variable for setting the EBS volume type (#176)- Change default type from
standard
togp2
. Prometheus etcd alerts are tuned for fast disks.
- Change default type from
Digital Ocean
- Ensure etcd secrets are only distributed to controller hosts, not workers.
- Remove
networking
optional variable. Only flannel works on Digital Ocean.
Google Cloud
- Add
disk_size
optional variable for setting instance disk size in GB - Add
controller_type
optional variable for setting machine type for controllers - Add
worker_type
optional variable for setting machine type for workers - Remove
machine_type
optional variable. Usecontroller_type
andworker_type
.
Addons
v1.9.6
- Kubernetes v1.9.6
- Update Calico from v3.0.3 to v3.0.4
Addons
- Update heapster from v1.5.1 to v1.5.2
v1.9.5
- Kubernetes v1.9.5
- Fix
subPath
volume mounts regression (kubernetes#61076)
- Fix
- Introduce Container Linux Config snippets on cloud platforms (#145)
- Validate and additively merge custom Container Linux Configs during
terraform plan
- Define files, systemd units, dropins, networkd configs, mounts, users, and more
- Require updating
terraform-provider-ct
plugin from v0.2.0 to v0.2.1
- Validate and additively merge custom Container Linux Configs during
- Add
node-role.kubernetes.io/controller="true"
node label to controllers (#160)
AWS
Digital Ocean
Google Cloud
- Require updating
terraform-provider-ct
plugin from v0.2.0 to v0.2.1 (action required!) - Relax
os_image
to optional. Default to "coreos-stable".
Addons
- Update nginx-ingress from 0.11.0 to 0.12.0
- Update Prometheus from 2.2.0 to 2.2.1
v1.9.4
- Kubernetes v1.9.4
- Secret, configMap, downward API, and projected volumes now read-only (breaking, kubernetes#58720)
- Regressed
subPath
volume mounts (regression, kubernetes#61076) - Mitigated
subPath
CVE-2017-1002101
- Introduce worker pools for AWS and Google Cloud for joining heterogeneous workers to existing clusters.
- Use new Network Load Balancers and cross zone load balancing on AWS
- Allow flexvolume plugins to be used on any Typhoon cluster (not just bare-metal)
- Upgrade etcd from v3.2.15 to v3.3.2
- Update Calico from v3.0.2 to v3.0.3
- Use kubernetes-incubator/bootkube v0.11.0
- Recommend updating
terraform-provider-ct
plugin from v0.2.0 to v0.2.1 (action recommended)
AWS
- Promote AWS platform to stable
- Allow groups of workers to be defined and joined to a cluster (i.e. worker pools) (#150)
- Replace the apiserver elastic load balancer with a network load balancer (#136)
- Replace the Ingress elastic load balancer with a network load balancer (#141)
- AWS NLBs can handle millions of RPS with high throughput and low latency.
- Require
terraform-provider-aws
1.7.0 or higher
- Enable NLB cross-zone load balancing (#159)
- Requests are automatically evenly distributed to targets regardless of AZ
- Require
terraform-provider-aws
1.11.0 or higher
- Add kubelet
--volume-plugin-dir
flag to allow flexvolume plugins (#142) - Fix controller and worker launch configs to ignore AMI changes (#126, #158)
Digital Ocean
- Add kubelet
--volume-plugin-dir
flag to allow flexvolume plugins (#142) - Fix to pass
ssh_fingerprints
as a list to droplets (#143)
Google Cloud
- Allow groups of workers to be defined and joined to a cluster (i.e. worker pools) (#148)
- Add kubelet
--volume-plugin-dir
flag to allow flexvolume plugins (#142) - Add
kubeconfig
variable tocontrollers
andworkers
submodules (#147) - Remove
kubeconfig_*
variables fromcontrollers
andworkers
submodules (#147) - Allow initial experimentation with accelerators (i.e. GPUs) on workers (#161) (unofficial)
- Require
terraform-provider-google
v1.6.0
- Require
Addons
- Update Prometheus from 2.1.0 to 2.2.0 (#153)
- Scrape Prometheus itself to enable alerts about Prometheus itself
- Adjust KubeletDown rule to fire when 10% of kubelets are down
- Update heapster from v1.5.0 to v1.5.1 (#131)
- Use separate service account
- Update nginx-ingress from 0.10.2 to 0.11.0
v1.9.3
- Kubernetes v1.9.3
- Network improvements and fixes (#104)
- Switch from Calico v2.6.6 to v3.0.2
- Add Calico GlobalNetworkSet CRD
- Update flannel from v0.9.0 to v0.10.0
- Use separate service account for flannel
- Update etcd from v3.2.14 to v3.2.15
Digital Ocean
- Use new Droplet types which offer more CPU/memory, at lower cost. (#105)
- A small Digital Ocean cluster costs less than $25 a month!
Addons
- Update Prometheus from v2.0.0 to v2.1.0 (#113)
- Improve alerting rules
- Relabel discovered kubelet, endpoint, service, and apiserver scrapes
- Use separate service accounts
- Update node-exporter and kube-state-metrics
- Include Grafana dashboards for Kubernetes admins (#113)
- Add grafana-watcher to load bundled upstream dashboards
- Update nginx-ingress from 0.9.0 to 0.10.2
- Update CLUO from v0.5.0 to v0.6.0
- Switch manifests to use
apps/v1
Deployments and Daemonsets (#120) - Remove Kubernetes Dashboard manifests (#121)
v1.9.2
- Kubernetes v1.9.2
- Add Terraform v0.11.x support
- Add explicit "providers" section to modules for Terraform v0.11.x
- Retain support for Terraform v0.10.4+
- Add migration guide from Terraform v0.10.x to v0.11.x (action required!)
- Update etcd from 3.2.13 to 3.2.14
- Update calico from 2.6.5 to 2.6.6
- Update kube-dns from v1.14.7 to v1.14.8
- Use separate service account for kube-dns
- Use kubernetes-incubator/bootkube v0.10.0
Bare-Metal
- Use per-node Container Linux install profiles (#97)
- Allow Container Linux channel/version to be chosen per-cluster
- Fix issue where cluster deletion could require
terraform apply
multiple times
Digital Ocean
- Relax
digitalocean
provider version constraint - Fix bug with
terraform plan
always showing a firewall diff to be applied (#3)
Addons
- Update CLUO to v0.5.0 to fix compatibility with Kubernetes 1.9 (important)
- Earlier versions can't roll out Container Linux updates on Kubernetes 1.9 nodes (cluo#163)
- Update kube-state-metrics from v1.1.0 to v1.2.0
- Fix RBAC cluster role for kube-state-metrics
v1.9.1
- Kubernetes v1.9.1
- Update kube-dns from 1.14.5 to v1.14.7
- Update etcd from 3.2.0 to 3.2.13
- Update Calico from v2.6.4 to v2.6.5
- Enable portmap to fix hostPort with Calico
- Use separate service account for controller-manager
v1.8.6
- Kubernetes v1.8.6
- Update Calico from v2.6.3 to v2.6.4
v1.8.5
- Kubernetes v1.8.5
- Recommend Container Linux images with Docker 17.09
- Container Linux stable, beta, and alpha now provide Docker 17.09 (instead of 1.12)
- Older clusters (with CLUO addon) auto-update Container Linux version to begin using Docker 17.09
- Fix race where
etcd-member.service
could fail to resolve peers (#69) - Add optional
cluster_domain_suffix
variable (#74) - Use kubernetes-incubator/bootkube v0.9.1
Bare-Metal
- Add kubelet
--volume-plugin-dir
flag to allow flexvolume providers (#61)
Addons
- Discourage deploying the Kubernetes Dashboard (security)
v1.8.4
- Kubernetes v1.8.4
- Calico related bug fixes
- Update Calico from v2.6.1 to v2.6.3
- Update flannel from v0.9.0 to v0.9.1
- Service accounts for kube-proxy and pod-checkpointer
- Use kubernetes-incubator/bootkube v0.9.0
v1.8.3
- Kubernetes v1.8.3
- Run etcd on-host, across controllers
- Promote AWS platform to beta
- Use kubernetes-incubator/bootkube v0.8.2
Google Cloud
- Add required variable
region
(e.g. "us-central1") - Reduce time to bootstrap a cluster
- Change etcd to run on-host, across controllers (etcd-member.service)
- Change controller instances to automatically span zones in the region
- Change worker managed instance group to automatically span zones in the region
- Improve internal firewall rules and use tag-based firewall policies
- Remove support for self-hosted etcd
- Remove the
zone
required variable - Remove the
controller_preemptible
optional variable
AWS
- Promote AWS platform to beta
- Reduce time to bootstrap a cluster
- Change etcd to run on-host, across controllers (etcd-member.service)
- Fix firewall rules for multi-controller kubelet scraping and node-exporter
- Remove support for self-hosted etcd
Addons
- Add Prometheus 2.0 addon with alerting rules
- Add Grafana dashboard for observing metrics
v1.8.2
- Kubernetes v1.8.2
- Fixes a memory leak in the v1.8.1 apiserver (kubernetes#53485)
- Switch to using the
gcr.io/google_containers/hyperkube
- Update flannel from v0.8.0 to v0.9.0
- Add
hairpinMode
to flannel CNI config - Add
--no-negcache
to kube-dns dnsmasq - Use kubernetes-incubator/bootkube v0.8.1
v1.8.1
- Kubernetes v1.8.1
- Use kubernetes-incubator/bootkube v0.8.0
Digital Ocean
- Run etcd cluster across controller nodes (etcd-member.service)
- Remove support for self-hosted etcd
- Reduce time to bootstrap a cluster
v1.7.7
- Kubernetes v1.7.7
- Use kubernetes-incubator/bootkube v0.7.0
- Update kube-dns to 1.14.5 to fix dnsmasq vulnerability
- Calico v2.6.1
- flannel-cni v0.3.0
- Update flannel CNI config to fix hostPort
v1.7.5
- Kubernetes v1.7.5
- Use kubernetes-incubator/bootkube v0.6.2
- Add AWS Terraform module (alpha)
- Add support for Calico networking (bare-metal, Google Cloud, AWS)
- Change networking default from "flannel" to "calico"
AWS
- Add
network_mtu
to allow CNI interface MTU customization
Bare-Metal
- Add
network_mtu
to allow CNI interface MTU customization - Remove support for
experimental_self_hosted_etcd
v1.7.3
- Kubernetes v1.7.3
- Use kubernetes-incubator/bootkube v0.6.1
Digital Ocean
- Add cloud firewall rules (requires Terraform v0.10)
- Change nodes tags from strings to DO tags
v1.7.1
- Kubernetes v1.7.1
- Use kubernetes-incubator/bootkube v0.6.0
- Add Bare-Metal Terraform module (stable)
- Add Digital Ocean Terraform module (beta)
Google Cloud
- Remove
k8s_domain_name
variable,cluster_name
+dns_zone
resolves to controllers - Rename
dns_base_zone
todns_zone
- Rename
dns_base_zone_name
todns_zone_name
v1.6.7
- Kubernetes v1.6.7
- Use kubernetes-incubator/bootkube v0.5.1
v1.6.6
- Kubernetes v1.6.6
- Use kubernetes-incubator/bootkube v0.4.5
- Disable locksmithd on hosts, in favor of CLUO.
v1.6.4
- Kubernetes v1.6.4
- Add Google Cloud Terraform module (stable)
Earlier
Earlier versions, back to v1.3.0, used different designs and mechanisms.