mirror of
https://github.com/poseidon/typhoon
synced 2024-11-17 20:14:02 +01:00
51 KiB
51 KiB
Typhoon
Notable changes between versions.
Latest
- Migrate from Terraform v0.11 to v0.12.x (action required!)
- Migration instructions for Terraform v0.12
- Require
terraform-provider-ct
v0.3.2+ to support Terraform v0.12 (action required)
AWS
- Require
terraform-provider-aws
v2.7+ to support Terraform v0.12 (action required)
Azure
- Require
terraform-provider-azurerm
v1.27+ to support Terraform v0.12 (action required) - Avoid unneeded rotations of Regular priority virtual machine scale sets
- Azure only allows
eviction_policy
to be set for Low priority VMs. Supporting Low priority VMs meant when Regular VMs were used, eachterraform apply
rolled workers, to set eviction_policy to null. - Terraform v0.12 nullable variables fix the issue so plan does not produce a diff.
- Azure only allows
Bare-Metal
- Require
terraform-provider-matchbox
v0.3.0+ to support Terraform v0.12 (action required)
DigitalOcean
- Require
terraform-provider-digitalocean
v1.3+ to support Terraform v0.12 (action required) - Change the default
worker_type
froms-1vcpu1-1gb
tos-1vcpu-2gb
Google Cloud
- Require
terraform-provider-google
v2.5+ to support Terraform v0.12 (action required)
Addons
- Update Grafana from v6.2.1 to v6.2.2
- Update node-exporter from v0.18.0 to v0.18.1
v1.14.3
- Kubernetes v1.14.3
- Update CoreDNS from v1.3.1 to v1.5.0
- Add
ready
plugin to improve readinessProbe
- Add
- Fix trailing slash in terraform-render-bootkube version (#479)
- Recommend updating
terraform-provider-ct
plugin from v0.3.1 to v0.3.2 (#487)
AWS
- Rename
worker
pool modulecount
variable toworker_count
(#485) (action required)count
will become a reserved variable name in Terraform v0.12
Azure
- Replace
azurerm_autoscale_setting
withazurerm_monitor_autoscale_setting
(#482) - Rename
worker
pool modulecount
variable toworker_count
(#485) (action required)count
will become a reserved variable name in Terraform v0.12
Bare-Metal
Google Cloud
- Rename
worker
pool modulecount
variable toworker_count
(#485) (action required)count
is a reserved variable in Terraform v0.12
Addons
- Update Prometheus from v2.9.2 to v2.10.0
- Update Grafana from v6.1.6 to v6.2.1
v1.14.2
- Kubernetes v1.14.2
- Update etcd from v3.3.12 to v3.3.13
- Upgrade Calico from v3.6.1 to v3.7.2
- Change flannel VXLAN port from 8472 (kernel default) to 4789 (IANA VXLAN)
AWS
- Only set internal VXLAN rules when
networking
is "flannel" (default: calico)
Azure
- Allow choosing Calico as the network provider (experimental) (#472)
- Add a
networking
variable accepting "flannel" (default) or "calico" - Use VXLAN encapsulation since Azure doesn't support IPIP
- Add a
DigitalOcean
- Allow choosing Calico as the network provider (experimental) (#472)
- Add a
networking
variable accepting "flannel" (default) or "calico" - Use VXLAN encapsulation since DigitalOcean doesn't support IPIP
- Add a
- Add explicit ordering between firewall rule creation and secure copying Kubelet credentials (#469)
- Fix race scenario if copies to nodes were before rule creation, blocking cluster creation
Addons
- Update Prometheus from v2.8.1 to v2.9.2
- Update kube-state-metrics from v1.5.0 to v1.6.0
- Update node-exporter from v0.17.0 to v0.18.0
- Update Grafana from v6.1.3 to v6.1.6
- Reduce nginx-ingress Role RBAC permissions (#458)
v1.14.1
- Kubernetes v1.14.1
Addons
- Update Grafana from v6.1.1 to v6.1.3
- Update nginx-ingress from v0.23.0 to v0.24.1
v1.14.0
- Kubernetes v1.14.0
- Update Calico from v3.6.0 to v3.6.1
- Add
enable_aggregation
option for CNCF conformance (#436)- Aggregation is disabled by default to retain our security stance
- Aggregation increases the security surface area. Extensions become part of the control plane and must be scrutinized carefully and trusted. Favor leaving aggregation disabled.
AWS
- Add ability to load balance TCP applications (#443)
- Output the network load balancer ARN as
nlb_id
- Accept a
worker_target_groups
(ARN) list to which worker instances should be added
- Output the network load balancer ARN as
Azure
- Add ability to load balance TCP/UDP applications (#447)
- Output the load balancer ID as
loadbalancer_id
- Output the load balancer ID as
- Output
worker_security_group_name
andworker_address_prefix
for extending firewall rules (#447)
DigitalOcean
- Harden internal (node-to-node) firewall rules to align with other platforms (#444)
- Add ability to load balance TCP applications (#444)
- Output
controller_tag
andworker_tag
for extending firewall rules (#444)
- Output
Google Cloud
- Add ability to load balance TCP/UDP applications (#442)
- Add worker instances to a target pool, output as
worker_target_pool
- Health check for workers with Ingress controllers. Forward rules don't support differing internal/external ports, but some Ingress controllers support TCP/UDP proxy as a workaround
- Add worker instances to a target pool, output as
- Remove Haswell minimum CPU platform requirement (#439)
Addons
- Update Prometheus from v2.8.0 to v2.8.1
- Update Grafana from v6.0.2 to v6.1.1
- Add dashboard for pods in a workload (deployment/daemonset/statefulset) (#446)
- Add dashboard for workloads by namespace
v1.13.5
- Kubernetes v1.13.5
- Resolve in-addr.arpa reverse DNS lookups (PTR) for pod IPv4 addresses (#415)
- Reverse DNS lookups for service IPv4 addresses unchanged
- Upgrade Calico from v3.5.2 to v3.6.0 (#430)
- Change pod IPAM from
host-local
tocalico-ipam
.pod_cidr
is still divided into/24
subnets per node, but managed asippools
andipamblocks
- Change pod IPAM from
- Recommend updating terraform-provider-ct from v0.3.0 to v0.3.1 (#434)
- Announce: Fedora Atomic modules will be not be updated beyond Kubernetes v1.13.x (#437)
- Thank you Project Atomic team and users, please see the deprecation notice
AWS
- Support
terraform-provider-aws
v2.0+ (#419)
Bare-Metal
- Change the default iPXE kernel and initrd download protocol from HTTP to HTTPS (#420)
- Require an iPXE-enabled network boot environment with support for TLS downloads. PXE clients must chainload to iPXE firmware compiled with
DOWNLOAD_PROTO_HTTPS
enabled. (action required) - Only affects Container Linux and Flatcar Linux install profiles that pull public images (default)
- Add
download_protocol
variable. Recognizing boot firmware TLS support is difficult in some environments, set the protocol to "http" for the old behavior (discouraged)
- Require an iPXE-enabled network boot environment with support for TLS downloads. PXE clients must chainload to iPXE firmware compiled with
DigitalOcean
- Fix kubelet hostname-override to set node metadata InternalIP correctly (#424)
- Uniquely, DigitalOcean does not resolve hostnames to instance private IPs. Kubelet auto-detect mechanisms require the internal IP be set directly.
- Regressed in v1.12.3 (#337) which aimed to provide friendly hostname-based node names on DigitalOcean
Addons
- Update Prometheus from v2.7.1 to v2.8.0
- Refresh rules based on upstreams (#426)
- Define NetworkPolicy to allow only traffic from the Grafana addon
- Update Grafana from v6.0.0 to v6.0.2
- Add liveness and readiness probes
- Refresh dashboards and organize to stay below ConfigMap size limit (#426)
- Remove heapster manifests from addons (#427)
- Heapster addon powers
kubectl top
(in early Kubernetes, running the addon was expected). Today, there are better monitoring options. kubectl top
reliance on a non-core extension means its not in-scope for minimal Kubernetes- Look to prior releases if you still wish to apply heapster
- Heapster addon powers
v1.13.4
- Kubernetes v1.13.4
- Update etcd from v3.3.11 to v3.3.12
- Update Calico from v3.5.0 to v3.5.2
- Assign priorityClassNames to critical cluster and node components (#406)
- Inform node out-of-resource eviction and scheduler preemption and ordering
- Add CoreDNS readiness probe (#410)
Bare-Metal
- Recommend updating terraform-provider-matchbox plugin from v0.2.2 to v0.2.3 (#402)
- Improve docs on using Ubiquiti EdgeOS with bare-metal clusters (#413)
Google Cloud
- Support
terraform-provider-google
v2.0+ (#407)- Require
terraform-provider-google
v1.19+ (action required)
- Require
- Set the minimum CPU platform to Intel Haswell (#405)
- Haswell or better is available in every zone (no price change)
- A few zones still default to Sandy/Ivy Bridge (shifts in April 2019)
Addons
- Modernize Prometheus rules and alerts (#404)
- Drop extraneous metrics (#397)
- Add
pod
name label to metrics discovered via service endpoints - Rename
kubernetes_namespace
label tonamespace
- Modernize Grafana and dashboards, see docs (#403, #404)
- Update nginx-ingress from v0.22.0 to v0.23.0
- Raise nginx-ingress liveness/readiness timeout to 5 seconds
- Remove nginx-ingess default-backend (#401)
Fedora Atomic
- Build Kubelet system container with buildah. The image is an OCI format and slightly larger.
v1.13.3
- Kubernetes v1.13.3
- Update etcd from v3.3.10 to v3.3.11
- Update CoreDNS from v1.3.0 to v1.3.1
- Switch from the
proxy
plugin to the fasterforward
plugin for upsteam resolvers
- Switch from the
- Update Calico from v3.4.0 to v3.5.0
- Update flannel from v0.10.0 to v0.11.0
- Reduce pod eviction timeout for deleting pods on unready nodes to 1 minute
- Respond more quickly to node preemption (previously 5 minutes)
- Fix automatic worker deletion on shutdown for cloud platforms
- Lowering Kubelet privileges in #372 dropped a needed node deletion authorization. Scale-in due to manual terraform apply (any cloud), AWS spot termination, or Azure low priority deletion left old nodes registered, requiring manual deletion (
kubectl delete node name
)
- Lowering Kubelet privileges in #372 dropped a needed node deletion authorization. Scale-in due to manual terraform apply (any cloud), AWS spot termination, or Azure low priority deletion left old nodes registered, requiring manual deletion (
AWS
- Add
ingress_zone_id
output with the NLB DNS name's Route53 zone for use in alias records (#380)
Azure
- Fix azure provider warning,
public_ip
allocation_method
replacespublic_ip_address_allocation
- Require
terraform-provider-azurerm
v1.21+ (action required)
- Require
Addons
- Update nginx-ingress from v0.21.0 to v0.22.0
- Update Prometheus from v2.6.0 to v2.7.1
- Update kube-state-metrics from v1.4.0 to v1.5.0
- Fix ClusterRole to collect and export PodDisruptionBudget metrics (#383)
- Update node-exporter from v0.15.2 to v0.17.0
- Update Grafana from v5.4.2 to v5.4.3
v1.13.2
- Kubernetes v1.13.2
- Add ServiceAccounts for
kube-apiserver
andkube-scheduler
(#370) - Use lower-privilege TLS client certificates for Kubelets (#372)
- Use HTTPS liveness probes for
kube-scheduler
andkube-controller-manager
(#377) - Update CoreDNS from v1.2.6 to v1.3.0
- Allow the
certificates.k8s.io
API to issue certificates signed by the cluster CA (#376)- Configure controller manager to sign CSRs that are manually approved by an administrator
AWS
- Change
controller_type
andworker_type
default from t2.small to t3.small (#365)- t3.small is cheaper, provides 2 vCPU (instead of 1), and 5 Gbps of pod-to-pod bandwidth!
Bare-Metal
- Remove the
kubeconfig
output variable
Addons
- Update Prometheus from v2.5.0 to v2.6.0
v1.13.1
- Kubernetes v1.13.1
- Update Calico from v3.3.2 to v3.4.0 (#362)
- Install CNI plugins with an init container rather than a sidecar
- Improve the
calico-node
ClusterRole
- Recommend updating
terraform-provider-ct
plugin from v0.2.1 to v0.3.0 (#363)- Migration instructions for upgrading
terraform-provider-ct
in-place for v1.12.2+ clusters (action required) - Require switching from
~/.terraformrc
to the Terraform third-party plugins directory~/.terraform.d/plugins/
- Require Container Linux 1688.5.3 or newer
- Migration instructions for upgrading
Google Cloud
- Increase TCP proxy apiserver backend service timeout from 1 minute to 5 minutes (#361)
- Align
port-forward
behavior closer to AWS/Azure (no timeout)
- Align
Addons
- Update Grafana from v5.4.0 to v5.4.2
v1.13.0
Addons
- Update Grafana from v5.3.4 to v5.4.0
- Disable Grafana login form, since admin user can't be disabled (#352)
- Example manifests aim to provide a read-only dashboard view
v1.12.3
- Kubernetes v1.12.3
- Add
enable_reporting
variable (default "false") to provide upstreams with usage data (#345) - Change kube-apiserver
--kubelet-preferred-address-types
to InternalIP,ExternalIP,Hostname - Update Calico from v3.3.0 to v3.3.1
- Disable Felix usage reporting by default (#345)
- Improve flannel manifests
- Update CoreDNS from v1.2.4 to v1.2.6
- Enable CoreDNS
loop
andloadbalance
plugins (#340)
- Enable CoreDNS
- Fix pod-checkpointer log noise and checkpointable pods detection (#346)
- Use kubernetes-incubator/bootkube v0.14.0
- Recommend switching from
~/.terraformrc
to the Terraform third-party plugins directory~/.terraform.d/plugins/
.- Allows pinning
terraform-provider-ct
andterraform-provider-matchbox
versions - Improves safety of later plugin version migrations
- Allows pinning
Azure
- Use eviction policy
Delete
forLow
priority virtual machine scale set workers (#343)- Fix issue where Azure defaults to
Deallocate
eviction policy, which required manually restarting deallocated instances.Delete
policy aligns Azure with AWS and GCP behavior. - Require
terraform-provider-azurerm
v1.19+ (action required)
- Fix issue where Azure defaults to
Bare-Metal
- Add Kubelet
/etc/iscsi
andiscsadm
mounts on bare-metal for iSCSI (#103)
Addons
- Update nginx-ingress from v0.20.0 to v0.21.0
- Update Prometheus from v2.4.3 to v2.5.0
- Update Grafana from v5.3.2 to v5.3.4
v1.12.2
- Kubernetes v1.12.2
- Update CoreDNS from 1.2.2 to 1.2.4
- Update Calico from v3.2.3 to v3.3.0
- Disable Kubelet read-only port (#324)
- Fix CoreDNS AntiAffinity spec to prefer spreading replicas
- Ignore controller node user-data changes (#335)
- Once all managed clusters use v1.12.2, it is possible to update
terraform-provider-ct
- Once all managed clusters use v1.12.2, it is possible to update
AWS
- Add
disk_iops
variable for EBS volume IOPS (#314)
Azure
- Use new
azurerm_network_interface_backend_address_pool_association
(#332)- Require
terraform-provider-azurerm
v1.17+ (action required)
- Require
- Add
primary
field toip_configuration
needed by v1.17+ (#331)
DigitalOcean
- Add AAAA DNS records resolving to worker nodes (#333)
- Hosting IPv6 apps requires editing nginx-ingress with
hostNetwork: true
- Hosting IPv6 apps requires editing nginx-ingress with
Google Cloud
- Add an IPv6 address and IPv6 forwarding rules for load balancing IPv6 Ingress (#334)
- Add
ingress_static_ipv6
output variable for use in AAAA DNS records - Allow serving IPv6 applications via Kubernetes Ingress
- Add
Addons
- Configure Heapster to scrape Kubelets with bearer token auth (#323)
- Update Grafana from v5.3.1 to v5.3.2
v1.12.1
- Kubernetes v1.12.1
- Update etcd from v3.3.9 to v3.3.10
- Update CoreDNS from 1.1.3 to 1.2.2
- Update Calico from v3.2.1 to v3.2.3
- Raise scheduler and controller-manager replicas to the larger of 2 or the number of controller nodes (#312)
- Single-controller clusters continue to run 2 replicas as before
- Raise default CoreDNS replicas to the larger of 2 or the number of controller nodes (#313)
- Add AntiAffinity preferred rule to favor spreading CoreDNS pods
- Annotate control plane and addon containers to use the Docker runtime seccomp profile (#319)
- Override Kubernetes default behavior that starts containers with
seccomp=unconfined
- Override Kubernetes default behavior that starts containers with
Azure
- Remove
admin_password
field (disabled) since it is now optional- Require
terraform-provider-azurerm
v1.16+ (action required)
- Require
Bare-Metal
- Add support for
cached_install
mode with Flatcar Linux (#315)
DigitalOcean
- Require
terraform-provider-digitalocean
v1.0+ (action required)
Addons
- Update nginx-ingress from v0.19.0 to v0.20.0
- Update Prometheus from v2.3.2 to v2.4.3
- Update Grafana from v5.2.4 to v5.3.1
v1.11.3
- Kubernetes v1.11.3
- Introduce Typhoon for Azure as alpha (#288)
- Special thanks @justaugustus for an earlier variant
- Update Calico from v3.1.3 to v3.2.1 (#278)
AWS
- Remove firewall rule allowing ICMP packets to nodes (#285)
Bare-Metal
- Remove
controller_networkds
andworker_networkds
variables. Use Container Linux Config snippets #277
Google Cloud
- Fix firewall to allow etcd client port 2379 traffic between controller nodes (#287)
- kube-apiservers were only able to connect to their node's local etcd peer. While master node outages were tolerated, reaching a healthy peer took longer than neccessary in some cases
- Reduce time needed to bootstrap the cluster
- Remove firewall rule allowing workers to access Nginx Ingress health check (#284)
- Nginx Ingress addon no longer uses hostNetwork, Prometheus scrapes via CNI network
Addons
- Update nginx-ingress from 0.17.1 to 0.19.0
- Update kube-state-metrics from v1.3.1 to v1.4.0
- Update Grafana from 5.2.2 to 5.2.4
v1.11.2
- Kubernetes v1.11.2
- Update etcd from v3.3.8 to v3.3.9
- Use kubernetes-incubator/bootkube v0.13.0
- Fix Fedora Atomic modules' Kubelet version (#270)
Bare-Metal
- Introduce Container Linux Config snippets on bare-metal
- Validate and additively merge custom Container Linux Configs during terraform plan
- Define files, systemd units, dropins, networkd configs, mounts, users, and more
- Require
terraform-provider-ct
plugin v0.2.1 (action required!)
Addons
- Update nginx-ingress from 0.16.2 to 0.17.1
- Add nginx-ingress manifests for bare-metal
- Update Grafana from 5.2.1 to 5.2.2
- Update heapster from v1.5.3 to v1.5.4
v1.11.1
- Kubernetes v1.11.1
Addons
- Update Prometheus from v2.3.1 to v2.3.2
Errata
- Fedora Atomic modules shipped with Kubelet v1.11.0, instead of v1.11.1. Fixed in #270.
v1.11.0
- Kubernetes v1.11.0
- Force apiserver to stop listening on
127.0.0.1:8080
- Replace
kube-dns
with CoreDNS (#261)- Edit the
coredns
ConfigMap to customize - CoreDNS doesn't use a resizer. For large clusters, scaling may be required.
- Edit the
AWS
- Update from Fedora Atomic 27 to 28 (#258)
Bare-Metal
- Update from Fedora Atomic 27 to 28 (#263)
- Promote Google Cloud to stable
- Update from Fedora Atomic 27 to 28 (#259)
- Remove
ingress_static_ip
module output. Useingress_static_ipv4
. - Remove
controllers_ipv4_public
module output.
Addons
- Update nginx-ingress from 0.15.0 to 0.16.2
- Update Grafana from 5.1.4 to 5.2.1
- Update heapster from v1.5.2 to v1.5.3
v1.10.5
AWS
- Switch
kube-apiserver
port from 443 to 6443 (#248) - Combine apiserver and ingress NLBs (#249)
- Reduce cost by ~$18/month per cluster. Typhoon AWS clusters now use one network load balancer.
- Ingress addon users may keep using CNAME records to the
ingress_dns_name
module output (few million RPS) - Ingress users with heavy traffic (many million RPS) should create a separate NLB(s)
- Worker pools no longer include an extraneous load balancer. Remove worker module's
ingress_dns_name
output - Disable detailed (paid) monitoring on worker nodes (#251)
- Favor Prometheus for cloud-agnostic metrics, aggregation, and alerting
- Add
worker_target_group_http
andworker_target_group_https
module outputs to allow custom load balancing - Add
target_group_http
andtarget_group_https
worker module outputs to allow custom load balancing
Bare-Metal
- Switch
kube-apiserver
port from 443 to 6443 (#248)- Users who exposed kube-apiserver on a WAN via their router/load-balancer will need to adjust its configuration (e.g. DNAT 6443). Most apiservers are on a LAN (internal, VPN-only, etc) so if you didn't specially configure network gear for 443, no change is needed. (possible action required)
- Fix possible deadlock when provisioning clusters larger than 10 nodes (#244)
DigitalOcean
- Switch
kube-apiserver
port from 443 to 6443 (#248)- Update firewall rules and generated kubeconfig's
Google Cloud
- Use global HTTP and TCP proxy load balancing for Kubernetes Ingress (#252)
- Switch Ingress from regional network load balancers to global HTTP/TCP Proxy load balancing
- Reduce cost by ~$19/month per cluster. Google bills the first 5 global and regional forwarding rules separately. Typhoon clusters now use 3 global and 0 regional forwarding rules.
- Worker pools no longer include an extraneous load balancer. Remove worker module's
ingress_static_ip
output - Allow using nginx-ingress addon on Fedora Atomic clusters (#200)
- Add
worker_instance_group
module output to allow custom global load balancing - Add
instance_group
worker module output to allow custom global load balancing - Deprecate
ingress_static_ip
module output. Addingress_static_ipv4
module output instead. - Deprecate
controllers_ipv4_public
module output
Addons
- Update CLUO from v0.6.0 to v0.7.0 (#242)
- Update Prometheus from v2.3.0 to v2.3.1
- Update Grafana from 5.1.3 to 5.1.4
- Drop
hostNetwork
from nginx-ingress addon- Both flannel and Calico support host port via
portmap
- Allows writing NetworkPolicies that reference ingress pods in
from
orto
. HostNetwork pods were difficult to write network policy for since they could circumvent the CNI network to communicate with pods on the same node.
- Both flannel and Calico support host port via
v1.10.4
- Kubernetes v1.10.4
- Update etcd from v3.3.5 to v3.3.6
- Update Calico from v3.1.2 to v3.1.3
Addons
- Update Prometheus from v2.2.1 to v2.3.0
- Add Prometheus liveness and readiness probes
- Annotate Grafana service so Prometheus scrapes metrics
- Label namespaces to ease writing Network Policies
v1.10.3
- Kubernetes v1.10.3
- Add Flatcar Linux (Container Linux derivative) as an option for AWS and bare-metal (thanks @kinvolk folks)
- Allow bearer token authentication to the Kubelet (#216)
- Require Webhook authorization to the Kubelet
- Switch apiserver X509 client cert org to satisfy new authorization requirement
- Require Terraform v0.11.x and drop support for v0.10.x (migration guide)
- Update etcd from v3.3.4 to v3.3.5 (#213)
- Update Calico from v3.1.1 to v3.1.2
AWS
- Allow Flatcar Linux by setting
os_image
to flatcar-stable (default), flatcar-beta, flatcar-alpha (#211) - Replace
os_channel
variable withos_image
to align naming across clouds- Please change values stable, beta, or alpha to coreos-stable, coreos-beta, coreos-alpha (action required!)
- Allow preemptible workers via spot instances (#202)
- Add
worker_price
to allow worker spot instances. Default to empty string for the worker autoscaling group to use regular on-demand instances - Add
spot_price
to internalworkers
module for spot worker pools
- Add
Bare-Metal
- Allow Flatcar Linux by setting
os_channel
to flatcar-stable, flatcar-beta, flatcar-alpha (#220) - Replace
container_linux_channel
variable withos_channel
- Please change values stable, beta, or alpha to coreos-stable, coreos-beta, coreos-alpha (action required!)
- Replace
container_linux_version
variable withos_version
- Add
network_ip_autodetection_method
variable for Calico host IPv4 address detection- Use Calico's default "first-found" to support single NIC and bonded NIC nodes
- Allow alternative methods for multi NIC nodes, like can-reach=IP or interface=REGEX
- Deprecate
container_linux_oem
variable
DigitalOcean
- Update Fedora Atomic module to use Fedora Atomic 28 (#225)
- Fedora Atomic 27 images disappeared from DigitalOcean and forced this early update
Addons
- Fix Prometheus data directory location (#203)
- Configure Prometheus to scrape Kubelets directly with bearer token auth instead of proxying through the apiserver (#217)
- Security improvement: Drop RBAC permission from
nodes/proxy
tonodes/metrics
- Scale: Remove per-node proxied scrape load from the apiserver
- Security improvement: Drop RBAC permission from
- Update Grafana from v5.04 to v5.1.3 (#208)
- Disable Grafana Google Analytics by default (#214)
- Update nginx-ingress from 0.14.0 to 0.15.0
- Annotate nginx-ingress service so Prometheus auto-discovers and scrapes service endpoints (#222)
v1.10.2
- Kubernetes v1.10.2
- Introduce Typhoon for Fedora Atomic (#199)
- Update Calico from v3.0.4 to v3.1.1 (#197)
- Update etcd from v3.3.3 to v3.3.4
- Update kube-dns from v1.14.9 to v1.14.10
Google Cloud
- Add support for multi-controller clusters (i.e. multi-master) (#54, #190)
- Switch from Google Cloud network load balancer to a TCP proxy load balancer. Avoid a bug in Google network load balancers that limited clusters to only bootstrapping one controller node.
- Add TCP health check for apiserver pods on controllers. Replace kubelet check approximation.
Addons
- Update nginx-ingress from 0.12.0 to 0.14.0
- Update kube-state-metrics from v1.3.0 to v1.3.1
v1.10.1
- Kubernetes v1.10.1
- Enable etcd v3.3 metrics endpoint (#175)
- Use
k8s.gcr.io
instead ofgcr.io/google_containers
(#180)- Kubernetes recommends using the alias to pull from the nearest regional mirror and to abstract the backing container registry
- Update etcd from v3.3.2 to v3.3.3
- Update kube-dns from v1.14.8 to v1.14.9
- Use kubernetes-incubator/bootkube v0.12.0
Bare-Metal
- Fix need for multiple
terraform apply
runs to create a cluster with Terraform v0.11.4 (#181)- To SSH during a disk install for debugging, SSH as user "core" with port 2222
- Remove the old trick of using a user "debug" during disk install
Google Cloud
- Refactor out the
controller
internal module
Addons
- Add Prometheus discovery for etcd peers on controller nodes (#175)
- Scrape etcd v3.3
--listen-metrics-urls
for metrics - Enable etcd alerts and populate the etcd Grafana dashboard
- Scrape etcd v3.3
- Update kube-state-metrics from v1.2.0 to v1.3.0
v1.10.0
- Kubernetes v1.10.0
- Remove unused, unmaintained
pxe-worker
internal module
AWS
- Add
disk_type
optional variable for setting the EBS volume type (#176)- Change default type from
standard
togp2
. Prometheus etcd alerts are tuned for fast disks.
- Change default type from
Digital Ocean
- Ensure etcd secrets are only distributed to controller hosts, not workers.
- Remove
networking
optional variable. Only flannel works on Digital Ocean.
Google Cloud
- Add
disk_size
optional variable for setting instance disk size in GB - Add
controller_type
optional variable for setting machine type for controllers - Add
worker_type
optional variable for setting machine type for workers - Remove
machine_type
optional variable. Usecontroller_type
andworker_type
.
Addons
v1.9.6
- Kubernetes v1.9.6
- Update Calico from v3.0.3 to v3.0.4
Addons
- Update heapster from v1.5.1 to v1.5.2
v1.9.5
- Kubernetes v1.9.5
- Fix
subPath
volume mounts regression (kubernetes#61076)
- Fix
- Introduce Container Linux Config snippets on cloud platforms (#145)
- Validate and additively merge custom Container Linux Configs during
terraform plan
- Define files, systemd units, dropins, networkd configs, mounts, users, and more
- Require updating
terraform-provider-ct
plugin from v0.2.0 to v0.2.1
- Validate and additively merge custom Container Linux Configs during
- Add
node-role.kubernetes.io/controller="true"
node label to controllers (#160)
AWS
Digital Ocean
Google Cloud
- Require updating
terraform-provider-ct
plugin from v0.2.0 to v0.2.1 (action required!) - Relax
os_image
to optional. Default to "coreos-stable".
Addons
- Update nginx-ingress from 0.11.0 to 0.12.0
- Update Prometheus from 2.2.0 to 2.2.1
v1.9.4
- Kubernetes v1.9.4
- Secret, configMap, downward API, and projected volumes now read-only (breaking, kubernetes#58720)
- Regressed
subPath
volume mounts (regression, kubernetes#61076) - Mitigated
subPath
CVE-2017-1002101
- Introduce worker pools for AWS and Google Cloud for joining heterogeneous workers to existing clusters.
- Use new Network Load Balancers and cross zone load balancing on AWS
- Allow flexvolume plugins to be used on any Typhoon cluster (not just bare-metal)
- Upgrade etcd from v3.2.15 to v3.3.2
- Update Calico from v3.0.2 to v3.0.3
- Use kubernetes-incubator/bootkube v0.11.0
- Recommend updating
terraform-provider-ct
plugin from v0.2.0 to v0.2.1 (action recommended)
AWS
- Promote AWS platform to stable
- Allow groups of workers to be defined and joined to a cluster (i.e. worker pools) (#150)
- Replace the apiserver elastic load balancer with a network load balancer (#136)
- Replace the Ingress elastic load balancer with a network load balancer (#141)
- AWS NLBs can handle millions of RPS with high throughput and low latency.
- Require
terraform-provider-aws
1.7.0 or higher
- Enable NLB cross-zone load balancing (#159)
- Requests are automatically evenly distributed to targets regardless of AZ
- Require
terraform-provider-aws
1.11.0 or higher
- Add kubelet
--volume-plugin-dir
flag to allow flexvolume plugins (#142) - Fix controller and worker launch configs to ignore AMI changes (#126, #158)
Digital Ocean
- Add kubelet
--volume-plugin-dir
flag to allow flexvolume plugins (#142) - Fix to pass
ssh_fingerprints
as a list to droplets (#143)
Google Cloud
- Allow groups of workers to be defined and joined to a cluster (i.e. worker pools) (#148)
- Add kubelet
--volume-plugin-dir
flag to allow flexvolume plugins (#142) - Add
kubeconfig
variable tocontrollers
andworkers
submodules (#147) - Remove
kubeconfig_*
variables fromcontrollers
andworkers
submodules (#147) - Allow initial experimentation with accelerators (i.e. GPUs) on workers (#161) (unofficial)
- Require
terraform-provider-google
v1.6.0
- Require
Addons
- Update Prometheus from 2.1.0 to 2.2.0 (#153)
- Scrape Prometheus itself to enable alerts about Prometheus itself
- Adjust KubeletDown rule to fire when 10% of kubelets are down
- Update heapster from v1.5.0 to v1.5.1 (#131)
- Use separate service account
- Update nginx-ingress from 0.10.2 to 0.11.0
v1.9.3
- Kubernetes v1.9.3
- Network improvements and fixes (#104)
- Switch from Calico v2.6.6 to v3.0.2
- Add Calico GlobalNetworkSet CRD
- Update flannel from v0.9.0 to v0.10.0
- Use separate service account for flannel
- Update etcd from v3.2.14 to v3.2.15
Digital Ocean
- Use new Droplet types which offer more CPU/memory, at lower cost. (#105)
- A small Digital Ocean cluster costs less than $25 a month!
Addons
- Update Prometheus from v2.0.0 to v2.1.0 (#113)
- Improve alerting rules
- Relabel discovered kubelet, endpoint, service, and apiserver scrapes
- Use separate service accounts
- Update node-exporter and kube-state-metrics
- Include Grafana dashboards for Kubernetes admins (#113)
- Add grafana-watcher to load bundled upstream dashboards
- Update nginx-ingress from 0.9.0 to 0.10.2
- Update CLUO from v0.5.0 to v0.6.0
- Switch manifests to use
apps/v1
Deployments and Daemonsets (#120) - Remove Kubernetes Dashboard manifests (#121)
v1.9.2
- Kubernetes v1.9.2
- Add Terraform v0.11.x support
- Add explicit "providers" section to modules for Terraform v0.11.x
- Retain support for Terraform v0.10.4+
- Add migration guide from Terraform v0.10.x to v0.11.x (action required!)
- Update etcd from 3.2.13 to 3.2.14
- Update calico from 2.6.5 to 2.6.6
- Update kube-dns from v1.14.7 to v1.14.8
- Use separate service account for kube-dns
- Use kubernetes-incubator/bootkube v0.10.0
Bare-Metal
- Use per-node Container Linux install profiles (#97)
- Allow Container Linux channel/version to be chosen per-cluster
- Fix issue where cluster deletion could require
terraform apply
multiple times
Digital Ocean
- Relax
digitalocean
provider version constraint - Fix bug with
terraform plan
always showing a firewall diff to be applied (#3)
Addons
- Update CLUO to v0.5.0 to fix compatibility with Kubernetes 1.9 (important)
- Earlier versions can't roll out Container Linux updates on Kubernetes 1.9 nodes (cluo#163)
- Update kube-state-metrics from v1.1.0 to v1.2.0
- Fix RBAC cluster role for kube-state-metrics
v1.9.1
- Kubernetes v1.9.1
- Update kube-dns from 1.14.5 to v1.14.7
- Update etcd from 3.2.0 to 3.2.13
- Update Calico from v2.6.4 to v2.6.5
- Enable portmap to fix hostPort with Calico
- Use separate service account for controller-manager
v1.8.6
- Kubernetes v1.8.6
- Update Calico from v2.6.3 to v2.6.4
v1.8.5
- Kubernetes v1.8.5
- Recommend Container Linux images with Docker 17.09
- Container Linux stable, beta, and alpha now provide Docker 17.09 (instead of 1.12)
- Older clusters (with CLUO addon) auto-update Container Linux version to begin using Docker 17.09
- Fix race where
etcd-member.service
could fail to resolve peers (#69) - Add optional
cluster_domain_suffix
variable (#74) - Use kubernetes-incubator/bootkube v0.9.1
Bare-Metal
- Add kubelet
--volume-plugin-dir
flag to allow flexvolume providers (#61)
Addons
- Discourage deploying the Kubernetes Dashboard (security)
v1.8.4
- Kubernetes v1.8.4
- Calico related bug fixes
- Update Calico from v2.6.1 to v2.6.3
- Update flannel from v0.9.0 to v0.9.1
- Service accounts for kube-proxy and pod-checkpointer
- Use kubernetes-incubator/bootkube v0.9.0
v1.8.3
- Kubernetes v1.8.3
- Run etcd on-host, across controllers
- Promote AWS platform to beta
- Use kubernetes-incubator/bootkube v0.8.2
Google Cloud
- Add required variable
region
(e.g. "us-central1") - Reduce time to bootstrap a cluster
- Change etcd to run on-host, across controllers (etcd-member.service)
- Change controller instances to automatically span zones in the region
- Change worker managed instance group to automatically span zones in the region
- Improve internal firewall rules and use tag-based firewall policies
- Remove support for self-hosted etcd
- Remove the
zone
required variable - Remove the
controller_preemptible
optional variable
AWS
- Promote AWS platform to beta
- Reduce time to bootstrap a cluster
- Change etcd to run on-host, across controllers (etcd-member.service)
- Fix firewall rules for multi-controller kubelet scraping and node-exporter
- Remove support for self-hosted etcd
Addons
- Add Prometheus 2.0 addon with alerting rules
- Add Grafana dashboard for observing metrics
v1.8.2
- Kubernetes v1.8.2
- Fixes a memory leak in the v1.8.1 apiserver (kubernetes#53485)
- Switch to using the
gcr.io/google_containers/hyperkube
- Update flannel from v0.8.0 to v0.9.0
- Add
hairpinMode
to flannel CNI config - Add
--no-negcache
to kube-dns dnsmasq - Use kubernetes-incubator/bootkube v0.8.1
v1.8.1
- Kubernetes v1.8.1
- Use kubernetes-incubator/bootkube v0.8.0
Digital Ocean
- Run etcd cluster across controller nodes (etcd-member.service)
- Remove support for self-hosted etcd
- Reduce time to bootstrap a cluster
v1.7.7
- Kubernetes v1.7.7
- Use kubernetes-incubator/bootkube v0.7.0
- Update kube-dns to 1.14.5 to fix dnsmasq vulnerability
- Calico v2.6.1
- flannel-cni v0.3.0
- Update flannel CNI config to fix hostPort
v1.7.5
- Kubernetes v1.7.5
- Use kubernetes-incubator/bootkube v0.6.2
- Add AWS Terraform module (alpha)
- Add support for Calico networking (bare-metal, Google Cloud, AWS)
- Change networking default from "flannel" to "calico"
AWS
- Add
network_mtu
to allow CNI interface MTU customization
Bare-Metal
- Add
network_mtu
to allow CNI interface MTU customization - Remove support for
experimental_self_hosted_etcd
v1.7.3
- Kubernetes v1.7.3
- Use kubernetes-incubator/bootkube v0.6.1
Digital Ocean
- Add cloud firewall rules (requires Terraform v0.10)
- Change nodes tags from strings to DO tags
v1.7.1
- Kubernetes v1.7.1
- Use kubernetes-incubator/bootkube v0.6.0
- Add Bare-Metal Terraform module (stable)
- Add Digital Ocean Terraform module (beta)
Google Cloud
- Remove
k8s_domain_name
variable,cluster_name
+dns_zone
resolves to controllers - Rename
dns_base_zone
todns_zone
- Rename
dns_base_zone_name
todns_zone_name
v1.6.7
- Kubernetes v1.6.7
- Use kubernetes-incubator/bootkube v0.5.1
v1.6.6
- Kubernetes v1.6.6
- Use kubernetes-incubator/bootkube v0.4.5
- Disable locksmithd on hosts, in favor of CLUO.
v1.6.4
- Kubernetes v1.6.4
- Add Google Cloud Terraform module (stable)
Earlier
Earlier versions, back to v1.3.0, used different designs and mechanisms.