1
0
Fork 0
mirror of https://github.com/poseidon/typhoon synced 2024-05-26 09:26:24 +02:00
typhoon/aws/flatcar-linux/kubernetes/workers/butane/worker.yaml

130 lines
4.5 KiB
YAML
Raw Normal View History

variant: flatcar
version: 1.0.0
systemd:
units:
- name: docker.service
enabled: true
- name: locksmithd.service
mask: true
- name: wait-for-dns.service
enabled: true
contents: |
[Unit]
Description=Wait for DNS entries
Wants=systemd-resolved.service
Before=kubelet.service
[Service]
Type=oneshot
RemainAfterExit=true
ExecStart=/bin/sh -c 'while ! /usr/bin/grep '^[^#[:space:]]' /etc/resolv.conf > /dev/null; do sleep 1; done'
[Install]
RequiredBy=kubelet.service
- name: kubelet.service
enabled: true
contents: |
[Unit]
Enable Kubelet TLS bootstrap and NodeRestriction * Enable bootstrap token authentication on kube-apiserver * Generate the bootstrap.kubernetes.io/token Secret that may be used as a bootstrap token * Generate a bootstrap kubeconfig (with a bootstrap token) to be securely distributed to nodes. Each Kubelet will use the bootstrap kubeconfig to authenticate to kube-apiserver as `system:bootstrappers` and send a node-unique CSR for kube-controller-manager to automatically approve to issue a Kubelet certificate and kubeconfig (expires in 72 hours) * Add ClusterRoleBinding for bootstrap token subjects (`system:bootstrappers`) to have the `system:node-bootstrapper` ClusterRole * Add ClusterRoleBinding for bootstrap token subjects (`system:bootstrappers`) to have the csr nodeclient ClusterRole * Add ClusterRoleBinding for bootstrap token subjects (`system:bootstrappers`) to have the csr selfnodeclient ClusterRole * Enable NodeRestriction admission controller to limit the scope of Node or Pod objects a Kubelet can modify to those of the node itself * Ability for a Kubelet to delete its Node object is retained as preemptible nodes or those in auto-scaling instance groups need to be able to remove themselves on shutdown. This need continues to have precedence over any risk of a node deleting itself maliciously Security notes: 1. Issued Kubelet certificates authenticate as user `system:node:NAME` and group `system:nodes` and are limited in their authorization to perform API operations by Node authorization and NodeRestriction admission. Previously, a Kubelet's authorization was broader. This is the primary security motivation. 2. The bootstrap kubeconfig credential has the same sensitivity as the previous generated TLS client-certificate kubeconfig. It must be distributed securely to nodes. Its compromise still allows an attacker to obtain a Kubelet kubeconfig 3. Bootstrapping Kubelet kubeconfig's with a limited lifetime offers a slight security improvement. * An attacker who obtains the kubeconfig can likely obtain the bootstrap kubeconfig as well, to obtain the ability to renew their access * A compromised bootstrap kubeconfig could plausibly be handled by replacing the bootstrap token Secret, distributing the token to new nodes, and expiration. Whereas a compromised TLS-client certificate kubeconfig can't be revoked (no CRL). However, replacing a bootstrap token can be impractical in real cluster environments, so the limited lifetime is mostly a theoretical benefit. * Cluster CSR objects are visible via kubectl which is nice 4. Bootstrapping node-unique Kubelet kubeconfigs means Kubelet clients have more identity information, which can improve the utility of audits and future features Rel: https://kubernetes.io/docs/reference/command-line-tools-reference/kubelet-tls-bootstrapping/ Rel: https://github.com/poseidon/terraform-render-bootstrap/pull/185
2020-04-26 01:50:51 +02:00
Description=Kubelet
Requires=docker.service
After=docker.service
Requires=coreos-metadata.service
After=coreos-metadata.service
Wants=rpc-statd.service
[Service]
Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.29.3
EnvironmentFile=/run/metadata/coreos
ExecStartPre=/bin/mkdir -p /etc/cni/net.d
Inline Container Linux kubelet.service, deprecate kubelet-wrapper * Change kubelet.service on Container Linux nodes to ExecStart Kubelet inline to replace the use of the host OS kubelet-wrapper script * Express rkt run flags and volume mounts in a clear, uniform way to make the Kubelet service easier to audit, manage, and understand * Eliminate reliance on a Container Linux kubelet-wrapper script * Typhoon for Fedora CoreOS developed a kubelet.service that similarly uses an inline ExecStart (except with podman instead of rkt) and a more minimal set of volume mounts. Adopt the volume improvements: * Change Kubelet /etc/kubernetes volume to read-only * Change Kubelet /etc/resolv.conf volume to read-only * Remove unneeded /var/lib/cni volume mount Background: * kubelet-wrapper was added in CoreOS around the time of Kubernetes v1.0 to simplify running a CoreOS-built hyperkube ACI image via rkt-fly. The script defaults are no longer ideal (e.g. rkt's notion of trust dates back to quay.io ACI image serving and signing, which informed the OCI standard images we use today, though they still lack rkt's signing ideas). * Shipping kubelet-wrapper was regretted at CoreOS, but remains in the distro for compatibility. The script is not updated to track hyperkube changes, but it is stable and kubelet.env overrides bridge most gaps * Typhoon Container Linux nodes have used kubelet-wrapper to rkt/rkt-fly run the Kubelet via the official k8s.gcr.io hyperkube image using overrides (new image registry, new image format, restart handling, new mounts, new entrypoint in v1.17). * Observation: Most of what it takes to run a Kubelet container is defined in Typhoon, not in kubelet-wrapper. The wrapper's value is now undermined by having to workaround its dated defaults. Typhoon may be better served defining Kubelet.service explicitly * Typhoon for Fedora CoreOS developed a kubelet.service without the use of a host OS kubelet-wrapper which is both clearer and eliminated some volume mounts
2019-12-29 20:17:26 +01:00
ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests
ExecStartPre=/bin/mkdir -p /opt/cni/bin
ExecStartPre=/bin/mkdir -p /var/lib/calico
ExecStartPre=/bin/mkdir -p /var/lib/kubelet/volumeplugins
ExecStartPre=/usr/bin/bash -c "grep 'certificate-authority-data' /etc/kubernetes/kubeconfig | awk '{print $2}' | base64 -d > /etc/kubernetes/ca.crt"
# Podman, rkt, or runc run container processes, whereas docker run
# is a client to a daemon and requires workarounds to use within a
# systemd unit. https://github.com/moby/moby/issues/6791
ExecStartPre=/usr/bin/docker run -d \
--name kubelet \
--privileged \
--pid host \
--network host \
-v /etc/cni/net.d:/etc/cni/net.d:ro \
-v /etc/kubernetes:/etc/kubernetes:ro \
-v /etc/machine-id:/etc/machine-id:ro \
-v /usr/lib/os-release:/etc/os-release:ro \
-v /lib/modules:/lib/modules:ro \
-v /run:/run \
-v /sys/fs/cgroup:/sys/fs/cgroup \
-v /var/lib/calico:/var/lib/calico:ro \
-v /var/lib/containerd:/var/lib/containerd \
-v /var/lib/kubelet:/var/lib/kubelet:rshared \
-v /var/log:/var/log \
-v /opt/cni/bin:/opt/cni/bin \
$${KUBELET_IMAGE} \
Enable Kubelet TLS bootstrap and NodeRestriction * Enable bootstrap token authentication on kube-apiserver * Generate the bootstrap.kubernetes.io/token Secret that may be used as a bootstrap token * Generate a bootstrap kubeconfig (with a bootstrap token) to be securely distributed to nodes. Each Kubelet will use the bootstrap kubeconfig to authenticate to kube-apiserver as `system:bootstrappers` and send a node-unique CSR for kube-controller-manager to automatically approve to issue a Kubelet certificate and kubeconfig (expires in 72 hours) * Add ClusterRoleBinding for bootstrap token subjects (`system:bootstrappers`) to have the `system:node-bootstrapper` ClusterRole * Add ClusterRoleBinding for bootstrap token subjects (`system:bootstrappers`) to have the csr nodeclient ClusterRole * Add ClusterRoleBinding for bootstrap token subjects (`system:bootstrappers`) to have the csr selfnodeclient ClusterRole * Enable NodeRestriction admission controller to limit the scope of Node or Pod objects a Kubelet can modify to those of the node itself * Ability for a Kubelet to delete its Node object is retained as preemptible nodes or those in auto-scaling instance groups need to be able to remove themselves on shutdown. This need continues to have precedence over any risk of a node deleting itself maliciously Security notes: 1. Issued Kubelet certificates authenticate as user `system:node:NAME` and group `system:nodes` and are limited in their authorization to perform API operations by Node authorization and NodeRestriction admission. Previously, a Kubelet's authorization was broader. This is the primary security motivation. 2. The bootstrap kubeconfig credential has the same sensitivity as the previous generated TLS client-certificate kubeconfig. It must be distributed securely to nodes. Its compromise still allows an attacker to obtain a Kubelet kubeconfig 3. Bootstrapping Kubelet kubeconfig's with a limited lifetime offers a slight security improvement. * An attacker who obtains the kubeconfig can likely obtain the bootstrap kubeconfig as well, to obtain the ability to renew their access * A compromised bootstrap kubeconfig could plausibly be handled by replacing the bootstrap token Secret, distributing the token to new nodes, and expiration. Whereas a compromised TLS-client certificate kubeconfig can't be revoked (no CRL). However, replacing a bootstrap token can be impractical in real cluster environments, so the limited lifetime is mostly a theoretical benefit. * Cluster CSR objects are visible via kubectl which is nice 4. Bootstrapping node-unique Kubelet kubeconfigs means Kubelet clients have more identity information, which can improve the utility of audits and future features Rel: https://kubernetes.io/docs/reference/command-line-tools-reference/kubelet-tls-bootstrapping/ Rel: https://github.com/poseidon/terraform-render-bootstrap/pull/185
2020-04-26 01:50:51 +02:00
--bootstrap-kubeconfig=/etc/kubernetes/kubeconfig \
--config=/etc/kubernetes/kubelet.yaml \
--container-runtime-endpoint=unix:///run/containerd/containerd.sock \
Enable Kubelet TLS bootstrap and NodeRestriction * Enable bootstrap token authentication on kube-apiserver * Generate the bootstrap.kubernetes.io/token Secret that may be used as a bootstrap token * Generate a bootstrap kubeconfig (with a bootstrap token) to be securely distributed to nodes. Each Kubelet will use the bootstrap kubeconfig to authenticate to kube-apiserver as `system:bootstrappers` and send a node-unique CSR for kube-controller-manager to automatically approve to issue a Kubelet certificate and kubeconfig (expires in 72 hours) * Add ClusterRoleBinding for bootstrap token subjects (`system:bootstrappers`) to have the `system:node-bootstrapper` ClusterRole * Add ClusterRoleBinding for bootstrap token subjects (`system:bootstrappers`) to have the csr nodeclient ClusterRole * Add ClusterRoleBinding for bootstrap token subjects (`system:bootstrappers`) to have the csr selfnodeclient ClusterRole * Enable NodeRestriction admission controller to limit the scope of Node or Pod objects a Kubelet can modify to those of the node itself * Ability for a Kubelet to delete its Node object is retained as preemptible nodes or those in auto-scaling instance groups need to be able to remove themselves on shutdown. This need continues to have precedence over any risk of a node deleting itself maliciously Security notes: 1. Issued Kubelet certificates authenticate as user `system:node:NAME` and group `system:nodes` and are limited in their authorization to perform API operations by Node authorization and NodeRestriction admission. Previously, a Kubelet's authorization was broader. This is the primary security motivation. 2. The bootstrap kubeconfig credential has the same sensitivity as the previous generated TLS client-certificate kubeconfig. It must be distributed securely to nodes. Its compromise still allows an attacker to obtain a Kubelet kubeconfig 3. Bootstrapping Kubelet kubeconfig's with a limited lifetime offers a slight security improvement. * An attacker who obtains the kubeconfig can likely obtain the bootstrap kubeconfig as well, to obtain the ability to renew their access * A compromised bootstrap kubeconfig could plausibly be handled by replacing the bootstrap token Secret, distributing the token to new nodes, and expiration. Whereas a compromised TLS-client certificate kubeconfig can't be revoked (no CRL). However, replacing a bootstrap token can be impractical in real cluster environments, so the limited lifetime is mostly a theoretical benefit. * Cluster CSR objects are visible via kubectl which is nice 4. Bootstrapping node-unique Kubelet kubeconfigs means Kubelet clients have more identity information, which can improve the utility of audits and future features Rel: https://kubernetes.io/docs/reference/command-line-tools-reference/kubelet-tls-bootstrapping/ Rel: https://github.com/poseidon/terraform-render-bootstrap/pull/185
2020-04-26 01:50:51 +02:00
--kubeconfig=/var/lib/kubelet/kubeconfig \
--node-labels=node.kubernetes.io/node \
%{~ for label in split(",", node_labels) ~}
--node-labels=${label} \
%{~ endfor ~}
%{~ for taint in split(",", node_taints) ~}
--register-with-taints=${taint} \
%{~ endfor ~}
--provider-id=aws:///$${COREOS_EC2_AVAILABILITY_ZONE}/$${COREOS_EC2_INSTANCE_ID}
ExecStart=docker logs -f kubelet
ExecStop=docker stop kubelet
ExecStopPost=docker rm kubelet
Restart=always
RestartSec=5
[Install]
WantedBy=multi-user.target
storage:
files:
- path: /etc/kubernetes/kubeconfig
mode: 0644
contents:
inline: |
${kubeconfig}
- path: /etc/kubernetes/kubelet.yaml
mode: 0644
contents:
inline: |
apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
authentication:
anonymous:
enabled: false
webhook:
enabled: true
x509:
clientCAFile: /etc/kubernetes/ca.crt
authorization:
mode: Webhook
cgroupDriver: systemd
clusterDNS:
- ${cluster_dns_service_ip}
clusterDomain: ${cluster_domain_suffix}
healthzPort: 0
rotateCertificates: true
shutdownGracePeriod: 45s
shutdownGracePeriodCriticalPods: 30s
staticPodPath: /etc/kubernetes/manifests
readOnlyPort: 0
resolvConf: /run/systemd/resolve/resolv.conf
volumePluginDir: /var/lib/kubelet/volumeplugins
- path: /etc/systemd/logind.conf.d/inhibitors.conf
contents:
inline: |
[Login]
InhibitDelayMaxSec=45s
- path: /etc/sysctl.d/max-user-watches.conf
mode: 0644
contents:
inline: |
fs.inotify.max_user_watches=16184
passwd:
users:
- name: core
ssh_authorized_keys:
- "${ssh_authorized_key}"