This was apparently hosted on the long gone "apollo" server[1], and when
archweb was migrated to a dedicated cloud VM, it was changed to a
redirect to the main site (archlinux.org)[2][3].
It may have made sense at the time, but now four years later there is no
reason for keeping this around.
I guess dev.archlinux.org was something similar to what pkgbuild.com is
today ("Public HTML server" for staff), but only for developers.
[1] f6c3af0e ("Merge branch 'apollo_decomission' into 'master'")
[2] 824fb084 ("tf-stage1/archlinux: Change DNS records for the archweb migration and also increase the machine size")
[3] 9800d023 ("roles/archweb: Create domain redirects for the domains that point to specific archweb sub urls.")
Somehow these changes were not directly applied even though the role
reloads the prometheus config.
Fixes: 10475a62 ("prometheus: Alert if a build hosts is OOM for 12h")
Signed-off-by: Christian Heusel <christian@heusel.eu>
There is not much value in knowing when one of our build hosts has no
more memory left as all of them have plenty of swap available.
Additionally these rules trigger quite often even for short spikes.
Signed-off-by: Christian Heusel <christian@heusel.eu>
As per my announcement to arch-devops[1] and staff, this adds a Mumble
server for Arch Linux.
The password for the special root user SuperAdmin is automatically
generated on first launch and printed to the logs. I went ahead and
added it to the vault. It should not usually be required to login as
SuperAdmin though as long as there are user admins around.
This uses certbot for local certificates.
[1] https://lists.archlinux.org/archives/list/arch-devops@lists.archlinux.org/thread/AHAOSTGFJTLQDSXLWFORDKGR6RDVHYEI/
It failed to reboot during the last upgrade procedure. Upon logging into
the Equinix Metal console, we discovered that we lack access to all 4 of
the servers sponsored by Equinix Metal. They are under the CNCF account,
and it's not possible to transfer them to our organization.
Equinix Metal is being sunset, and the remaining 3 servers will also go
away on June 30th 2026. We can keep them until then, or until they fail
to boot like seoul.mirror.pkgbuild.com.
Check the HTTPS DNS records of the following Geo domains:
- geo.mirror.pkgbuild.com
- riscv.mirror.pkgbuild.com
Ensure they return: "1 . alpn=h2,h3 ipv4hint=... ipv6hint=..."
Ref #606
It seems to have broken with the release of filesystem 2021.12.07, which
incorporates this upstream change[1] in [2]. Please also see the
upstream issue[3].
I'm not sure why we used ansible_fqdn in the first place as
inventory_hostname should be preferred (as we define it ourselves).
[1] ce266330fc
[2] fc84245e3e
[3] https://github.com/systemd/systemd/issues/20358
This may be interesting for our mirror administrators and mirror owners.
I tried backfilling the data, but was unsuccessful, due to a bug[1]. We
may try again if/when the bug is fixed.
[1] https://github.com/prometheus/prometheus/issues/13747
The HTTP code must be 2xx for probe_success to indicate that the probe
succeeded, if not an alert will be sent.
Fixes: 653f8011 ("Add GitLab Pages for alpm-types[1]")
This alert only triggers for america.mirror.pkgbuild.com. Ideally, we
should be able to increase the trigger point for high-bandwidth boxes.
I don't see a straightforward way to implement it, so disable for now.
WireGuard was setup to provide a internal network with confidentiality,
authenticity and integrity[1]. This migrate the remaining Prometheus
exporters to use the internal WireGuard network.
[1] 664deb67 ("WireGuard all hosts")
Fix #384
We want non-DevOps to be able to deploy project documentation (ex:
repod) with GitLab Pages and a separate domain was considered the only
sensible solution due to security issues[1].
[1] https://github.blog/2013-04-09-yummy-cookies-across-domains/
roles/prometheus/defaults/main.yml used to include a comment with the
commands used to generate a list of HTTPS endpoints to check. Move it
into a proper script and fix it to generate the correct current list.
These are used to signal the start of the document in a stream of many
documents. As Ansible only supports one YAML document per file this is
unnecessary. About a third of our YAML documents already lacked these.