bc-thesis/tex/text.tex
2021-05-14 06:44:16 +02:00

866 lines
41 KiB
TeX
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

% ============================================================================ %
\nn{Introduction}
What we have commonly seen being used in the wild in recent years may be
called "professionalizing", by which I mean that what would typically be available 10
years ago cannot match to practically anything you can get ready in minutes to
cause real harm today.
% ============================================================================ %
\part{Theoretical part}
While denial of service can be caused in a multitude of different ways and
impact any part of the stack, we are predominantly going to look at the ones
pertaining internet networks.
\n{1}{Definition}
First we shall define what a denial of service (\it{DoS}) attack \it{is} and that
can be achieved by looking at what it does.
A DoS attack is an action that harms a \it{service} in such a way that it can
no longer serve legitimate requests as a result of being occupied by bogus or
excessive requests from the attacker.
A DDoS is a DoS that is distributed among many participant devices (and
operators).
The devices participating are generally also victims in this, most of the
attacks are performed with open DNS resolvers, home routers left to rot by
vendors, misconfigured web services or IoT devices as involuntary
participants. All one has to do is open Shodan and look for specific ports open
(ports of protocols with good reflection ratio such as DNS, CLDAP, or SSDP),
start probing and then reap the easy harvest. A quick search for devices
running with port 123 (NTP) open is certain to return a mind-blowing number
\cite{ShodanNTPd}.
\n{1}{Context}
Only in the past decade we have witnessed many large DoS/DDoS attacks, some of them
against critical infrastructure services like cloud hosting, DNS, git hosting
services or CCTV cameras. All of the attacks weaponized poorly managed
endpoints, unpatched IoT devices or
malware-infected-hosts-turned-botnet-zombies. The intensity and frequency has
also been sharply increasing, with the latest of attacks passing over the Tbps
threshold (Akamai mitigated a 1.44Tbps DDoS in 2020
\cite{akamai2020ddosretrospect}), data from Cisco noting that overall, there
was a \textbf{776\%} growth in attacks between 100 Gbps and 400 Gbps from 2018 to 2019
and predictions for the total number of DDoS attacks to double from 7.9 million
in 2018 to 15.4 million by 2023 \cite{cisco2020report}. The question is: why?
There motifs will probably more often than not stay a mystery, however, a
proliferation of DDoS-for-hire websites \cite{Santanna2018BooterLG} even on
\emph{clearnet} points us to a plausible answer.
Somebody is making money selling abusive services that are being used for
putting competitors out of business or just plain extortion. According to
Akamai, extortion attacks have seen a widespread return, with a new wave launching in mid-August
2020 \cite{akamai2021ddos}.
Akamai went on to note that DDoS attackers are expanding their reach across
geographies and industries, with the number targeted entities now being 57\%
higher than the year before that.
\n{1}{Attack methods}
There are generally several different ways to categorise a method of
attack:
\begin{description}
\item[By layers, in which the attacks are performed:]\
\begin{itemize}
\item link layer
\item internet layer
\item transport layer
\item application
\end{itemize}
\end{description}
\begin{description}
\item[By the nature of their distribution:]\
\begin{description}
\item[distributed] the effort is collectively advanced by a group of
devices
\begin{enumerate}
\item deliberate
\begin{enumerate}
\item remotely coordinated devices (IRC C\&C) - so called \it{voluntary botnets}
\item each operating their own computer, performing a premeditated operation
in a synchronized manner
\end{enumerate}
\item involuntary - hijacked devices
\end{enumerate}
\item[not distributed] there is a single source of badness
\end{description}
\end{description}
\begin{description}
\item [By the kind of remoteness necessary to successfully execute the
attack:]\
\begin{description}
\item[close-proximity] (physical engagement, i.e. sabotage) requires physical
presence in/near e.g. a datacenter, networking equipment (cutting cables,
playing a pyro)
\item[local network access] such as over a WiFi access point or on LAN
\item[remote] such as over the internet
\end{description}
\end{description}
\begin{description}
\item[By specific features:]\
\begin{itemize}
\item IP fragmentation
\item SYN flood - a rapid sequence of TCP protocol SYN messages
\item volumetric DDoS attack
\item amplification attack (also called "reflection attack")
\begin{itemize}
\item memcached exploit (1:51200)
\item DNS (\textasciitilde1:50), with a formula \cite{akamaidnsampl} \[R = answer size / query size\]
\item SNMP
\item NTP
\end{itemize}
\item exploits
\begin{itemize}
\item 0days
\item simply running unpatched versions of software
\end{itemize}
\item physical network destruction/crippling
\end{itemize}
\end{description}
\n{2}{IP fragmentation}
This is the type of attack whereby an attacker attempts to send a fragmented
payload (TCP, UDP or even ICMP) that the client is supposed to reassemble at
the destination, by doing of which their system resources (CPU and mainly
memory) would quickly get depleted, ultimately crashing the host.
It is usually necessary for IP datagrams (packets) to get fragmented in order
to be transmitted over the network. If a packet being sent is larger than the
maximum transmission unit (MTU) of the receiving side (e.g. a server), it has
to be fragmented to be transmitted completely.
ICMP and UDP fragmentation usually involves packets larger than the MTU, a
simple attempt to overwhelm the receiver that is unable to reassemble such
packets, ideally even accompanied by a buffer overflow that the attacker can
exploit further. Fragmenting TCP segments, on the other hand, targets the TCP
mechanism for reassembly. Reasonably recent Linux kernel implements protection
against this \cite{linuxretransmission}.
In either case, this is a network layer attack, since it targets the way the Internet Protocol requires data to be transmitted and processed.
\n{2}{SYN flood}\label{synfloodattack}
To establish a TCP connection, a \emph{three way handshake} must be
performed.\\
That is the opening sequence of a TCP connection that any two machines -
let's call them TCP A and TCP B - perform, whereby TCP A wanting to talk
sends a \emph{segment} with a SYN control flag, TCP B (assuming also willing to
communicate) responds with a segment with SYN and ACK control flags set and
finally, TCP A answers with a final ACK \cite{rfc793tcp}.
Using \texttt{tcpdump} we can capture an outgoing SYN packet on interface
\texttt{enp0s31f6}.
\begin{verbatim}
# tcpdump -Q out -n -N -c 1 -v -i enp0s31f6 "tcp[tcpflags] == tcp-syn"
\end{verbatim}
A malicious actor is able to misuse the handshake mechanism by posing as a
legitimate \emph{client} (or rather many legitimate clients) and sending large
number of SYN segments to a \emph{server} willing to establish a connection
(\it{LISTEN} state). The server replies with a [SYN, ACK], which is a combined
acknowledgement of the clients request \it{and} a synchronization request of
its own. The client responds back with an ACK and then the connection reaches
the \it{ESTABLISHED} state.
There is a state in which a handshake is in the process but connection has not
yet been ESTABLISHED. These connections are referred to as embryonic
(half-formed) sessions. That is precisely what happens when an attacker sends
many SYNs but stops there and leaves the connection hanging.
One particularly sly method aimed at causing as much network congestion near/at
the victim as possible is setting a private IP address (these are unroutable,
or rather, \it{should not be routed} over public internet) or an address from
deallocated space as the source IP address. For the sake of the argument
suppose it is an address from deallocated space, what then ends up happening is
the server responds with a [SYN, ACK] and since no response comes from an address
that's not currently allocated to a customer (no response \it{can} come because
nobody is using it), TCP just assumes that the packets lost on the
way and attempts packet \it{retransmission} \cite{rfc6298}.
Obviously, this cannot yield a successful result so in the end the server just
added onto the already congesting network.
Current recommended practice as per RFC 3704 is to enable strict mode when
possible to prevent IP spoofing from DDoS attacks. If asymmetric routing or
other kind of complicated routing is used, then loose mode is recommended
\cite{rfc3704multihomed}.
That way the spoofed traffic never leaves the source network (responsibility of
the transit provider/ISP) and does not aggregate on a single host's interface.
For this to be a reality the adoption rate of the subject RFC recommendations
would need to see an proper increase.
As is true for anything, if countermeasures are set up improperly, legitimate traffic could end up being blocked as a result.
\n{2}{Amplified Reflection Attack}
The name suggests this type of attack is based on two concepts: amplification and
reflection. The amplification part pertains the fact that certain protocols
answer even a relatively small query with a sizable response.
The reflection part is usually taking advantage of session-less protocols.
One such protocol is UDP with session-less meaning that hosts are not required
to first establish a \it{session} to communicate, a response is simply sent
back to the address that the packet arrives from (source address).
Except for the fact that if a malicious player is not interested in
communication but only wants to cause harm, a packet's source address does not
necessarily have to, in fact \it{cannot} (from an attacker's point of view)
correspond to the source address of their machine.
Since overwriting fields of the packet header (where the information important
to routing reside) is trivial and there's nothing easier than supplying a UDP
request with (either a bogus but more commonly) a victim IP address as the
source address instead of our own that's present there by default.
The response is then returned \it{back} - not to the actual sender, but simply
according to the source address.\\
Since UDP has no concept of a \it{connection} or any verification mechanism, the response arrives at the door of the victim that has never asked for
it - in the worst case an unsolicited pile of them.
This is why the three-way handshake is used with TCP, which was developed later
than UDP, as it reduces the possibility of false connections.
The goal of the attacker is then clear: get the largest possible response and
have it delivered to the victim (in good faith of the server even).
Spoofing the source address is done with the purpose of evading detection as a
blocking or rate-limiting mechanism at the destination would likely identify any
above-threshold-number requests coming from a single IP and ban them, thus
decreasing the impact of the attack when the intent was to achieve congestion
at the designated destination - the victim.
A perfect example for how bad this can get is unpatched or misconfigured
\texttt{memcached} software, that is very commonly being used as e.g. a database
caching system and has an option to listen on UDP port.
Cloudflare say they have witnessed amplification factors up to 51200 times
\cite{cfmemcached}.
As has already been mentioned in ~\ref{synfloodattack}, this entire suite of issues
could be if not entirely prevented then largely mitigated if the very sound
recommendations of RFC 3704 gained greater adoption among ISPs.
Until then, brace yourselves for the next assault.
\n{2}{Slowloris}\label{slowloris}
The principle of this attack is to first open as many connections as
possible, aiming to fill the capacity of the server, and then keep them
open for as long as possible by sending periodic keep-alive packets.\\
This attack works at the application layer but the principle can easily be
reapplied elsewhere.
\n{2}{BGP hijacking}
BGP is an inter-Autonomous System routing protocol, whose primary function is
to exchange network reachability information with other BGP systems.
Furthermore, this network reachability information "includes information on the
list of Autonomous Systems (ASes) that reachability information traverses.
This information is sufficient for constructing a graph of AS connectivity for
this reachability, from which routing loops may be pruned and, at the AS level,
some policy decisions may be enforced. This information is sufficient for
constructing a graph of AS connectivity for this reachability, from which
routing loops may be pruned and, at the AS level, some policy decisions may be
enforced." \cite{rfc4271bgp4}.
BGP hijacking, in some places spoken of as prefix hijacking, route hijacking or
IP hijacking is a result of a intentional or unintentional misbehavior in
which a malicious or misconfigured BGP router originates a route to an IP
prefix it does not own and Zhang et al. find it is becoming an increasingly
serious security problem in the Internet \cite{Zhang2007PracticalDA}.
\n{2}{Low-rate DoS on BGP}
As shown by Zhang et al. in their "Low-Rate TCP-Targeted DoS Attack Disrupts
Internet Routing" paper, BGP itself is prone to a variation of slowloris due to
the fact that it runs over TCP for reliability. Importantly, this is a
low-bandwidth attack and a more difficult one to detect because of that.
Beyond the attack's ability to further slow down the already slow BGP
convergence process during route changes, it can cause a BGP session reset. For
the BGP session to be reset, the induced congestion by attack traffic needs to
last sufficiently long to cause the BGP Hold Timer to expire \cite{Zhang2007LowRateTD}.
On top of all that, this attack is especially hideous in that it can be launched
remotely from end hosts without access to routers or the ability to send
traffic directly to them.
\n{1}{Attack tools}
Believe it or not there actually exists a DDoS attack tools topic on
GitHub
\url{https://github.com/topics/ddos-attack-tools?o=desc\&s=stars}.
\n{2}{HOIC}
LOIC successor HTTP flooding High Orbit Ion Cannon, affectionately
referred to as 'HOIC' is a \emph{free software}\footnotemark tool which enables
one to stress-test the robustness of their infrastructure by applying
enormous pressure on the designated target in form of high number of requests.
It operates with HTTP and users are able to send 'GET' or 'POST' requests to as
many as 256 sites simultaneously.
While it is relatively easily defeated by a WAF (see \ref{waf}), the
possibility to target many sites at once makes it possible for users to
coordinate the attack, consequently making detection and mitigation efforts
more difficult.
\footnotetext{free as both in freedom and free beer}
\n{2}{slowloris.py}
\texttt{slowloris.py} is a python script available
from~\url{github.com/gkbrk/slowloris} that is able to perform a slowloris
attack. It seeks to extinguish file descriptors needed for opening new
connections on the server and then keeping the connections for as long as it can.\\
Legitimate requests cannot be served as a result, since there is no way for the
server to facilitate them until resources bound by bogus requests are freed,
i.e. the attack ceases to be.
\n{2}{iperf3}
Massive load producing tool sending a packet flood of protocol of choice towards the target.
\n{2}{ddosim}
DDoS simulator methods of flooding:
\begin{itemize}
\item TCP
\item UDP
\item HTTP
\end{itemize}
\n{2}{Metasploit Framework}
Metasploit is a penetration testing framework with an open source community
version and a commercial version (Metasploit Unleashed) available. It enables security
researchers to automate workflows of probing vulnerable services or devices via
use of so called modules - smaller programs with definable inputs that perform
prefined actions. Modules are often community-contributed and one can even
write a module ourselves.a The SYN-flooding funtionality has been implemented -
\texttt{aux/synflood} an auxiliary module. Auxiliary modules do not execute
payloads and perform arbitrary actions that may not be related to
exploitation, such as scanning, fuzzing and denial of service attacks
\cite{metasploit}.
\n{2}{Web browser}
Depending on our point of view (more fittingly, our scaling
capabilities), sometimes all that is needed to cause a denial of
service is tons of people behind a web browser.\\
Numerous requests quickly overload a small server, eventually causing it
respond so slowly that the impact is indistinguishable from a DoS attack.
That is because in principle \it{a DoS attack} is practically the same
thing as discussed above, the only difference is the malicious intent,
imperceivable to a machine.
\n{1}{Mitigation methods}
Drastic times require drastic measures and since a DDoS attacks coming
at us practically every other month classify as
\it{drastic} quite easily, we're forced to act accordingly
\cite{akamai2021ddos}.
Still, it is more reasonable to prepare than to improvise, therefore the
following write-up mentions of commonly used mitigation methods at different levels,
from a hobbyist server to an e-commerce service to an ISP. The list is
inconclusive and of course if reading this at a later date, always cross-check
with the current best practices at the time.
\n{2}{Blackhole routing (black-holing, null routing)}
Black-holing is a technique that instructs routers that traffic for a specific
prefix is to be routed to the null interface, i.e. be dropped and is used to
cut attack traffic before it reaches the destination AS.\\
Assuming the router is properly configured to direct RFC 1918 destined traffic
to a null interface, traffic destined to the attacked network gets dropped,
making the attacked network unreachable to the attacker and everyone else.
Matter of factly, we actually conclude the DoS for the attacker
ourselves.\cite{rfc1918}\cite{rfc3882}
In case of a DDoS, the traffic is likely to come from all over the world
\cite{akamai2020ddosretrospect}.
The idea here is to announce to our upstream (ingress provider) that supports RTBH
(remotely-triggered black hole) signalling (critical) that we do not need any
traffic for the victim IP anymore. They would then propagate the announcement
further and in no time we'd stop seeing malicious traffic coming to a victim IP
in our network.
In fact, we would not see any traffic coming to the victim anymore, because we
just broadcast a message that we do not wish to receive traffic for it.
For the entire time we're announcing it, the victim host stays unreachable.
We should make sure to announce the smallest possible prefix to minimise the
collateral damage. Generally, a /21 or /22 prefix is assigned to an AS (the
average prefix per AS being 22.7866 as of 11 May 2021 \cite{prefixavgsize}) announcing a black hole
for such a large space would likely cause more damage than the attack itself.
To reduce BGP overhead, prefixes are usually announced aggregated, with the
exception of "a~situation", such as when we wish to only stop receiving traffic
for one IP address. Smallest possible accepted prefix size tends to be /24
(which is still a lot) with average prefix size updated being 23.11
\cite{prefixavgupdatedsize}, however, some upstream providers might even
support a /32 in case of emergency, effectively dropping traffic only for the
victim.
When an attack hits, all we have to do is:
\begin{enumerate}
\item deaggregate prefixes
\item withdraw hit prefixes.
\end{enumerate}
In case our upstream provider did not support RTBH and we could not lose them
(e.g. the only one around), we could still make use of Team Cymru's new
BGP-based solution that distributes routes to participating networks using only
vetted information about current and ongoing unwanted traffic - the \b{Unwanted
Traffic Removal Service} (UTRS). It is a free community service, currently only
available to operators who have an existing ASN assigned and publicly announce
one or more netblocks with their own originating ASN into the public Internet
BGP routing tables.
If only there was a way to just shut down the bad traffic but keep the good one
flowing\footnotemark!
\footnotetext{other than scrubbing}
Behold, this is what \it{selective black-holing} actually is. Some upstream
providers define multiple different blackhole communities each followed by a
predefined action on the upstream. One is able to announce to these communities as needed.
Assume we would announce to a community that would in response announce the
blackhole to internet exchanges in, say, North America and Asia and but allow
traffic coming from Europe, would be a perfect example of selective black-holing.
This causes two things to happen. First, our customer's IP is still reachable
from our local area (e.g. Europe) and since our fictitious customer mostly
serves European customers that's fine. Second, outside of the prefedined radius
(Europe in this exercise) any traffic destined for our network (of which the
victim IP is a part of) is immediately dropped at the remote IXPs, long before
it ever comes anywhere near our geographical area, let alone our network.
I believe this approach is superior to indiscriminate black-holing and, given
it is reasonably automated and quick to respond, in combination with other
mitigation methods it can provide a viable protection for the network.
\n{2}{Sinkholing}
Moving on, this method works by diverting only malicious traffic away from its target,
usually using a predefined list of IP addresses known to be part of malicious
activities to identify DDoS traffic. False positives can occur more rarely and
collateral damage is lesser than with black-holing but since botnet IPs can be
also used by legitimate users this is still prone to false positives.
Additionally, sinkholing as such is ineffective against IP
spoofing, which is a common feature in network layer attacks.
\n{2}{Scrubbing}
An improvement on arbitrary full-blown sinkholing, during the scrubbing process
all ingress traffic is routed through a security service, which can be
performed in-house or can even be outsourced. Malicious network
packets are identified based on their header content, size, type, point of
origin, etc. using heuristics or just simple rules. The challenge is to perform
scrubbing at an inline rate without impacting legitimate users.
If outsourced, the scrubber service has the bandwidth capacity (either
on-demand or permanently) to take the hit that we do not have. There are at
least two ways to go about this - the BGP and the DNS way, we will cover the BGP
one. Once an attack is identified, we stop announcing the prefix that is
currently being hit, contact our scrubbing provider (usually
automatically/programatically) to start announcing the subject prefix,
receiving all its traffic (including the attack traffic), the scrubbing service
does the cleaning and sends us back the clean traffic \cite{akamaiddosdefence}.
When performing the scrubbing in-house, we have to clean the traffic on our own
appliance that has to have sufficient bandwidth (usually on par with upstream).
A poor man's scrubber:
\begin{itemize}
\item hardware accelerated ACLs on switches,
\item switches do simple filtering at \it{inline rate} (ASICs)
\item can be effective when attack protocol is easily distinguishable from real
traffic
\item hit by NTP/DNS/SNMP/SSDP amplification attack
\end{itemize}
We should be performing network analysis and once higher rates of packets with
source ports of protocols known to be misused for DoS/DDoS start arriving to
our network (such as 123 or 53), we start signalling to our upstream
providers, since they can probably handle it better than us and have as much
interest in doing so as us (or should).
One thing we should do no matter whether we are currently suffering an attack (and
scrubbing it ourselves) is to rule out and drop and never receive traffic
appearing to come from \it{our own network}, since such traffic could not exist
naturally and is obviously spoofed.
Team Cymru has got now a long tradition of maintaining bogons lists called the
\textbf{Bogon Reference}. Bogon prefixes are routes that should never appear in the
Internet routing table. A packet with an address from a bogon range should
not be routed over the public Internet. These ranges are commonly found as the
source addresses in DoS/DDoS attacks.\\
Bogons are netblocks that have not been allocated to a regional internet
registry (RIR) by the Internet Assigned Numbers Authority (IANA) and Martian packets
(private and reserved addresses defined by RFC 1918, RFC 5735, and RFC 6598
\cite{rfc1918}, \cite{rfc5735}, \cite{rfc6598}).
To get help with bogon ingress and egress filtering, we should set up automated
obtaining of updated and curated bogon lists via HTTP, BGP, RIRs and DNS from
Team Cymru \cite{teamcymru}.
In case we have our own ASN, are connected directly at an IXP, have no
RTBH support upstream and basically have no other choice, we just need
to find out who is sending the malicious traffic, drop the session and receive
traffic from other peers.
64B packet size --> lower throughput, high cpu utilization
\n{2}{IP masking}\label{ipmasking}
This is technique is widely used (e.g. CloudFlare flagship service), relying
solely on not divulging sensitive information - in this case server IP - to
attackers and the capacity of the \it{fronting} service to withstand the attack
due to having access to more badwidth than the attacker can produce. All traffic
- including potentially harmful traffic - flows through what is basically a giant
proxy. However, before declaring it a net win for us, it is important to
acknowledge that it also comes with heavy privacy implications, as now some
other service performs TLS termination in our behalf and \textbf{sees
everything} (that was encrypted only \emph{in transit} and is not additionally
encrypted) that \emph{anyone} sends us, before finally forwarding it back.
\n{2}{WAF}\label{waf}
WAF - \it{Web Application Firewall} - is an appliance used to protect
(as name suggests) web applications. In this day and age, this is
especially necessary and enables system administrators to craft protection
logic in one place and shield potentially vulnerable applications. This method
works on the application layer of the OSI model and is commonly deployed
as part of a web proxy or a module of a web proxy, which means network layer
attacks cannot be handled in this way. While not negligible, as always,
it is crucial to not have any assumptions and know exactly what
\it{layer} of protection using of WAF brings.
Generally or at least as per CBP (current best practices), applications are not
deployed with ports exposed directly to the Internet. A sane approach of having
access to resources \it{proxied} yields multiple possibilities in terms of
authentication/authorization and protection scenarios and also several ways to
more effectively use resources available. For one, where any web content
\it{caching} is required, it is easily achieved with a \it{caching} proxy
server. It commonly also enables specifying custom access policies.
There are also hosted (cloud) WAF offerings, however, they come with exactly
the same privacy implications as IP masking solutions (see \ref{ipmasking}).
\n{2}{Rate-limiting}
As a general precaution, it is sane to limit number of connections a client is
able to make in a predefined amount of time (based on the requirements of the
service). The same applies to a limit on how many connections a client can have
open simultaneously, which can even prevent Slowloris (see \ref{slowloris}).
Rate-limiting is usually set either on a proxy or a WAF, but some form of
rate-limiting can even be built into an app.
A well known rate-limiting pluggable solution that can be used with SSHd, HTTP or multitude of other endpoints is \texttt{Fail2Ban}.
\n{2}{Decreased-TIME\_WAIT connection closing}
This can help withstand a situation when conntrack table fills up and
the server refuses to accept any new connections. There is absolutely no
reason to keep connections in the conntrack table long after they become
inactive. The Linux kernel's NetFilter actually has a scrubbing mechanism, that
is supposed to be getting the conntrack table rid of the timed-out entries.
Except practice shows they can linger for much longer than necessary.
When dealing with massive amounts of traffic it is very reasonable not only to
increase the size of the conntrack table (memory trade-off), which is the
generally recommended solution, but also to decrease the TIME\_WAIT timeout to
force-evict connections that have stopped sending data.
It is also an easy way to mitigate slowloris (see \ref{slowloris}).
More on the workings of conntrack in \ref{netfilter}
Nginx is a widely used proxy. It uses two FDs (file descriptors) for each
connection. The limit of max open FDs can indeed be increased easily,
howerever, we might still just be delaying the inevitable (FD exhaustion)
and inefficiently wasting precious compute resources needed when an attack
comes. If Nginx is unable to allocate FDs necessary to track a connection, the
connection attempt will fail. By resetting connections that timed out we
prevent such a situation from occurring easily. In Nginx this is set with a
single line: \texttt{reset\_timedout\_connection on;}
\n{1}{Mitigation tools}
No tools are going to remedy for a bad design decision and that applies
equally to physical and internet engineering.
\n{2}{Firewall}
No matter the specific implementation, it is presumably safe to say that
any firewall is better than no firewall.
There are two main types of firewalls:
\begin{itemize}
\item software,
\item appliance (hardware-accelerated).
\end{itemize}
A software firewall is just another program running on the operating
system, apart from the fact that it is typically running with
system-level privileges. It can be run on a general-purpose computer. In fact
most of the consumer-grade operating systems nowadays incorporate or by default
enable a firewall solution.
In contrast, an appliance firewall is a dedicated piece of hardware
purpose-build specifically for the sole role of behaving as a firewall and
is typically running a custom and very minimal operating system and no
userspace programs. Usually the system does not have a userspace, since it is
vendored to run as an appliance.
\n{3}{Software firewall}
Solutions available as software firewalls are typically specific to a given
operating system.
Usually, there exist several tools that enable communication with the core
implementing the logic, commonly by a way of embedding deeply in the networking
stack of the OS or utilizing kernel APIs. In Linux distributions, the Linux
kernel is the one that sees all. Each packet arriving at the network interface
is inspected by the kernel and a decision is made regarding it.
Historically, \texttt{ipset} and later \texttt{iptables} used to be the de-facto
standard, however, a more recent successor emerged quite some time ago
and is replacing (in modern distributions has replaced, although
backward compatibility has been preserved) the former two - the
\texttt{nftables} tool.
\n{3}{Netfilter}\label{netfilter}
The Linux kernel subsystem named \texttt{Netfilter} is part of the internet
protocol stack of the kernel and is responsible for packet manipulation and
filtering \cite{Boye2012NetfilterCT}. The packet filtering and classification
rules framework frontend tools \texttt{iptables} as well as the newer
\texttt{nftables} can be interacted with via a shell utility and since they
also expose APIs of their own, it is common that they have graphical frontends
as additional convenience as well, most notably \texttt{firewalld}, which can
be used in conjunction with both of them.
Although newer versions of the Linux kernel support both \texttt{iptables} and
\texttt{nftables} just the same, only one of them can be used at a time. This
can be arbitrarily changed at runtime, a reboot is not necessary since they are
userspace tools) and interact with the kernel using \it{loadable kernel
modules}.
Part of the \texttt{Netfilter} framework responsible for connection tracking is
fittingly named Conntrack. Connection, or a \it{flow} is a tuple defined by a
unique combination of source address, destination address, source port,
destination port a and the transport protocol used [refneeded flow].
Conntrack keeps track of the flows in a special fixed-size
(tunable\footnotemark) in-kernel
hash table structure with a fixed upper limit.
\footnotetext{via \texttt{net.netfilter.nf\_conntrack\_buckets}}
On Linux devices functioning as router devices a common issue is the depletion
of space in the conntrack table. Once the maximum number of connection is
reached, Linux simply logs an error message "\texttt{nf\_conntrack: table full,
dropping packet}" to the kernel log and "all further new connection requests
are dropped until the table is below the maximum limit again."
\cite{Westphal2017CT}. That, Westphal further notes, is indeed very
unfortunate, especially in DoS scenarios.
Unless the router also functions as a NAT, this can be remedied in two ways:
decreasing the timeout until an established connection is closed and decreasing
the timeout until an inactive connection in the TIME\_WAIT state is evicted from
the conntrack table. By default, the TIME\_WAIT timeout is several hours long
and leaves the router vulnerable to a UDP flood.
Netfilter is here to help again with conntrack treating entries that have not
(yet) seen two-way communication specially – they can be evicted early if
the connection tracking table is full. In case insertion of a new entry fails
because the table is full, "...the kernel searches the next
8 adjacent buckets of the hash slot where the new connection
was supposed to be inserted at for an entry that has not seen a
reply. If one is found, it is discarded and the new connection
entry is allocated."\cite{Westphal2017CT}. Randomised source address in TCP SYN
floods becomes a non-issue because now most entries can be early-evicted
because the TCP connection tracker sets the "assured" flag only once the three-way
handshake has completed.
In case of UDP, the assured flag is set once a packet arrives
after the connection has already seen at least one packet in
the reply direction, that is the request/response traffic
does not have the assured bit set and can therefore be early-dropped at any time.
\n{2}{FastNetMon DDoS Mitigation toolkit}
Originally created by Pavel Odintsov, this program can serve as a helper on top
of analysis and metric collection tools, evaluate data and trigger configurable
mitigation reactions.
\cite{fastnetmonorig}, \cite{fastnetmonfork}, \cite{fastnetmonng}
% \begin{Shaded}
% \begin{Highlighting}[]
% \CommentTok{// Add host to blackhole}
% \DataTypeTok{bool}\NormalTok{ add\_to\_blackhole(TemplateKeyType client\_id, }\DataTypeTok{attack\_details\_t}\NormalTok{ current\_attack) \{}
% \BuiltInTok{std::}\NormalTok{lock\_guard\textless{}}\BuiltInTok{std::}\NormalTok{mutex\textgreater{} lock\_guard(structure\_mutex);}
%
% \NormalTok{ ban\_list\_storage[client\_id] = current\_attack;}
% \ControlFlowTok{return} \KeywordTok{true}\NormalTok{;}
% \NormalTok{ \}}
% \end{Highlighting}
% \end{Shaded}
% ============================================================================ %
\part{Practical part}
\n{1}{Infrastructure description}
% TODO
Broader infrastructure description HERE.
The testing was performed in a virtual lab comprised of five virtual machines
(VMs) running on a KVM-enabled Fedora 34. Since the expectation was to
frequently tweak various system settings of the guests (VMs) as part of the
verification process, we decided to take the \emph{infrastructure as code}
approach. Every piece of infrastructure - down to the details of how many
virtual CPUs are allocated to a host, what is the disk size and the filesystem,
etc. - is declared as code, can be versioned and used to provision resources.
The industry standard tool \texttt{Terraform} was chosen due to a broad support
of infrastructure and providers, great documentation, large user base and the
tool being open source.
For bootstrapping, \texttt{cloud-init} has been used mainly because of the fact
that it can integrate with terraform quite smoothly, works on many Linux
distributions and allows us to pre-setup things like copy over SSH pubkeys so
that a secure connection can be established right after first boot, set VM
hostname, locale, timezone, add users/groups, install packages, run commands
and even create arbitrary files, such as program configurations.
The disk sizes of the VMs were determined by the size of their base image.
The VM naming convention is specified as follows: a prefix \texttt{r\_} for
routers and \texttt{h\_} for other hosts, in our case the attacker, victim and
defender machines.
\n{2}{VM specifications}
\tab{VM specifications}{tab:vmspecifications}{0.75}{ |c||rrrrc| }{
\hline
\bf{VM name} & \bf{vCPU(s)} & \bf{RAM} & \bf{disk space} & \bf{net ifaces} &
\bf{operating system} \\
\hline\hline
r\_upstream & 1 & 768MB & 4.3GB & {outer,DMZ} & Fedora 33 \\
\hline
r\_edge& 1 & 768MB & 4.3GB & {DMZ,inner} & Fedora 33 \\
\hline
h\_victim & 1 & 768MB & 11GB & {inner} & CentOS 8 \\
\hline
h\_attacker & 1 & 1GB & 5.37GB & {outer} & Fedora 34 \\
\hline
h\_defender & 1 & 1GB & 5.37GB & {DMZ} & Fedora 34 \\
\hline
}
The inner (our edge) and the upstream (our transit provider) routers are
each part of different \it{AS}. They are directly connected and communicate using BGP.
The outer router and the inner router are BGP peers.
We assume our upstream provider supports RTBH signalling.
In this scenario the attacker is directly connected to our UPSTREAM router,
and while in reality there would probably be a greater distance between us and
them, this is fine for our simulation purposes, since malicious traffic will be
cut before it reaches us.
If our upstream provider did not support RTBH signalling, in case we were
attacked we could still use a scrubbing service but it is preferred that such a
provider is picked that has the RTBH capabilities.
FENIX in CR - trusted peers
Linux router SPAN/mirror ports configuration for packet capture to fastnetmon
(the only mode that makes nDPI integration possible)
Linux netflow/sflow configuration for fastnetmon - slow
all that with ansible roles
BIRD for bgp, static routes for a poor man's router
\n{1}{Infrastructure set-up}
\nns{approach 0}
\begin{itemize}
\item KVM
\item \texttt{terraform} with \texttt{libvirt} provider
\item \texttt{cloud-init}
\item \texttt{ansible}
\end{itemize}
\nns{approach 1}
\begin{itemize}
\item KVM
\item \texttt{terraform}
\item \texttt{ignition}
\item Fedora CoreOS
\end{itemize}
VMs required:
\begin{itemize}
\item victim
\item router - inner
\item router - edge
\item attacker
\item defence machine
\end{itemize}
See tab.~\ref{tab:vmspecifications} for details.
Simulating multiple physical devices performing different roles
(routing, attacking, playing victim) in our attack-defence/mitigation
scenario has been achieved by creating a test-lab virtual
infrastructure.\\
The tried-and-true way, state-of-the-art Linux kernel-native
virtualization solution has been chosen to tackle the task of running VMs for
us - the KVM technology.
Testing has been performed on my personal laptop - Dell Latitude 5480
machine equipped with ULV dual-core Intel i5 Core 6300U processor with
\texttt{mitigations=off}, 24GB
(8+16) of RAM and a 512GB SATA SSD (TLC).
The host operating system from the perspective of
VMs was \texttt{Fedora\ 34}. Both \texttt{updates} and
\texttt{updates-testing} repositories have been enabled, which allowed us to use
latest (at the time) stable Linux kernel Fedora had to offer directly without too much
of a hassle, as of the time of writing in version \texttt{5.11.20}.
File system in use on the host was Btrfs on top of LVM (LUKS+LVM to be
precise) and a Btrfs subvolume has been created specifically for the
libvirt storage pool. Since all of the system images for our VMs have been
downloaded in a QCOW2 format, the CoW (Copy-on-Write) feature of Btrfs has been
turned off for the subject subvolume, just as recommended in the Arch wiki
[refneeded archwiki btrfs cow] for improved storage performance (and decreased
flash wear).
Notably, the system has also been using the \texttt{nftables} backend of
\texttt{firewalld}, for which, luckily, \texttt{libvirt} was already
prepared.
\n{1}{Mitigation tools set-up}
An open-source DDOS mitigation toolkit named \texttt{fastnetmon} was
picked to serve as an attack detection tool. It supports analysing traffic from
multiple different exporter types, including Netflow (v5 adn v9), sFlow and port
mirrors.
BGP black-holing upsides:
\begin{itemize}
\item quick and surgically precise way
\end{itemize}
BGP black-holing weak spots:
\begin{itemize}
\item requires for our BGP peers to be available and willing to help/cooperate to advertise smaller subnets
\end{itemize}
\n{1}{Attack tools set-up}
When considering the way to simulate an attack locally, we weren't
primarily looking for a tool, which would enable a decentralised (the
first "D" of DDOS) attack, instead the objective was mainly to congest
the weakest link, which would happen to live inside our network (that's
why we're concerned in the first place).
\n{2}{iperf3}
\n{2}{slowlorispy}
\n{2}{Metasploit \texttt{aux/synflood module}}
\n{1}{Performing an attack}
run the tools
\n{1}{Monitoring}
\begin{itemize}
\item fastnetmon metrics
\item packet capture on the router interfaces
\item Netflow signalling
\end{itemize}
% ============================================================================ %
\nn{Conclusion}
% ============================================================================ %