tex: add stuff on browsers
also mention Gemini (https://gemini.circumlunar.space/)
This commit is contained in:
parent
dd8eb1d3c5
commit
f7db0cb375
@ -13,11 +13,15 @@ PID & Process ID \\
|
|||||||
Cgroup & Control group \\
|
Cgroup & Control group \\
|
||||||
|
|
||||||
TLS & Transport Layer Security \\
|
TLS & Transport Layer Security \\
|
||||||
|
TCP & Transmission Control Protocol \\
|
||||||
SSH & Secure Shell \\
|
SSH & Secure Shell \\
|
||||||
|
DNS & Domain Name System \\
|
||||||
|
ISP & Internet Service Provider \\
|
||||||
GPG & GNU Privacy Guard \\
|
GPG & GNU Privacy Guard \\
|
||||||
GNU & GNU's Not Unix! \\
|
GNU & GNU's Not Unix! \\
|
||||||
CSS & Cascading Style Sheets \\
|
CSS & Cascading Style Sheets \\
|
||||||
API & Application Programming Interface \\
|
API & Application Programming Interface \\
|
||||||
|
CLI & Command Line Interface \\
|
||||||
SCM & Source Code Management \\
|
SCM & Source Code Management \\
|
||||||
HIBP & Have I Been Pwned \\
|
HIBP & Have I Been Pwned \\
|
||||||
|
|
||||||
|
@ -162,7 +162,7 @@
|
|||||||
title = {A simple, modern and secure encryption tool (and Go library) with small explicit keys, no config options, and UNIX-style composability.},
|
title = {A simple, modern and secure encryption tool (and Go library) with small explicit keys, no config options, and UNIX-style composability.},
|
||||||
author = {Filippo Sotille and Ben Cox and age contributors},
|
author = {Filippo Sotille and Ben Cox and age contributors},
|
||||||
year = 2021,
|
year = 2021,
|
||||||
note={{Available from: \url{https://github.com/FiloSottile/age}. [viewed 2023-05-17]}}
|
note={{Available from: \url{https://github.com/FiloSottile/age}. [viewed 2023-05-23]}}
|
||||||
}
|
}
|
||||||
|
|
||||||
@misc{x25519rfc7748,
|
@misc{x25519rfc7748,
|
||||||
@ -186,7 +186,49 @@
|
|||||||
publisher = "GitHub",
|
publisher = "GitHub",
|
||||||
howpublished = {[online]},
|
howpublished = {[online]},
|
||||||
year = "2007",
|
year = "2007",
|
||||||
note={{Available from: \url{https://github.com/504ensicsLabs/LiME}. [viewed 2023-05-17]}}
|
note={{Available from: \url{https://github.com/504ensicsLabs/LiME}. [viewed 2023-05-23]}},
|
||||||
|
}
|
||||||
|
|
||||||
|
@misc{wwwf,
|
||||||
|
howpublished = {[online]},
|
||||||
|
title = {History of the Web},
|
||||||
|
author = {{World Wide Web Foundation}},
|
||||||
|
year = 2021,
|
||||||
|
note={{Available from: \url{https://webfoundation.org/about/vision/history-of-the-web/}. [viewed 2023-05-23]}}
|
||||||
|
}
|
||||||
|
|
||||||
|
@misc{ddvweb,
|
||||||
|
howpublished = {[online]},
|
||||||
|
title = {What is this Gemini thing anyway, and why am I excited about it?},
|
||||||
|
author = {{Drew DeVault}},
|
||||||
|
year = 2020,
|
||||||
|
month = nov,
|
||||||
|
note={{Available from: \url{https://drewdevault.com/2020/11/01/What-is-Gemini-anyway.html}. [viewed 2023-05-23]}}
|
||||||
|
}
|
||||||
|
|
||||||
|
@misc{gemini,
|
||||||
|
howpublished = {[online]},
|
||||||
|
title = {Project Gemini},
|
||||||
|
author = {Solderpunk and Sean Conner and {{The Gemini Contributors}}},
|
||||||
|
year = 2019,
|
||||||
|
note={{Available from: \url{https://gemini.circumlunar.space/} and over Gemini from: \url{gemini://gemini.circumlunar.space/} [viewed 2023-05-23]}}
|
||||||
|
}
|
||||||
|
|
||||||
|
@misc{geminispec,
|
||||||
|
howpublished = {[online]},
|
||||||
|
title = {Speculative Specification},
|
||||||
|
author = {Solderpunk and Sean Conner and {{The Gemini Contributors}}},
|
||||||
|
year = 2019,
|
||||||
|
note={{Available from: \url{https://gemini.circumlunar.space/docs/specification.gmi} and over Gemini from: \url{gemini://gemini.circumlunar.space/docs/specification.gmi} [viewed 2023-05-23]}}
|
||||||
|
}
|
||||||
|
|
||||||
|
@misc{chromiumrootdns,
|
||||||
|
howpublished = {[online]},
|
||||||
|
title = {This well-intentioned Chrome feature is causing serious problems},
|
||||||
|
author = {Anthony Spadafora},
|
||||||
|
year = 2020,
|
||||||
|
month = aug,
|
||||||
|
note={{Available from: \url{https://www.techradar.com/news/this-well-intentioned-chrome-feature-is-causing-serious-problems} [viewed 2023-05-23]}}
|
||||||
}
|
}
|
||||||
|
|
||||||
% =========================================================================== %
|
% =========================================================================== %
|
||||||
|
158
tex/text.tex
158
tex/text.tex
@ -120,31 +120,169 @@ Entropy, dictionaries, multiple factors.
|
|||||||
\n{1}{Web security}\label{sec:websecurity}
|
\n{1}{Web security}\label{sec:websecurity}
|
||||||
|
|
||||||
The internet, being the vast space of intertwined concepts and ideas, is a
|
The internet, being the vast space of intertwined concepts and ideas, is a
|
||||||
superset of the Web, which is the part of the internet that is discussed in the
|
superset of the Web, since not everything that is available on internet can be
|
||||||
next section.
|
described as web \emph{resources}. But precisely that is the part of the
|
||||||
|
internet that is discussed in the next sections and covers what browsers are,
|
||||||
|
what they do and how they relate to web security.
|
||||||
|
|
||||||
|
|
||||||
\n{2}{Browsers}\label{sec:browsers}
|
\n{2}{Browsers}\label{sec:browsers}
|
||||||
|
|
||||||
The following subsection covers what browsers are, what they do and how they
|
|
||||||
relate to web security.
|
|
||||||
|
|
||||||
TODO: describe how browsers find out where the web page lives, get a webpage,
|
TODO: describe how browsers find out where the web page lives, get a webpage,
|
||||||
parse it, parse stylesheets, run scripts, apply SAMEORIGIN restrictions etc.
|
parse it, parse stylesheets, run scripts, apply SAMEORIGIN restrictions etc.
|
||||||
|
|
||||||
TODO: (privileged process running untrusted code on user's computer), history,
|
TODO: (privileged process running untrusted code on user's computer), history,
|
||||||
present, security focus of the development teams, user facing signalling
|
present, security focus of the development teams, user facing signalling
|
||||||
(padlock colours, scary warnings).
|
(padlock colours, scary warnings).
|
||||||
|
|
||||||
|
Browsers, sometimes used together with the word that can serve as a real tell
|
||||||
|
for their specialisation - web browsers - are programs intended for
|
||||||
|
\emph{browsing} of \emph{the web}. In more technical terms, browsers are
|
||||||
|
programs that facilitate (directly or via intermediary tools) domain name
|
||||||
|
lookups, connecting to web servers, optionally establishing a secure
|
||||||
|
connection, requesting the web page in question, determining its \emph{security
|
||||||
|
policy} and resolving what accompanying resources the web page specifies and
|
||||||
|
depending on the applicable security policy, requesting those from their
|
||||||
|
respective origins, applying stylesheets and running scripts. Constructing a
|
||||||
|
program that can speak many protocols, securely runs untrusted code from the
|
||||||
|
internet is no easy task.
|
||||||
|
|
||||||
|
\n{3}{Complexity}
|
||||||
|
|
||||||
|
Browsers these days are also quite ubiquitous programs running on
|
||||||
|
\emph{billions} of consumer-grade mobile devices (which are also notorious for
|
||||||
|
bad update hygiene) or desktop devices all over the world. Regular users
|
||||||
|
usually expect them to work flawlessly with a multitude of network conditions,
|
||||||
|
network scenarios (café WiFi, cellular data in a remote location, home
|
||||||
|
broadband that is DNS-poisoned by the ISP), differently tuned (or commonly
|
||||||
|
misconfigured) web servers, a combination of modern and \emph{legacy}
|
||||||
|
encryption schemes and different levels of conformance to web standards from
|
||||||
|
both web server and website developers. Of course, if a website is broken, it
|
||||||
|
is the browser's fault. Browsers are expected to detect if \emph{captive
|
||||||
|
portals} (a type of access control that usually tries to force the user through
|
||||||
|
a webpage with terms of use) are active and offer redirects. All of this is
|
||||||
|
immense complexity and the combination of ubiquity and great exposure this type
|
||||||
|
of software gets is in the authors opinion the cause behind a staggering amount
|
||||||
|
of vulnerabilities found, reported and fixed in browsers every year.
|
||||||
|
|
||||||
|
\n{3}{Standardisation}
|
||||||
|
|
||||||
|
Over the years, a consortium of parties interested in promoting and developing
|
||||||
|
the web (also due to its potential as a digital marketplace, i.e.\ financial
|
||||||
|
incentives) and browser vendors (of which the most neutral participant is
|
||||||
|
perhaps \emph{Mozilla}, with Chrome being run by Google, Edge by Microsoft and
|
||||||
|
Safari/Webkit by Apple) has evolved a great volume of web standards, which are
|
||||||
|
also relatively frequently getting updated or deprecated and replaced by
|
||||||
|
revised or new ones, rendering the browser maintenance task into essentially a
|
||||||
|
cat-and-mouse game.
|
||||||
|
|
||||||
|
It is the web's extensibility that enabled this build-up and ironically has
|
||||||
|
been proclaimed by some to be its greatest asset. It has also been ostensibly
|
||||||
|
been criticised~\cite{ddvweb} in the past and the frustration with the status
|
||||||
|
quo of web standards has relatively recently prompted a group of people to even
|
||||||
|
create ``\textit{a new application-level internet protocol for the distribution
|
||||||
|
of arbitrary files, with some special consideration for serving a lightweight
|
||||||
|
hypertext format which facilitates linking between files}'':
|
||||||
|
Gemini~\cite{gemini}\cite{geminispec} that in the words of its authors can be
|
||||||
|
thought of as ``\textit{the web, stripped right back to its essence}'' or as
|
||||||
|
``\textit{Gopher, souped up and modernised just a little}'', depending upon the
|
||||||
|
reader's perspective, noting that the latter view is probably more accurate.
|
||||||
|
|
||||||
|
\n{3}{HTTP}
|
||||||
|
|
||||||
|
Originally, HTTP was also designed just for fetching hypertext
|
||||||
|
\emph{resources}, but it has evolved since then, particularly due to its
|
||||||
|
extensibility, to allow for fetching of all sorts of web resources a modern
|
||||||
|
website of today provides, such as scripts or images, or even to \emph{post}
|
||||||
|
content back to servers.
|
||||||
|
|
||||||
|
HTTP relies on TCP (Transmission Control Protocol), which is one of the
|
||||||
|
\emph{reliable} (mandated by HTTP) protocols used to send data across
|
||||||
|
contemporary IP (Internet Protocol) networks, to deliver the data it requests
|
||||||
|
or sends. When Tim Berners-Lee invented the World Wide Web (WWW) in 1989 while
|
||||||
|
working at CERN (The European Organization for Nuclear Research) with a rather
|
||||||
|
noble intent as a ``\emph{wide-area hypermedia information retrieval initiative
|
||||||
|
to give universal access to a large universe of documents}''~\cite{wwwf}, he
|
||||||
|
also invented the HyperText Markup Language (HTML) to serve as a formatting
|
||||||
|
method for these new hypermedia documents. The first website was written
|
||||||
|
roughly the same way as today's websites are, using HTML, although the markup
|
||||||
|
language has changed since, with the current version being HTML5.
|
||||||
|
|
||||||
|
It has been mentioned that the client \textbf{requests} a \textbf{resource} and
|
||||||
|
receives a \textbf{response}, so those terms should probably be defined.
|
||||||
|
|
||||||
|
A request is what the client sends to the server. A resource is what it
|
||||||
|
requests and a response is the answer provided by the server.
|
||||||
|
|
||||||
|
HTTP follows a classic client-server model whereby it is \textbf{always} the
|
||||||
|
client that initiates the request.
|
||||||
|
|
||||||
|
A web page is, to be blunt, a chunk of \emph{hypertext}. To display a web page,
|
||||||
|
a browser first needs to send a request to fetch the HTML representing the
|
||||||
|
page, which is then parsed and additional requests for sub-resources are made.
|
||||||
|
If a page defines a layout information in the form of CSS, that is parsed as
|
||||||
|
well.
|
||||||
|
|
||||||
|
A web page needs to be present on the local computer first \emph{before} it can
|
||||||
|
be parsed by the browser, and since websites are usually still served by
|
||||||
|
programs called \emph{web servers} as in the \emph{early days}, that presents a
|
||||||
|
problem of how tell the browser where from the resource should be pulled. In
|
||||||
|
today's browsers, the issue is sorted (short of the CLI) by the \emph{address
|
||||||
|
bar}, a place into which user types what they wish the browser to fetch for
|
||||||
|
them.
|
||||||
|
|
||||||
|
The formal name of this segment is a \emph{Universal Resource Locator}, or URL,
|
||||||
|
and it contains the schema (or the protocol, such as \texttt{http://}), the
|
||||||
|
host address or a domain name and a (TCP) port number.
|
||||||
|
|
||||||
|
Since a TCP connection needs to be established first, to connect to a server
|
||||||
|
whose only URL contains a domain name, the browser needs to perform a domain
|
||||||
|
name \emph{lookup} using system facilities, or as was the case for a couple of
|
||||||
|
notorious Chromium versions, send some additional and unrelated queries which
|
||||||
|
(with Chromium-based derivatives' numbers) ended up placing unnecessary load
|
||||||
|
directly at the root DNS servers~\cite{chromiumrootdns}.
|
||||||
|
|
||||||
|
If a raw IP address+port combination is used, the browser attempts to connect
|
||||||
|
to it directly and requests the user-requested page by default using the
|
||||||
|
\texttt{GET} \emph{method}. A \emph{well-known} HTTP port 80 is assumed unless
|
||||||
|
other port is explicitly specified and it can be omitted both if host is a
|
||||||
|
domain name or an IP address.
|
||||||
|
|
||||||
|
The method is a way for the user-agent to define what operation it wants to
|
||||||
|
perform. \texttt{GET} is used for fetching resources while \texttt{POST} is
|
||||||
|
used to send data to the server, such as to post the values of an HTML form.
|
||||||
|
|
||||||
|
A server response is comprised of a \textbf{status code}, a status message,
|
||||||
|
HTTP \textbf{headers} and an optional \textbf{body} containing the content. The
|
||||||
|
status code indicates if the original request was successful or not and the
|
||||||
|
browser is generally there to interpret these status codes to the user. There
|
||||||
|
is enough status codes to be confused by the sheer numbers but luckily, there
|
||||||
|
is a method to the madness and they can be divided into groups/classes:
|
||||||
|
|
||||||
|
\begin{itemize}
|
||||||
|
\item 1xx: Informational responses
|
||||||
|
\item 2xx: Successful responses
|
||||||
|
\item 3xx: Redirection responses
|
||||||
|
\item 4xx: Client error responses
|
||||||
|
\item 5xx: Server error responses
|
||||||
|
\end{itemize}
|
||||||
|
|
||||||
|
In case the \emph{user agent} (a web \emph{client}) such as a browser receives
|
||||||
|
a response with content, it has to parse it.
|
||||||
|
|
||||||
|
A header is additional information sent by both the server and the client.
|
||||||
|
|
||||||
|
|
||||||
\n{2}{Cross-site scripting}\label{sec:xss}
|
\n{2}{Cross-site scripting}\label{sec:xss}
|
||||||
|
|
||||||
\n{2}{Content Security Policy}\label{sec:csp}
|
\n{2}{Content Security Policy}\label{sec:csp}
|
||||||
|
|
||||||
Content Security Policy has been an important addition to the arsenal of
|
Content Security Policy has been an important addition to the arsenal of
|
||||||
website administrators, even though not everybody has necessarily taken notice
|
website operators, even though not everybody has necessarily been utilising it
|
||||||
or even utilised it properly. To understand what guarantees it provides and
|
properly or even taken notice. To understand what guarantees it provides and
|
||||||
what kind of protections it employs, it is first necessary to grok how websites
|
what kind of protections it employs, it is first necessary to grok how websites
|
||||||
are parsed and displayed, which has been discussed in depth in
|
are parsed and displayed, which has been discussed in depth in previous
|
||||||
Section~\ref{sec:browsers}.
|
sections.
|
||||||
|
|
||||||
|
|
||||||
\n{1}{Sandboxing}\label{sec:sandboxing}
|
\n{1}{Sandboxing}\label{sec:sandboxing}
|
||||||
\n{2}{User isolation}
|
\n{2}{User isolation}
|
||||||
|
Reference in New Issue
Block a user