1
0
Fork 0

tex: add batch 1

This commit is contained in:
leo 2023-05-19 23:55:38 +02:00
parent 46008320d5
commit ef8673e9a2
Signed by: wanderer
SSH Key Fingerprint: SHA256:Dp8+iwKHSlrMEHzE3bJnPng70I7LEsa3IJXRH/U+idQ
4 changed files with 271 additions and 41 deletions

Binary file not shown.

After

Width:  |  Height:  |  Size: 102 KiB

View File

@ -8,14 +8,17 @@
SHA & Secure Hash Algorithm \\
AES & Advanced Encryption Standard \\
ID & identity \\
PID & process ID \\
Cgroup & control group \\
ID & Identity \\
PID & Process ID \\
Cgroup & Control group \\
TLS & Transport Layer Security \\
SSH & Secure Shell \\
GPG & GNU Privacy Guard \\
GNU & GNU's Not Unix! \\
CSS & Cascading Style Sheets \\
API & Application Programming Interface \\
SCM & Source Code Management \\
HIBP & Have I Been Pwned \\
TOML & Tom's Obvious Minimal Language \\
@ -25,6 +28,8 @@ INI & Initialization file \\
OWASP & Open Web Application Security Project \\
NIST & National Institute of Standards and Technology \\
SEO & Search-Engine Optimisation \\
\end{tabular}
% =========================================================================== %

View File

@ -148,4 +148,13 @@
note={{Available from: \url{https://man7.org/linux/man-pages/man7/namespaces.7.html}. [viewed 2023-05-17]}}
}
@misc{agwagitssh,
howpublished = {[online]},
title = {It's Now Possible TO Sign Arbitrary Data With Your SSH Keys},
author = {Andrew Ayer},
year = 2021,
month = nov,
note={{Available from: \url{https://www.agwa.name/blog/post/ssh_signatures}. [viewed 2023-05-17]}}
}
% =========================================================================== %

View File

@ -31,13 +31,13 @@ Linux kernel~\cite{linux}.
\n{2}{GNU/Linux}
When talking about an operating system, the term ``GNU/Linux'' as defined by
the Free Software Foundation~\cite{fsfgnulinux} is used. While it is longer and
arguably a little bit cumbersome, the author aligns with the opinion that this
term more correctly describes its actual target. Being aware there are many
people that conflate the complete operating system with its (be it core)
component, the kernel, the author is taking care to distinguish the two,
although writing from experience, colloquially, this probably brings more
As far as a Linux-based operating system is concerned, the term ``GNU/Linux''
as defined by the Free Software Foundation~\cite{fsfgnulinux} is used. While it
is longer and arguably a little bit cumbersome, the author aligns with the
opinion that this term more correctly describes its actual target. Being aware
there are many people that conflate the complete operating system with its (be
it core) component, the kernel, the author is taking care to distinguish the
two, although writing from experience, colloquially, this probably brings more
confusion and a lengthy explanation is usually required.
@ -71,7 +71,7 @@ Pre-requisites necessary for following up.
Explanation. What are hash functions
\n{3}{Uses and \textit{mis}uses}
The good the bad and the ugly of hash usage (including or in some cases
The good, the bad and the ugly of hash usage (including or in some cases
excluding salting, weak hashes, split hashes (Microsoft)).
\n{3}{Threats to hashes}
@ -93,7 +93,7 @@ Generally.
\n{3}{Arbitrary length requirements (min/max)}
\n{3}{Arbitrary complexity requirements}
\n{3}{Restricting special characters}
Service providers have too often been found to forbid the use of so called
Service providers have too often been found forbidding the use of so called
\textit{special characters} in passwords for as long as passwords have been
used to protect privileged access. Ways of achieving the same may vary but the
intent stays the same: prevent users from inputting characters into the system,
@ -105,12 +105,14 @@ Entropy, dictionaries, multiple factors.
\n{1}{Web security}\label{sec:websecurity}
The internet being the vast space of intertwined concepts and ideas, is a
The internet, being the vast space of intertwined concepts and ideas, is a
superset of the Web, which is the part of the internet zoomed in on in this
section.
\n{2}{Browsers}\label{sec:browsers}
What they are, what do they do, how do they relate into the security aspect
What they are, what do they do, how they relate to the security aspect
(privileged process running untrusted code on user's computer), history,
present, security focus of the dev teams, user facing signalling (padlock
colours, scary warnings).
@ -118,6 +120,7 @@ colours, scary warnings).
TODO: describe how browsers find out where the web page lives, get a webpage,
parse it, parse stylesheets, run scripts, apply SAMEORIGIN restrictions etc.
\n{2}{Cross-site scripting}\label{sec:xss}
\n{2}{Content Security Policy}\label{sec:csp}
@ -173,12 +176,15 @@ following are certain to make the count:
\textbf{Disclaimer:} the author is not affiliated in any way with any of the
projects described on this page.
The \textit{Password Compromise Monitoring Tool} (\texttt{pcmt}) program has been
developed using and utilising a great deal of open-source software in the
process, either directly or as an outstanding work tool, and the author would
like to take this opportunity to recognise that fact.
The \textit{Password Compromise Monitoring Tool} (\texttt{pcmt}) program has
been developed using and utilising a great deal of free (as in Freedom) and
open-source software in the process, either directly or as an outstanding work
tool, and the author would like to take this opportunity to recognise that
fact.
In particular, the author acknowledges that this work would not be the same
without:
In particular, this work would not be the same without:
\begin{itemize}
\item vim (\url{https://www.vim.org/})
\item Arch Linux (\url{https://archlinux.org/})
@ -192,13 +198,159 @@ In particular, this work would not be the same without:
All of the code written has been typed into VIM (\texttt{9.0}), the shell used
to run the commands was ZSH, both running in the author's terminal emulator of
choice - \texttt{kitty} on a (at the time of writing) 8 month installation of
\textit{Arch Linux (by the way)} using a \texttt{6.3.1-wanderer-zfs-xanmod1}
variant of the Linux kernel.
choice - \texttt{kitty} on a \raisebox{.8ex}{\texttildelow}8 month (at the time
of writing) installation of \textit{Arch Linux (by the way)} using a
\texttt{6.3.1-wanderer-zfs-xanmod1} variant of the Linux kernel.
\n{1}{Development}
The source code of the project was being versioned since the start using the
popular and industry-standard git (\url{https://git-scm.com}) source code
management (SCM) tool. Commits were made frequently and, if at all possible,
for small and self-contained changes of code, trying to follow sane commit
message \emph{hygiene}, i.e.\ striving for meaningful and well-formatted commit
messages. The name of the default branch is \texttt{development}, since that is
what the author likes to choose for new projects that are not yet stable (it is
in fact the default in author's \texttt{.gitconfig}).
\n{2}{Commit signing}
Since git allows cryptographically \emph{singing} all commits, it would be
unwise not to take advantage of this. For the longest time, GPG was the only
method available for signing commits in git, however, that is no longer
applicable~\cite{agwagitssh}. These days, it is also possible to both sign and
verify one's git commits (and tags!) using SSH keys, namely those produced by
OpenSSH (the same ones that can be used to log in to remote systems). The
author has, of course, not reused the same key pair that is used to connect to
machines for signing commits. A different, \texttt{Ed25519} elliptic curve key
pair has been used specifically for signing. A public component of this key is
enclosed to this thesis as an attachment for future reference.
The validity of a signature on a particular commit can be viewed with git using
the following commands (the \% sign denotes the shell prompt):
\begin{figure}[h]
\centering
\begin{varwidth}{\linewidth}
\begin{verbatim}
% cd <cloned project dir>
% git show --show-signature <commit>
% # alternatively:
% git verify-commit <commit>
\end{verbatim}
\end{varwidth}
\caption{Verifying signature of a git commit}
\label{fig:gitverif}
\end{figure}
There is one caveat to this though, git first needs some additional
configuration for the code in Figure~\ref{fig:gitverif} to work as one would
expect. Namely that the public key used to verify the signature needs to be
stored in git's ``allowed signers file'', then git needs to be told where that
file is using the configuration value \texttt{gpg.ssh.allowedsignersfile} and
finally the configuration value of the \texttt{gpg.format} field needs to be
set to \texttt{ssh}.
Since git allows the configuration values to be local to each repository, both
of the mentioned issues can be solved by running the following commands from
inside of the cloned repository:
\begin{figure}[h]
\centering
\begin{varwidth}{\linewidth}
\scriptsize
\begin{verbatim}
% # set the signature format for the local repository.
% git config --local gpg.format ssh
% # save the public key.
% cat >./tmp/.allowed_signers \
<<<'leo ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIKwshTdBgLzwY4d8N7VainZCngH88OwvPGhZ6bm87rBO'
% # set the allowed signers file path for the local repository.
% git config --local gpg.ssh.allowedsignersfile=./tmp/.allowed_signers
\end{verbatim}
\end{varwidth}
\caption{Prepare allowed signers file and signature format for git}
\label{fig:gitsshprep}
\end{figure}
After the code in Figure~\ref{fig:gitsshprep} is run, everything from the
Figure~\ref{fig:gitverif} should remain applicable for the lifetime of the
repository or until git changes implementation of signature verification.
For future reference, git has been used in the version \texttt{git version
2.40.1}.
\n{2}{Continuous Integration}
To increase both the author's and public confidence in the atomic changes made
over time, it was attempted to thoroughly \emph{integrate} them using a
continuous integration (CI) service that was plugged into the main source code
repository since the early stages of development. This, of course, was again
self-hosted, including the workers. The tool of choice there was Drone
(\url{https://drone.io}) and the ``docker'' runner (in fact it runs any OCI
container) was used to run the builds.
The way this runner works is it creates an ephemeral container for every
pipeline step and executes given \emph{commands} inside of it. At the end of
each step the container is discarded, while the repository, which is mounted
into each container's \texttt{/drone/src} is persisted between steps, allowing
it to be cloned only from \emph{origin} only at the start of the pipeline and
then shared for all of the following steps, saving bandwidth, time and disk
writes.
The entire configuration used to run the pipelines can be found in a file named
\texttt{.drone.yml} at the root of the main source code repository. The
workflow consists of three pipelines, which are run in parallel. Two main
pipelines are defined to build the binary and run tests on \texttt{x86\_64}
GNU/Linux targets, one for each of Arch and Alpine (version 3.17).
These the two pipelines were identical apart from OS-specific bits such as
installing a certain package, etc.
For the record, other OS-architecture combinations were not tested.
A third pipeline was defined to build a popular static analysis tool called
\texttt{golangci-lint} - which is sort of a meta-linter, bundling a staggering
amount of linters (linter is a tool that performs static code analysis and can
raise awareness of programming errors, flag potentially buggy code constructs,
or \emph{mere} stylistic errors) - from sources and then perform the analysis
of project's codebase using the freshly built binary. If the result of this
step is successful, a handful of code analysis services get pinged in the next
steps to take notice of the changes to project's source code and update their
metrics, details can be found in the main Drone configuration file
\texttt{.drone.yml} and the configuration of \texttt{golangci-lint} can be
found in the root of the repository in the file named \texttt{.golangci.yml}.
The median build time as of writing was 1 minute, which includes running all
three pipelines, and that is acceptable.
\obr{Drone CI median
build}{fig:drone-median-build}{.77}{graphics/drone-median-build}
\n{2}{Source code repositories}\label{sec:repos}
All of the pertaining source code was published in repositories on a publicly
available git server operated by the author, the reasoning \emph{pro}
self-hosting being that it is the preferred way of guaranteed autonomy over
one's source code, as opposed to large silos owned by big corporations having a
track record of arguably not always deciding with user's best interest in mind,
acting on impulse or under public pressure (potentially at least temporarily
disrupting their user's operations), thus beholding their user to their lengthy
\emph{terms of service} that \emph{can change at any time}. Granted,
decentralisation can take a toll on discoverability of the project, but that is
not of concern here.
The git repository containing source code of the \texttt{pcmt} project:\\
\url{https://git.dotya.ml/mirre-mt/pcmt.git}.
The git repository hosting the \texttt{pcmt} configuration schema:\\
\url{https://git.dotya.ml/mirre-mt/pcmt-config-schema.git}.
The repository containing the \LaTeX{} source code of this thesis:\\
\url{https://git.dotya.ml/mirre-mt/masters-thesis.git}.
\n{2}{Toolchain}
Throughout the creation of this work, the \emph{current} version of the Go
@ -234,7 +386,8 @@ errors are values.
The appeal for the author comes from a number of language features, such as
built-in support for concurrency, testing, sane \emph{zero} values, lack of
pointer arithmetic, inheritance and implicit type conversions, easy-to-read
syntax, producing a statically linked binary by default, etc.
syntax, producing a statically linked binary by default, etc., on top of that,
the language has got a cute mascot.
Due to the foresight of the authors of the Go Authors regarding \emph{the
formatting question} (i.e.\ where to put the braces, tabs vs.\ spaces, etc.),
@ -277,12 +430,79 @@ usage of custom types (that are, of course merely combinations of the primitive
types that the language provides, such as \emph{Bool}, \emph{Natural} or
\emph{List}, to name just a few), so it was not exceedingly hard to start
designing a custom configuration \emph{schema} for the program.
Dhall not being a Turing-complete language also guarantees that evaluation
\emph{always} terminates eventually, which is a good attribute to possess as a
configuration language.
\n{3}{Schema}
The configuration schema was at first being developed as part of the main
project's repository, before it was determined that it would benefit both the
development and overall clarity if the schema lived in its own repository (see
Section~\ref{sec:repos} for details).
\begin{figure}[h]
\begin{varwidth}
\scriptsize
\begin{verbatim}
let Schema =
{ Type =
{ Host : Text
, Port : Natural
, HTTP :
{ Domain : Text
, Secure : Bool
, AutoTLS : Bool
, TLSKeyPath : Text
, TLSCertKeyPath : Text
, HSTSMaxAge : Natural
, ContentSecurityPolicy : Text
, RateLimit : Natural
, Gzip : Natural
, Timeout : Natural
}
, Mailer :
{ Enabled : Bool
, Protocol : Text
, SMTPAddr : Text
, SMTPPort : Natural
, ForceTrustServerCert : Bool
, EnableHELO : Bool
, HELOHostname : Text
, Auth : Text
, From : Text
, User : Text
, Password : Text
, SubjectPrefix : Text
, SendPlainText : Bool
}
, LiveMode : Bool
, DevelMode : Bool
, AppPath : Text
, Session :
{ CookieName : Text
, CookieAuthSecret : Text
, CookieEncrSecret : Text
, MaxAge : Natural
}
, Logger : { JSON : Bool, Fmt : Optional Text }
, Init : { CreateAdmin : Bool, AdminPassword : Text }
, Registration : { Allowed : Bool }
}
, default = {=}
}
in Schema
\end{verbatim}
\end{varwidth}
\caption{Dhall configuration schema version 0.0.1-rc.1}
\label{fig:dhallschema}
\end{figure}
\n{3}{Safety considerations}
Having a programmable configuration language that understands functions and
allows importing not only arbitrary text from random internet URLs, but also
importing and \emph{evaluating} (i.e.\ running) potentially untrusted code, it
@ -298,15 +518,16 @@ shortcomings of Dhall, namely long start-up with \emph{cold cache}, which can
generally be observed in the scenario of running the program in a
\emph{container}.
The way that Dhall works is it resolves every expression down to a combination
of its most basic types (eliminating all abstraction and indirection) in the
process called \textbf{normalisation}~\cite{dhallnorm} and then saves this
result in the hosts cache. The \texttt{dhall-haskell} binary attempts to
resolve the variable \texttt{XDG\_CACHE\_HOME} (have a look at \emph{XDG Base
Directory Spec}~\cite{xdgbasedirspec} for details) to decide \emph{where} the
results of the normalisation will be written for repeated use. Do note that
this behaviour has been observed on a GNU/Linux host and the author has not
verified this behaviour on a non-GNU/Linux host.
If we want to describe the way Dhall works when performing an evaluation, it
resolves every expression down to a combination of its most basic types
(eliminating all abstraction and indirection) in the process called
\textbf{normalisation}~\cite{dhallnorm} and then saves this result in the hosts
cache. The \texttt{dhall-haskell} binary attempts to resolve the variable
\texttt{XDG\_CACHE\_HOME} (have a look at \emph{XDG Base Directory
Spec}~\cite{xdgbasedirspec} for details) to decide \emph{where} the results of
the normalisation will be written for repeated use. Do note that this
behaviour has been observed on a GNU/Linux host and the author has not verified
this behaviour on a non-GNU/Linux host.
If normalisation is performed inside an ephemeral container (as opposed to, for
instance, an interactive desktop session), the results effectively get lost on
@ -406,11 +627,6 @@ access to the database.
\n{1}{Implementation}
\n{2}{Compromise checking}
Periodicity, alerting mechanisms (email, Telegram..)
\\
The above will be scrapped as there will be no way for the application to
access the user data since it will be encrypted by a key, passphrase to which
only the user knows.
\n{3}{Have I Been Pwned? Integration}
TODO
@ -546,9 +762,9 @@ namespaces} (on GNU/Linux) would influence the process (given that the
need to account.
Equally, if the application is running inside the container, the operator needs
to make sure the database is either running in a network that is also directly
attached to the container or that there is a mechanism in place that routes the
requests for the database hostname to the destination.
to make sure that the database is either running in a network that is also
directly attached to the container or that there is a mechanism in place that
routes the requests for the database hostname to the destination.
One such mechanism is container name based routing inside \emph{pods}
(Podman/Kubernetes), where the resolution of container names is the