1626 lines
73 KiB
TeX
1626 lines
73 KiB
TeX
% =========================================================================== %
|
|
\part{Practical part}
|
|
|
|
\n{1}{Introduction}
|
|
|
|
A part of the task of this thesis was to build an actual application, which was
|
|
named Password Compromise Monitoring Tool, or \texttt{pcmt} for short.
|
|
Therefore, the development process, the general tools and practices as well as
|
|
the specific outcome are all described in the following sections. A whole
|
|
section is dedicated to application architecture, whereby relevant engineering
|
|
choices are justified and motifs preceding the decisions are explained. This
|
|
part then flows into recommendations for more of a production deployment and
|
|
concludes by describing the validation methods chosen and used to ensure
|
|
correctness and stability of the program.
|
|
|
|
|
|
\n{2}{Kudos}
|
|
|
|
The program that has been developed as part of this thesis used and utilised a
|
|
great deal of free (as in \textit{freedom}) and open-source software in the
|
|
process, either directly or as an outstanding work tool, and the author would
|
|
like to take this opportunity to recognise that fact\footnotemark{}.
|
|
|
|
\footnotetext{\textbf{Disclaimer:} the author is not affiliated with any of the
|
|
projects mentioned on this page.}
|
|
|
|
In particular, the author acknowledges that this work would not be the same
|
|
without:
|
|
|
|
\begin{itemize}
|
|
\item vim (\url{https://www.vim.org/})
|
|
\item Arch Linux (\url{https://archlinux.org/})
|
|
\item ZSH (\url{https://www.zsh.org/})
|
|
\item kitty (\url{https://sw.kovidgoyal.net/kitty/})
|
|
\item Nix (\url{https://nixos.org/explore.html})
|
|
\item pre-commit (\url{https://pre-commit.com/})
|
|
\item Podman (\url{https://podman.io/})
|
|
\item Go (\url{https://go.dev/})
|
|
\end{itemize}
|
|
|
|
All the code was typed into VIM, the shell used was ZSH, and the terminal
|
|
emulator of choice was \texttt{kitty}. The development machines ran a recent
|
|
installation of \textit{Arch Linux}\footnotemark{} and Fedora 38, both using a
|
|
\texttt{6.\{2,3,4\}.x} XanMod variant of the Linux kernel.
|
|
|
|
\footnotetext{(by the way) \url{https://i.redd.it/mfrfqy66ey311.jpg}.}
|
|
|
|
|
|
\n{1}{Development}
|
|
|
|
The source code of the project was being versioned since the start, using the
|
|
popular and industry-standard git (\url{https://git-scm.com}) source code
|
|
management (SCM) tool. Commits were made frequently and, if at all possible,
|
|
consist of small and self-contained changes of code, trying to follow sane
|
|
commit message \emph{hygiene}, i.e.\ striving for meaningful and well-formatted
|
|
commit messages. The name of the default branch is \texttt{development}, since
|
|
that is what the author likes to choose for new projects that are not yet
|
|
stable (it is in fact the default in author's \texttt{.gitconfig}).
|
|
|
|
|
|
\n{2}{Commit signing}
|
|
|
|
Since git allows cryptographically \emph{singing} all commits, it would be
|
|
unwise not to take advantage of this. For the longest time, GPG was the only
|
|
method available for signing commits in git; however, that is no longer
|
|
applicable~\cite{agwagitssh}. These days, it is also possible to both sign and
|
|
verify one's git commits (and tags!) using SSH keys, namely those produced by
|
|
OpenSSH, which \emph{can} be the same ones that can be used to log in to remote
|
|
systems. The author has, of course, not reused the same key pairs that are used
|
|
to connect to machines for signing commits. A different, \texttt{Ed25519}
|
|
elliptic curve key pairs have been used specifically for signing. Public
|
|
components of these keys are enclosed in this thesis as
|
|
Appendix~\ref{appendix:signingkeys} for future reference.
|
|
|
|
The validity of a signature on a particular commit can be viewed with git using
|
|
the following commands (the \% sign denotes the shell prompt):
|
|
|
|
\vspace{\parskip}
|
|
\begin{lstlisting}[language=bash, caption={Verifying the signature of a git commit},
|
|
label=gitverif, basicstyle=\linespread{0.9}\small\ttfamily,
|
|
backgroundcolor=\color{lstbg}]
|
|
% cd <cloned project dir>
|
|
% git show --show-signature <commit>
|
|
% # alternatively:
|
|
% git verify-commit <commit>
|
|
\end{lstlisting}
|
|
\vspace*{-\baselineskip}
|
|
|
|
There is one caveat to this though, git first needs some additional
|
|
configuration for the code in Listing~\ref{gitverif} to work as one would
|
|
expect. Namely that the public key used to verify the signature needs to be
|
|
stored in git's ``allowed signers file'', then git needs to be told where that
|
|
file is located using the configuration value
|
|
\texttt{gpg.ssh.allowedsignersfile} and finally the configuration value of the
|
|
\texttt{gpg.format} field needs to be set to \texttt{ssh}. Luckily, because
|
|
git also allows the configuration values to be local to each repository, both
|
|
of the mentioned issues can be solved by running the following commands from
|
|
inside the cloned repository:
|
|
|
|
\vspace{\parskip}
|
|
\begin{lstlisting}[language=bash, caption={Prepare allowed signers file and signature format for git},
|
|
label=gitsshprep, basicstyle=\linespread{0.9}\small\ttfamily,
|
|
backgroundcolor=\color{lstbg}]
|
|
% # set the signature format for the local repository.
|
|
% git config --local gpg.format ssh
|
|
% # save the public key.
|
|
% cat > ./.tmp-allowed_signers \
|
|
<<<'surtur <insert literal surtur pubkey>
|
|
leo <insert literal leo pubkey>'
|
|
% # set the allowed signers file path for the local repository.
|
|
% git config --local gpg.ssh.allowedsignersfile=./.tmp-allowed_signers
|
|
\end{lstlisting}
|
|
\vspace*{-\baselineskip}
|
|
|
|
After the code in Listing~\ref{gitsshprep} is run, everything from the
|
|
Listing~\ref{gitverif} should remain applicable for the lifetime of the
|
|
repository or until git changes implementation of signature verification. The
|
|
git \texttt{user.name} that can be seen on the commits in the \textbf{Author}
|
|
field is named after the machine that was used to develop the program, since
|
|
the author uses different signing keys on each machine. That way the committer
|
|
machine can be determined post-hoc.
|
|
|
|
For future reference, git has been used in the version \texttt{git version
|
|
2.4\{0,1,2\}.x}.
|
|
|
|
|
|
\n{2}{Continuous Integration}
|
|
|
|
To increase both the author's and public confidence in the atomic changes made
|
|
over time, it was attempted to thoroughly \emph{integrate} them using a
|
|
continuous integration (CI) service that was plugged into the main source code
|
|
repository since the early stages of development. This, of course, was again
|
|
self-hosted, including the workers. The tool of choice there was Drone
|
|
(\url{https://drone.io}) and the ``docker'' runner (in fact it runs any OCI
|
|
container) was used to run the builds.
|
|
|
|
The way this runner works is that it creates an ephemeral container for every
|
|
pipeline step and executes given \emph{commands} inside of it. At the end of
|
|
each step, the container is discarded while the repository clone, which is
|
|
mounted into each container's \texttt{/drone/src}, is persisted between steps,
|
|
allowing it to be cloned from \emph{origin} only at the start of the pipeline
|
|
and then shared for all the following steps, saving bandwidth, time and disk
|
|
writes.
|
|
|
|
The entire configuration used to run the pipelines can be found in a file named
|
|
\texttt{.drone.yml} at the root of the main source code repository. The
|
|
workflow consists of four pipelines, which are run in parallel. Two main
|
|
pipelines are defined to build the frontend assets, the \texttt{pcmt} binary
|
|
and run tests on \texttt{x86\_64} GNU/Linux targets, one for each of Alpine
|
|
(version 3.1\{7,8\}) and Arch. These two pipelines are identical apart from
|
|
OS-specific bits such as installing a certain package, etc. For the record,
|
|
other OS-architecture combinations were not tested.
|
|
|
|
A third pipeline contains instructions to build a popular static analysis tool
|
|
called \texttt{golangci-lint}, which is a sort of meta-linter, bundling a
|
|
staggering number of linters (linter is a tool that performs static code
|
|
analysis and can raise awareness of programming errors, flag potentially buggy
|
|
code constructs, or \emph{mere} stylistic errors), from sources and then
|
|
perform the analysis of project's codebase using the freshly built binary. If
|
|
the result of this step is successful, a handful of code analysis services get
|
|
pinged in the next steps to take notice of the changes to project's source code
|
|
and update their metrics. Details can be found in the main Drone configuration
|
|
file \texttt{.drone.yml} and the configuration for the \texttt{golangci-lint}
|
|
tool itself (such as what linters are enabled/disabled and their
|
|
configurations) can be found in the root of the repository in the file named
|
|
\texttt{.golangci.yml}.
|
|
|
|
The fourth pipeline focuses on linting the \texttt{Containerfile} and building
|
|
the container and pushing in to a public container registry, although the
|
|
latter action is only performed on feature branches, \emph{pull request} or
|
|
\emph{tag} events.
|
|
|
|
\obr{Drone CI median build
|
|
time}{fig:drone-median-build}{.84}{graphics/drone-median-build}
|
|
|
|
The median build time as of writing was 1 minute, which includes running all
|
|
four pipelines, and that is acceptable. Build times might of course vary
|
|
depending on the hardware, for reference, these builds were run on a machine
|
|
equipped with a Zen 3 Ryzen 5 5600 CPU with nominal clock times, DDR4 @ 3200
|
|
MHz RAM, a couple of PCIe Gen 4 NVMe drives in a mirrored setup (using ZFS) and
|
|
a 600 Mbps downlink, software-wise running Arch with an author-flavoured Xanmod
|
|
kernel version 6.\{2,3,4\}.x.
|
|
|
|
|
|
\n{2}{Source code repositories}\label{sec:repos}
|
|
|
|
The git repository containing source code of the \texttt{pcmt} project:\\
|
|
\url{https://git.dotya.ml/mirre-mt/pcmt.git}.
|
|
The git repository hosting the \texttt{pcmt} configuration schema:\\
|
|
\url{https://git.dotya.ml/mirre-mt/pcmt-config-schema.git}.
|
|
The repository containing the \LaTeX{} source code of this thesis:\\
|
|
\url{https://git.dotya.ml/mirre-mt/masters-thesis.git}.
|
|
|
|
All the pertaining source code was published in repositories on a publicly
|
|
available git server operated by the author, the reasoning \emph{pro}
|
|
self-hosting being that it is the preferred way of guaranteed autonomy over
|
|
one's source code, as opposed to large silos owned by big corporations having a
|
|
track record of arguably not always deciding with user's best interest in mind
|
|
(although recourse has been observed~\cite{ytdl}). When these providers act on
|
|
impulse or under public pressure they can potentially (at least temporarily)
|
|
disrupt operations of their users. Thus, they are not only beholding their
|
|
users to lengthy \emph{terms of service} that \emph{are subject to change at
|
|
any given moment}, but also outside factors beyond their control. Granted,
|
|
decentralisation can take a toll on discoverability of the project, but that is
|
|
only a concern if rapid market penetration is a goal, not when aiming for an
|
|
organically grown community.
|
|
|
|
|
|
\n{2}{Toolchain}
|
|
|
|
Throughout the creation of this work, the \emph{then-current} version of the Go
|
|
programming language was used, i.e. \texttt{go1.20}.
|
|
|
|
To read more on why Go was chosen in particular, see
|
|
Appendix~\ref{appendix:whygo}. Equally, Nix and Nix-based tools such as
|
|
\texttt{devenv} have also aided heavily during development, more on those is
|
|
written in Appendix~\ref{appendix:whynix}.
|
|
|
|
\tab{Tool/Library-Usage Matrix}{tab:toolchain}{1.0}{ll}{
|
|
\textbf{Tool/Library} & \textbf{Usage} \\
|
|
Go programming language & program core \\
|
|
Dhall configuration language & program configuration \\
|
|
Echo & HTTP handlers, controllers \\
|
|
ent & ORM using graph-based modelling \\
|
|
pq & Pure-Go Postgres drivers \\
|
|
bluemonday & sanitising HTML \\
|
|
TailwindCSS & utility-first approach to Cascading Style Sheets \\
|
|
PostgreSQL & persistent data storage \\
|
|
}
|
|
|
|
Table~\ref{tab:depsversionmx} contains the names and versions of the most
|
|
important libraries and supporting software that were used to build the
|
|
application.
|
|
|
|
\tab{Dependency-Version Matrix}{tab:depsversionmx}{1.0}{ll}{
|
|
\textbf{Name} & \textbf{version} \\
|
|
\texttt{echo} (\url{https://echo.labstack.com/}) & 4.11.1 \\
|
|
\texttt{go-dhall} (\url{https://github.com/philandstuff/dhall-golang}) & 6.0.2\\
|
|
\texttt{ent} (\url{https://entgo.io/}) & 0.12.3 \\
|
|
\texttt{pq} (\url{https://github.com/lib/pq/}) & 1.10.9 \\
|
|
\texttt{bluemonday} (\url{https://github.com/microcosm-cc/bluemonday}) & 1.0.25 \\
|
|
\texttt{tailwindcss} (\url{https://tailwindcss.com/}) & 3.3.0 \\
|
|
\texttt{PostgreSQL} (\url{https://www.postgresql.org/}) & 15.3 \\
|
|
}
|
|
|
|
Additionally, the dependency-version mapping for the Go program can be inferred
|
|
from looking at the \texttt{go.mod}'s first \textit{require} block at any point
|
|
in time. The same can be achieved for \emph{frontend} by glancing at the
|
|
\texttt{package-lock.json} file.
|
|
|
|
|
|
\n{1}{Application architecture}
|
|
|
|
The application is written in Go and uses \textit{gomodules}. The full name of
|
|
the module is \texttt{git.dotya.ml/mirre-mt/pcmt}.
|
|
|
|
\obr{Application class diagram}{fig:classdiagram}{.79}{graphics/pcmt-class-diagram.pdf}
|
|
|
|
\n{2}{Package structure}
|
|
|
|
The source code of the module is organised into smaller, self-contained Go
|
|
\emph{packages} appropriately along a couple of domains: logging, core
|
|
application, web routers, configuration and settings, etc. In Go, packages are
|
|
delimited by folder structure -- each folder can be a package.
|
|
|
|
Generally speaking, the program aggregates decision points into central places,
|
|
such as \texttt{run.go}, which then imports child packages that facilitate each
|
|
of the tasks of loading the configuration, connecting to the database and
|
|
running migrations, consolidating flag, environment variable and
|
|
configuration-based values into canonical \emph{settings} \texttt{struct},
|
|
setting up web routes, authenticating requests, or handling \texttt{signals}
|
|
and performing graceful shutdowns.
|
|
|
|
\n{3}{Internal package}
|
|
|
|
The \texttt{internal} package was not used as of writing, but the author plans
|
|
to eventually migrate \emph{internal} logic of the program into the internal
|
|
package to prevent accidental imports.
|
|
|
|
|
|
\n{2}{Logging}
|
|
|
|
The program uses \emph{dependency injection} to share a single logger instance
|
|
(the same technique is also used to share the database client). This logger is
|
|
then passed around as a pointer, so that the underlying data stays the same or
|
|
is modified concurrently for all consumers. As a rule of thumb throughout the
|
|
application, every larger \texttt{struct} that needs to be passed around is
|
|
passed around as a pointer.
|
|
|
|
An experimental (note: not anymore, with \texttt{go1.21} it was brought into
|
|
Go's \textit{stdlib}) library for \textit{structured} logging \texttt{slog} was
|
|
used to facilitate every logging need that the program might have. It supports
|
|
both JSON and plain-text logging, which was made configurable by the program.
|
|
Either a configuration file value or an environment variable can be used to set
|
|
this.
|
|
|
|
There are four log levels available by default (\texttt{DEBUG}, \texttt{INFO},
|
|
\texttt{WARNING}, \texttt{ERROR}) and the pertinent library funtions are
|
|
parametric. The first parameter of type \texttt{string} is the main message,
|
|
that is supplied as a \emph{value} to the \emph{key} named appropriately
|
|
`\texttt{msg}', a feature of structured loggers which can later be used for
|
|
filtering. Any other parameters need to be supplied in pairs, serving as key
|
|
and value, respectively.
|
|
|
|
This main \texttt{slog} interface has been extended in package
|
|
\texttt{slogging} to also provide the formatting functionality of the
|
|
\texttt{fmt} standard library package. This was achieved by directly embedding
|
|
\texttt{slog.Logger} in a custom \texttt{struct} type named \texttt{Slogger}
|
|
and implementing the additional methods on the custom type. The new type that
|
|
embeds the original \texttt{slog.Logger} gets to keep its methods thanks to the
|
|
composition nature of Go. Thus, common formatting directives like the one seen
|
|
in Listing~\ref{goFmtExpression} are now supported with the custom logger, in
|
|
addition to anything the base \texttt{slog.Logger} offers.
|
|
|
|
\vspace{\parskip}
|
|
\begin{lstlisting}[language=Go, caption={Example formatting expression supplied
|
|
to the logger}, label=goFmtExpression, basicstyle=\linespread{0.9}\small\ttfamily,
|
|
backgroundcolor=\color{lstbg},
|
|
otherkeywords={\%s, \%q, \%v},
|
|
]
|
|
slogger.Debugf("operation %q for user %q completed at %s", op, usr.ID, time.Now())
|
|
\end{lstlisting}
|
|
|
|
Furthermore, functionality was added to support changing the log level at
|
|
runtime, which is a convenient feature in certain situations.
|
|
|
|
|
|
\n{2}{Authentication}
|
|
|
|
The authentication logic is relatively simple and its core has mostly been
|
|
isolated into a custom \emph{middleware}. User passwords are hashed using a
|
|
secure KDF before ever being sent to the database. The KDF of choice is
|
|
\texttt{bcrypt} (with a sane \emph{Cost} of 10), which automatically includes
|
|
\emph{salt} for the password and provides ``length-constant'' time hash
|
|
comparisons. The author plans to add support for the more modern
|
|
\texttt{scrypt} and the state-of-the-art, P-H-C (Password Hashing Competition)
|
|
winner algorithm \texttt{Argon2}
|
|
(\url{https://github.com/P-H-C/phc-winner-argon2}) for flexibility.
|
|
|
|
\n{2}{SQLi prevention}
|
|
|
|
No raw SQL queries are directly used to access the database, thus decreasing
|
|
the likelihood of SQL injection attacks. Instead, parametric queries are
|
|
constructed in code using a graph-like API of the \texttt{ent} library, which
|
|
is attended to in depth in Section~\ref{sec:dbschema}.
|
|
|
|
|
|
\n{2}{Configurability}
|
|
|
|
Virtually any important value in the program has been made into a configuration
|
|
value, so that the operator can customise the experience as needed. A choice of
|
|
sane configuration defaults was attempted, which resulted in the configuration
|
|
file essentially only needing to contain secrets, unless there is a need to
|
|
override the defaults. It is not entirely a \emph{zero-config} situation,
|
|
rather a \emph{minimal-config} one. An example can be seen in
|
|
Section~\ref{sec:configuration}.
|
|
|
|
Certain options deemed important enough (this was largely subjective) were
|
|
additionally made into command-line \emph{flags}, using the standard library
|
|
package \texttt{flags}. Users wishing to display all available options can
|
|
append the program with the \texttt{-help} flag, a courtesy of the mentioned
|
|
\texttt{flags} package.
|
|
|
|
\vspace*{-\baselineskip}
|
|
|
|
\paragraph{\texttt{-host <hostname/IP>} (string)}{Takes one argument and specifies
|
|
the hostname, or the address to listen on.}
|
|
|
|
\vspace*{-\baselineskip}
|
|
|
|
\paragraph{\texttt{-port <port number>} (int)}{This flag takes one integer
|
|
argument and specifies the port to listen on. The argument is validated at
|
|
program start-up and the program has a fallback built in for the case that
|
|
the supplied value is bogus, such as a string or a number outside the allowed
|
|
TCP range $1-65535$.}
|
|
|
|
\vspace*{-\baselineskip}
|
|
|
|
\paragraph{\texttt{-printMigration}}{A boolean option that, if set, makes the
|
|
program print any \textbf{upcoming} database migrations (based on the current
|
|
state of the database) and exit. The connection string environment variable
|
|
still needs to be set in order to be able connect to the database and perform
|
|
the schema \emph{diff}. This option is mainly useful during debugging.}
|
|
|
|
\vspace*{-\baselineskip}
|
|
|
|
\paragraph{\texttt{-devel}}{This flag instructs the program to enter
|
|
\textit{devel mode}, in which all templates are re-parsed and re-executed upon
|
|
each request, and the default log verbosity is changed to level
|
|
\texttt{DEBUG}. Should not be used in production.}
|
|
|
|
\vspace*{-\baselineskip}
|
|
|
|
\paragraph{\texttt{-import <path/to/file>} (string)}{This option tells the program
|
|
to perform an import of local breach data into program's main database.
|
|
Obviously, the database connection string environment variable also needs to
|
|
be present for this. The option takes one argument that is the path to file
|
|
formatted according to the \texttt{ImportSchema} (consult
|
|
Listing~\ref{breachImportSchema}). The program prints the result of the import
|
|
operation, indicating success or failure, and exits.}
|
|
|
|
\vspace*{-\baselineskip}
|
|
|
|
\paragraph{\texttt{-version}}{As could probably be inferred from its name, this
|
|
flag makes the program to print its own version (that has been embedded into
|
|
the binary at build time) and exit. A release binary would print something
|
|
akin to a \emph{semantic versioning}-compliant git tag string, while a
|
|
development binary might simply print the truncated commit ID (consult
|
|
\texttt{Containerfile} and \texttt{justfile}) of the sources used to build it.}
|
|
|
|
|
|
\n{2}{Embedded assets}
|
|
|
|
An important thing to mention is embedded assets and templates. Go has multiple
|
|
mechanisms to natively embed arbitrary files directly into the binary during
|
|
the regular build process. \texttt{embed.FS} from the standard library
|
|
\texttt{embed} package was used to bundle all template files and web assets,
|
|
such as images, logos and stylesheets at the module level. These are then
|
|
passed around the program as needed, such as to the \texttt{handlers} package.
|
|
|
|
There is also a toggle in the application configuration (\texttt{LiveMode}),
|
|
which instructs the program at start-up to either rely entirely on embedded
|
|
assets, or pull live template and asset files from the filesystem. The former
|
|
option makes the application more portable as it is wholy self-contained, while
|
|
the latter allows for flexibility and customisation not only during
|
|
development. Where the program looks for assets and templates in \emph{live
|
|
mode} is determined by another configuration options: \texttt{assetsPath} and
|
|
\texttt{templatePath}.
|
|
|
|
|
|
\n{2}{Composability}
|
|
|
|
The core templating functionality was provided by the \texttt{html/template} Go
|
|
standard library package. Echo's \texttt{Renderer} interface has been
|
|
implemented, so that template rendering could be performed directly using
|
|
Echo's built-in facilities in a more ergonomic manner using \texttt{return
|
|
c.Render(http.StatusOk, "home.tmpl")}.
|
|
|
|
\vspace{\parskip}
|
|
\begin{lstlisting}[
|
|
caption={Conditionaly enabling functionality inside a Go template based on user access level},
|
|
label=tmplConditionals, basicstyle=\linespread{0.9}\small\ttfamily,
|
|
backgroundcolor=\color{lstbg},
|
|
morekeywords={if,and,end},
|
|
]
|
|
{{ if and .User .User.IsLoggedIn .User.IsAdmin }}
|
|
...
|
|
{{ end }}
|
|
\end{lstlisting}
|
|
|
|
Templates used for rendering of the web pages were created in a composable
|
|
manner, split into smaller, reusable parts, such as \texttt{footer.tmpl} and
|
|
\texttt{head.tmpl}. Those could then be included e.g.\ using \texttt{\{\{
|
|
template "footer.tmpl" \}\}}. Specific functionality is conditionally
|
|
executed based on the determined level of access of the user, see
|
|
Listing~\ref{tmplConditionals} for reference.
|
|
|
|
A popular HTML sanitiser \texttt{bluemonday} has been employed to aid with
|
|
battling XSS. The program first runs every template through the sanitiser
|
|
before rendering it, so that any user-controlled inputs are handled safely.
|
|
|
|
A dynamic web application should include a CSP configuration. The program
|
|
therefore has the ability to calculate the hashes (SHA256/SHA384) of its assets
|
|
(scripts, images) on the fly and it is able to use them inside the templates.
|
|
This unlocks potentially using third party assets without opening up CSP with
|
|
directives like \texttt{script-src 'unsafe-hashes'}. It also means that there
|
|
is no need to maintain a set of customised \texttt{head} templates with
|
|
pre-computed hashes next to script sources, since the application can perform
|
|
the necessary calculations in user's stead.
|
|
|
|
|
|
\n{2}{Server-side rendering}
|
|
|
|
The application constructs the web pages \emph{entirely} on the server side,
|
|
and it runs without a single line of JavaScript, of which the author is
|
|
especially proud. It improves load times, decreases the attack surface,
|
|
increases maintainability and reduces cognitive load that is required when
|
|
dealing with JavaScript. Of course, that requires extensive usage of
|
|
non-semantic \texttt{POST} requests in web forms even for data \emph{updates}
|
|
(where HTTP \texttt{PUT}s should be used) and the accompanying frequent
|
|
full-page refreshes, but that still is not enough to warrant the use of
|
|
JavaScript.
|
|
|
|
|
|
\n{2}{Frontend}
|
|
|
|
Frontend-wise, the application Tailwind was used for CSS. It promotes the usage
|
|
of flexible \emph{utility-first} classes in the HTML markup instead of
|
|
separating out styles from content. Understandably, this is somewhat of a
|
|
preference issue and the author does not hold hard opinions in either
|
|
direction; however, it has to be noted that this approach empirically allows
|
|
for rather quick UI prototyping. Tailwind was chosen for having a reasonably
|
|
detailed documentation and offering built-in support for dark/light mode, and
|
|
partially also because it \emph{looks} nice.
|
|
|
|
The Go templates containing the CSS classes need to be parsed by Tailwind in
|
|
order to produce the final stylesheet that can be bundled with the application.
|
|
The upstream provides an original CLI tool (\texttt{tailwindcss}), which can be
|
|
used exactly for that action. Simple and accessible layouts were overall
|
|
preferred, a single page was rather split into multiple when becoming
|
|
convoluted. Data-backed efforts were made to create reasonably contrasting
|
|
pages.
|
|
|
|
\n{3}{Frontend experiments}
|
|
|
|
As an aside, the author has briefly experimented with WebAssembly to provide
|
|
client-side dynamic functionality for this project, but has ultimately scrapped
|
|
it in favour of the entirely server-side rendered approach. It is possible that
|
|
it would get revisited in the future if necessary. Even from the short
|
|
experiments it was obvious how much faster WebAssembly was when compared to
|
|
JavaScript.
|
|
|
|
|
|
% \newpage
|
|
\n{2}{User isolation}
|
|
|
|
\obr{Application use case diagram}{fig:usecasediagram}{.9}{graphics/pcmt-use-case.pdf}
|
|
|
|
Users are only allowed into specific parts of the application based on the role
|
|
they currently possess (Role-based Access Control).
|
|
|
|
While this short list might get amended in the future, initially only two basic
|
|
roles were envisioned:
|
|
|
|
\begin{itemize}
|
|
\item Administrator
|
|
\item User
|
|
\end{itemize}
|
|
|
|
It is paramount that the program protects itself from the insider threats as
|
|
well, and therefore each role is only able to perform actions that it is
|
|
explicitly assigned. While there definitely is a certain overlap between the
|
|
capabilities of the two outlined roles, each also possesses unique features
|
|
that the other one does not.
|
|
|
|
For instance, the administrator role is not able to perform breach data
|
|
searches directly, for that a separate \emph{user} account has to be devised.
|
|
Similarly, a regular user is not able to manage breach lists and other users,
|
|
because that is a privileged operation.
|
|
|
|
In-application administrators are not able to view (any) sensitive user data
|
|
and should therefore only be able to perform the following actions:
|
|
|
|
\begin{itemize}
|
|
\item Create user accounts
|
|
\item View user listing
|
|
\item View user details
|
|
\item Change user details, including administrative status
|
|
\item Delete user accounts
|
|
\item Refresh breach data from online sources
|
|
\end{itemize}
|
|
|
|
Let us consider a case when a user performs an operation on their own account.
|
|
While demoting from administrator to a regular user should be permitted,
|
|
promoting self to be an administrator would constitute a \emph{privilege
|
|
escalation} and likely be a precursor to at least a \emph{denial of service} of
|
|
sorts, as there would be nothing preventing the newly-\emph{admined} user from
|
|
disabling the accounts of all other administrators.
|
|
|
|
|
|
\n{2}{Zero trust principle}
|
|
|
|
\textit{Confidentiality, i.e.\ not trusting the provider}
|
|
|
|
There is no way for the application (and consequently, the in-application
|
|
administrator) to read user's data (such as saved search queries). This is
|
|
possible by virtue of encrypting the pertinent data before saving them in the
|
|
database by a state-of-the-art \texttt{age} tool (backed by
|
|
X25519)~\cite{age},~\cite{x25519rfc7748}. The \texttt{age} \emph{identity}
|
|
itself is in turn encrypted by a passphrase that only the user controls. Of
|
|
course, the user-supplied password is run by a password based key derivation
|
|
function (\texttt{argon2}, version \emph{id} with the officially {recommended}
|
|
configuration parameters) before letting it encrypt the \emph{age} key.
|
|
|
|
The \texttt{age} identity is only generated once the user changes their
|
|
password for the first time, in an attempt to prevent scenarios like the
|
|
in-application administrator with access to physical database being able to
|
|
both \textbf{recover} the key from the database and \textbf{decrypt} it, given
|
|
that they already know the user password (because they set it when they created
|
|
the user), which would subsequently give them unbounded access to any future
|
|
encrypted data, as long as they would be able to maintain their database
|
|
access. This is why generating the \texttt{age} identity is bound to the first
|
|
password change.
|
|
|
|
Of course, the supposed evil administrator could simply perform the password
|
|
change themselves! However, the user would at least be able to find those
|
|
changes in the activity logs and know to \emph{not} use the application under
|
|
such circumstances. But given the scenario of a total database compromise, the
|
|
author finds that all hope is \emph{already} lost at that point. At least when
|
|
the database is dumped, it should only contain non-sensitive, functional
|
|
information in plain text, everything else should be encrypted.
|
|
|
|
Consequently, both the application operators and the in-application
|
|
administrators should ideally never be able to learn the details of what the
|
|
user is tracking/searching for, the same being by extension applicable even to
|
|
potential attackers with direct access to the database. Thus, the author
|
|
maintains that every scenario that could potentially lead to a data breach
|
|
(apart from a compromised actual user password) would have to entail some form
|
|
of operating memory acquisition on the machine hosting the application, for
|
|
instance using \texttt{LiME}~\cite{lime}, or perhaps directly the
|
|
\emph{hypervisor}, if considering a virtualised (``cloud'') environments.
|
|
|
|
|
|
\n{1}{Implementation}
|
|
|
|
\n{2}{Dhall Configuration Schema}\label{sec:configuration}
|
|
|
|
The configuration schema was at first being developed as part of the main
|
|
project's repository, before it was determined that both the development and
|
|
overall clarity would benefit from the schema living in its own repository (see
|
|
Section~\ref{sec:repos} for details). This enabled the schema to be
|
|
independently developed and versioned, and only be pulled into the main
|
|
application whenever it was determined to be ready.
|
|
|
|
|
|
% \vspace{\parskip}
|
|
\smallskip
|
|
% \vspace{\baselineskip}
|
|
\begin{lstlisting}[language=Haskell, caption={Dhall configuration schema version 0.0.1-rc.2},
|
|
label=dhallschema, basicstyle=\linespread{0.9}\footnotesize\ttfamily,
|
|
backgroundcolor=\color{lstbg},
|
|
morekeywords={Text, Natural, Optional, Type}
|
|
]
|
|
let Schema =
|
|
{ Type =
|
|
{ Host : Text
|
|
, Port : Natural
|
|
, HTTP :
|
|
{ Domain : Text
|
|
, Secure : Bool
|
|
, AutoTLS : Bool
|
|
, TLSKeyPath : Text
|
|
, TLSCertKeyPath : Text
|
|
, HSTSMaxAge : Natural
|
|
, ContentSecurityPolicy : Text
|
|
, RateLimit : Natural
|
|
, Gzip : Natural
|
|
, Timeout : Natural
|
|
}
|
|
, Mailer :
|
|
{ Enabled : Bool
|
|
, Protocol : Text
|
|
, SMTPAddr : Text
|
|
, SMTPPort : Natural
|
|
, ForceTrustServerCert : Bool
|
|
, EnableHELO : Bool
|
|
, HELOHostname : Text
|
|
, Auth : Text
|
|
, From : Text
|
|
, User : Text
|
|
, Password : Text
|
|
, SubjectPrefix : Text
|
|
, SendPlainText : Bool
|
|
}
|
|
, LiveMode : Bool
|
|
, DevelMode : Bool
|
|
, AppPath : Text
|
|
, Session :
|
|
{ CookieName : Text
|
|
, CookieAuthSecret : Text
|
|
, CookieEncrSecret : Text
|
|
, MaxAge : Natural
|
|
}
|
|
, Logger : { JSON : Bool, Fmt : Optional Text }
|
|
, Init : { CreateAdmin : Bool, AdminPassword : Text }
|
|
, Registration : { Allowed : Bool }
|
|
}
|
|
}
|
|
\end{lstlisting}
|
|
\vspace*{-\baselineskip}
|
|
|
|
Full schema with type annotations can be seen in Listing~\ref{dhallschema}.
|
|
|
|
\newpage
|
|
|
|
The \texttt{let} statement declares a variable called \texttt{Schema} and
|
|
assigns to it the result of the expression on the right side of the equals
|
|
sign, which has for practical reasons been trimmed and is displayed without the
|
|
\emph{default} block. The default block is instead shown in its own
|
|
Listing~\ref{dhallschemadefaults}.
|
|
|
|
The main configuration comprises both raw attributes and child records, which
|
|
allow for grouping of related functionality. For instance, configuration
|
|
settings pertaining mailserver setup are grouped in a record named
|
|
\textbf{Mailer}. Its attribute \textbf{Enabled} is annotated as \textbf{Bool},
|
|
which was deemed appropriate for an on-off switch-like functionality, with the
|
|
only permissible values being either \emph{True} or \emph{False}.
|
|
|
|
Do note that in Dhall $true\ != True$, since internally \textbf{True} is a
|
|
\texttt{Bool} constant built directly into Dhall (see ``The Prelude'' for
|
|
reference), while \textbf{true} is evaluated as an \emph{unbound} variable,
|
|
that is, a variable \emph{not} defined in the current \emph{scope} and thus not
|
|
\emph{present} in the current scope~\cite{dhallprelude}.
|
|
|
|
Another one of Dhall's specialties is that `$==$' and `$!=$' (in)equality
|
|
operators \textbf{only} work on values of type \texttt{Bool}, which for example
|
|
means that variables of type \texttt{Natural} (\texttt{uint}) or \texttt{Text}
|
|
(\texttt{string}) cannot be compared directly as is the case in other
|
|
languages. That either leaves the comparing work for a higher-level language
|
|
(such as Go). Alternatively, from the perspective of the Dhall authors
|
|
\emph{enums} are the promoted way to solve this when the value matters, i.e.\
|
|
derive a custom \emph{named} type from a primitive type and compare
|
|
\emph{that}.
|
|
|
|
\newpage
|
|
% \vspace{\parskip}
|
|
\begin{lstlisting}[language=Haskell, caption={Dhall configuration defaults for
|
|
schema version 0.0.1-rc.2},
|
|
label=dhallschemadefaults, basicstyle=\linespread{0.9}\footnotesize\ttfamily,
|
|
backgroundcolor=\color{lstbg},
|
|
]
|
|
, default =
|
|
-- | have sane defaults.
|
|
{ Host = ""
|
|
, Port = 3000
|
|
, HTTP =
|
|
{ Domain = ""
|
|
, Secure = False
|
|
, AutoTLS = False
|
|
, TLSKeyPath = ""
|
|
, TLSCertKeyPath = ""
|
|
, HSTSMaxAge = 0
|
|
, ContentSecurityPolicy = ""
|
|
, RateLimit = 0
|
|
, Gzip = 0
|
|
, Timeout = 0
|
|
}
|
|
, Mailer =
|
|
{ Enabled = False
|
|
, Protocol = "smtps"
|
|
, SMTPAddr = ""
|
|
, SMTPPort = 465
|
|
, ForceTrustServerCert = False
|
|
, EnableHELO = False
|
|
, HELOHostname = ""
|
|
, Auth = ""
|
|
, From = ""
|
|
, User = ""
|
|
, Password = ""
|
|
, SubjectPrefix = "pcmt - "
|
|
, SendPlainText = True
|
|
}
|
|
, LiveMode =
|
|
-- | LiveMode controls whether the application looks for
|
|
-- | directories "assets" and "templates" on the filesystem or
|
|
-- | in its bundled Embed.FS.
|
|
False
|
|
, DevelMode = False
|
|
, AppPath =
|
|
-- | AppPath specifies where the program looks for "assets" and
|
|
-- | "templates" in case LiveMode is True.
|
|
"."
|
|
, Session =
|
|
{ CookieName = "pcmt_session"
|
|
, CookieAuthSecret = ""
|
|
, CookieEncrSecret = ""
|
|
, MaxAge = 3600
|
|
}
|
|
, Logger = { JSON = True, Fmt = None Text }
|
|
, Init =
|
|
{ CreateAdmin =
|
|
-- | if this is True, attempt to create a user with admin
|
|
-- | privileges with the password specified below
|
|
False
|
|
, AdminPassword =
|
|
-- | used for the first admin, forced change on first login.
|
|
"50ce50fd0e4f5894d74c4caecb450b00c594681d9397de98ffc0c76af5cff5953eb795f7"
|
|
}
|
|
, Registration.Allowed = True
|
|
}
|
|
}
|
|
|
|
in Schema
|
|
\end{lstlisting}
|
|
\vspace*{-\baselineskip}
|
|
\vspace*{-\baselineskip}
|
|
\vspace*{-\baselineskip}
|
|
\n{2}{Data integrity and authenticity}
|
|
|
|
The user can interact with the application via a web client, such as a browser,
|
|
and is required to authenticate for all sensitive operations. To not only know
|
|
\emph{who} the user is but also make sure they are \emph{permitted} to perform
|
|
the action they are attempting, the program employs an \emph{authorisation}
|
|
mechanism in the form of sessions. These are on the client side represented by
|
|
cryptographically signed and encrypted (using 256-bit AES) HTTP cookies. That
|
|
lays foundations for a few things: the data saved into the cookies can be
|
|
regarded as private because short of future \emph{quantum computers} only the
|
|
program itself can decrypt and access the data, and the data can be trusted
|
|
since it is both signed using the key that only the program controls and
|
|
\emph{encrypted} with \emph{another} key that equally only the program holds.
|
|
|
|
The cookie data is only ever written \emph{or} read at the server side,
|
|
solidifying the authors decision to let it be encrypted, as there is no point
|
|
in not encrypting it for some perceived client-side simplification. Users
|
|
navigating the website send their session cookie (if it exists) with
|
|
\textbf{every request} to the server, which subsequently verifies the integrity
|
|
of the data and in case it is valid, determines the existence and potential
|
|
amount of user privilege that should be granted. Public endpoints do not
|
|
mandate the presence of a valid session by definition, while at protected
|
|
endpoints the user is authenticated at every request. When a session expires or
|
|
if there is no session to begin with, the user is either shown a \emph{Not
|
|
found} error message, the \emph{Unauthorised} error message or redirected to
|
|
\texttt{/signin}, depending on the endpoint or resource, as can be seen, this
|
|
behaviour is not uniform and depends on the resource and/or the endpoint.
|
|
|
|
Another aspect that contributes to data integrity from \emph{another} point of
|
|
view is utilising database \emph{transactions} for bundling together multiple
|
|
database operations that collectively change the \emph{state}. Using the
|
|
transactional jargon, the data is only \emph{committed} if each individual
|
|
change was successful. In case of any errors, the database is instructed to
|
|
perform an atomic \emph{rollback}, which brings it back to a state before the
|
|
changes were ever attempted.
|
|
|
|
The author has additionally considered the thought of utilising an embedded
|
|
immutable database like immudb (\url{https://immudb.io}) for record keeping
|
|
(verifiably storing data change history) and additional data integrity checks,
|
|
e.g.\ for tamper protection purposes and similar; however, that work remains
|
|
yet to be materialised.
|
|
|
|
|
|
\n{2}{Database schema}\label{sec:dbschema}
|
|
|
|
The database schema is not being created by manually typing out SQL statements.
|
|
Instead, an Object-relational Mapping (ORM) tool named \texttt{ent} is used,
|
|
which allows defining the table schema and relations entirely in Go. The upside
|
|
of this approach is that the \emph{entity} types are natively understood by
|
|
code editors, and they also get type-checked by the compiler for correctness,
|
|
preventing all sorts of headaches and potential bugs.
|
|
|
|
Since \texttt{ent} encourages the usage of \emph{declarative migrations} at
|
|
early stages of the project, it is not required for the database schema to
|
|
exist on application start-up in form of raw SQL (or HCL). Instead,
|
|
\texttt{ent} only requires a valid connection string providing reasonably
|
|
privileged access to the database and it handlers the database configuration by
|
|
auto-generating SQL with the help of the companion embedded library
|
|
\texttt{Atlas} (\url{https://atlasgo.io/}). The upstream project (\texttt{ent})
|
|
encourages moving to otherwise more traditional \emph{versioned migrations} for
|
|
more mature projects, so that is on the roadmap for later.
|
|
|
|
The best part about using \texttt{ent} is that there is no need to define
|
|
supplemental methods on the models, as with \texttt{ent} these are meant to be
|
|
\emph{code generated} (in the older sense of word, not with Large Language
|
|
Models) into existence. Code generation creates files with actual Go models
|
|
based on the types of the attributes in the database schema model, and the
|
|
respective relations are transformed into methods on the receiver or functions
|
|
taking object attributes as arguments.
|
|
|
|
For instance, if the model's attribute is a string value \texttt{Email}, ent
|
|
can be used to generate code that contains methods on the user object like the
|
|
following:
|
|
|
|
\begin{itemize}
|
|
\item \texttt{EmailIn(pattern string)}
|
|
\item \texttt{EmailEQ(email string)}
|
|
\item \texttt{EmailNEQ(email string)}
|
|
\item \texttt{EmailHasSuffix(suffix string)}
|
|
\end{itemize}
|
|
|
|
These methods can further be imported into other packages and this makes
|
|
working with the database a morning breeze.
|
|
|
|
All the database \emph{entity} IDs were declared as type \texttt{UUID}
|
|
(\emph{universally unique ID, theoretically across space and time}), contrary
|
|
to the more traditional \emph{integer} IDs.
|
|
|
|
Support for \texttt{UUID}s was provided natively by the supported databases and
|
|
in Go via a popular and vetted open-source library
|
|
(\url{github.com/google/uuid}). Among the upsides of using \texttt{UUID}s over
|
|
integer IDs is that there is no need to manually increment the ID. But more
|
|
importantly, there is also the fact that compared to 32-bit\footnotemark{}
|
|
signed integers the \texttt{UUID} is a somewhat randomly generated 16 byte (128
|
|
bit) array, reducing chances of collision.
|
|
|
|
Barring higher chances of preventing conflicts during imports of foreign
|
|
databases, this design decision might not provide any advantage for the current
|
|
system \emph{at the moment}. It could, however, hold importance in the future,
|
|
should the database ever be deployed in a replicated, high-availability (HA)
|
|
manner with more than one concurrent \emph{writer} (replicated application
|
|
instances).
|
|
|
|
\footnotetext{In Go, integer size is architecture dependent, see
|
|
\url{https://go.dev/ref/spec#Numeric_types}.}
|
|
|
|
The relations between entities as modelled with \texttt{ent} can be imagined as
|
|
the edges connecting the nodes of a directed \emph{graph}, with the nodes
|
|
representing the entities. This conceptualisation lends itself to a more
|
|
human-friendly querying language, where the directionality can be expressed
|
|
with words describing ownership, like so:
|
|
|
|
\vspace{\parskip}
|
|
\begin{lstlisting}[caption={Ent graph query},
|
|
label=entQuery,
|
|
backgroundcolor=\color{lstbg},
|
|
language=Go,
|
|
]
|
|
one, err := users.Query.
|
|
Where(
|
|
LocalBreach.
|
|
Has(Field_xyz)
|
|
).
|
|
Only(ctx)
|
|
\end{lstlisting}
|
|
|
|
|
|
|
|
\n{1}{Deployment}
|
|
|
|
A deployment setup as suggested in Section~\ref{sec:deploymentRecommendations}
|
|
is already \emph{partially} covered by the multi-stage \texttt{Containerfile}
|
|
that is available in the main sources. Once built, the resulting container
|
|
image only contains a handful of things it absolutely needs:
|
|
|
|
\begin{itemize}
|
|
\item a self-contained statically linked copy of the program
|
|
\item a default configuration file and corresponding Dhall expressions cached
|
|
at build time
|
|
\item a recent CA certs bundle
|
|
\end{itemize}
|
|
|
|
Since the program also needs a database for proper functioning, an example
|
|
scenario includes the application container being run in a Podman \textbf{pod}
|
|
(as in a pea pod or pod of whales) together with the database. That results in
|
|
not having to expose the database to the entire host or out of the pod at all,
|
|
it is only available over pod's \texttt{localhost}. Hopefully it goes without
|
|
saying that the default values of any configuration secrets should be
|
|
substituted by the application operator with new, securely generated ones
|
|
(read: using \texttt{openssl rand} or \texttt{pwgen}).
|
|
|
|
|
|
\n{2}{Rootless Podman}
|
|
|
|
Assuming rootless Podman set up and the \texttt{just} tool installed on the
|
|
host, the application could be deployed by following a series of relatively
|
|
simple steps:
|
|
|
|
\begin{itemize}
|
|
\item build (or pull) the application container image
|
|
\item create a pod with user namespacing, exposing the application port
|
|
\item run the database container inside the pod
|
|
\item run the application inside the pod
|
|
\end{itemize}
|
|
|
|
In concrete terms, it would resemble something along the lines of
|
|
Listing~\ref{podmanDeployment}. Do note that all the commands are executed
|
|
under the unprivileged \texttt{user@containerHost} that is running rootless
|
|
Podman, i.e.\ it has \texttt{UID}/\texttt{GID} mapping entries in
|
|
\texttt{/etc/setuid} and \texttt{/etc/setgid} files \textbf{prior} to running any
|
|
Podman commands.
|
|
|
|
% \newpage
|
|
\begin{lstlisting}[language=bash, caption={Example application deployment using
|
|
rootless Podman},
|
|
label=podmanDeployment, basicstyle=\linespread{0.9}\small\ttfamily,
|
|
backgroundcolor=\color{lstbg}, commentstyle=\color{gray},
|
|
morekeywords={mkdir,podman,just},
|
|
]
|
|
# From inside the project folder, build the image locally using kaniko.
|
|
just kaniko
|
|
|
|
# Create a pod, limit the amount of memory/CPU available to its containers.
|
|
podman pod create --replace --name pcmt \
|
|
--memory=100m --cpus=2 \
|
|
--userns=keep-id -p3005:3000
|
|
|
|
# Create the database folder and run the database in the pod.
|
|
mkdir -pv ./tmp/db
|
|
podman run --pod pcmt --replace -d --name "pcmt-pg" --rm \
|
|
-e POSTGRES_INITDB_ARGS="--auth-host=scram-sha-256 \
|
|
--auth-local=scram-sha-256" \
|
|
-e POSTGRES_PASSWORD=postgres \
|
|
-v $PWD/tmp/db:/var/lib/postgresql/data:Z \
|
|
--health-cmd "sh -c 'pg_isready -U postgres -d postgres'" \
|
|
--health-on-failure kill \
|
|
--health-retries 3 \
|
|
--health-interval 10s \
|
|
--health-timeout 1s \
|
|
--health-start-period=5s \
|
|
docker.io/library/postgres:15.2-alpine3.17
|
|
|
|
# Run the application itself in the pod.
|
|
podman run --pod pcmt --replace --name pcmt-og -d --rm \
|
|
-e PCMT_LIVE=False \
|
|
-e PCMT_DBTYPE="postgres" \
|
|
-e PCMT_CONNSTRING="host=pcmt-pg port=5432 sslmode=disable \
|
|
user=postgres dbname=postgres password=postgres"
|
|
-v $PWD/config.dhall:/config.dhall:Z,ro \
|
|
docker.io/immawanderer/mt-pcmt:testbuild -config /config.dhall
|
|
\end{lstlisting}
|
|
% \vspace*{-\baselineskip}
|
|
|
|
To summarise Listing~\ref{podmanDeployment}, first the application container is
|
|
built from inside the project folder using \texttt{kaniko}. The container
|
|
image could alternatively be pulled from the container repository, but it makes
|
|
more sense showing the image being built from sources with the listing
|
|
depicting a \texttt{:testbuild} tag being used.
|
|
|
|
Next, a \emph{pod} is created and given a name, setting the port binding for
|
|
the application. Then, the database container is started inside the pod,
|
|
configured with a healthchecking mechanism.
|
|
|
|
As a final step, the application container itself is run inside the pod. The
|
|
application configuration named \texttt{config.dhall} located in \texttt{\$PWD}
|
|
is mounted as a volume into container's \texttt{/config.dhall}, providing the
|
|
application with a default configuration. The default container does contain a
|
|
default configuration for reference, however, running the container without
|
|
additionally providing the necessary secrets would fail.
|
|
|
|
\n{3}{Sanity checks}
|
|
|
|
Also do note that the application connects to the database using its
|
|
\emph{container} name, i.e.\ not the IP address. This is possible thanks to
|
|
Podman setting up DNS resolution inside pods using default networks in such a
|
|
way that all containers in the pod can reach each other using their (container)
|
|
names.
|
|
|
|
Interestingly, connecting via \texttt{localhost} from containers inside the pod
|
|
would also work. Inside the pod, any container in the pod can reach any other
|
|
container in the same pod via \emph{pod's} own \texttt{localhost}, thanks to a
|
|
shared network name space~\cite{podmanNet}.
|
|
|
|
In fact, \emph{pinging} (sending ICMP packets using the \texttt{ping} command)
|
|
the database and application containers from an ad-hoc Alpine Linux container
|
|
that just joined the pod temporarily yields:
|
|
|
|
\vspace{\parskip}
|
|
\begin{lstlisting}[language=bash, caption={Pinging pod containers using their
|
|
names}, label=podmanPing, basicstyle=\linespread{0.9}\small\ttfamily,
|
|
backgroundcolor=\color{lstbg},
|
|
morekeywords={podman,ping}
|
|
]
|
|
user@containerHost % podman run --rm -it \
|
|
--user=0 \
|
|
--pod=pcmt \
|
|
docker.io/library/alpine:3.18
|
|
/ % ping -c2 pcmt-og
|
|
PING pcmt-og (127.0.0.1): 56 data bytes
|
|
64 bytes from 127.0.0.1: seq=0 ttl=42 time=0.072 ms
|
|
64 bytes from 127.0.0.1: seq=1 ttl=42 time=0.118 ms
|
|
|
|
--- pcmt-og ping statistics ---
|
|
2 packets transmitted, 2 packets received, 0% packet loss
|
|
round-trip min/avg/max = 0.072/0.095/0.118 ms
|
|
/ % ping -c2 pcmt-pg
|
|
PING pcmt-pg (127.0.0.1): 56 data bytes
|
|
64 bytes from 127.0.0.1: seq=0 ttl=42 time=0.045 ms
|
|
64 bytes from 127.0.0.1: seq=1 ttl=42 time=0.077 ms
|
|
|
|
--- pcmt-pg ping statistics ---
|
|
2 packets transmitted, 2 packets received, 0% packet loss
|
|
round-trip min/avg/max = 0.045/0.061/0.077 ms
|
|
/ %
|
|
\end{lstlisting}
|
|
|
|
Was the application deployed in a traditional manner instead of using Podman,
|
|
the use of FQDNs or IPs would be probably be necessary, as there would be no
|
|
magic resolution of container names happening transparently in the background.
|
|
|
|
\n{3}{Database isolation from the host}
|
|
|
|
A keen observer has undoubtedly noticed that the pod constructed in
|
|
Listing~\ref{podmanDeployment} did only create the binding for a port used by
|
|
the application (\texttt{5005/tcp}). The Postgres default port
|
|
\texttt{5432/tcp} is not among pod's port bindings, as can be seen in the pod
|
|
creation command in the said listing. This can also easily be verified using
|
|
the command in Listing~\ref{podmanPortBindings}:
|
|
|
|
\begin{lstlisting}[language=bash, caption={Podman pod port binding inspection},
|
|
label=podmanPortBindings, basicstyle=\linespread{0.9}\small\ttfamily,
|
|
backgroundcolor=\color{lstbg},
|
|
morekeywords={podman},
|
|
]
|
|
user@containerHost % podman pod inspect pcmt \
|
|
--format="Port bindings: {{.InfraConfig.PortBindings}}\n\
|
|
Host network: {{.InfraConfig.HostNetwork}}"
|
|
Port bindings: map[3000/tcp:[{ 5005}]]
|
|
Host network: false
|
|
\end{lstlisting}
|
|
\vspace*{-\baselineskip}
|
|
|
|
To be absolutely sure that the database is available only internally in the pod
|
|
(unless, of course, there is another process listening on the subject port),
|
|
and that connecting to the database from outside the pod (i.e. from the
|
|
container host) really \emph{does} fail, the following commands can be issued:
|
|
|
|
\begin{lstlisting}[language=bash, caption={In-pod database is unreachable from
|
|
the host}, breaklines=true, label=podDbUnreachable,
|
|
basicstyle=\linespread{0.9}\small\ttfamily,
|
|
backgroundcolor=\color{lstbg},
|
|
]
|
|
user@containerHost % curl localhost:5432
|
|
--> curl: (7) Failed to connect to localhost port 5432 after 0 ms: Couldn't connect to server
|
|
\end{lstlisting}
|
|
\vspace*{-\baselineskip}
|
|
|
|
The error in Listing~\ref{podDbUnreachable} is indeed expected, as it is the
|
|
result of the database port not been exposed from the pod.
|
|
|
|
Of course, since a volume (essentially a bind mount) from the host is used, the
|
|
actual data is still accessible on the host, both to privileged users and the
|
|
user running the pod. On the host with SELinux support, the \texttt{:Z} volume
|
|
addendum at least ensures that the content of the volume is directly
|
|
inaccessible to other containers, including the application container running
|
|
inside the same pod, via SELinux labelling.
|
|
|
|
\n{3}{Health checks}
|
|
|
|
Running the containers with health checks can be counted among the few crucial
|
|
settings. That way
|
|
the container runtime can periodically \emph{check} that the application
|
|
running inside the container is behaving correctly and instructions can be
|
|
provided on what action should be taken, should the health of the application
|
|
evaluate unsatisfyingly. Furthermore, different sets of health checking
|
|
commands can be passed with Podman for start-up and runtime.
|
|
|
|
|
|
\n{2}{Reverse proxy configuration}
|
|
|
|
If the application is deployed behind a reverse proxy, such as NGINX, the
|
|
configuration snippet in Listing~\ref{nginxSnip} might apply. Do note how the
|
|
named upstream server \texttt{pcmt} references the port that was exposed from
|
|
the pod created in Listing~\ref{podmanDeployment}.
|
|
|
|
\begin{lstlisting}[caption={Example reverse proxy configuration snippet},
|
|
breaklines=true, label=nginxSnip, basicstyle=\linespread{0.9}\scriptsize\ttfamily,
|
|
backgroundcolor=\color{lstbg},
|
|
morekeywords={upstream,server,return,listen,server_name,add_header,access_log,error_log,location,proxy_pass,proxy_set_header,allow,include,more_set_headers,ssl_buffer_size,ssl_dhparam,ssl_certificate,ssl_certificate_key,http2},
|
|
]
|
|
upstream pcmt {
|
|
server 127.0.0.1:5005;
|
|
}
|
|
server {
|
|
return 301 https://<pcmt domain>$request_uri;
|
|
listen 80;
|
|
listen [::]:80;
|
|
server_name: <pcmt domain> www.<pcmt domain>;
|
|
return 404;
|
|
add_header Referrer-Policy "no-referrer, origin-when-cross-origin";
|
|
}
|
|
server {
|
|
server_name <pcmt domain>;
|
|
access_log /var/log/nginx/<pcmt domain>.access.log;
|
|
error_log /var/log/nginx/<pcmt domain>.error.log;
|
|
location / {
|
|
proxy_pass http://pcmt;
|
|
proxy_set_header X-Forwarded-Host $host;
|
|
proxy_set_header X-Forwarded-For $proxy_add_forwarded_for;
|
|
}
|
|
location /robots.txt {
|
|
allow all;
|
|
add_header Content-Type "text/plain; charset=utf-8";
|
|
add_header X-Robots-Tag "all, noarchive, notranslate";
|
|
return 200 "User-agent: *\nDisallow: /";
|
|
}
|
|
include sec-headers.conf;
|
|
|
|
add_header X-Real-IP $remote_addr;
|
|
add_header X-Forwarded-For $proxy_add_x_forwarded_for;
|
|
add_header X-Forwarded-Proto $scheme;
|
|
more_set_headers 'Early-Data: $ssl_early_data';
|
|
|
|
listen [::]:443 ssl http2;
|
|
listen 443 ssl http2;
|
|
ssl_certificate /etc/letsencrypt/live/<pcmt domain>/fullchain.pem;
|
|
ssl_certificate_key /etc/letsencrypt/live/<pcmt domain>/privkey.pem;
|
|
include /etc/letsencrypt/options-ssl-nginx.conf;
|
|
ssl_dhparam /etc/letsencrypt/ssl-dhparams.pem;
|
|
|
|
# reduce TTFB
|
|
ssl_buffer_size 4k;
|
|
}
|
|
\end{lstlisting}
|
|
\vspace*{-\baselineskip}
|
|
|
|
The snippet describes how traffic arriving at port \texttt{80/tcp} (IPv4 or
|
|
IPv6) that matches the domain name(s) \texttt{\{www.,\}<pcmt domain>}
|
|
(\texttt{<pcmt domain>} being the domain name that the program was configured
|
|
with, including appropriate DNS records) gets 301-redirected to the same
|
|
location (\texttt{\$request\_uri}), only over \texttt{HTTPS}. If the server
|
|
name does not match, a 404 is returned instead. In the main location block, all
|
|
traffic except for \texttt{/robots.txt} is forwarded to the named backend, with
|
|
headers added on top by the proxy in order to label the incoming requests as
|
|
\emph{not} originating at the proxy. The \emph{robots} route is treated
|
|
specially, immediately returning a directive that disallows crawling of any
|
|
resource on the page for all. The proxy is also instructed to log access and
|
|
error events to specific log files, finally load the domain's TLS certificates
|
|
(obtained out of band), reduce the \texttt{ssl\_buffer\_size} and listen on
|
|
port \texttt{443/tcp} (dual stack).
|
|
|
|
|
|
\n{1}{Validation}
|
|
|
|
\n{2}{Unit tests}
|
|
|
|
Unit testing is a hot topic for many people and the author does not count
|
|
himself to be a staunch supporter of neither extreme. The ``no unit tests''
|
|
opinion seems to discount any benefit there is to unit testing, while a
|
|
``TDD-only''\footnotemark{} approach can be a little too much for some people's
|
|
taste. The author tends to prefer a \emph{middle ground} approach in this
|
|
particular case, i.e. writing enough tests where meaningful, but not
|
|
necessarily testing everything or writing tests prior to business logic code.
|
|
Arguably, following the practice of TDD should result in writing a \emph{better
|
|
designed} code, particularly because there needs to be a prior thought about
|
|
the shape and function of the code, as it is tested for before being even
|
|
written, but it adds a slight inconvenience to what is otherwise a
|
|
straightforward process.
|
|
|
|
Thanks to Go's built in support for testing via its \texttt{testing} package
|
|
and the tooling in the \texttt{go} tool, writing tests is relatively simple. Go
|
|
looks for files in the form \texttt{<filename>\_test.go} in the present working
|
|
directory but can be instructed to look for test files in packages recursively
|
|
found on any path using the ellipsis, like so: \texttt{go test
|
|
./path/to/package/\ldots}, which then \emph{runs} all the tests found, and
|
|
reports some statistics, such as the time it took to run the test or whether it
|
|
succeeded or failed. To be precise, the test files also need to contain test
|
|
functions, which are functions with the signature \texttt{func TestWhatever(t
|
|
*testing.T)\{\}} and where the function prefix ``Test'' is just as important as
|
|
the signature. Without it, the function is not considered to be a testing
|
|
function despite having the required signature and is therefore \emph{not}
|
|
executed during testing.
|
|
|
|
This test lookup behaviour, however, also has a neat side effect: all the test
|
|
files can be kept side-by-side their regular source counterparts, there is no
|
|
need to segregate them into a specially blessed \texttt{tests} folder or
|
|
similar, which in author's opinion improves readability. As a failsafe, in case
|
|
no actual test are found, the current behaviour of the tool is to print a note
|
|
informing the developer that no tests were found, which is handy to learn if it
|
|
was not intended/expected. When compiling regular source code, the Go files
|
|
with \texttt{\_test} in the name are simply ignored by the build tool.
|
|
|
|
\footnotetext{TDD, or Test Driven Development, is a development methodology
|
|
whereby tests are written \emph{first}, then a complementary piece of code
|
|
that is supposed to be tested is added, just enough to get past the compile
|
|
errors and to see the test \emph{fail} and then is the code finally
|
|
refactored to make the test \emph{pass}. The code can then be fearlessly
|
|
extended because the test is the safety net catching the programmer when the
|
|
mind slips and alters the originally intended behaviour of the code.}
|
|
|
|
|
|
\n{2}{Integration tests}
|
|
|
|
Integrating with external software, namely the database in case of this
|
|
program, is designed to utilise the same mechanism that was mentioned in the
|
|
previous section: Go's \texttt{testing} package. These tests verify that the
|
|
code changes can still perform the same actions with the external software that
|
|
were possible before the change and are run before every commit locally and
|
|
then after pushing to remote in the CI.
|
|
|
|
\n{3}{func TestUserExists(t *testing.T)}
|
|
|
|
In the integration test shown in Listing~\ref{integrationtest}, it is prefaced
|
|
at line 10 by declaring a helper function \texttt{getCtx() context.Context},
|
|
which takes no arguments and returns a new \texttt{context.Context} initialised
|
|
with the value of the global logger. As previously mentioned, that is how the
|
|
logger gets injected into the user module functions. The actual test function
|
|
with the signature \texttt{TestUserExists(t *testing.T)} defines a database
|
|
connection string at line 21 and attempts to open a connection to the database.
|
|
The database in use here is SQLite3 running in memory mode, meaning no file is
|
|
actually written to disk during this process. Since the testing data is not
|
|
needed after the test, this is desirable. Next, a defer statement calls the
|
|
\texttt{Close()} method on the database object, which is the Go idiomatic way
|
|
of closing files and network connections (which are also an abstraction over
|
|
files on UNIX-like operating systems such as GNU/Linux). Contrary to where it
|
|
is declared, the \emph{defer} statement is only called after all the statements
|
|
in the surrounding function, which makes sure no file descriptors (FDs) are
|
|
leaked and the file is properly closed when the function returns.
|
|
|
|
In the next step at line 25 a database schema creation is attempted, handling
|
|
the potential error in a Go idiomatic way, which uses the return value from the
|
|
function in an assignment to a variable declared in the \texttt{if} statement,
|
|
and checks whether the \texttt{err} was \texttt{nil} or not. In case the
|
|
\texttt{err} was not \texttt{nil}, i.e.\ \emph{there was an error in the callee
|
|
function}, the condition evaluates to \texttt{true}, which is followed by
|
|
entering the inner block. Inside it, the error is announced to the user (likely
|
|
a developer running the test in this case) and the testing object's
|
|
\texttt{FailNow()} method is called. That marks the test function as having
|
|
failed, and thus stops its execution. In this case, that is the desired
|
|
outcome, since if the database schema creation call fails, there really is no
|
|
point in continuing the testing of user creation.
|
|
\\
|
|
Conversely, if the schema \emph{does} get created without an error, the code
|
|
continues to declare a few variables (lines 30-32): \texttt{username},
|
|
\texttt{email} and \texttt{ctx}, where the context injected with the logger is
|
|
saved. Two of them are subsequently (line 33) passed into the
|
|
\texttt{UsernameExists} function, \texttt{ctx} being the first argument and the
|
|
database pointer and \texttt{username} following, while the \texttt{email}
|
|
variable is only used at a later stage (line 46). The point of declaring them
|
|
together is to give a sense of relatedness. The error value returned from this
|
|
function is again checked (line 33) and if everything goes well, the
|
|
\texttt{usernameFound} boolean value is checked next at line 38.
|
|
|
|
\smallskip
|
|
\smallskip
|
|
\begin{lstlisting}[language=Go, caption={User existence integration test},
|
|
label=integrationtest,basicstyle=\linespread{0.9}\scriptsize\ttfamily,
|
|
backgroundcolor=\color{lstbg},
|
|
numbers=left,
|
|
numberstyle=\linespread{0.9}\scriptsize\ttfamily,
|
|
frame=l,
|
|
framesep=18.5pt,
|
|
framerule=0.1pt,
|
|
xleftmargin=18.7pt,
|
|
otherkeywords={\%s, \%q, \%v},
|
|
]
|
|
// modules/user/user_test.go
|
|
package user
|
|
|
|
import (
|
|
"context"
|
|
"testing"
|
|
|
|
"git.dotya.ml/mirre-mt/pcmt/ent/enttest"
|
|
"git.dotya.ml/mirre-mt/pcmt/slogging"
|
|
_ "github.com/xiaoqidun/entps"
|
|
)
|
|
|
|
func getCtx() context.Context {
|
|
l := slogging.Init(false)
|
|
ctx := context.WithValue(context.Background(), CtxKey{}, l)
|
|
return ctx
|
|
}
|
|
|
|
func TestUserExists(t *testing.T) {
|
|
connstr := "file:ent_tests?mode=memory&_fk=1"
|
|
db := enttest.Open(t, "sqlite3", connstr)
|
|
defer db.Close()
|
|
|
|
if err := db.Schema.Create(context.Background()); err != nil {
|
|
t.Errorf("failed to create schema resources: %v", err)
|
|
t.FailNow()
|
|
}
|
|
|
|
username := "dude"
|
|
email := "dude@b.cc"
|
|
ctx := getCtx()
|
|
|
|
usernameFound, err := UsernameExists(ctx, db, username)
|
|
if err != nil {
|
|
t.Errorf("error checking for username {%s} existence: %q", username, err)
|
|
}
|
|
|
|
if usernameFound {
|
|
t.Errorf("unexpected: user{%s} should not have been found", username)
|
|
}
|
|
|
|
if _, err := EmailExists(ctx, db, email); err != nil {
|
|
t.Errorf("unexpected: user email '%s' should not have been found", email)
|
|
}
|
|
|
|
usr, err := CreateUser(ctx, db, email, username, "so strong")
|
|
if err != nil {
|
|
t.Errorf("failed to create user, error: %q", err)
|
|
t.FailNow()
|
|
} else if usr == nil {
|
|
t.Error("got nil usr back")
|
|
t.FailNow()
|
|
}
|
|
|
|
if usr.Username != username {
|
|
t.Errorf("got back wrong username, want: %s, got: %s",
|
|
username, usr.Username,
|
|
)
|
|
} // ...more checks...
|
|
}
|
|
\end{lstlisting}
|
|
|
|
Since the database has just been created, there should be no users, which is
|
|
checked in the body of the \texttt{if} statement (line 35). The same check is
|
|
then performed using an email (line 42), which is also correctly expected to
|
|
fail.
|
|
|
|
The final statements of the described test attempts to create a user by calling
|
|
the function \texttt{CreateUser(...)} at line 46, whose return values are again
|
|
checked for both error and \emph{nillability}, respectively. The test continues
|
|
with more of the checks similar to what has been described so far, but the rest
|
|
was omitted for brevity.
|
|
|
|
As was just demonstrated in the test, a neat thing about error handling in Go
|
|
is that it allows for very easy checking of all code paths, not just the
|
|
\emph{happy path} where there are no issues. The recommended approach of
|
|
immediately explicitly handling (or deciding to ignore) the error is in
|
|
author's view superior to wrapping hundreds of lines in \texttt{try} blocks and
|
|
then \emph{catching} (or not) \emph{all the} exceptions, as is the practice in
|
|
some other languages.
|
|
|
|
|
|
\n{2}{Test environment}
|
|
|
|
The application has been deployed in a test environment on author's modest
|
|
Virtual Private Server (VPS) at \texttt{https://testpcmt.dotya.ml}, protected
|
|
by \emph{Let's Encrypt}\allowbreak issued, short-lived, ECDSA
|
|
\texttt{secp384r1} curve TLS certificate, and configured with strict CSP. It is
|
|
a test instance, therefore limits (and rate-limits) to prevent abuse might be
|
|
imposed.
|
|
\\
|
|
The test environment makes the program available over both modern IPv6 and
|
|
legacy IPv4 protocols, to maximise accessibility. Redirects were set up from
|
|
plain HTTP to HTTPS, as well as from \texttt{www} to non-\texttt{www} domain.
|
|
The subject domain configuration is hardened by setting the \texttt{CAA}
|
|
record, limiting certificate authorities (CAs) that are able to issue TLS
|
|
certificates for it (and let them be trusted by validating clients).
|
|
Additionally, \textit{HTTP Strict Transport Security} (HSTS) had been enabled
|
|
for the main domain (\texttt{dotya.ml}) including the subdomains quite some
|
|
time ago (consult the preload lists in Firefox/Chrome), which mandates that
|
|
clients speaking HTTP only ever connect to it (and the subdomains) using TLS.
|
|
|
|
The whole deployment has been orchestrated using an Ansible\footnotemark{}
|
|
playbook created for this occasion, focusing on idempotence with the aim of
|
|
reliably automating the deployment process. At the same time, it is now
|
|
described reasonably well in the code. Its code is available at
|
|
\url{https://git.dotya.ml/mirre-mt/ansible-pcmt.git}.
|
|
|
|
\footnotetext{A Nix-ops approach was considered, however, Ansible was deemed
|
|
more suitable since the existing host runs Arch.}
|
|
|
|
|
|
\n{3}{Deployment validation}
|
|
|
|
% TODO: show the results of testing the app in prod using:
|
|
% \url{https://testssl.sh/} and
|
|
% \url{https://gtmetrix.com/reports/testpcmt.dotya.ml/}.
|
|
|
|
The deployed application has been validated using the \textit{Security Headers}
|
|
tool (see \url{https://securityheaders.com/?q=https%3A%2F%2Ftestpcmt.dotya.ml}),
|
|
the results of which can be seen in Figure~\ref{fig:secheaders}.
|
|
|
|
It shows that the application sets the \texttt{Cross Origin Opener Policy} to
|
|
\texttt{same-origin}, which isolates the browsing context exclusively to
|
|
\textit{same-origin} documents, preventing \textit{cross-origin} documents from
|
|
loading in the same browser context.
|
|
|
|
\obr{Security Headers scan}{fig:secheaders}{.89}{graphics/screen-securityHeaders}
|
|
|
|
Furthermore, a \texttt{Content Security Policy} of
|
|
\texttt{upgrade-insecure-requests; default-src 'none'; manifest-src 'self';
|
|
font-src 'self'; img-src 'self' https://*; script-src 'self'; style-src 'self';
|
|
object-src 'self'; form-action 'self'; frame-ancestors'self'; base-uri 'self'}
|
|
is set by the program using a header.
|
|
This policy essentially pronounces the application (whatever domain it happens
|
|
to be hosted at - \texttt{'self'}) as the only \textit{permissible} source for
|
|
any scripts, styles and frames, the only destination of web forms. One
|
|
exception is the \texttt{image-src 'self' https://*} directive, which more
|
|
leniently also permits images from any \textit{secure} sources. This measure
|
|
ensures that no unvetted content is ever loaded from elsewhere.
|
|
|
|
The \texttt{Referrer-Policy} header setting of \texttt{no-referrer,
|
|
strict-origin-when-cross-origin} ensures that user tracking is reduced, since
|
|
no referrer is included (the \texttt{Referer} header is omitted) when the user
|
|
navigates away from the site or somehow send requests outside the application
|
|
using other means. The \texttt{Permissions-Policy} set to
|
|
\texttt{geolocation=(), midi=(), sync-xhr=(), microphone=(), camera=(),
|
|
gyroscope=(), magnetometer=(), fullscreen=(self), payment=()} declares that the
|
|
application is, for instance, never going to request access to payment
|
|
information, user microphone or camera devices, or geolocation.
|
|
|
|
\texttt{gobuster} was used in fuzzing mode to aid in uncovering potential
|
|
application misconfigurations. The wordlists used include:
|
|
|
|
\begin{itemize}
|
|
\item Anton Lopanitsyn's \texttt{fuzz.txt} (\url{https://github.com/Bo0oM/fuzz.txt/tree/master})
|
|
\item Daniel Miessler's \texttt{SecLists} (\url{https://github.com/danielmiessler/SecLists})
|
|
\item Sam's \texttt{samlists} (\url{https://github.com/the-xentropy/samlists})
|
|
\end{itemize}
|
|
|
|
Many requests yielded 404s for non-existent pages, or possibly pages requiring
|
|
authentication (\emph{NotFound} is used so as not to disclose page's
|
|
existence). The program initially also issued quite a few 503s as a result of
|
|
rate-limiting, until \texttt{gobuster} was tamed using the \texttt{--delay}
|
|
parameter. Anti-CSRF measures employed by the program caused most of the
|
|
requests to yield 400s (missing CSRF token), or 403s with a CSRF token.
|
|
% A Burp test would perhaps be more telling.
|
|
|
|
The deployed application was scanned with Quallys' \textit{SSL Labs} scanner
|
|
and the results can be seen in Figure~\ref{fig:ssllabs}, confirming that HSTS
|
|
(includes subdomains) is deployed, the server runs TLS 1.3, the DNS Certificate
|
|
Authority Authorisation (CAA) is configured for the domain, with the overall
|
|
grade being A+.
|
|
|
|
\obr{Quallys SSL Labs scan}{fig:ssllabs}{.75}{graphics/screen-sslLabs}
|
|
|
|
|
|
|
|
\n{1}{Application screenshots}
|
|
|
|
Figure~\ref{fig:homepage} depicts the initial page that a logged-out user is
|
|
greeted with when they load the application.
|
|
|
|
\obr{Homepage}{fig:homepage}{.84}{graphics/screen-homepage}
|
|
|
|
Figure~\ref{fig:signup} can be seen showing a registration page with input
|
|
fields turned green after basic validation. Visiting this page with
|
|
registration disabled in settings would yield a 404.
|
|
|
|
\obr{Registration page}{fig:signup}{.65}{graphics/screen-signup}
|
|
|
|
\newpage
|
|
|
|
\obr{Registration page email
|
|
error}{fig:signupEmailError}{.54}{graphics/screen-signup-emailError}
|
|
|
|
A sign-up form error telling the user to provide a valid email address is shown
|
|
in Figure~\ref{fig:signupEmailError}.
|
|
|
|
|
|
\obr{Sign-in page}{fig:signin}{.55}{graphics/screen-signin}
|
|
|
|
Figure~\ref{fig:signin} depicts a sign-in form similar to the sign-up one.
|
|
|
|
\obr{Short password error on
|
|
sign-in}{fig:signinShortPasswd}{.55}{graphics/screen-signin-shortPasswordError}
|
|
|
|
An error in Figure~\ref{fig:signinShortPasswd} prompts the user to lengthen the
|
|
content of the password field from 3 to at least 20 characters.
|
|
|
|
\newpage
|
|
|
|
\obr{Admin homepage}
|
|
{fig:adminHome}{.25}
|
|
{graphics/screen-adminHome}
|
|
|
|
Figure~\ref{fig:adminHome} displays a simple greeting and a logout button.
|
|
|
|
\obr{User management screen}
|
|
{fig:adminUserManagement}{.85}
|
|
{graphics/screen-adminUserManagement}
|
|
|
|
Figure~\ref{fig:adminUserManagement} shows the user management screen, which
|
|
provides links to view user details page, start creating a new user.
|
|
|
|
% \obr{User creation screen}
|
|
% {fig:adminUserCreate}{.35}
|
|
% {graphics/screen-adminUserCreate}
|
|
|
|
\obr{User creation: `username not unique' error}
|
|
{fig:adminUserCreateErrorUsernameNotUnique}{.65}
|
|
{graphics/screen-adminUserCreateErrorUsernameNotUnique}
|
|
|
|
User creation form can be seen in
|
|
Figure~\ref{fig:adminUserCreateErrorUsernameNotUnique}. Both regular and admin
|
|
level users can be created here. In this case, an error is shown, telling the
|
|
user there is an issue with username uniqueness. User experience of this
|
|
process could in the future be improved by using a bit of JavaScript (or
|
|
WebAssembly) to check uniqueness of the username on user's \emph{key-up}.
|
|
|
|
\newpage
|
|
|
|
\obr{Creation of user `demo'}
|
|
{fig:adminUserCreateDemo}{.75}
|
|
{graphics/screen-adminUserCreateDemo}
|
|
|
|
The user management screen is again shown in
|
|
Figure~\ref{fig:adminUserCreateDemo} after user `demo' was created. An
|
|
informative \emph{flash} message is printed near the top of the page
|
|
immediately after the action and not shown on subsequent page loads.
|
|
|
|
|
|
\obr{User details screen}
|
|
{fig:adminUserDetail}{.65}
|
|
{graphics/screen-adminUserDetail}
|
|
|
|
The user details page is depicted in Figure~\ref{fig:adminUserDetail}. The
|
|
interface presents key information about the user such as ID, username and
|
|
admin status. Additionally, it provides a link back to the previous page and
|
|
two buttons: one for editing the user and one for user deletion.
|
|
|
|
\obr{User edit screen}
|
|
{fig:adminUserEdit}{.65}
|
|
{graphics/screen-adminUserEdit}
|
|
|
|
Figure~\ref{fig:adminUserEdit} shows the form for user editing with a button
|
|
`Update' in the bottom for submitting, a couple of checkboxes for toggling
|
|
`admin' and `active' state of the user. Above those, there are input fields for
|
|
`username', `email', `password' and the confirmation of the password.
|
|
|
|
\newpage
|
|
|
|
\obr{User deletion confirmation}
|
|
{fig:adminUserDeleteConfirm}{.55}
|
|
{graphics/screen-adminUserDeleteConfirmation}
|
|
|
|
When attempting to delete a user, the administrator is presented with the
|
|
screen shown in Figure~\ref{fig:adminUserDeleteConfirm}, which asks them
|
|
whether they are absolutely sure to perform an action with permanent
|
|
consequences. The `Confirm permanent deletion' button is highlighted in intense
|
|
red colour, while the `Cancel' button is displayed in a light blue tone. There
|
|
are two additional links: the `All users' one that points to the user
|
|
management page, and the `Back to detail' one that simply brings the
|
|
administrator one step back to the user details page.
|
|
|
|
\obr{User deletion post-hoc}
|
|
{fig:adminUserDeletePostHoc}{.55}
|
|
{graphics/screen-adminUserDemoDeletion}
|
|
|
|
After successful user deletion, the administrator is redirected back to user
|
|
management page and a flash message confirming the deletion is printed near the
|
|
top of the page, as shown in Figure~\ref{fig:adminUserDeletePostHoc}.
|
|
|
|
\obr{Log-out message}
|
|
{fig:logout}{.20}
|
|
{graphics/screen-logout}
|
|
|
|
Figure~\ref{fig:logout} shows the message printed to users on logout.
|
|
|
|
\newpage
|
|
|
|
\obr{Manage API keys}
|
|
{fig:manageAPIKeys}{.65}
|
|
{graphics/screen-manageAPIKeys}
|
|
|
|
Figure~\ref{fig:manageAPIKeys} shows a page that allows administrators to
|
|
manage instance-wide API keys for external services, such as \emph{Have I Been
|
|
Pwned?} or \emph{DeHashed.com}. Do note that these keys are never distributed
|
|
to clients in any way and are only ever used by the application itself to make
|
|
the requests on \emph{behalf} of the users.
|
|
|
|
\obr{Import of locally available breach data from the CLI}
|
|
{fig:localImport}{.99}
|
|
{graphics/screen-localImport}
|
|
|
|
Figure~\ref{fig:localImport} depicts how formatted breach data can be imported
|
|
into the program's database using the CLI.
|
|
|
|
|
|
|
|
|
|
% =========================================================================== %
|