extend, reword practical part
This commit is contained in:
parent
fb5c7d0bcd
commit
65b7cb9b89
@ -3,14 +3,15 @@
|
||||
|
||||
\n{1}{Introduction}
|
||||
|
||||
A part of the task of this thesis was to build an actual Password Compromise
|
||||
Monitoring Tool. Therefore, the development process, the tools and practices
|
||||
used generally, and with more specificity the outcome are all described in the
|
||||
following sections. A whole section is dedicated to application architecture,
|
||||
whereby relevant engineering choices are justified and motifs preceding the
|
||||
decisions are explained. This part then flows into recommendations for more of
|
||||
a production deployment and concludes by describing the validation methods
|
||||
chosen and used to ensure correctness and stability of the program.
|
||||
A part of the task of this thesis was to build an actual application, which was
|
||||
named Password Compromise Monitoring Tool, or \texttt{pcmt} for short.
|
||||
Therefore, the development process, the general tools and practices as well as
|
||||
the specific outcome are all described in the following sections. A whole
|
||||
section is dedicated to application architecture, whereby relevant engineering
|
||||
choices are justified and motifs preceding the decisions are explained. This
|
||||
part then flows into recommendations for more of a production deployment and
|
||||
concludes by describing the validation methods chosen and used to ensure
|
||||
correctness and stability of the program.
|
||||
|
||||
|
||||
\n{2}{Kudos}
|
||||
@ -18,7 +19,10 @@ chosen and used to ensure correctness and stability of the program.
|
||||
The program that has been developed as part of this thesis used and utilised a
|
||||
great deal of free (as in \textit{freedom}) and open-source software in the
|
||||
process, either directly or as an outstanding work tool, and the author would
|
||||
like to take this opportunity to recognise that fact\footnotemark.
|
||||
like to take this opportunity to recognise that fact\footnotemark{}.
|
||||
|
||||
\footnotetext{\textbf{Disclaimer:} the author is not affiliated with any of the
|
||||
projects mentioned on this page.}
|
||||
|
||||
In particular, the author acknowledges that this work would not be the same
|
||||
without:
|
||||
@ -34,14 +38,12 @@ without:
|
||||
\item Go (\url{https://go.dev/})
|
||||
\end{itemize}
|
||||
|
||||
All of the code written has been typed into VIM (\texttt{9.0}), the shell used
|
||||
to run the commands was ZSH, both running in the author's terminal emulator of
|
||||
choice, \texttt{kitty}. The development machines ran a recent installation of
|
||||
\textit{Arch Linux (by the way)} and Fedora 38, both using a
|
||||
All the code was typed into VIM, the shell used was ZSH, and the terminal
|
||||
emulator of choice was \texttt{kitty}. The development machines ran a recent
|
||||
installation of \textit{Arch Linux}\footnotemark{} and Fedora 38, both using a
|
||||
\texttt{6.\{2,3,4\}.x} XanMod variant of the Linux kernel.
|
||||
|
||||
\footnotetext{\textbf{Disclaimer:} the author is not affiliated with any of the
|
||||
projects mentioned on this page.}
|
||||
\footnotetext{(by the way) \url{https://i.redd.it/mfrfqy66ey311.jpg}.}
|
||||
|
||||
|
||||
\n{1}{Development}
|
||||
@ -49,11 +51,11 @@ projects mentioned on this page.}
|
||||
The source code of the project was being versioned since the start, using the
|
||||
popular and industry-standard git (\url{https://git-scm.com}) source code
|
||||
management (SCM) tool. Commits were made frequently and, if at all possible,
|
||||
for small and self-contained changes of code, trying to follow sane commit
|
||||
message \emph{hygiene}, i.e.\ striving for meaningful and well-formatted commit
|
||||
messages. The name of the default branch is \texttt{development}, since that is
|
||||
what the author likes to choose for new projects that are not yet stable (it is
|
||||
in fact the default in author's \texttt{.gitconfig}).
|
||||
consist of small and self-contained changes of code, trying to follow sane
|
||||
commit message \emph{hygiene}, i.e.\ striving for meaningful and well-formatted
|
||||
commit messages. The name of the default branch is \texttt{development}, since
|
||||
that is what the author likes to choose for new projects that are not yet
|
||||
stable (it is in fact the default in author's \texttt{.gitconfig}).
|
||||
|
||||
|
||||
\n{2}{Commit signing}
|
||||
@ -134,33 +136,33 @@ container) was used to run the builds.
|
||||
|
||||
The way this runner works is that it creates an ephemeral container for every
|
||||
pipeline step and executes given \emph{commands} inside of it. At the end of
|
||||
each step the container is discarded, while the repository, which is mounted
|
||||
into each container's \texttt{/drone/src} is persisted between steps, allowing
|
||||
it to be cloned only from \emph{origin} only at the start of the pipeline and
|
||||
then shared for all the following steps, saving bandwidth, time and disk
|
||||
each step, the container is discarded while the repository clone, which is
|
||||
mounted into each container's \texttt{/drone/src}, is persisted between steps,
|
||||
allowing it to be cloned from \emph{origin} only at the start of the pipeline
|
||||
and then shared for all the following steps, saving bandwidth, time and disk
|
||||
writes.
|
||||
|
||||
The entire configuration used to run the pipelines can be found in a file named
|
||||
\texttt{.drone.yml} at the root of the main source code repository. The
|
||||
workflow consists of four pipelines, which are run in parallel. Two main
|
||||
pipelines are defined to build the frontend assets, the \texttt{pcmt} binary
|
||||
and run tests on \texttt{x86\_64} GNU/Linux targets, one for each of Arch and
|
||||
Alpine (version 3.1\{7,8\}). These two pipelines are identical apart from
|
||||
and run tests on \texttt{x86\_64} GNU/Linux targets, one for each of Alpine
|
||||
(version 3.1\{7,8\}) and Arch. These two pipelines are identical apart from
|
||||
OS-specific bits such as installing a certain package, etc. For the record,
|
||||
other OS-architecture combinations were not tested.
|
||||
|
||||
A third pipeline contains instructions to build a popular static analysis tool
|
||||
called \texttt{golangci-lint}, which is sort of a meta-linter, bundling a
|
||||
staggering amount of linters (linter is a tool that performs static code
|
||||
called \texttt{golangci-lint}, which is a sort of meta-linter, bundling a
|
||||
staggering number of linters (linter is a tool that performs static code
|
||||
analysis and can raise awareness of programming errors, flag potentially buggy
|
||||
code constructs, or \emph{mere} stylistic errors) - from sources and then
|
||||
code constructs, or \emph{mere} stylistic errors), from sources and then
|
||||
perform the analysis of project's codebase using the freshly built binary. If
|
||||
the result of this step is successful, a handful of code analysis services get
|
||||
pinged in the next steps to take notice of the changes to project's source code
|
||||
and update their metrics, details can be found in the main Drone configuration
|
||||
and update their metrics. Details can be found in the main Drone configuration
|
||||
file \texttt{.drone.yml} and the configuration for the \texttt{golangci-lint}
|
||||
tool itself (such as what linters are enabled/disabled and with what settings)
|
||||
can be found in the root of the repository in the file named
|
||||
tool itself (such as what linters are enabled/disabled and their
|
||||
configurations) can be found in the root of the repository in the file named
|
||||
\texttt{.golangci.yml}.
|
||||
|
||||
The fourth pipeline focuses on linting the \texttt{Containerfile} and building
|
||||
@ -287,9 +289,10 @@ passed around as a pointer.
|
||||
|
||||
An experimental (note: not anymore, with \texttt{go1.21} it was brought into
|
||||
Go's \textit{stdlib}) library for \textit{structured} logging \texttt{slog} was
|
||||
used to facilitate every logging need the program might have. It supports both
|
||||
JSON and plain-text logging, which was made configurable by the program. Either
|
||||
a configuration file value or an environment variable can be used to set this.
|
||||
used to facilitate every logging need that the program might have. It supports
|
||||
both JSON and plain-text logging, which was made configurable by the program.
|
||||
Either a configuration file value or an environment variable can be used to set
|
||||
this.
|
||||
|
||||
There are four log levels available by default (\texttt{DEBUG}, \texttt{INFO},
|
||||
\texttt{WARNING}, \texttt{ERROR}) and the pertinent library funtions are
|
||||
@ -373,10 +376,10 @@ TCP range $1-65535$.}
|
||||
|
||||
\vspace*{-\baselineskip}
|
||||
|
||||
\paragraph{\texttt{-printMigration}}{A boolean option that if set, makes the
|
||||
program print any \textbf{upcoming} database migrations (based on the current
|
||||
state of the database) and exit. The connection string environment variable
|
||||
still needs to be set in order to be able connect to the database and perform
|
||||
\paragraph{\texttt{-printMigration}}{A boolean option that, if set, makes the
|
||||
program print any \textbf{upcoming} database migrations (based on the current
|
||||
state of the database) and exit. The connection string environment variable
|
||||
still needs to be set in order to be able connect to the database and perform
|
||||
the schema \emph{diff}. This option is mainly useful during debugging.}
|
||||
|
||||
\vspace*{-\baselineskip}
|
||||
@ -410,7 +413,7 @@ development binary might simply print the truncated commit ID (consult
|
||||
|
||||
An important thing to mention is embedded assets and templates. Go has multiple
|
||||
mechanisms to natively embed arbitrary files directly into the binary during
|
||||
the regular build process. \texttt{embed.FS} from the standard library'
|
||||
the regular build process. \texttt{embed.FS} from the standard library
|
||||
\texttt{embed} package was used to bundle all template files and web assets,
|
||||
such as images, logos and stylesheets at the module level. These are then
|
||||
passed around the program as needed, such as to the \texttt{handlers} package.
|
||||
@ -481,7 +484,7 @@ JavaScript.
|
||||
|
||||
\n{2}{Frontend}
|
||||
|
||||
Frontend-side, the application Tailwind was used for CSS. It promotes the usage
|
||||
Frontend-wise, the application Tailwind was used for CSS. It promotes the usage
|
||||
of flexible \emph{utility-first} classes in the HTML markup instead of
|
||||
separating out styles from content. Understandably, this is somewhat of a
|
||||
preference issue and the author does not hold hard opinions in either
|
||||
@ -491,7 +494,7 @@ detailed documentation and offering built-in support for dark/light mode, and
|
||||
partially also because it \emph{looks} nice.
|
||||
|
||||
The Go templates containing the CSS classes need to be parsed by Tailwind in
|
||||
order t produce the final stylesheet that can be bundled with the application.
|
||||
order to produce the final stylesheet that can be bundled with the application.
|
||||
The upstream provides an original CLI tool (\texttt{tailwindcss}), which can be
|
||||
used exactly for that action. Simple and accessible layouts were overall
|
||||
preferred, a single page was rather split into multiple when becoming
|
||||
@ -503,9 +506,9 @@ pages.
|
||||
As an aside, the author has briefly experimented with WebAssembly to provide
|
||||
client-side dynamic functionality for this project, but has ultimately scrapped
|
||||
it in favour of the entirely server-side rendered approach. It is possible that
|
||||
it would get revisited in the future if necessary, and performance mattered.
|
||||
Even from the short experiments it was obvious how much faster WebAssembly was
|
||||
when compared to JavaScript.
|
||||
it would get revisited in the future if necessary. Even from the short
|
||||
experiments it was obvious how much faster WebAssembly was when compared to
|
||||
JavaScript.
|
||||
|
||||
|
||||
% \newpage
|
||||
@ -525,15 +528,15 @@ roles were envisioned:
|
||||
\end{itemize}
|
||||
|
||||
It is paramount that the program protects itself from the insider threats as
|
||||
well and therefore each role is only able to perform actions that it is
|
||||
explicitly assigned. While there definitely is certain overlap between the
|
||||
well, and therefore each role is only able to perform actions that it is
|
||||
explicitly assigned. While there definitely is a certain overlap between the
|
||||
capabilities of the two outlined roles, each also possesses unique features
|
||||
that the other one does not.
|
||||
|
||||
For instance, the administrator role is not able to perform searches on the
|
||||
breach data directly, for that a separate \emph{user} account has to be
|
||||
devised. Similarly, a regular user is not able to manage breach lists and other
|
||||
users, because that is a privileged operation.
|
||||
For instance, the administrator role is not able to perform breach data
|
||||
searches directly, for that a separate \emph{user} account has to be devised.
|
||||
Similarly, a regular user is not able to manage breach lists and other users,
|
||||
because that is a privileged operation.
|
||||
|
||||
In-application administrators are not able to view (any) sensitive user data
|
||||
and should therefore only be able to perform the following actions:
|
||||
@ -572,12 +575,12 @@ configuration parameters) before letting it encrypt the \emph{age} key.
|
||||
The \texttt{age} identity is only generated once the user changes their
|
||||
password for the first time, in an attempt to prevent scenarios like the
|
||||
in-application administrator with access to physical database being able to
|
||||
both \textbf{recover} the key from the database and \textbf{decrypt} it given
|
||||
both \textbf{recover} the key from the database and \textbf{decrypt} it, given
|
||||
that they already know the user password (because they set it when they created
|
||||
the user), which would subsequently give them unbounded access to any future
|
||||
encrypted data, as long as they would be able to maintain their database
|
||||
access. This is why generating the \texttt{age} identity is are bound to the
|
||||
first password change.
|
||||
access. This is why generating the \texttt{age} identity is bound to the first
|
||||
password change.
|
||||
|
||||
Of course, the supposed evil administrator could simply perform the password
|
||||
change themselves! However, the user would at least be able to find those
|
||||
@ -603,8 +606,8 @@ instance using \texttt{LiME}~\cite{lime}, or perhaps directly the
|
||||
\n{2}{Dhall Configuration Schema}\label{sec:configuration}
|
||||
|
||||
The configuration schema was at first being developed as part of the main
|
||||
project's repository, before it was determined that it would benefit both the
|
||||
development and overall clarity if the schema lived in its own repository (see
|
||||
project's repository, before it was determined that both the development and
|
||||
overall clarity would benefit from the schema living in its own repository (see
|
||||
Section~\ref{sec:repos} for details). This enabled the schema to be
|
||||
independently developed and versioned, and only be pulled into the main
|
||||
application whenever it was determined to be ready.
|
||||
@ -876,13 +879,27 @@ manner with more than one concurrent \emph{writer} (replicated application
|
||||
instances).
|
||||
|
||||
\footnotetext{In Go, integer size is architecture dependent, see
|
||||
\url{https://go.dev/ref/spec#Numeric_types}}
|
||||
\url{https://go.dev/ref/spec#Numeric_types}.}
|
||||
|
||||
The relations between entities as modelled with \texttt{ent} can be imagined as
|
||||
the edges connecting the nodes of a directed \emph{graph}, with the nodes
|
||||
representing the entities. This conceptualisation lends itself for a more
|
||||
representing the entities. This conceptualisation lends itself to a more
|
||||
human-friendly querying language, where the directionality can be expressed
|
||||
with words describing ownership
|
||||
with words describing ownership, like so:
|
||||
|
||||
\vspace{\parskip}
|
||||
\begin{lstlisting}[caption={Ent graph query},
|
||||
label=entQuery,
|
||||
backgroundcolor=\color{lstbg},
|
||||
language=Go,
|
||||
]
|
||||
one, err := users.Query.
|
||||
Where(
|
||||
LocalBreach.
|
||||
Has(Field_xyz)
|
||||
).
|
||||
Only(ctx)
|
||||
\end{lstlisting}
|
||||
|
||||
|
||||
|
||||
@ -1179,12 +1196,13 @@ himself to be a staunch supporter of neither extreme. The ``no unit tests''
|
||||
opinion seems to discount any benefit there is to unit testing, while a
|
||||
``TDD-only''\footnotemark{} approach can be a little too much for some people's
|
||||
taste. The author tends to prefer a \emph{middle ground} approach in this
|
||||
particular case, i.e. writing enough tests where meaningful but not necessarily
|
||||
testing everything or writing tests prior to business logic code. Arguably,
|
||||
following the practice of TDD should result in writing a \emph{better designed}
|
||||
code, particularly because there needs to be a prior thought about the shape
|
||||
and function of the code, as it is tested for before it is even written, but it
|
||||
adds a slight inconvenience to what is otherwise a straightforward process.
|
||||
particular case, i.e. writing enough tests where meaningful, but not
|
||||
necessarily testing everything or writing tests prior to business logic code.
|
||||
Arguably, following the practice of TDD should result in writing a \emph{better
|
||||
designed} code, particularly because there needs to be a prior thought about
|
||||
the shape and function of the code, as it is tested for before being even
|
||||
written, but it adds a slight inconvenience to what is otherwise a
|
||||
straightforward process.
|
||||
|
||||
Thanks to Go's built in support for testing via its \texttt{testing} package
|
||||
and the tooling in the \texttt{go} tool, writing tests is relatively simple. Go
|
||||
@ -1200,7 +1218,7 @@ the signature. Without it, the function is not considered to be a testing
|
||||
function despite having the required signature and is therefore \emph{not}
|
||||
executed during testing.
|
||||
|
||||
This test lookup behaviour; however, also has a neat side effect: all the test
|
||||
This test lookup behaviour, however, also has a neat side effect: all the test
|
||||
files can be kept side-by-side their regular source counterparts, there is no
|
||||
need to segregate them into a specially blessed \texttt{tests} folder or
|
||||
similar, which in author's opinion improves readability. As a failsafe, in case
|
||||
@ -1231,20 +1249,20 @@ then after pushing to remote in the CI.
|
||||
|
||||
In the integration test shown in Listing~\ref{integrationtest}, it is prefaced
|
||||
at line 10 by declaring a helper function \texttt{getCtx() context.Context},
|
||||
which takes no arguments and returns a new\\ \texttt{context.Context}
|
||||
initialised with the value of the global logger. As previously mentioned, that
|
||||
is how the logger gets injected into the user module functions. The actual test
|
||||
function with the signature \texttt{TestUserExists(t *testing.T)} defines a
|
||||
database connection string at line 21 and attempts to open a connection to the
|
||||
database. The database in use here is SQLite3 running in memory mode, meaning
|
||||
no file is actually written to disk during this process. Since the testing data
|
||||
is not needed after the test, this is desirable. Next, a defer statement calls
|
||||
the \texttt{Close()} method on the database object, which is the Go idiomatic
|
||||
way of closing files and network connections (which are also an abstraction
|
||||
over files on UNIX-like operating systems such as GNU/Linux). Contrary to where
|
||||
it is declared, the \emph{defer} statement is only called after all the
|
||||
statements in the surrounding function, which makes sure no file descriptors
|
||||
(FDs) are leaked and the file is properly closed when the function returns.
|
||||
which takes no arguments and returns a new \texttt{context.Context} initialised
|
||||
with the value of the global logger. As previously mentioned, that is how the
|
||||
logger gets injected into the user module functions. The actual test function
|
||||
with the signature \texttt{TestUserExists(t *testing.T)} defines a database
|
||||
connection string at line 21 and attempts to open a connection to the database.
|
||||
The database in use here is SQLite3 running in memory mode, meaning no file is
|
||||
actually written to disk during this process. Since the testing data is not
|
||||
needed after the test, this is desirable. Next, a defer statement calls the
|
||||
\texttt{Close()} method on the database object, which is the Go idiomatic way
|
||||
of closing files and network connections (which are also an abstraction over
|
||||
files on UNIX-like operating systems such as GNU/Linux). Contrary to where it
|
||||
is declared, the \emph{defer} statement is only called after all the statements
|
||||
in the surrounding function, which makes sure no file descriptors (FDs) are
|
||||
leaked and the file is properly closed when the function returns.
|
||||
|
||||
In the next step at line 25 a database schema creation is attempted, handling
|
||||
the potential error in a Go idiomatic way, which uses the return value from the
|
||||
@ -1371,7 +1389,7 @@ The application has been deployed in a test environment on author's modest
|
||||
Virtual Private Server (VPS) at \texttt{https://testpcmt.dotya.ml}, protected
|
||||
by \emph{Let's Encrypt}\allowbreak issued, short-lived, ECDSA
|
||||
\texttt{secp384r1} curve TLS certificate, and configured with strict CSP. It is
|
||||
a test instance; therefore limits (and rate-limits) to prevent abuse might be
|
||||
a test instance, therefore limits (and rate-limits) to prevent abuse might be
|
||||
imposed.
|
||||
\\
|
||||
The test environment makes the program available over both modern IPv6 and
|
||||
@ -1427,7 +1445,7 @@ ensures that no unvetted content is ever loaded from elsewhere.
|
||||
The \texttt{Referrer-Policy} header setting of \texttt{no-referrer,
|
||||
strict-origin-when-cross-origin} ensures that user tracking is reduced, since
|
||||
no referrer is included (the \texttt{Referer} header is omitted) when the user
|
||||
navigatse away from the site or somehow send requests outside the application
|
||||
navigates away from the site or somehow send requests outside the application
|
||||
using other means. The \texttt{Permissions-Policy} set to
|
||||
\texttt{geolocation=(), midi=(), sync-xhr=(), microphone=(), camera=(),
|
||||
gyroscope=(), magnetometer=(), fullscreen=(self), payment=()} declares that the
|
||||
|
Reference in New Issue
Block a user