VTI_aux/latex/sem2/s2_crypto.tex

148 lines
7.9 KiB
TeX

\documentclass[10pt]{article}
%\usepackage[czech]{babel}
\usepackage[sfdefault]{noto}
\usepackage{caption}
\usepackage{subcaption}
\usepackage{graphicx}
\usepackage{listings}
\usepackage[T1]{fontenc}
\usepackage[paperwidth=21cm, paperheight=29.7cm]{geometry}
\geometry{paperwidth=21cm, paperheight=29.7cm, tmargin=2.5cm, bmargin=2.5cm,
lmargin=2.5cm, rmargin=2.5cm}
\usepackage{enumitem}
\usepackage{placeins}
\usepackage[colorlinks=true]{hyperref}
\usepackage{color}
\definecolor{lightgray}{rgb}{0.88, 0.88, 0.88}
\lstset{
backgroundcolor=\color{lightgray},
escapeinside={\%*}{*)},
basicstyle=\ttfamily,
keepspaces=true,
breaklines=true,
postbreak=\mbox{\textcolor{red}{$\hookrightarrow$}\space},
}
\begin{document}
\begin{minipage}[c]{.5\textwidth}
\vspace{0pt}
\hspace{-17pt}
\raggedright
NAME
\\
\hspace{-17pt}
GROUP
\end{minipage}
\begin{minipage}[c]{.46\textwidth}
\vspace{0pt}
\raggedleft
TIK 2019, VTI, S2
\\\today
\end{minipage}
\\
\rule{\textwidth}{1pt}
\\[5pt]
\Large{\textbf{Cryptography tools}}
\normalsize
\vspace*{-5pt}
\section{Monoalphabetic substitution cipher}
Substitution cipher is a system in which elements of unencrypted text (known as plaintext) are replaced according to a fixed set of rules. This might mean the individual letters, pairs or triplets of letters or any mixture of such. This process produces an encrypted text known as ciphertext. Such ciphertext is then decoded by the recipient by means of a reverse operation. The encoding rule must be known to both involved parties.
\subsection{Simple substitution}
This method involves simply replacing individual letters by either different symbols or a shifted image of the same alphabet such as the Caesar cipher or Atbash. This work involves application of this principle to create a simple tool for creating and decrypting Caesar ciphertexts.
\subsection{Cipher security}
Although the number of possible keys is quite large ($26! \approx 2^{88.4}$ or $88$ bits) the cipher is not very strong. An easy and efficient method of breaking such cipher is to analyze letter frequency distribution of the text. With knowledge of plaintext language, this can be matched to letter frequency of the particular language and then be progressively expanded to decode the words. Some words can also be distinguished by their specific patterns. Some resistance to this method can be achieved by contriving the plaintext to achieve a specific (flat) frequency distribution. Usually though this only means that a larger sample of ciphertext is needed to decode the message. In English, about 50 letters of a ciphertext are needed to reliably decode a simple substitution cipher.
With Caesar cipher, multiple encryptions do not increase the security since ROT$y$ on a ROT$x$ ciphertext only produces a ROT$x+y$ ciphertext.
\vspace*{-5pt}
\section{The tool}
The purpose of this work was to produce a cryptographic tool for encoding and decoding text using a monoalphabetic cipher. The resulting tool can be found attached, named \texttt{monoalpha\_tool.py}. There is also a Tk GUI variant, named \texttt{monoalpha\_tool\_gui.py}. The following section is a description of this tool. The tool works reliably with English language text. No other language has been tested. This explanation assumes a Unix-like environment with Python 3.x installed.
\subsection{\texttt{monoalpha\_tool.py}}
The basic program takes one positional parameter \texttt{"text"} and several optional flags to further specify intended action. The default functionality is to take the \texttt{"text"} parameter, encrypt it using ROT13 and print the resulting ciphertext to stdout.
An example execution could be:
\begin{lstlisting}
$ monoalpha_tool.py "hello world"
URYYBJBEYQ
$
\end{lstlisting}
The immediately obvious thing is the conversion to uppercase letters and the missing whitespace. This is done to try and improve cipher security by obscuring individual words.
\subsubsection{Parameters}
The optional arguments are illustrated nicely by the help section of the program:
\begin{lstlisting}
$ monoalpha_tool.py --help
usage: monoalpha_tool.py [-h] [-s SHIFT] [-k KEY] [-d] [-f] text
Caesar cipher encoding/decoding tool.
positional arguments:
text Text to be encoded or decoded
optional arguments:
-h, --help show this help message and exit
-s SHIFT, --shift SHIFT
Use custom alphabet shift (default is rot13)
-k KEY, --key KEY Use custom alphabet, input as "A:N, B:O, ..." with
substitution for each character
-d, --decrypt Set mode to decode
-f, --force Try to guess original encryption
$
\end{lstlisting}
Most of these parameters are self exlpanatory.
\\\noindent The \texttt{-k} flag allows for input of a custom substitution alphabet dictionary. This option is only available for encryption. The dictionary should contain a value for each letter of the plaintext. Capitalization of the entered letters is not important.
For example:
\begin{lstlisting}
$ monoalpha_tool.py -k "H:Q, E:K, L:T, O:S" hello
QKTTS
$
\end{lstlisting}
By setting the \texttt{-d} flag, the program treats the \texttt{"text"} input as ciphertext. This can be used in conjunction with the \texttt{-s} flag if the encoding shift is known or the \texttt{-f} flag to perform a simple frequency analysis and try to guess the encoding shift. For example:
\begin{lstlisting}
$ monoalpha_tool.py -d -f ZQZMTZHKDMZZQZMTIVODJIZQZMTOMDWZOCJPBCODORJPGYZIYDIVWDOHJMZYZXZIORVT
Trying to decrypt using frequency analysis. This method works best on large ciphertexts.
Text possibly encrypted using rot21
EVERYEMPIREEVERYNATIONEVERYTRIBETHOUGHTITWOULDENDINABITMOREDECENTWAY
$
\end{lstlisting}
With the \texttt{-d} flag on its own, the program performs a ROT13 decryption of the supplied ciphertext.
\subsubsection{Frequency analysis}
Setting the \texttt{-d} \texttt{-f} flags executes a frequency analysis attack on the supplied ciphertext. The attack is based on the letter "e" being the most common in English texts. Since its frequency is considerably higher than that of other frequent letters, it is the only one checked. After determining a possible substitution for "e" in the ciphertext, the encoding shift is calculated. This method is quite crude and starts to work reliably only with ciphertexts of at least a paragraph in length.
\subsection{\texttt{monoalpha\_tool\_gui.py}}
This file provides a Tk GUI for the program. Functionality remains the same but some users may prefer buttons over command line arguments. The program comprises of two screens, one for each mode, switchable at any time with a button. The top part of the window contains a plaintext/ciphertext entry field, the middle part provides access to encryption/decryption options such as alphabet shift input or custom dictionary input. Again, capitalization of the letters is not important when using a custom dictionary. The resulting ciphertext/plaintext is displayed in a text field at the bottom of the window. The program starts in encryption mode. It is best explained by the following screen shots.
\newpage
\begin{figure}[!h]
\centering
\includegraphics[scale=0.5]{02shiftenc.jpg}
\caption{Encoding text by selecting an alphabet shift}
\vspace{20pt}
\includegraphics[scale=0.5]{03dictenc.jpg}
\caption{Encoding text using a custom dictionary}
\end{figure}
\newpage
\begin{figure}[!h]
\centering
\includegraphics[scale=0.5]{05shiftdec.jpg}
\caption{Decoding ciphertext with known encryption shift}
\vspace{20pt}
\includegraphics[scale=0.5]{06guessdec.jpg}
\caption{Decoding ciphertext by frequency analysis}
\end{figure}
\FloatBarrier
\newpage
\section{Sources and links}
\begin{itemize}
\item \href{https://en.wikipedia.org/wiki/Substitution_cipher}{A wikipedia article on substitution ciphers}
\item \href{https://en.wikipedia.org/wiki/Caesar_cipher}{A wikipedia article on Caesar cipher}
\item \href{https://git.dotya.ml/2EEEB/VTI_aux/src/branch/master/code/crypto}{Git repository containing the described tools}
\end{itemize}
\end{document}