mirror of
https://github.com/BLAKE3-team/BLAKE3
synced 2024-05-18 08:06:07 +02:00
279 lines
8.7 KiB
Markdown
279 lines
8.7 KiB
Markdown
The official C implementation of BLAKE3.
|
||
|
||
# Example
|
||
|
||
An example program that hashes bytes from standard input and prints the
|
||
result:
|
||
|
||
```c
|
||
#include "blake3.h"
|
||
#include <stdio.h>
|
||
#include <unistd.h>
|
||
|
||
int main() {
|
||
// Initialize the hasher.
|
||
blake3_hasher hasher;
|
||
blake3_hasher_init(&hasher);
|
||
|
||
// Read input bytes from stdin.
|
||
unsigned char buf[65536];
|
||
ssize_t n;
|
||
while ((n = read(STDIN_FILENO, buf, sizeof(buf))) > 0) {
|
||
blake3_hasher_update(&hasher, buf, n);
|
||
}
|
||
|
||
// Finalize the hash. BLAKE3_OUT_LEN is the default output length, 32 bytes.
|
||
uint8_t output[BLAKE3_OUT_LEN];
|
||
blake3_hasher_finalize(&hasher, output, BLAKE3_OUT_LEN);
|
||
|
||
// Print the hash as hexadecimal.
|
||
for (size_t i = 0; i < BLAKE3_OUT_LEN; i++) {
|
||
printf("%02x", output[i]);
|
||
}
|
||
printf("\n");
|
||
return 0;
|
||
}
|
||
```
|
||
|
||
If you save the example code above as `example.c`, and you're on x86\_64
|
||
with a Unix-like OS, you can compile a working binary like this:
|
||
|
||
```bash
|
||
gcc -O3 -o example example.c blake3.c blake3_dispatch.c blake3_portable.c \
|
||
blake3_sse41_x86-64_unix.S blake3_avx2_x86-64_unix.S blake3_avx512_x86-64_unix.S
|
||
```
|
||
|
||
# API
|
||
|
||
## The Struct
|
||
|
||
```c
|
||
typedef struct {
|
||
// private fields
|
||
} blake3_hasher;
|
||
```
|
||
|
||
An incremental BLAKE3 hashing state, which can accept any number of
|
||
updates. This implementation doesn't allocate any heap memory, but
|
||
`sizeof(blake3_hasher)` itself is relatively large, currently 1912 bytes
|
||
on x86-64. This size can be reduced by restricting the maximum input
|
||
length, as described in Section 5.4 of [the BLAKE3
|
||
spec](https://github.com/BLAKE3-team/BLAKE3-specs/blob/master/blake3.pdf),
|
||
but this implementation doesn't currently support that strategy.
|
||
|
||
## Common API Functions
|
||
|
||
```c
|
||
void blake3_hasher_init(
|
||
blake3_hasher *self);
|
||
```
|
||
|
||
Initialize a `blake3_hasher` in the default hashing mode.
|
||
|
||
```c
|
||
void blake3_hasher_update(
|
||
blake3_hasher *self,
|
||
const void *input,
|
||
size_t input_len);
|
||
```
|
||
|
||
Add input to the hasher. This can be called any number of times.
|
||
|
||
```c
|
||
void blake3_hasher_finalize(
|
||
const blake3_hasher *self,
|
||
uint8_t *out,
|
||
size_t out_len);
|
||
```
|
||
|
||
Finalize the hasher and emit an output of any length. This doesn't
|
||
modify the hasher itself, and it's possible to finalize again after
|
||
adding more input. The constant `BLAKE3_OUT_LEN` provides the default
|
||
output length, 32 bytes.
|
||
|
||
## Less Common API Functions
|
||
|
||
```c
|
||
void blake3_hasher_init_keyed(
|
||
blake3_hasher *self,
|
||
const uint8_t key[BLAKE3_KEY_LEN]);
|
||
```
|
||
|
||
Initialize a `blake3_hasher` in the keyed hashing mode. The key must be
|
||
exactly 32 bytes.
|
||
|
||
```c
|
||
void blake3_hasher_init_derive_key_raw(
|
||
blake3_hasher *self,
|
||
const void *context,
|
||
size_t context_len);
|
||
```
|
||
|
||
Initialize a `blake3_hasher` in the key derivation mode. Key material
|
||
is to be given as input after initialization, using
|
||
`blake3_hasher_update`. The key derivation `context`
|
||
should follow the __Key Derivation Context Guidelines__
|
||
described below. `context_len` indicates the size of `context` in bytes.
|
||
|
||
```c
|
||
void blake3_hasher_init_derive_key(
|
||
blake3_hasher *self,
|
||
const char *context);
|
||
```
|
||
|
||
Similar to `blake3_hasher_init_derive_key_raw`, except it takes the key
|
||
derivation `context` as a null-terminated C string.
|
||
|
||
This function is offered as a convenience. It is recommended to use this
|
||
function when giving a literal, hardcoded C string as parameter.
|
||
|
||
Notice that contrary to `blake3_hasher_init_derive_key_raw`, this function
|
||
cannot accept `context`s containing the byte `0x00` except as a the
|
||
terminating byte. For this reason, `blake3_hasher_init_derive_key_raw` is
|
||
preferred in more general contexts, such as when implementing bindings to
|
||
this C library.
|
||
|
||
```c
|
||
void blake3_hasher_finalize_seek(
|
||
const blake3_hasher *self,
|
||
uint64_t seek,
|
||
uint8_t *out,
|
||
size_t out_len);
|
||
```
|
||
|
||
The same as `blake3_hasher_finalize`, but with an additional `seek`
|
||
parameter for the starting byte position in the output stream. To
|
||
efficiently stream a large output without allocating memory, call this
|
||
function in a loop, incrementing `seek` by the output length each time.
|
||
|
||
## Key Derivation Context Guidelines
|
||
|
||
The key derivation context should uniquely describe the
|
||
application, place and purpose of the derivation.
|
||
|
||
The context should be **statically known,
|
||
hardcoded, globally unique, and application-specific**.
|
||
|
||
The context should not depend on any dynamic input such as salts,
|
||
nonces, or identifiers read from a database at runtime.
|
||
|
||
A good format for the context string is:
|
||
|
||
```
|
||
[application] [commit timestamp] [purpose]
|
||
```
|
||
|
||
For example:
|
||
|
||
```
|
||
example.com 2019-12-25 16:18:03 session tokens v1
|
||
```
|
||
|
||
It's recommended that the context string consists of ASCII bytes
|
||
containing only alphanumeric characters, whitespace and punctuation.
|
||
However, any bytes are acceptable as long as they satisfy the
|
||
static constraints described above.
|
||
|
||
# Building
|
||
|
||
This implementation is just C and assembly files. It doesn't include a
|
||
public-facing build system. (The `Makefile` in this directory is only
|
||
for testing.) Instead, the intention is that you can include these files
|
||
in whatever build system you're already using. This section describes
|
||
the commands your build system should execute, or which you can execute
|
||
by hand. Note that these steps may change in future versions.
|
||
|
||
## x86
|
||
|
||
Dynamic dispatch is enabled by default on x86. The implementation will
|
||
query the CPU at runtime to detect SIMD support, and it will use the
|
||
widest instruction set available. By default, `blake3_dispatch.c`
|
||
expects to be linked with code for four different instruction sets:
|
||
portable C, SSE4.1, AVX2, and AVX-512.
|
||
|
||
For each of the x86 SIMD instruction sets, two versions are available,
|
||
one in assembly (with three flavors: Unix, Windows MSVC, and Windows
|
||
GNU) and one using C intrinsics. The assembly versions are generally
|
||
preferred: they perform better, they perform more consistently across
|
||
different compilers, and they build more quickly. On the other hand, the
|
||
assembly versions are x86\_64-only, and you need to select the right
|
||
flavor for your target platform.
|
||
|
||
Here's an example of building a shared library on x86\_64 Linux using
|
||
the assembly implementations:
|
||
|
||
```bash
|
||
gcc -shared -O3 -o libblake3.so blake3.c blake3_dispatch.c blake3_portable.c \
|
||
blake3_sse41_x86-64_unix.S blake3_avx2_x86-64_unix.S blake3_avx512_x86-64_unix.S
|
||
```
|
||
|
||
When building the intrinsics-based implementations, you need to build
|
||
each implementation separately, with the corresponding instruction set
|
||
explicitly enabled in the compiler. Here's the same shared library using
|
||
the intrinsics-based implementations:
|
||
|
||
```bash
|
||
gcc -c -fPIC -O3 -msse4.1 blake3_sse41.c -o blake3_sse41.o
|
||
gcc -c -fPIC -O3 -mavx2 blake3_avx2.c -o blake3_avx2.o
|
||
gcc -c -fPIC -O3 -mavx512f -mavx512vl blake3_avx512.c -o blake3_avx512.o
|
||
gcc -shared -O3 -o libblake3.so blake3.c blake3_dispatch.c blake3_portable.c \
|
||
blake3_avx2.o blake3_avx512.o blake3_sse41.o
|
||
```
|
||
|
||
Note above that building `blake3_avx512.c` requires both `-mavx512f` and
|
||
`-mavx512vl` under GCC and Clang, as shown above. Under MSVC, the single
|
||
`/arch:AVX512` flag is sufficient. The MSVC equivalent of `-mavx2` is
|
||
`/arch:AVX2`. MSVC enables SSE4.1 by defaut, and it doesn't have a
|
||
corresponding flag.
|
||
|
||
If you want to omit SIMD code on x86, you need to explicitly disable
|
||
each instruction set. Here's an example of building a shared library on
|
||
x86 with only portable code:
|
||
|
||
```bash
|
||
gcc -shared -O3 -o libblake3.so -DBLAKE3_NO_SSE41 -DBLAKE3_NO_AVX2 -DBLAKE3_NO_AVX512 \
|
||
blake3.c blake3_dispatch.c blake3_portable.c
|
||
```
|
||
|
||
## ARM NEON
|
||
|
||
The NEON implementation is not enabled by default on ARM, since not all
|
||
ARM targets support it. To enable it, set `BLAKE3_USE_NEON=1`. Here's an
|
||
example of building a shared library on ARM Linux with NEON support:
|
||
|
||
```bash
|
||
gcc -shared -O3 -o libblake3.so -DBLAKE3_USE_NEON blake3.c blake3_dispatch.c \
|
||
blake3_portable.c blake3_neon.c
|
||
```
|
||
|
||
Note that on some targets (ARMv7 in particular), extra flags may be
|
||
required to activate NEON support in the compiler. If you see an error
|
||
like...
|
||
|
||
```
|
||
/usr/lib/gcc/armv7l-unknown-linux-gnueabihf/9.2.0/include/arm_neon.h:635:1: error: inlining failed
|
||
in call to always_inline ‘vaddq_u32’: target specific option mismatch
|
||
```
|
||
|
||
...then you may need to add something like `-mfpu=neon-vfpv4
|
||
-mfloat-abi=hard`.
|
||
|
||
## Other Platforms
|
||
|
||
The portable implementation should work on most other architectures. For
|
||
example:
|
||
|
||
```bash
|
||
gcc -shared -O3 -o libblake3.so blake3.c blake3_dispatch.c blake3_portable.c
|
||
```
|
||
|
||
# Differences from the Rust Implementation
|
||
|
||
The single-threaded Rust and C implementations use the same algorithms,
|
||
and their performance is the same if you use the assembly
|
||
implementations or if you compile the intrinsics-based implementations
|
||
with Clang. (Both Clang and rustc are LLVM-based.)
|
||
|
||
The C implementation doesn't currently include any multithreading
|
||
optimizations. OpenMP support or similar might be added in the future.
|