1
0
Fork 0
mirror of https://github.com/BLAKE3-team/BLAKE3 synced 2024-05-09 03:26:16 +02:00
Commit Graph

371 Commits

Author SHA1 Message Date
Jack O'Connor 35aa4259bd version 0.3.7
Changes since 0.3.6:
- BUGFIX: The C implementation was incorrect on big endian systems for
  inputs longer than 1024 bytes. This bug affected all previous versions
  of the C implementation. Little endian platforms like x86 were
  unaffected. The Rust implementation was also unaffected.
  @jakub-zwolakowski and @pascal-cuoq from TrustInSoft reported this
  bug: https://github.com/BLAKE3-team/BLAKE3/pull/118
- BUGFIX: The C build on x86-64 was producing binaries with an
  executable stack. @tristanheaven reported this bug:
  https://github.com/BLAKE3-team/BLAKE3/issues/109
- @mkrupcale added optimized implementations for SSE2. This improves
  performance on older x86 processors that don't support SSE4.1.
- The C implementation now exposes the
  `blake3_hasher_init_derive_key_raw` function, to make it easier to
  implement language bindings. Added by @k0001.
2020-10-01 10:00:06 -04:00
Jack O'Connor 3d212291b9 add cross_test.sh for the C bindings
This will let us add big endian testing to CI for our C code. (We were
already doing it for our Rust code.)

This is adapted from test_vectors/cross_test.sh. It works around the
limitation that the `cross` tool can't reach parent directories. It's an
unfortunate hack, but at least it's only for testing. It might've been
less hacky to use symlinks for this somehow, but I worry that would
break things on Windows, and I don't want to have to add workarounds for
my workarounds.
2020-09-29 16:48:18 -04:00
Jack O'Connor 0b13637ae3 fix a couple of big-endianness mistakes in blake3.c
Kudos to @pascal-cuoq and @jakub-zwolakowski from TrustInSoft for
catching these bugs.

Original report: https://github.com/BLAKE3-team/BLAKE3/pull/118
2020-09-29 16:09:28 -04:00
Jack O'Connor 3817999f17 fix the short_test_cases loop in the C bindings tests 2020-09-29 11:06:32 -04:00
Jack O'Connor 5bdfd07666 update the blake3_c_rust_bindings test cases also 2020-09-29 10:59:56 -04:00
Jack O'Connor ae3e8e6b3a add more test cases at shorter input lengths 2020-09-29 10:51:49 -04:00
Jack O'Connor b54f8ff5ba tweak the readme description of the benchmark chart 2020-09-24 14:12:14 -04:00
Jack O'Connor d2a23f5330 add a docs.rs badge 2020-09-15 11:49:09 -04:00
Jack O'Connor e70bc965e3 use an absolute url for https://github.com/BLAKE3-team/BLAKE3/blob/master/b3sum/what_does_check_do.md 2020-09-14 11:17:39 -04:00
Jack O'Connor 6785d7bc0c remove an outdated section of the b3sum readme 2020-09-14 11:05:15 -04:00
Jack O'Connor a01fd16011 add some horizontal rules to the C readme 2020-09-10 17:38:35 -04:00
Jack O'Connor ac1da75bb9 add a test for blake3_hasher_init_derive_key_raw 2020-09-10 16:52:14 -04:00
Jack O'Connor 44fd9efbc2 C readme edits 2020-09-10 16:40:25 -04:00
Jack O'Connor 004b39a350 cargo fmt 2020-09-10 15:55:02 -04:00
Jack O'Connor 27b7f610e0
Merge pull request #114 from k0001/no-cstr
C: Add blake3_hasher_init_derive_key_len
2020-09-10 14:54:15 -05:00
Jack O'Connor cc04130eaa cover the no_sse2 flags in CI testing 2020-09-02 12:23:49 -04:00
Jack O'Connor 38bf1cf3a0 s/multi-threading/multithreading/ 2020-09-01 09:52:36 -04:00
Jack O'Connor 9829abee72 mention @mkrupcale's SSE2 implementation in the readme 2020-09-01 09:47:24 -04:00
Renzo Carbonara b205e0efa1 C: rename blake3_hasher_init_derive_key_raw and documentation 2020-09-01 13:20:16 +03:00
Jack O'Connor 5b22bf57c8 add i586-unknown-linux-musl as a test target
Samuel noticed that rustc seems to assume (incorrectly?) that all i686
targets support SSE2, but it doesn't make that assumption for i586.
2020-08-31 18:25:38 -04:00
Jack O'Connor 3c1db55529 add the dynamic check for SSE2 support
It will be very rare that this actually executes, but we should include
it for completeness.
2020-08-31 18:25:38 -04:00
Jack O'Connor a79fec7e39 fix a build break on x86 targets without guaranteed SSE2 support
This is quite hard to trigger, because SSE2 has been guaranteed for a
long time. But you could trigger it this way:

    rustup target add i686-unknown-linux-musl
    RUSTFLAGS="-C target-cpu=i386" cargo build --target i686-unknown-linux-musl

Note a relevant gotcha though: The `cross` tool will not forward
environment variables like RUSTFLAGS to the container by default, so if
you're testing with `cross` you'll need to use the `rustc` command to
explicitly pass the flag, as I've done here in ci.yml. (Or you could
create a `Cross.toml` file, but I don't want to commit one of those if I
can avoid it.)
2020-08-31 18:25:38 -04:00
Samuel Neves 8610ebda6a add sse2 tests and benchmarks 2020-08-31 19:12:01 +01:00
Samuel Neves bf705f2d54 remove avoidable spill 2020-08-31 19:11:58 +01:00
Samuel Neves 3340e32c7f
Merge pull request #110 from mkrupcale/sse2
Add SSE2 implementations
2020-08-31 18:56:55 +01:00
Matthew Krupcale be2da69b6b C: asm: simplify pblendw emulation
Use statically calculated ~mask. This reduces the number of moves and registers necessary at the expense of an extra memory load. This is probably a good trade-off since we are not bound by memory uops in this loop.
2020-08-31 12:12:42 -04:00
Nikolai Vazquez 324090b2c3 Implement `fmt::Debug` using builders
This enables pretty printing via `{:#?}`. The normal style for `{:?}` is
kept exactly the same.
2020-08-31 12:04:40 -04:00
Matthew Krupcale 47e415c7f1 C: asm: simplify pinsrd emulation
Use punpckl{,q}dq instead of pinsrw.
2020-08-31 00:21:47 -04:00
Matthew Krupcale c592e9a3f6 C: asm: remove blendvps usage altogether
This simplifies the operation by removing the need to use blendvps at all.
2020-08-30 23:13:47 -04:00
Renzo Carbonara 31e4080aa2 C: Add blake3_hasher_init_derive_key_len
blake3_hasher_init_derive_key_len is an alternative version of
blake3_hasher_init_derive_key which takes the context and its
length as separate parameters, and not together as a C string.

The motivation for this addition is making it easier for
bindings to this C library to call this function without
having to first copy over the context bytes just to add
one 0x00 byte at the end.

Notice that contrary to blake3_hasher_init_derive_key,
blake3_hasher_init_derive_key_len allows the inclusion of a
0x00 byte in the context. Given the rules about context string
selection, this byte is unlikely to be used as part of a context
string. But if for some reason it is ever given, it will be
included in the context string and processed like any other
non-alphanumeric byte would. For compatibility with
blake3_hasher_init_derive_key, bindings should still check for
the absence of 0x00 bytes.
2020-08-30 12:27:33 +03:00
Jack O'Connor c8a5b53e1d wording tweak in the C readme 2020-08-26 16:55:39 -04:00
Matthew Krupcale c33a8462d1 Write _mm_blend_epi16 emulation without multiplication
Use _mm_and_si128 and _mm_cmpeq_epi16 rather than expensive multiplication _mm_mullo_epi16 with _mm_srai_epi16 that compiler may not be able to optimize.
2020-08-25 12:26:15 -04:00
Matthew Krupcale 90e2a924a4 Fix Windows MSVC undefined symbol errors
MSVC returns "error A2006:undefined symbol : FFFFFFFFH", so use 0FFFFFFFFH instead. Also use 0 prefix for 0H to align things.
2020-08-24 21:31:29 -04:00
Matthew Krupcale e581035bd3 Put PBLENDW masks in the RDATA section
Previously, these masks were undefined because they were outside of the RDATA section.
2020-08-24 21:26:41 -04:00
Matthew Krupcale 00849f8625 Fix Windows MSVC undefined symbol errors
MSVC returns "error A2006:undefined symbol : B1H", so use 0B1H instead.
2020-08-24 21:20:10 -04:00
Matthew Krupcale c32660099a Fix unreachable expression compiler warning
SSE2 target_feature appears to always be present for x86_64.
2020-08-24 21:09:56 -04:00
Matthew Krupcale e4681ec39e C: asm: emulate pshufb ROT8 using SSE2 instructions
Use a simple shift for the rotation.

 * c/blake3_sse2_x86-64_unix.S: emulate pshufb using SSE2 instructions for x86_64 unix
 * c/blake3_sse2_x86-64_windows_gnu.S: Likewise for x86_64 Windows GNU.
 * c/blake3_sse2_x86-64_windows_msvc.asm: Likewise for x86_64 Windows MSVC.
2020-08-24 00:57:39 -04:00
Matthew Krupcale 769c7cdc96 C: asm: emulate pshufb ROT16 using SSE2 instructions
Use two 16-bit shuffles: one for the low 64-bits and one for the high 64-bits.

 * c/blake3_sse2_x86-64_unix.S: emulate pshufb using SSE2 instructions for x86_64 unix
 * c/blake3_sse2_x86-64_windows_gnu.S: Likewise for x86_64 Windows GNU.
 * c/blake3_sse2_x86-64_windows_msvc.asm: Likewise for x86_64 Windows MSVC.
2020-08-24 00:57:39 -04:00
Matthew Krupcale 1ef915dbea C: asm: emulate pinsrd using SSE2 instructions
Use two pinsrw and a 16-bit shift to insert the 32-bit integer at the desired location.

 * c/blake3_sse2_x86-64_unix.S: emulate pinsrd using SSE2 instructions for x86_64 unix
 * c/blake3_sse2_x86-64_windows_gnu.S: Likewise for x86_64 Windows GNU.
 * c/blake3_sse2_x86-64_windows_msvc.asm: Likewise for x86_64 Windows MSVC.
2020-08-24 00:57:39 -04:00
Matthew Krupcale e632967a8d C: asm: emulate blendvps using SSE2 instructions
Blend according to (mask & b) | ((~mask) & a).

 * c/blake3_sse2_x86-64_unix.S: emulate blendvps using SSE2 instructions for x86_64 unix
 * c/blake3_sse2_x86-64_windows_gnu.S: Likewise for x86_64 Windows GNU.
 * c/blake3_sse2_x86-64_windows_msvc.asm: Likewise for x86_64 Windows MSVC.
2020-08-24 00:57:28 -04:00
Matthew Krupcale 460c9d3031 C: asm: emulate pblendw using SSE2 instructions
Use a constant mask to blend according to (mask & b) | ((~mask) & a).

 * c/blake3_sse2_x86-64_unix.S: emulate pblendw using SSE2 instructions for x86_64 unix
 * c/blake3_sse2_x86-64_windows_gnu.S: Likewise for x86_64 Windows GNU.
 * c/blake3_sse2_x86-64_windows_msvc.asm: Likewise for x86_64 Windows MSVC.
2020-08-24 00:57:09 -04:00
Matthew Krupcale a9a701c622 SSE2 intrinsic: emulate _mm_shuffle_epi8 SSSE3 intrinsic rot8 with SSE2 intrinsics
Use a simple shift version for the 8-bit rotation.

 * c/blake3_sse2.c: emulate _mm_shuffle_epi8 rot8 using SSE2 intrinsics
2020-08-24 00:56:57 -04:00
Matthew Krupcale 92c8047a15 SSE2 intrinsic: emulate _mm_shuffle_epi8 SSSE3 intrinsic rot16 with SSE2 intrinsics
Use two 16-bit shuffles: one for the low 64-bits and one for the high 64-bits.

 * c/blake3_sse2.c: emulate _mm_shuffle_epi8 rot16 using SSE2 intrinsics
2020-08-24 00:56:46 -04:00
Matthew Krupcale 40a4a2b6b0 SSE2 intrinsic: emulate _mm_blend_epi16 SSE4.1 intrinsic with SSE2 intrinsics
Use a constant mask to blend according to (mask & b) | ((~mask) & a).

 * src/rust_sse2.rs: emulate _mm_blend_epi16 using SSE2 intrinsics
 * c/blake3_sse2.c: Likewise.
2020-08-24 00:55:06 -04:00
Matthew Krupcale d91f20dd29 Start SSE2 implementation based on SSE4.1 version
Wire up basic functions and features for SSE2 support using the SSE4.1 version
as a basis without implementing the SSE2 instructions yet.

 * Cargo.toml: add no_sse2 feature
 * benches/bench.rs: wire SSE2 benchmarks
 * build.rs: add SSE2 rust intrinsics and assembly builds
 * c/Makefile.testing: add SSE2 C and assembly targets
 * c/README.md: add SSE2 to C build instructions
 * c/blake3_c_rust_bindings/build.rs: add SSE2 C rust binding builds
 * c/blake3_c_rust_bindings/src/lib.rs: add SSE2 C rust bindings
 * c/blake3_dispatch.c: add SSE2 C dispatch
 * c/blake3_impl.h: add SSE2 C function prototypes
 * c/blake3_sse2.c: add SSE2 C intrinsic file starting with SSE4.1 version
 * c/blake3_sse2_x86-64_{unix.S,windows_gnu.S,windows_msvc.asm}: add SSE2
   assembly files starting with SSE4.1 version
 * src/ffi_sse2.rs: add rust implementation using SSE2 C rust bindings
 * src/lib.rs: add SSE2 rust intrinsics and SSE2 C rust binding rust SSE2 module
   configurations
 * src/platform.rs: add SSE2 rust platform detection and dispatch
 * src/rust_sse2.rs: add SSE2 rust intrinsic file starting with SSE4.1 version
 * tools/instruction_set_support/src/main.rs: add SSE2 feature detection
2020-08-24 00:54:46 -04:00
Samuel Neves adbf07d67a Fix #109
The default executable stack setting on Linux can be fixed in two different ways:

 - By adding the `.section .note.GNU-stack,"",%progbits` special incantation
 - By passing the `--noexecstack` flag to the assembler

This patch implements both, but only one of them is strictly necessary.

I've also added some additional hardening flags to the Makefile. May not be portable.
2020-08-23 22:32:36 +01:00
Jack O'Connor 8dc30a2737 assembly authorship in the README 2020-08-19 10:52:32 -04:00
Jack O'Connor 09cc03614d the same hex example for rustdocs 2020-08-14 11:33:53 -04:00
Jack O'Connor 88094169fc tweak the readme hex example 2020-08-14 11:30:46 -04:00
Alexx Roche 7589c8c608 How to access the hex inside the Hash()
blake3::hash(b"str") -> Hash(hex)  and I want the `hex` stripped out from the Hash() as a str
2020-08-14 11:22:56 -04:00