1
0
Fork 0
mirror of https://github.com/BLAKE3-team/BLAKE3 synced 2024-06-01 17:46:03 +02:00
Commit Graph

269 Commits

Author SHA1 Message Date
Jack O'Connor 6fbc1a679d version 0.3.0
Changes since version 0.2.3:
- The optimized assembly implementations are now built by default. They
  perform better than the intrinsics implementations, and they compile
  much more quickly. Bringing the default behavior in line with reported
  benchmark figures should also simplify things for people running their
  own benchmarks. Previously this crate only built Rust intrinsics
  implementations by default, and the assembly implementations were
  gated by the (slightly confusingly named) "c" feature. Now the "c"
  feature is gone, and applications that need the old behavior can use
  the new "pure" feature. Mainly this will be applications that don't
  want to require a C compiler. Note that the `b3sum` crate previously
  activated the "c" feature by default, so its behavior hasn't changed.
2020-03-30 00:36:13 -04:00
Jack O'Connor 152740578e add testing-only flags to disable individual instruction sets
This lets CI test a wider range of possible SIMD support settings.
2020-03-29 23:12:47 -04:00
Jack O'Connor e06a0f255a refactor the Cargo feature set
The biggest change here is that assembly implementations are enabled by
default.

Added features:
- "pure" (Pure Rust, with no C or assembly implementations.)

Removed features:
- "c" (Now basically the default.)

Renamed features;
- "c_prefer_intrinsics" -> "prefer_intrinsics"
- "c_neon" -> "neon"

Unchanged:
- "rayon"
- "std" (Still the only feature on by default.)
2020-03-29 18:02:03 -04:00
Jack O'Connor 7caf1ad4bb version 0.2.3
Changes since version 0.2.2:
- Bug fix: Commit 13556be fixes a crash on Windows when using the SSE4.1
  assembly implementation (--features=c, set by default for b3sum). This
  is undefined behavior and therefore a potential security issue.
- b3sum now supports the --num-threads flag.
- The C API now includes a blake3_hasher_finalize_seek() function, which
  returns output from any position in the extended output stream.
- Build fix: Commit 5fad419 fixes a compiler error in the AVX-512 C
  intrinsics implementation targeting the Windows GNU ABI.
2020-03-29 01:44:00 -04:00
Jack O'Connor 5fad419a8d add a Windows GNU AVX-512 build break workaround
The break in question only repros under --release, and we didn't start
testing a release build of the prefer-intrinsics mode until just now.
2020-03-29 01:42:41 -04:00
Jack O'Connor 96c36d5df9 add more --release mode testing 2020-03-29 01:01:30 -04:00
Samuel Neves 13556be388 save missing clobbered registers on Windows 2020-03-29 05:53:37 +01:00
Jack O'Connor be4e7babee print instruction set support quietly 2020-03-28 19:51:54 -04:00
Jack O'Connor f77c8ffd7c print out instruction set support in CI 2020-03-28 19:30:28 -04:00
Jack O'Connor eb50d82f16 add release assembly tests 2020-03-28 19:15:25 -04:00
Jack O'Connor c26a37f70c C files -> C and assembly files 2020-03-25 17:25:48 -04:00
Jack O'Connor c3639b4255 c/README.md changes
The C implementation now supports output seeking. Also expand the API
section a bit, and reorganize things to put the example on top.
2020-03-25 17:11:36 -04:00
Jack O'Connor a4ceef3932 add blake3_hasher_finalize_seek to the C API 2020-03-25 17:11:36 -04:00
Jack O'Connor 4feadee6bb disable fail-fast for cross tests too 2020-03-24 16:45:33 -04:00
Jack O'Connor 9d77bd6958 correct a comment 2020-03-17 14:26:39 -04:00
Jack O'Connor c8f93a32bb add links to other implementations in the readme 2020-03-16 17:38:08 -04:00
Jack O'Connor 470d42a05a update b3sum/README.md 2020-03-16 12:26:16 -04:00
Jack O'Connor a0355ba8e0 add the --num-threads flag
As part of this change, make the rayon and memmap dependencies
mandatory. This simplifies the code a lot, and I'm not aware of any
callers who build b3sum without the default dependencies.

If --num-threads is not given, or if its value is 0, b3sum will still
respect the RAYON_NUM_THREADS environment variable.
2020-03-16 12:24:03 -04:00
Jack O'Connor d925728aed wrap --help output to 80 columns 2020-03-15 15:47:58 -04:00
Jack O'Connor 1f529a841c add an example of parsing a Hash from a hex string
Suggested by @zaynetro:
https://github.com/BLAKE3-team/BLAKE3/pull/24#issuecomment-594369061
2020-03-05 10:54:22 -05:00
Jack O'Connor 48f2f745d9 clean up the C example a bit 2020-03-01 17:33:36 -05:00
Jack O'Connor 0432f9c7a3 some comment typos 2020-02-27 09:52:46 -05:00
Jack O'Connor c197a773ac version 0.2.2
Changes since 0.2.1 (and since c-0.2.0):
- Fix a performance issue when the caller makes multiple calls to
  update() with uneven lengths. (#69, reported by @willbryant.)
2020-02-25 12:15:27 -05:00
Jack O'Connor 8d84cfc0af remove a mis-optimization that hurt performance for uneven updates
If the total number of chunks hashed so far is e.g. 1, and update() is
called with e.g. 8 more chunks, we can't compress all 8 together. We
have to break the input up, to make sure that that 1 lone chunk CV gets
merged with its proper sibling, and that in general the correct layout
of the tree is preserved. What we should do is hash 1-2-4-1 chunks of
input, using increasing powers of 2 (with some cleanup at the end). What
we were doing was 2-2-2-2 chunks. This was the result of a mistaken
optimization that got us stuck with an always-odd number of chunks so
far.

Fixes https://github.com/BLAKE3-team/BLAKE3/issues/69.
2020-02-25 11:40:37 -05:00
Jack O'Connor 74b5fe9054 update the red bar chart with the figure from the asm implementation 2020-02-21 18:14:31 -05:00
Jack O'Connor 9f6104c8ed add examples to the b3sum readme 2020-02-19 16:48:53 -05:00
Jack O'Connor fdd329ba57 check for AVX-512 compiler support even when using assembly 2020-02-14 11:59:59 -05:00
Jack O'Connor 865d201722 version 0.2.1
Changes since 0.2.0:
- Workarounds in the assembly implementations (enabled by the "c"
  feature), to build with older compilers.
2020-02-14 11:20:03 -05:00
Jack O'Connor fdeb3a38ee tag the first release of the C implementation, c-0.2.0
This release is motivated by a fix for a potential security
vulnerability. 421a21abd8 fixes a bug
introduced in a1c4c4efb5. A truncated
pointer register led to a segfault on x86-64 under Clang 7 and 8.
Clang 9 happens to be unaffected, but the behavior is undefined in
general. See also:
https://github.com/BLAKE3-team/BLAKE3/issues/60#issuecomment-585838317

The C implementation of BLAKE3 hasn't been formally packaged anywhere,
and most callers vendor code from master. This release tag is intended
to make the fix above more visible, to encourage callers to update their
vendored copies. We will continue to publish tags like this whenever
bugs in the C implementation are fixed, or if there are any incompatible
API changes.

Note that the issue above does not impact callers of the Rust `blake3`
crate. The affected file, `blake3_dispatch.c`, is not compiled by that
crate in any configuration. It does impact callers of the internal
`blake3_c_rust_bindings` crate, but that crate is not published on
crates.io and not intended for production use.
2020-02-13 12:30:10 -05:00
Samuel Neves 421a21abd8 Fix bug inadvertently introduced in a1c4c4efb5 2020-02-13 16:08:07 +00:00
Samuel Neves 207915a751 Work around GCC bug 85328 by forcing trivially masked stores.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85328

Fixes #58.
2020-02-13 15:22:17 +00:00
Samuel Neves fa6f14cafa Work around clang bug 36144 by replacing anonymous label numbers.
https://bugs.llvm.org/show_bug.cgi?id=36144

Fixes #60.
2020-02-13 15:22:17 +00:00
Jack O'Connor fcc14c8c1b more file renaming, use underscores more consistently 2020-02-12 18:41:41 -05:00
Erik Johansson 0281f1ae16 Rename assembly files (blake3-* -> blake3_*)
This gives the assembly files the same prefix as the intrinsics files which
simplifies building when the build system should pick between the assembly and
the intrinsics files.
2020-02-12 23:08:44 +01:00
Jack O'Connor afdaf3036b version 0.2.0
Changes since 0.1.5:
- The `c_avx512` feature has been replaced by the `c` feature. In
  addition to providing AVX-512 support, `c` also provides optimized
  assembly implementations. These assembly implementations perform
  better, perform more consistently across compilers, and compile more
  quickly. As before, `c` is off by default, but the `b3sum` binary
  crate activates it by default.
- The `rayon` feature no longer affects the entire API. Instead, it
  provides the `join::RayonJoin` type for use with
  `Hasher::update_with_join`, so that the caller can control when
  multi-threading happens. Standalone API functions like `hash` are
  always single-threaded now.
2020-02-12 14:57:57 -05:00
Jack O'Connor 724e784fe2 merge the version 0.1.5 branch
Version 0.1.5 was a backport release to mitigate
https://github.com/BLAKE3-team/BLAKE3/issues/57. This is a no-op merge
to make sure that the 0.1.5 branch shows up in `git log master`.
2020-02-12 14:57:00 -05:00
Jack O'Connor 5dea889834 add a performance note and a usage example for Hasher 2020-02-12 14:38:35 -05:00
Jack O'Connor 38a46ba8ae document optional Cargo features on docs.rs 2020-02-12 14:20:11 -05:00
Jack O'Connor 1c4d7fdd8d add test_asm to the C Makefile 2020-02-12 13:12:05 -05:00
Jack O'Connor 7ee05ba3bd document how to build the C code with assembly implementations 2020-02-12 13:04:03 -05:00
Jack O'Connor b8a1d2d982 integrate assembly implementations into blake3_c_rust_bindings 2020-02-12 10:23:17 -05:00
Jack O'Connor efbfa0463c integrate assembly implementations into the blake3 crate 2020-02-12 10:23:17 -05:00
Samuel Neves b6b3c27824 assembly implementations 2020-02-12 10:23:17 -05:00
Jack O'Connor 1c5d4eea6a test a couple more reset() cases 2020-02-12 10:22:54 -05:00
Jack O'Connor e0dc4d932e use a non-zero value for counter when testing hash_many with parents
We use a counter value that's very close to wrapping the lower word,
when we're testing the hash_many chunks case. It turns out that this is
a useful thing to do with parents too, even though parents 1) are
teeechnically supposed to always use a counter of 0, and 2) aren't going
to increment the counter at all. We caught a bug in the assembly
implementations this way (where we accidentally did increment the
counter, but only the higher word), because the equivalent test in
rust_c_bindings uses this eccentric parents counter value.
2020-02-11 23:45:41 -05:00
Jack O'Connor ec34043b45 add cross testing on i686 to CI 2020-02-11 13:58:26 -05:00
Jack O'Connor 30671b1c05 version 0.1.5
Changes since 0.1.4:
- Remove all AVX-512 code from builds with the default feature set. This
  works around https://github.com/rust-lang/rust/issues/68905 and fixes
  the nightly build as long as the "c_avx512" feature is not activated.

This release is a backport of a single commit, e43a7d6. The master
branch contains backwards-incompatible changes (fc219f4), and the next
release of master will be version 0.2.0.

Note that the `b3sum` crate activates the "c_avx512" feature by default,
and it will continue to fail to build on nightly until the upstream bug
is fixed.
2020-02-10 16:00:37 -05:00
Jack O'Connor e43a7d68bc avoid compiling avx512_detected() when the "c_avx512" feature is disabled
https://github.com/rust-lang/rust/issues/68905 is currently causing
nightly builds to fail, unless `--no-default-features` is used. This
change means that the default build will succeed, and the failure will
only happen when the "c_avx512" is enabled. The `b3sum` crate will still
fail to build on nightly, because it enables that feature, but most
callers should start succeeding on nightly.
2020-02-10 15:54:52 -05:00
Jack O'Connor af2e791602 avoid compiling avx512_detected() when the "c_avx512" feature is disabled
https://github.com/rust-lang/rust/issues/68905 is currently causing
nightly builds to fail, unless `--no-default-features` is used. This
change means that the default build will succeed, and the failure will
only happen when the "c_avx512" is enabled. The `b3sum` crate will still
fail to build on nightly, because it enables that feature, but most
callers should start succeeding on nightly.
2020-02-10 15:25:23 -05:00
Jack O'Connor c0a43e5fb8 add the Windows GNU toolchain to CI 2020-02-07 13:46:42 -05:00