1
0
Fork 0
mirror of https://github.com/BLAKE3-team/BLAKE3 synced 2024-05-31 00:06:05 +02:00
Commit Graph

72 Commits

Author SHA1 Message Date
Jack O'Connor 5dea889834 add a performance note and a usage example for Hasher 2020-02-12 14:38:35 -05:00
Jack O'Connor 38a46ba8ae document optional Cargo features on docs.rs 2020-02-12 14:20:11 -05:00
Jack O'Connor efbfa0463c integrate assembly implementations into the blake3 crate 2020-02-12 10:23:17 -05:00
Jack O'Connor 1c5d4eea6a test a couple more reset() cases 2020-02-12 10:22:54 -05:00
Jack O'Connor e0dc4d932e use a non-zero value for counter when testing hash_many with parents
We use a counter value that's very close to wrapping the lower word,
when we're testing the hash_many chunks case. It turns out that this is
a useful thing to do with parents too, even though parents 1) are
teeechnically supposed to always use a counter of 0, and 2) aren't going
to increment the counter at all. We caught a bug in the assembly
implementations this way (where we accidentally did increment the
counter, but only the higher word), because the equivalent test in
rust_c_bindings uses this eccentric parents counter value.
2020-02-11 23:45:41 -05:00
Jack O'Connor af2e791602 avoid compiling avx512_detected() when the "c_avx512" feature is disabled
https://github.com/rust-lang/rust/issues/68905 is currently causing
nightly builds to fail, unless `--no-default-features` is used. This
change means that the default build will succeed, and the failure will
only happen when the "c_avx512" is enabled. The `b3sum` crate will still
fail to build on nightly, because it enables that feature, but most
callers should start succeeding on nightly.
2020-02-10 15:25:23 -05:00
Jack O'Connor fc219f4f8d Hasher::update_with_join
This is a new interface that allows the caller to provide a
multi-threading implementation. It's defined in terms of a new `Join`
trait, for which we provide two implementations, `SerialJoin` and
`RayonJoin`. This lets the caller control when multi-threading is used,
rather than the previous all-or-nothing design of the "rayon" feature.

Although existing callers should keep working, this is a compatibility
break, because callers who were relying on automatic multi-threading
before will now be single-threaded. Thus the next release of this crate
will need to be version 0.2.

See https://github.com/BLAKE3-team/BLAKE3/issues/25 and
https://github.com/BLAKE3-team/BLAKE3/issues/54.
2020-02-06 15:07:15 -05:00
Jack O'Connor 24071db346 re-export digest and crypto_mac 2020-02-04 10:02:46 -05:00
Cesar Eduardo Barros a3d42f724d Inline wrapper methods 2020-02-03 17:29:25 -05:00
Jack O'Connor 0651736ff4 make the inherent reset() method return &mut self 2020-02-03 10:21:27 -05:00
Jack O'Connor 9ffe377d45 implement crypto_mac::Mac 2020-02-03 10:18:02 -05:00
Jack O'Connor bcd424cab6 mention the digest traits in the docs 2020-02-02 17:40:30 -05:00
Jack O'Connor 9bab77d2cf implement traits from the digest crate 2020-02-02 17:28:22 -05:00
Jack O'Connor e603983647 add Hasher::reset
Closes https://github.com/BLAKE3-team/BLAKE3/issues/41.
2020-02-02 16:38:29 -05:00
Jack O'Connor 92d421dea1 add a larger test case
One thing I like to test is that, if I hack simd_degree to be higher
than MAX_SIMD_DEGREE, assertions fire. This requires a test case long
enough to exceed that number of chunks.
2020-01-22 21:19:47 -05:00
Jack O'Connor 78e858d050 expand comments about lazy merging 2020-01-21 12:09:42 -05:00
Jack O'Connor ccadbad244 stack size in the optimized impl should be MAX_DEPTH + 1 2020-01-21 11:41:20 -05:00
Jack O'Connor 67262dff31 double the maximum incremental subtree size
Because compress_subtree_to_parent_node effectively cuts its input in
half, we can give it an input that's twice as big, without violating the
CV stack invariant.
2020-01-20 19:25:55 -05:00
Samuel Neves b8c33e11ef manually prefetch message blocks 2020-01-19 18:45:37 +00:00
Jack O'Connor a3147eb909 comment about parallelism 2020-01-18 14:32:52 -05:00
Jack O'Connor 84c26670bf add blake3_c_rust_bindings for testing and benchmarking 2020-01-16 16:09:42 -05:00
Cesar Eduardo Barros 9f509a8f1f Inline trivial functions
For the Read and Write traits, this also allows the compiler to see that
the return value is always Ok, allowing it to remove the Err case from
the caller as dead code.
2020-01-12 18:27:42 -05:00
Cesar Eduardo Barros 4690c5f14e Use fixed-size constant_time_eq
The generic constant_time_eq has several branches on the slice length,
which are not necessary when the slice length is known. However, the
optimizer is not allowed to look into the core of constant_time_eq, so
these branches cannot be elided.

Use instead a fixed-size variant of constant_time_eq, which has no
branches since the length is known.
2020-01-12 17:40:57 -05:00
Jack O'Connor 793c8a2444 disambiguate the two test
We can't change the context used in test_vectors.json without breaking
people, but we can change the one in unit tests.
2020-01-11 00:23:07 -05:00
Jack O'Connor 8d3f33802d correct the comments around SIMD rotations 2020-01-10 10:29:48 -05:00
Jack O'Connor 8be609ba9d delete the previous vendored C files and repoint the Rust code 2020-01-09 09:48:52 -05:00
Jack O'Connor 442775e3ce test_msg_schedule_permutation 2020-01-09 09:21:07 -05:00
JP Aumasson ed81da9aaa code comment 2020-01-08 13:28:02 -05:00
Jack O'Connor b0d775d589 simplify the docs example 2020-01-07 15:41:35 -05:00
Jack O'Connor 80260dc763 switch to the new permutations 2020-01-05 14:57:17 -05:00
Jack O'Connor 9fe42d0702 warn not to use derive_key with passwords 2020-01-05 13:29:50 -05:00
Jack O'Connor dc324a189e add the guts module to share code with Bao 2019-12-29 11:55:19 -06:00
Jack O'Connor 2fac7447e0 make derive_key take a key of any length
The previous version of this API called for a key of exactly 256 bits.
That's good for optimal performance, but it would mean losing the
use-with-other-algorithms property for applications whose input keys are
a different size. There's no way for an abstraction over the previous
version to provide reliable domain separation for the "extract" step.
2019-12-28 17:56:29 -06:00
Jack O'Connor 021c7b66be switch to simplified rotations
This is a performance improvement on modern x86 chips (Skylake and
later), and the LLVM optimizer can convert these to AVX-512 rotations
when those are enabled.
2019-12-23 13:41:06 -06:00
Jack O'Connor ab88db1aed docs tweaks 2019-12-14 10:13:10 -05:00
Jack O'Connor f54c292a53 silence another warning in the --no-default-features tests 2019-12-13 13:19:44 -05:00
Jack O'Connor d963fe18f3 test release mode in CI
As part of this, get rid of the BLAKE3_FUZZ_ITERATIONS variable. I
wasn't using it anywhere, and it was leading to some compiler warnings
in --no-default-features mode.
2019-12-13 13:15:48 -05:00
Jack O'Connor 0c245f21bf fix the doc tests build 2019-12-13 13:12:06 -05:00
Jack O'Connor 04f5ccd648 expand the docs 2019-12-13 12:53:09 -05:00
Jack O'Connor a52d4daa98 update MAX_DEPTH 2019-12-12 23:40:13 -05:00
Jack O'Connor b5f1e925f7 rename "offset" to "counter" and always increment it by 1
This is simpler than sometimes incrementing by CHUNK_LEN and other times
incrementing by BLOCK_LEN.
2019-12-12 21:41:30 -05:00
Jack O'Connor a5cc3b2867 reduce the CHUNK_LEN from 2048 bytes to 1024 bytes
Smaller chunk sizes are a big benefit for parallelism at shorter input
lengths, and recent benchmarks show that this reduction has a relative
small cost in terms of peak throughput. It's also a nice round number.
2019-12-12 20:39:00 -05:00
Jack O'Connor 9bf1020213 make the "c_avx512" feature a no-op on non-x86
This lets us enable it by default in b3sum.
2019-12-12 15:13:04 -05:00
Jack O'Connor 3b5664c8a5 struct OutputReader 2019-12-12 11:33:21 -05:00
Jack O'Connor 1a57232b49 delete an unused import 2019-12-11 22:32:53 -05:00
Jack O'Connor 52ea6487f8 switch to representing CVs as words for the compression function
The portable implementation was getting slowed down by converting back
and forth between words and bytes.

I made the corresponding change on the C side first
(12a37be8b5),
and as part of this commit I'm re-vendoring the C code. I'm also
exposing a small FFI interface to C so that blake3_neon.c can link
against portable.rs rather than blake3_portable.c, see c_neon.rs.
2019-12-11 18:05:26 -05:00
Jack O'Connor c81d5c2522 test against test_vectors.json in CI 2019-12-11 10:50:18 -05:00
Jack O'Connor ee0014776f silence an unreachable code warning when "c_neon" is in use 2019-12-08 21:58:32 -05:00
Jack O'Connor ae7271cc87 add benchmarks for AVX-512 and NEON 2019-12-08 21:56:10 -05:00
Jack O'Connor 1574b488f9 unify the platform-specific tests and test AVX-512 and NEON 2019-12-08 21:56:10 -05:00