1
0
Fork 0
mirror of https://github.com/BLAKE3-team/BLAKE3 synced 2024-06-08 03:56:06 +02:00
Commit Graph

235 Commits

Author SHA1 Message Date
Jack O'Connor afdaf3036b version 0.2.0
Changes since 0.1.5:
- The `c_avx512` feature has been replaced by the `c` feature. In
  addition to providing AVX-512 support, `c` also provides optimized
  assembly implementations. These assembly implementations perform
  better, perform more consistently across compilers, and compile more
  quickly. As before, `c` is off by default, but the `b3sum` binary
  crate activates it by default.
- The `rayon` feature no longer affects the entire API. Instead, it
  provides the `join::RayonJoin` type for use with
  `Hasher::update_with_join`, so that the caller can control when
  multi-threading happens. Standalone API functions like `hash` are
  always single-threaded now.
2020-02-12 14:57:57 -05:00
Jack O'Connor 724e784fe2 merge the version 0.1.5 branch
Version 0.1.5 was a backport release to mitigate
https://github.com/BLAKE3-team/BLAKE3/issues/57. This is a no-op merge
to make sure that the 0.1.5 branch shows up in `git log master`.
2020-02-12 14:57:00 -05:00
Jack O'Connor 5dea889834 add a performance note and a usage example for Hasher 2020-02-12 14:38:35 -05:00
Jack O'Connor 38a46ba8ae document optional Cargo features on docs.rs 2020-02-12 14:20:11 -05:00
Jack O'Connor 1c4d7fdd8d add test_asm to the C Makefile 2020-02-12 13:12:05 -05:00
Jack O'Connor 7ee05ba3bd document how to build the C code with assembly implementations 2020-02-12 13:04:03 -05:00
Jack O'Connor b8a1d2d982 integrate assembly implementations into blake3_c_rust_bindings 2020-02-12 10:23:17 -05:00
Jack O'Connor efbfa0463c integrate assembly implementations into the blake3 crate 2020-02-12 10:23:17 -05:00
Samuel Neves b6b3c27824 assembly implementations 2020-02-12 10:23:17 -05:00
Jack O'Connor 1c5d4eea6a test a couple more reset() cases 2020-02-12 10:22:54 -05:00
Jack O'Connor e0dc4d932e use a non-zero value for counter when testing hash_many with parents
We use a counter value that's very close to wrapping the lower word,
when we're testing the hash_many chunks case. It turns out that this is
a useful thing to do with parents too, even though parents 1) are
teeechnically supposed to always use a counter of 0, and 2) aren't going
to increment the counter at all. We caught a bug in the assembly
implementations this way (where we accidentally did increment the
counter, but only the higher word), because the equivalent test in
rust_c_bindings uses this eccentric parents counter value.
2020-02-11 23:45:41 -05:00
Jack O'Connor ec34043b45 add cross testing on i686 to CI 2020-02-11 13:58:26 -05:00
Jack O'Connor 30671b1c05 version 0.1.5
Changes since 0.1.4:
- Remove all AVX-512 code from builds with the default feature set. This
  works around https://github.com/rust-lang/rust/issues/68905 and fixes
  the nightly build as long as the "c_avx512" feature is not activated.

This release is a backport of a single commit, e43a7d6. The master
branch contains backwards-incompatible changes (fc219f4), and the next
release of master will be version 0.2.0.

Note that the `b3sum` crate activates the "c_avx512" feature by default,
and it will continue to fail to build on nightly until the upstream bug
is fixed.
2020-02-10 16:00:37 -05:00
Jack O'Connor e43a7d68bc avoid compiling avx512_detected() when the "c_avx512" feature is disabled
https://github.com/rust-lang/rust/issues/68905 is currently causing
nightly builds to fail, unless `--no-default-features` is used. This
change means that the default build will succeed, and the failure will
only happen when the "c_avx512" is enabled. The `b3sum` crate will still
fail to build on nightly, because it enables that feature, but most
callers should start succeeding on nightly.
2020-02-10 15:54:52 -05:00
Jack O'Connor af2e791602 avoid compiling avx512_detected() when the "c_avx512" feature is disabled
https://github.com/rust-lang/rust/issues/68905 is currently causing
nightly builds to fail, unless `--no-default-features` is used. This
change means that the default build will succeed, and the failure will
only happen when the "c_avx512" is enabled. The `b3sum` crate will still
fail to build on nightly, because it enables that feature, but most
callers should start succeeding on nightly.
2020-02-10 15:25:23 -05:00
Jack O'Connor c0a43e5fb8 add the Windows GNU toolchain to CI 2020-02-07 13:46:42 -05:00
Jack O'Connor ca62c4724d stop skipping all other builds when one CI build fails 2020-02-06 18:43:50 -05:00
Jack O'Connor fc219f4f8d Hasher::update_with_join
This is a new interface that allows the caller to provide a
multi-threading implementation. It's defined in terms of a new `Join`
trait, for which we provide two implementations, `SerialJoin` and
`RayonJoin`. This lets the caller control when multi-threading is used,
rather than the previous all-or-nothing design of the "rayon" feature.

Although existing callers should keep working, this is a compatibility
break, because callers who were relying on automatic multi-threading
before will now be single-threaded. Thus the next release of this crate
will need to be version 0.2.

See https://github.com/BLAKE3-team/BLAKE3/issues/25 and
https://github.com/BLAKE3-team/BLAKE3/issues/54.
2020-02-06 15:07:15 -05:00
Jack O'Connor 24071db346 re-export digest and crypto_mac 2020-02-04 10:02:46 -05:00
Jack O'Connor 0c663aa8ac add a link in the README to bar_chart.py
Closes https://github.com/BLAKE3-team/BLAKE3/issues/53.
2020-02-04 09:40:10 -05:00
Cesar Eduardo Barros a3d42f724d Inline wrapper methods 2020-02-03 17:29:25 -05:00
Jack O'Connor 0de4412884 version 0.1.4
Changes since 0.1.3:
- Hasher supports the reset() method.
- Hasher implements several traits from the `digest` and `crypto_mac`
  crates.
- Bug fixes in the C implementation for MSVC and for 32-bit x86.
2020-02-03 12:05:26 -05:00
Jack O'Connor 0651736ff4 make the inherent reset() method return &mut self 2020-02-03 10:21:27 -05:00
Jack O'Connor 9ffe377d45 implement crypto_mac::Mac 2020-02-03 10:18:02 -05:00
Jack O'Connor bcd424cab6 mention the digest traits in the docs 2020-02-02 17:40:30 -05:00
Jack O'Connor 9bab77d2cf implement traits from the digest crate 2020-02-02 17:28:22 -05:00
Jack O'Connor e603983647 add Hasher::reset
Closes https://github.com/BLAKE3-team/BLAKE3/issues/41.
2020-02-02 16:38:29 -05:00
Samuel Neves a1c4c4efb5 Fix #51.
Thanks to bit4 for spotting this bug.
2020-02-02 18:47:38 +00:00
TheVice 58926046ca [MSVC] added possible to compile at Microsoft Visual C compiler.
[main.c] removed including of unistd.h from c/main.c file.
[blake3_avx2.c|blake3_avx512.c|blake3_sse41.c] resolved compile error:
'C4146' - applying of unary minus operator to the unsigned value.
2020-01-30 16:17:46 -05:00
Jack O'Connor 3c098eecc1 formating in c/README.md 2020-01-29 13:05:44 -05:00
Jack O'Connor af0ef07519 update the c/README.md example to hash stdin 2020-01-29 13:01:40 -05:00
Jack O'Connor 37e153cc60 add NEON support to blake3_dispatch.c
Currently this requires setting the BLAKE3_USE_NEON preprocessor flag.
In the future we may enable this automatically on AArch32/64 or include
some kind of dynamic feature detection. (Though ARM makes this harder
than x86.)

As part of this, get rid of the IS_ARM flag. It wasn't being set
properly when I tried it on a Raspberry Pi.

Closes #30.
2020-01-28 15:59:16 -05:00
Jack O'Connor d7a37fa54d clear errno before strtoull
I ran into a bug on ARM where we were getting non-zero here, from
something else that stuck around in error.
2020-01-28 14:11:26 -05:00
Jack O'Connor 4304cd1085 one more warning 2020-01-28 13:26:37 -05:00
Jack O'Connor d980514c44 fix unused variable warning 2020-01-28 13:25:22 -05:00
Jack O'Connor 6742722898 add a note about testing in main.c 2020-01-27 16:21:34 -05:00
TheVice 8ce1cddedc [memset] removed call of 'memset' function according to the overwriting
of it content inside of blake3_hasher_finalize function.
2020-01-27 16:17:09 -05:00
TheVice 4730ab237e [memset] placed function after checking of memory was done
on which it should be apply.
2020-01-27 16:17:09 -05:00
Jack O'Connor dec0c49576 add a note about AVX-512 flags 2020-01-27 13:10:25 -05:00
Jack O'Connor 444a338b45 remove an obsolete remark about performance 2020-01-27 13:04:36 -05:00
Jack O'Connor 5ef22de9d0 link to the C implementation from the README 2020-01-27 13:02:00 -05:00
Jack O'Connor 71e605fd5d
typo 2020-01-26 16:12:10 -05:00
Jack O'Connor 1db856a3e5 expand the C README for public consumption 2020-01-26 16:07:51 -05:00
Samuel Neves 214c70d8f3
Merge pull request #40 from erijo/cpp
Add extern "C" to blake3.h
2020-01-24 00:42:41 +00:00
Erik Johansson 182aea4871 Add extern "C" to blake3.h
So that the header can be included in C++-programs without getting linker
errors.
2020-01-23 20:42:34 +01:00
Samuel Neves a830ab2661 streamline load_counters
avx2 before:

        mov     eax, esi
        neg     rax
        vmovq   xmm0, rax
        vpbroadcastq    ymm0, xmm0
        vpand   ymm0, ymm0, ymmword ptr [rip + .LCPI1_0]
        vmovq   xmm2, rdi
        vpbroadcastq    ymm1, xmm2
        vpaddq  ymm1, ymm0, ymm1
        vmovdqa ymm0, ymmword ptr [rip + .LCPI1_1] # ymm0 = [0,2,4,6,4,6,6,7]
        vpermd  ymm3, ymm0, ymm1
        mov     r8d, eax
        and     r8d, 5
        add     r8, rdi
        mov     esi, eax
        and     esi, 6
        add     rsi, rdi
        and     eax, 7
        vpshufd xmm4, xmm3, 231         # xmm4 = xmm3[3,1,2,3]
        vpinsrd xmm4, xmm4, r8d, 1
        add     rax, rdi
        vpinsrd xmm4, xmm4, esi, 2
        vpinsrd xmm4, xmm4, eax, 3
        vpshufd xmm3, xmm3, 144         # xmm3 = xmm3[0,0,1,2]
        vpinsrd xmm3, xmm3, edi, 0
        vmovdqa xmmword ptr [rdx], xmm3
        vmovdqa xmmword ptr [rdx + 16], xmm4
        vpermq  ymm3, ymm1, 144         # ymm3 = ymm1[0,0,1,2]
        vpblendd        ymm2, ymm3, ymm2, 3 # ymm2 = ymm2[0,1],ymm3[2,3,4,5,6,7]
        vpsrlq  ymm2, ymm2, 32
        vpermd  ymm2, ymm0, ymm2
        vextracti128    xmm1, ymm1, 1
        vmovq   xmm3, rax
        vmovq   xmm4, rsi
        vpunpcklqdq     xmm3, xmm4, xmm3 # xmm3 = xmm4[0],xmm3[0]
        vmovq   xmm4, r8
        vpalignr        xmm1, xmm4, xmm1, 8 # xmm1 = xmm1[8,9,10,11,12,13,14,15],xmm4[0,1,2,3,4,5,6,7]
        vinserti128     ymm1, ymm1, xmm3, 1
        vpsrlq  ymm1, ymm1, 32
        vpermd  ymm0, ymm0, ymm1

avx2 after:

        neg     esi
        vmovd   xmm0, esi
        vpbroadcastd    ymm0, xmm0
        vmovd   xmm1, edi
        vpbroadcastd    ymm1, xmm1
        vpand   ymm0, ymm0, ymmword ptr [rip + .LCPI0_0]
        vpaddd  ymm1, ymm1, ymm0
        vpbroadcastd    ymm2, dword ptr [rip + .LCPI0_1] # ymm2 = [2147483648,2147483648,2147483648,2147483648,2147483648,2147483648,2147483648,2147483648]
        vpor    ymm0, ymm0, ymm2
        vpxor   ymm2, ymm1, ymm2
        vpcmpgtd        ymm0, ymm0, ymm2
        shr     rdi, 32
        vmovd   xmm2, edi
        vpbroadcastd    ymm2, xmm2
        vpsubd  ymm0, ymm2, ymm0
2020-01-23 12:17:43 +00:00
Samuel Neves de1458c565 name collision 2020-01-23 11:51:46 +00:00
Samuel Neves 37ea737c16 more robust bit-trickery functions 2020-01-23 10:58:45 +00:00
Jack O'Connor e17c45ddd5 version 0.1.3
Changes since 0.1.2:
- All x86 implementations include _mm_prefetch optimizations. These
  improve performance for very large inputs.
- The C implementation performs parallel parent hashing, matching the
  performance of the single-threaded Rust implementation.
- b3sum supports --no-mmap. Contributed by @cesarb.
2020-01-22 21:35:24 -05:00
Jack O'Connor 163f52245d port compress_subtree_to_parent_node from Rust to C
This recursive function performs parallel parent node hashing, which is
an important optimization.
2020-01-22 21:32:39 -05:00