Changes since 1.5.4:
- `b3sum --check` now supports checkfiles with Windows-style newlines.
`b3sum` still emits Unix-style newlines, even on Windows, but
sometimes text editors or version control tools will swap them.
- The "digest" feature (deleted in v1.5.2) has been added back to the
`blake3` crate. This is for backwards compatibility only, and it's
insta-deprecated. All callers should prefer the "traits-preview"
feature.
Changes since 1.5.3:
- Initial implementation of SIMD acceleration for the XOF (i.e.
blake3::Hasher::finalize_xof). This brings long output performance
into line with long input performance. Currently AVX-512-only and
Unix-only.
- Add build support for "gnullvm" targets (Clang on Windows).
- The "zeroize" feature no longer depends on proc-macros and syn.
Changes since 1.5.2:
- Revert the serialization change. It was intended to be backwards
compatible, but that didn't hold for non-self-describing serialization
formats like bincode. See #414.
Changes since 1.5.1:
- `build.rs` sets `cc::Build::emit_rerun_if_env_changed(false)` to
prevent some unnecessary rebuilds, particularly when the `PATH`
changes on Windows. See #324.
- Serializing a `Hash` produces a bytestring instead of an array in
formats that support bytestrings (like CBOR). Deserialization is
backwards-compatible with the array format.
- Cleanup and edge case fixes in the C and CMake builds.
Changes since 1.5.0:
- The Rust crate is now compatible with Miri.
- ~1% performance improvement on Arm NEON contributed by @divinity76 (#384).
- Various fixes and improvements in the CMake build.
- The MSRV of b3sum is now 1.74.1. (The MSRV of the library crate is
unchanged, 1.66.1.)
vld1q_u8 and vst1q_u8 has no alignment requirements.
This improves performance on Oracle Cloud's VM.Standard.A1.Flex by 1.15% on a 16*1024 input, from 13920 nanoseconds down to 13800 nanoseconds (approx)
Specify language requirement as a [compile-feature] and force compiler
extensions off ensuring portability problems are detected early on.
Note that we do not use the `C_STANDARD` property, because it doesn't
propagate to dependent targets and would prohibit users from compiling
their code base with consistent flags / language configuations if they
were to target a newer C standard. Similarly we do not configure
`C_STANDARD_REQUIRED` as [compile-features] do not interact with
it--they are enforced regardless.
[compile-feature]: https://cmake.org/cmake/help/latest/manual/cmake-compile-features.7.html#compile-feature-requirements
clang-cl is LLVM's MSVC-compatible compiler frontend for Windows ABI.
If clang-cl is in use, `CMAKE_C_COMPILER_ID` is `Clang` even though
it doesn't take Unix-like command line options but MSVC-like options.
`if(MSVC)` is the correct predicate to check if we should pass MSVC-ish
command line options.
ARMv8 CPUs are guaranteed to support NEON instructions. However, for
32bit ARMv8 triplets GCC needs to explicitly be configured to enable
NEON intrinsics.
Changes since 1.4.1:
- The Rust crate's Hasher type has gained new helper methods for common
forms of IO: update_reader, update_mmap, and update_mmap_rayon. The
latter matches the default behavior of b3sum. The mmap methods are
gated by the new "mmap" Cargo feature.
- Most of the Rust crate's public types now implement the Zeroize trait.
This is gated by the new "zeroize" Cargo feature.
- The Rust crate's Hash types now implements the serde Serialize and
Deserialize traits. This is gated by the new "serde" Cargo feature.
- The C library now uses atomics to cache detected CPU features under
most compilers other than MSVC. Previously this was a non-atomic
write, which was probably "benign" but made TSan unhappy.
- NEON support is now disabled by default on big-endian AArch64.
Previously this was a build error if the caller didn't explicitly
disable it.
If multiple threads try to compute a hash simultaneously before the library has been used for the first time,
the logic in get_cpu_features that detects CPU features will write to g_cpu_features without synchronization,
which is a race condition and flagged by ThreadSanitizer.
This change marks g_cpu_features as an atomic variable to address the race condition.
Changes since 1.4.0:
- Improved performance in the ARM NEON implementation for both C and
Rust callers. This affects AArch64 targets by default and ARMv7
targets that explicitly enable (and support) NEON. The size of the
improvement depends on the microarchitecture, but I've benchmarked
~1.3x on a Cortex-A53 and ~1.2x on an Apple M1. Contributed by
@sdlyyxy in #319.
- The MSRV is now 1.66.1 for both the `blake3` crate and `b3sum`.
Given the myriad of `-mfpu` options for ARM [1], the inability to
portably query for CPU support, and the lack of standardized ISA names
we have no other choice, but to opt out of automatically supplying NEON
compile flags. Instead we simply add the NEON optimized source file if
we detect an ISA with guaranteed NEON support (>= ARMv8) or the user
explicitly requests it (in which case he is expected to provide the
compile flags with `CMAKE_C_FLAGS` or `BLAKE3_CFLAGS_NEON` either
through a toolchain file or commandline parameters).
[1]: https://gcc.gnu.org/onlinedocs/gcc/ARM-Options.html
Changes since 1.3.3:
- The C implementation provides a `CMakeLists.txt` for callers who build
with CMake. The CMake build is not yet stable, and callers should
expect breaking changes in patch version updates. The "by hand" build
will always continue to be supported and documented.
- `b3sum` supports the `--seek` flag, to set the starting position in
the output stream.
- `b3sum --check` prints a summary of errors to stderr.
- `Hash::as_bytes` is const.
- `Hash` supports `from_bytes`, which is const.
The ISA names communicated by `CMAKE_SYSTEM_PROCESSOR` aren't as much
standardized as one would wish they were. Factor the different names
into lists allowing for simpler checks and future updates.
Add hidden options for enabling SIMD support in case ISA detection
fails. These should only be used to temporarily workarounds until the
ISA name lists has been updated/fixed.