This makes them consistent with how the existing update() and
update_rayon() methods work, with the difference being that it's it's
io::Result<&mut Self> instead of just &mut Self.
New methods:
- update_reader
- update_mmap
- update_mmap_rayon
These are more discoverable, more convenient, and safer.
There are two problems I want to avoid by taking a `Path` instead of a
`File`. First, exposing `Mmap` objects to the caller is fundamentally
unsafe, and making `maybe_mmap_file` private avoids that issue. Second,
taking a `File` raises questions about whether memory mapped reads
should behave like regular file reads. (Should they respect the current
seek position? Should they update the seek position?) Taking a `Path`
from the caller and opening the `File` internally avoids these
questions.
If multiple threads try to compute a hash simultaneously before the library has been used for the first time,
the logic in get_cpu_features that detects CPU features will write to g_cpu_features without synchronization,
which is a race condition and flagged by ThreadSanitizer.
This change marks g_cpu_features as an atomic variable to address the race condition.
Changes since 1.4.0:
- Improved performance in the ARM NEON implementation for both C and
Rust callers. This affects AArch64 targets by default and ARMv7
targets that explicitly enable (and support) NEON. The size of the
improvement depends on the microarchitecture, but I've benchmarked
~1.3x on a Cortex-A53 and ~1.2x on an Apple M1. Contributed by
@sdlyyxy in #319.
- The MSRV is now 1.66.1 for both the `blake3` crate and `b3sum`.
Given the myriad of `-mfpu` options for ARM [1], the inability to
portably query for CPU support, and the lack of standardized ISA names
we have no other choice, but to opt out of automatically supplying NEON
compile flags. Instead we simply add the NEON optimized source file if
we detect an ISA with guaranteed NEON support (>= ARMv8) or the user
explicitly requests it (in which case he is expected to provide the
compile flags with `CMAKE_C_FLAGS` or `BLAKE3_CFLAGS_NEON` either
through a toolchain file or commandline parameters).
[1]: https://gcc.gnu.org/onlinedocs/gcc/ARM-Options.html
Changes since 1.3.3:
- The C implementation provides a `CMakeLists.txt` for callers who build
with CMake. The CMake build is not yet stable, and callers should
expect breaking changes in patch version updates. The "by hand" build
will always continue to be supported and documented.
- `b3sum` supports the `--seek` flag, to set the starting position in
the output stream.
- `b3sum --check` prints a summary of errors to stderr.
- `Hash::as_bytes` is const.
- `Hash` supports `from_bytes`, which is const.
The ISA names communicated by `CMAKE_SYSTEM_PROCESSOR` aren't as much
standardized as one would wish they were. Factor the different names
into lists allowing for simpler checks and future updates.
Add hidden options for enabling SIMD support in case ISA detection
fails. These should only be used to temporarily workarounds until the
ISA name lists has been updated/fixed.
In order for blake3 to be usable as a shared library on Windows it is
required to annotate public symbols. Use this as an opportunity to prune
the symbol table for other OSes, too.
Aggreggate source files directly in the target instead of a proxy
variable.
Install CMake package config files in order to allow the project to be
found via `find_package()` by dependents.
Replace hard coded SIMD compiler flags with configurable options. Retain
the current GCC/Clang flags as defaults for these compilers. Add default
SIMD compiler flags for MSVC.
Remove hard coded compiler flags (including -fPIC). These are not
portable and should be set by the toolchain file or on the CLI.
- Guard ASM sources with triplet compatibility checks.
- Remove the `BLAKE3_STATIC` option in favor of [`BUILD_SHARED_LIBS`].
[`BUILD_SHARED_LIBS`]: https://cmake.org/cmake/help/v3.9/variable/BUILD_SHARED_LIBS.html
SSSE3 is indicated by bit 9 of ECX, not bit 0, which indicates the
presence of SSE3.
There are very few CPUs in use affected by this bug; SSE3 was part of
the Prescott new instructions, introduced in the later Pentium 4 chips,
whereas SSSE3 was introduced in Intel's Core 2 and AMD's Bulldozer. This
leaves a few Pentium 4 and Athlon 64 models that will potentially run an
illegal pshufb or pblendw.