This gives the assembly files the same prefix as the intrinsics files which
simplifies building when the build system should pick between the assembly and
the intrinsics files.
If AVX-512 is enabled, and the local C compiler doesn't support it, the
build is going to fail. However, if we check for this explicitly, we can
give a better error message.
Fixes https://github.com/BLAKE3-team/BLAKE3/issues/6.
The portable implementation was getting slowed down by converting back
and forth between words and bytes.
I made the corresponding change on the C side first
(12a37be8b5),
and as part of this commit I'm re-vendoring the C code. I'm also
exposing a small FFI interface to C so that blake3_neon.c can link
against portable.rs rather than blake3_portable.c, see c_neon.rs.