Disable auto vectorization of xxhash64, when AVX512 is present. #3819

TocarIP · 2023-11-13T23:25:49Z

AVX512 adds support for VPMULLQ instructions, which makes is possible to vectorize XXH64_update. Here is a minimal reproducer where clang already vectrozies it with avx512 enabled: https://godbolt.org/z/cnM3986dx . Unfortunately some architectures (e.g skylake) have a very significant latency (uops.info report 15+ cycles), so this makes xxhash ~2x slower. Trivial benchmark shows BM_hash/16M 1.29ms ± 3% vs BM_hash/16M 2.74ms ± 4%. We already disable vectorization for xxhash32 if sse4 is present, so we should probably do the same for xxhash64 and avx512.

The text was updated successfully, but these errors were encountered:

Cyan4973 · 2023-11-13T23:28:06Z

We already disable vectorization for xxhash32 if sse4 is present, so we should probably do the same for xxhash64 and avx512.

Indeed,
this seems like the same issue,
and we should probably use a similar solution.

List of updates : https://github.com/Cyan4973/xxHash/releases/tag/v0.8.2 This is also a preparation task before taking care of #3819

TocarIP · 2023-11-16T19:31:02Z

Someone asked me about AMD offline so might as well post here. On genoa (Zen4) AVX512 is actually ~30% faster ( 1135871 ns/op vs 885935 ns/op).

Cyan4973 · 2023-11-17T00:25:36Z

OK, so it makes the situation a bit less clear,
since avx512 is now sometimes beneficial, sometimes detrimental.
It probably opens the door towards offering a user choice on this topic.

However, it doesn't remove the question of "what's a good default",
and if I read the situation correctly, it seems that disabling avx512 for xxh64 remains a reasonable default at this point in time.

TocarIP · 2023-11-17T20:03:35Z

Regression on Intel is bigger than gain on AMD, so disabled by default makes sense. Using the same XXH_ENABLE_AUTOVECTORIZE to allow opt-in is probably the best.

List of updates : https://github.com/Cyan4973/xxHash/releases/tag/v0.8.2 This is also a preparation task before taking care of facebook#3819

Cyan4973 · 2024-03-06T00:15:54Z

Cyan4973/xxHash#924

List of updates : https://github.com/Cyan4973/xxHash/releases/tag/v0.8.2 This is also a preparation task before taking care of facebook#3819

Cyan4973 added a commit that referenced this issue Nov 13, 2023

update xxhash to v0.8.2

592b1ac

List of updates : https://github.com/Cyan4973/xxHash/releases/tag/v0.8.2 This is also a preparation task before taking care of #3819

This was referenced Nov 13, 2023

update xxhash library to v0.8.2 #3820

Merged

Disable auto vectorization of xxhash64, when AVX512 is present. Cyan4973/xxHash#897

Closed

gcflymoto pushed a commit to gcflymoto/zstd that referenced this issue Dec 9, 2023

update xxhash to v0.8.2

ee519bc

List of updates : https://github.com/Cyan4973/xxHash/releases/tag/v0.8.2 This is also a preparation task before taking care of facebook#3819

Cyan4973 mentioned this issue Mar 8, 2024

prevent XXH64 from being autovectorized by XXH512 by default #3933

Merged

Cyan4973 self-assigned this Mar 8, 2024

Cyan4973 closed this as completed in #3933 Mar 12, 2024

hswong3i pushed a commit to alvistack/facebook-zstd that referenced this issue Mar 27, 2024

update xxhash to v0.8.2

1e855d5

List of updates : https://github.com/Cyan4973/xxHash/releases/tag/v0.8.2 This is also a preparation task before taking care of facebook#3819

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Disable auto vectorization of xxhash64, when AVX512 is present. #3819

Disable auto vectorization of xxhash64, when AVX512 is present. #3819

TocarIP commented Nov 13, 2023

Cyan4973 commented Nov 13, 2023

TocarIP commented Nov 16, 2023

Cyan4973 commented Nov 17, 2023

TocarIP commented Nov 17, 2023

Cyan4973 commented Mar 6, 2024

Disable auto vectorization of xxhash64, when AVX512 is present. #3819

Disable auto vectorization of xxhash64, when AVX512 is present. #3819

Comments

TocarIP commented Nov 13, 2023

Cyan4973 commented Nov 13, 2023

TocarIP commented Nov 16, 2023

Cyan4973 commented Nov 17, 2023

TocarIP commented Nov 17, 2023

Cyan4973 commented Mar 6, 2024