Skip to content

Commit

Permalink
Add some SVE implementations (facebookresearch#3933)
Browse files Browse the repository at this point in the history
Summary:
related: facebookresearch#2884

I added some SVE implementations of:

- `code_distance`
    - `distance_single_code`
    - `distance_four_codes`
- `exhaustive_L2sqr_blas_cmax_sve`
- `fvec_inner_products_ny`
- `fvec_madd`

## Evaluation result

I evaluated the search for SIFT1M dataset on AWS EC2 c7g.large and r8g.large instances.
`main` is the current (2e6551f) implementation.

### c7g.large (Graviton 3)

![g3_sift1m](https://github.com/user-attachments/assets/9c03cffa-72d1-4c77-9ae8-0ec0a5f5a6a5)

![g3_ivfpq](https://github.com/user-attachments/assets/4a8dfcc8-823c-4c31-ae79-3f4af9be28c8)

On Graviton 3, `IndexIVFPQ` has been improved particularly. In the best case (IndexIVFPQ + IndexFlatL2, M: 32), this PR is approx. 2.38-~~2.50~~**2.44**x faster than `main` .

- nprobe: 1, 0.069ms/query → 0.029ms/query
- nprobe: 4, 0.181ms/query → ~~0.074~~**0.075**ms/query
- nprobe: 16, 0.613ms/query → ~~0.245~~**0.251**ms/query

### r8g.large (Graviton 4)

![g4_sift1m](https://github.com/user-attachments/assets/e8510163-49d2-4143-babe-d406e2e40398)

![g4_ivfpq](https://github.com/user-attachments/assets/dc9a3ae0-a6b5-4a07-9898-c6aff372025c)

On Graviton 4, especially `IndexIVFPQ` for tiny `nprobe` has been improved. In the best case (IndexIVFPQ + IndexFlatL2, M: 8, nprobe: 1), this PR is approx. 1.33x faster than `main` (0.016ms/query → 0.012ms/query).

Pull Request resolved: facebookresearch#3933

Reviewed By: mengdilin

Differential Revision: D64249808

Pulled By: asadoughi

fbshipit-source-id: 8a625f0ab37732d330192599c851f864350885c4
  • Loading branch information
vorj authored and facebook-github-bot committed Oct 15, 2024
1 parent 1ab7e5c commit dce7c09
Show file tree
Hide file tree
Showing 4 changed files with 1,175 additions and 0 deletions.
Loading

0 comments on commit dce7c09

Please sign in to comment.