Retrieval Augmentation and Semantic Search at Scale – Ash Vardanian, Unum Cloud

December 22, 2023

The topic of Vector Search and Retrieval-Augmented Generation has gained much attention in 2023. With thousands of open-source Embedding models on HuggingFace and dozens of Vector DBs on GitHub, there’s a lot to explore. Yet, not all scale well, and the high costs of AI work can hit hard.

This talk won’t be a tutorial. Instead, we’ll dive into technical benchmarks, spotlight the issues that block different solutions from scaling, and share lessons from multiple CLIP-like AI pre-training experiments and serving over 10 Billion vectors from a single machine.

We will cover the design decisions that went into the USearch and UForm open-source libraries, and will answer questions, like what is the optimal GPU for my inference workload? Can one serve search results from SSDs instead of RAM? And which tools will let me do that?

source by The Linux Foundation

linux foundation

You May Also Like

How to install OneNote on Linux (Ubuntu, Mint, Kali, Manjaro, Fedora)

Harumachi Clover Minecraft PARODY – SWING ARRANGEMENT Family Kid FRiendly Download Linux

How to use signal app | signal app का use kaise kare