Serve 100s of Fine-tuned LLMs for the Cost of Serving One with LoRAX

November 8, 2023

Predibase recently released the novel LoRA Exchange (LoRAX) serving architecture which offers developers the most efficient and cost-effective way to train and serve smaller, task-specific LLMs using any GPU.

Fine-tuning and serving a large collection of models for production applications has, until now, required dedicated GPU resources for each deployed model which can be cost-prohibitive.

LoRAX is a modular LLM serving architecture that allows users to dynamically serve 100+ fine-tuned models from a single GPU, effectively letting you serve them all for the price of one.

In this session, you will learn:

• Parameter-efficient fine-tuning with LoRA
• Just-in-time dynamic LoRA adapter loading
• How to avoid out of memory errors (OOMs) with tiered weight caching
• Optimizing for high aggregate throughput

Included in this session is a live demo showing how you can leverage our new Python SDK to fine-tune and query LLaMA-2-7b using LoRAX through the Predibase 2-week free trial.

• LoRAX follow-along notebook: https://pbase.ai/loraxcolab
• Slides from the session: https://pbase.ai/loraxwebinarslides
• Fine-tune and serve LLMs for free with our trial: https://pbase.ai/getstarted

source

by Predibase

linux foundation

One thought on “Serve 100s of Fine-tuned LLMs for the Cost of Serving One with LoRAX”

Chris Cortez

November 8, 2023 at 10:12 am

This was really good, thanks guys. After trying a bunch of different ways, and having some success (and plenty of OOM) running GPU machines and hosting models … your approach makes so much sense. Looking forward to trying it.

Comments are closed.

You May Also Like

BSCS/Supercomputer/RAM/Bhaikot/Whatsapp/TOp 5/Hard disk/Maga Project/MSCS/Fatest Computer/Latest.

70 TB Servers – How and Why We Use Cassandra – Nathan Jackels, Open Systems International

A Preview Of Six Flavors Of Ubuntu 20.04 "Focal Fossa"

One thought on “Serve 100s of Fine-tuned LLMs for the Cost of Serving One with LoRAX”