Built by AI leaders from Uber, Google, Apple and Amazon. Developed and deployed with the world’s leading organizations.
Bigger Isn’t Always Better
Fine-tune smaller task-specific LLMs that outperform bloated alternatives from commercial vendors. Don’t pay for what you don’t need.
Efficient Fine-Tuning and Serving
Train and deploy task-specific open-source models in record time and under budget.
First-class fine-tuning experience
Predibase offers state-of-the-art fine-tuning techniques out of the box such as quantization, low-rank adaptation, and memory-efficient distributed training to ensure your fine-tuning jobs are fast and efficient—even on commodity GPUs.
The most cost-effective serving infra
With Serverless Fine-Tuned Endpoints and token-based pricing you can stop paying for GPU resources you don’t need. Our unique serving infra–LoRAX–lets you cost-effectively serve many fine-tuned adapters on a single GPU in dedicated deployments.
Your Models, Your Property
Start owning and stop renting your LLMs. The models you build and customize on Predibase are your property, regardless of whether you use the Predibase Cloud and Serverless Fine-Tuned Endpoints or deploy inside your VPC.
The fastest way to fine-tune and deploy any open-source LLM
Fine-tune and serve any open-source LLM. Our proven, scalable infrastructure is available through either serverless fine-tuned endpoints or within your environment’s virtual private cloud.
Try Any Open Source LLM in an Instant
Stop spending hours wrestling with complex model deployments before you’ve even started fine-tuning. Deploy and query the latest open-source pre-trained LLM—like Llama-2, Mistral and Zephyr—so you can test and evaluate the best base model for your use case. Scalable managed infrastructure in your VPC or Predibase cloud enables you to achieve this in minutes with just a few lines of code.
# Deploy an LLM from HuggingFace
pb.deployments.create(
name="my-llama-2-13b-deployment",
description="Deployment of Llama-2-13B in Predibase Cloud",
config=DeploymentConfig(
base_model="meta-llama/Llama-2-13b",
)
)
# Prompt the deployed LLM
client = pb.deployments.client("my-llama-2-13b-deployment")
print(client.generate(
"Write an algorithm in Java to reverse the words in a string.",
).generated_text)
Efficiently Fine-tune Models for Your Task
No more out-of-memory errors or costly training jobs. Fine-tune any open-source LLM on the most readily available GPUs using Predibase’s optimized training system. We automatically apply optimizations such as quantization, low-rank adaptation, and memory-efficient distributed training combined with right-sized compute to ensure your jobs are successfully trained as efficiently as possible.
# Kick off the fine-tune job and track the learning curves for your adapter in the Predibase UI
adapter = pb.finetuning.jobs.create(
config={
"base_model": "meta-llama/Llama-2-13b", # specify a HuggingFace LLM to fine-tune
"epochs": 3,
"learning_rate": 0.0002,
},
dataset=my_dataset, # Upload your dataset to Predibase beforehand.
repo="my_adapter",
description='Fine-tune "meta-llama/Llama-2-13b" with my dataset for my task.',
)
Dynamically Serve Many Fine-tuned LLMs In One Deployment
Our scalable serving infra automatically scales up and down to meet the demands of your production environment. Dynamically serve many fine-tuned LLMs together for over 100x cost reduction versus dedicated deployments with our novel LoRA Exchange (LoRAX) architecture. Load and query them in seconds.
Read more about LoRAX.
# Prompt the fine-tuned adapter instantly using the client, previously created for the deployed LLM
print(client.generate(
"Write an algorithm in Java to reverse the words in a string.",
adapter_id="my_adapter/3", # Specify adapter/version for inference
).generated_text)
Built on Proven Open-Source Technology
LoRAX
LoRAX (LoRA eXchange) enables users to serve thousands of fine-tuned LLMs on a single GPU, dramatically reducing the cost of serving without compromising on throughput or latency.
Ludwig
Ludwig is a declarative framework to develop, train, fine-tune, and deploy state-of-the-art deep learning and large language models. Ludwig puts AI in the hands of all engineers without requiring low-level code.
Horovod
Horovod is a distributed deep learning framework that scales PyTorch and TensorFlow training to hundreds of machines.
Use Cases
Predibase lets you fine-tune any open-source LLM for your task-specific use case.