Run AI with an API

2 min read Original article ↗

How it works

You can get started with any model with just one line of code. But as you do more complex things, you can fine-tune models or deploy your own custom code.

Run models

Our community has already published thousands of models that are ready to use in production. You can run these with one line of code.

import replicate

output = replicate.run(
  "black-forest-labs/flux-dev",
  input={
    "aspect_ratio": "1:1",
    "num_outputs": 1,
    "output_format": "jpg",
    "output_quality": 80,
    "prompt": "An astronaut riding a rainbow unicorn, cinematic, dramatic",
  }
)

print(output)

Fine-tune models with your own data

You can improve models with your own data to create new models that are better suited to specific tasks.

Image models like SDXL can generate images of a particular person, object, or style.

Train a model:

training = replicate.trainings.create(
  destination="mattrothenberg/drone-art"
  version="ostris/flux-dev-lora-trainer:e440909d3512c31646ee2e0c7d6f6f4923224863a6a10c494606e79fb5844497",
  input={
    "steps": 1000,
    "input_images": 

https://example.com/images.zip

,
    "trigger_word": "TOK",
  },
)

This will result in a new model:

Then, you can run it with one line of code:

output = replicate.run(
  "mattrothenberg/drone-art:abcde1234...",
  input={"prompt": "a photo of TOK forming a rainbow in the sky"}),
)

Deploy custom models

You aren’t limited to the models on Replicate: you can deploy your own custom models using Cog, our open-source tool for packaging machine learning models.

Cog takes care of generating an API server and deploying it on a big cluster in the cloud. We scale up and down to handle demand, and you only pay for the compute that you use.

First, define the environment your model runs in with cog.yaml:

build:
  gpu: true
  system_packages:
    - "libgl1-mesa-glx"
    - "libglib2.0-0"
  python_version: "3.10"
  python_packages:
    - "torch==1.13.1"
predict: "predict.py:Predictor"

Next, define how predictions are run on your model with predict.py:

from cog import BasePredictor, Input, Path
import torch

class Predictor(BasePredictor):
  def setup(self):
      """Load the model into memory to make running multiple predictions efficient"""
      self.model = torch.load("./weights.pth")

  # The arguments and types the model takes as input
  def predict(self,
        image: Path = Input(description="Grayscale input image")
  ) -> Path:
      """Run a single prediction on the model"""
      processed_image = preprocess(image)
      output = self.model(processed_image)
      return postprocess(output)