How to Build a Real-Time AI Inference Pipeline Using GPU and StarRocks

Table of Contents

Prerequisites
Step 1: Install Required Packages
Step 2: Deploy StarRocks with Docker Compose
Step 3: Connect to StarRocks Frontend
Step 4: Create Tables in StarRocks for Prediction Data
Step 5: Run AI Inference Using GPU (PyTorch)
Step 6: Ingest Inference Data into StarRocks
Step 7: Query Results in Real-Time
Conclusion

Building a real-time AI inference pipeline lets you analyze and react to data as soon as it’s available. With the rise of deep learning and big data, you often need to combine high-performance AI models (using GPU acceleration) with fast, scalable analytics tools.

In this guide, you’ll learn how to set up an end-to-end pipeline that uses a GPU server for running AI inference with PyTorch and StarRocks.

Prerequisites

An Ubuntu 24.04 server with an NVIDIA GPU.
A non-root user or a user with sudo privileges.
NVIDIA drivers are installed on your server.

Step 1: Install Required Packages

You need several tools and dependencies to build your AI inference pipeline and run StarRocks. This step will guide you through updating your system and installing everything you need.

First, update all the packages to the latest version.

apt update -y

Next, install the required packages.

apt install python3 python3-pip python3-venv docker-compose default-jdk  -y

Step 2: Deploy StarRocks with Docker Compose

Now that you have the required tools, it’s time to deploy a StarRocks cluster using Docker Compose. This makes it easy to set up and manage both the Frontend (FE) and Backend (BE) services needed for analytics.

First, create a directory for your StarRocks project.

mkdir ~/starrocks && cd ~/starrocks

Next, create a docker-compose.yml file.

nano docker-compose.yml

Add the following configuration.

version: '3.8'
services:
  fe:
    image: starrocks/fe-ubuntu:3.2-latest
    container_name: starrocks-fe
    ports:
      - "8030:8030"    # Web UI
      - "9030:9030"    # MySQL protocol
    environment:
      - FE_SERVERS=starrocks-fe:9010
    command:
      - /opt/starrocks/fe/bin/start_fe.sh
    volumes:
      - ./fe-meta:/opt/starrocks/fe/meta
    restart: always

  be:
    image: starrocks/be-ubuntu:3.2-latest
    container_name: starrocks-be
    depends_on:
      - fe
    ports:
      - "8040:8040"    # BE port
    command:
      - /opt/starrocks/be/bin/start_be.sh
    environment:
      - FE_SERVERS=starrocks-fe:9010
    volumes:
      - ./be-storage:/opt/starrocks/be/storage
    restart: always

Explanation:

Frontend (FE): Handles queries and coordinates the cluster, with ports 8030 (web UI) and 9030 (MySQL protocol).
Backend (BE): Stores data and executes queries, exposed on port 8040. It depends on the FE service.
Both use local volumes to persist metadata and storage, and will restart automatically if they fail.

Start the StarRocks cluster.

docker compose up -d

Verify that containers are running.

docker ps

You should see both starrocks-fe and starrocks-be containers in the output, indicating that your StarRocks cluster is up and running.

CONTAINER ID   IMAGE                            COMMAND                  CREATED          STATUS          PORTS                                                                                      NAMES
97465ff8fb67   starrocks/be-ubuntu:3.2-latest   "/opt/starrocks/be/b…"   31 minutes ago   Up 31 minutes   0.0.0.0:8040->8040/tcp, [::]:8040->8040/tcp                                                starrocks-be
b184563cf3b3   starrocks/fe-ubuntu:3.2-latest   "/opt/starrocks/fe/b…"   31 minutes ago   Up 31 minutes   0.0.0.0:8030->8030/tcp, [::]:8030->8030/tcp, 0.0.0.0:9030->9030/tcp, [::]:9030->9030/tcp   starrocks-fe

Step 3: Connect to StarRocks Frontend

With StarRocks running in Docker, it’s time to connect to the Frontend (FE) service and prepare it for storing AI inference results. You’ll use the MySQL client to manage your StarRocks databases and tables, since StarRocks supports the MySQL protocol for compatibility.

1. First, install the MySQL client package.

apt install mysql-client -y

2. Next, connect to StarRocks FE using MySQL protocol.

mysql -h 127.0.0.1 -P 9030 -uroot

This command connects to the StarRocks FE running on your server at port 9030, using the default root user.

3. Before you can create and use tables, you must register at least one backend (BE) node with the FE.

mysql>ALTER SYSTEM ADD BACKEND "starrocks-be:9050";

4. Verify that the backend is registered and active.

mysql>SHOW BACKENDS\G

You should see an output showing the BE node’s status, IP address, and “Alive: true.”


*************************** 1. row ***************************
            BackendId: 10006
                   IP: 172.18.0.3
        HeartbeatPort: 9050
               BePort: 9060
             HttpPort: 8040
             BrpcPort: 8060
        LastStartTime: 2025-06-28 11:06:13
        LastHeartbeat: 2025-06-28 11:07:28
                Alive: true
 SystemDecommissioned: false
ClusterDecommissioned: false
            TabletNum: 59
     DataUsedCapacity: 0.000 B
        AvailCapacity: 269.413 GB
        TotalCapacity: 337.143 GB
              UsedPct: 20.09 %
       MaxDiskUsedPct: 20.09 %
               ErrMsg: 
              Version: 3.2.16-8dea52d
               Status: {"lastSuccessReportTabletsTime":"2025-06-28 11:07:14"}
    DataTotalCapacity: 269.413 GB
          DataUsedPct: 0.00 %
             CpuCores: 4
    NumRunningQueries: 0
           MemUsedPct: 0.73 %
           CpuUsedPct: 0.2 %
             Location: 
1 row in set (0.00 sec)

5. Exit from MySQL.

mysql>exit;

Step 4: Create Tables in StarRocks for Prediction Data

Now that your StarRocks cluster is up and the backend node is active, you need a place to store your AI inference results. You’ll create a database and a table designed for logging model predictions, making it easy to analyze results in real-time.

1. Connect to StarRocks FE with the MySQL client.

mysql -h 127.0.0.1 -P 9030 -uroot

2. Create a new database for your AI demo.

mysql>CREATE DATABASE ai_demo;

3. Switch to your new database.

mysql> USE ai_demo;

4. Create a predictions table to store model outputs.

mysql> CREATE TABLE predictions (
    filename VARCHAR(255),
    prediction INT,
    ts DATETIME DEFAULT CURRENT_TIMESTAMP
) ENGINE=OLAP
DUPLICATE KEY(filename)
DISTRIBUTED BY HASH(filename) BUCKETS 3
PROPERTIES (
    "replication_num" = "1"
);

5. Exit the MySQL client.

mysql>exit;

Step 5: Run AI Inference Using GPU (PyTorch)

With your StarRocks database ready, it’s time to generate some predictions using a GPU-accelerated AI model. In this step, you’ll set up a simple PyTorch pipeline to process images, run inference on your GPU, and export results for analytics.

1. Create a Python virtual environment for your project.

python3 -m venv venv
source venv/bin/activate

2. Install PyTorch and related libraries.

pip install torch torchvision pillow

3. Build your model inference script.

nano model_inference.py

Add the following code:

import torch
from torchvision.models import resnet50, ResNet50_Weights
from torchvision import transforms
from PIL import Image

weights = ResNet50_Weights.DEFAULT
model = resnet50(weights=weights).cuda().eval()
transform = weights.transforms()

def predict(image_path):
    image = Image.open(image_path).convert('RGB')
    input_tensor = transform(image).unsqueeze(0).cuda()
    with torch.no_grad():
        output = model(input_tensor)
    return output.argmax(dim=1).item()

4. Test single image inference.

python3 model_inference.py

5. Create another script to run batch inference on a folder of images.

nano batch_infer.py

Add the following code:

import os, csv
from model_inference import predict
from PIL import UnidentifiedImageError

os.makedirs("images", exist_ok=True)  # Ensure folder exists

with open("predictions.csv", "w", newline="") as f:
    writer = csv.writer(f)
    writer.writerow(["filename", "prediction"])
    for file in os.listdir("images"):
        if file.endswith(".jpg"):
            path = os.path.join("images", file)
            try:
                label = predict(path)
                writer.writerow([file, label])
            except UnidentifiedImageError:
                print(f"Skipped invalid image: {file}")

This script loops through all .jpg files in the images folder, runs inference, and writes the results to predictions.csv.

6. Run the batch inference.

python3 batch_infer.py

After this step, you’ll have a predictions.csv file with model outputs, ready to load into StarRocks for analytics.

Step 6: Ingest Inference Data into StarRocks

With your AI model’s predictions saved in predictions.csv, the next step is to load these results directly into your StarRocks database. StarRocks offers a simple and fast streaming API called Stream Load for ingesting batch data via HTTP.

1. Use the Stream Load API with curl.

curl --location-trusted -u root: \
  -T predictions.csv \
  -H "Expect: 100-continue" \
  -H "Content-Type: application/octet-stream" \
  -H "column_separator:," \
  -H "columns: filename,prediction" \
  http://localhost:8030/api/ai_demo/predictions/_stream_load

You should see a response like this:

{
    "TxnId": 2,
    "Label": "c62f6a53-aeff-4be2-a2b5-0d04733217ae",
    "Status": "Success",
    "Message": "OK",
    "NumberTotalRows": 1,
    "NumberLoadedRows": 1,
    "NumberFilteredRows": 0,
    "NumberUnselectedRows": 0,
    "LoadBytes": 21,
    "LoadTimeMs": 195,
    "BeginTxnTimeMs": 22,
    "StreamLoadPlanTimeMs": 116,
    "ReadDataTimeMs": 0,
    "WriteDataTimeMs": 9,
    "CommitAndPublishTimeMs": 46
}

Status: "Success" confirms your data was loaded.

Your AI inference results are now stored in StarRocks and ready for instant querying and analytics! In the next step, you’ll learn how to run real-time SQL queries on these predictions.

Step 7: Query Results in Real-Time

Once your predictions are loaded into StarRocks, you can instantly analyze and visualize the data using SQL queries. This step demonstrates how to access your AI inference results and perform real-time analytics.

1. Connect to StarRocks using the MySQL client.

mysql -h 127.0.0.1 -P 9030 -uroot

2. Switch to your AI demo database.

mysql> USE ai_demo;

3. View the latest predictions.

mysql>SELECT * FROM predictions ORDER BY ts DESC LIMIT 5;

This command fetches the five most recent inference results, letting you monitor new data as it arrives.

+----------+------------+---------------------+
| filename | prediction | ts                  |
+----------+------------+---------------------+
| filename |       NULL | 2025-07-09 07:34:34 |
+----------+------------+---------------------+
1 row in set (0.07 sec)

4. Aggregate predictions for analytics.

mysql> SELECT prediction, COUNT(*) AS count FROM predictions GROUP BY prediction ORDER BY count DESC;

This query shows how many times each prediction label appears, great for summarizing your model’s results.

+------------+-------+
| prediction | count |
+------------+-------+
|       NULL |     1 |
+------------+-------+
1 row in set (0.03 sec)

Your pipeline is now complete, from GPU-powered inference to instant analytics with StarRocks!

Conclusion

You’ve now built a complete real-time AI inference pipeline using a GPU server and StarRocks. You started by preparing your Ubuntu 24.04 environment, then deployed StarRocks with Docker Compose. After connecting to StarRocks, you created a table for storing prediction results. Next, you used PyTorch to run image classification on your GPU, saved the predictions, and loaded them directly into StarRocks for real-time analytics.

Facebook

Atlantic.Net Cloud GPU Hosting Massive Computing Power

Up in 60 Seconds!

Your subscription could not be saved. Please try again.

Your subscription has been successful.

Newsletter

Subscribe to our newsletter and stay updated.

Email Address

Provide your email address to subscribe. For e.g [email protected]

Your subscription could not be saved. Please try again.

Your subscription has been successful.

View White Papers

How to Build a Real-Time AI Inference Pipeline Using GPU and StarRocks

Prerequisites

Step 1: Install Required Packages

Step 2: Deploy StarRocks with Docker Compose

Step 3: Connect to StarRocks Frontend

Step 4: Create Tables in StarRocks for Prediction Data

Step 5: Run AI Inference Using GPU (PyTorch)

Step 6: Ingest Inference Data into StarRocks

Step 7: Query Results in Real-Time

Conclusion

Atlantic.Net Cloud GPU Hosting Massive Computing Power

Award-Winning Hosting Solutions & Services