Table of Contents
- Prerequisites
- Step 1: Install Required Packages
- Step 2: Deploy StarRocks with Docker Compose
- Step 3: Connect to StarRocks Frontend
- Step 4: Create Tables in StarRocks for Prediction Data
- Step 5: Run AI Inference Using GPU (PyTorch)
- Step 6: Ingest Inference Data into StarRocks
- Step 7: Query Results in Real-Time
- Conclusion
Building a real-time AI inference pipeline lets you analyze and react to data as soon as it’s available. With the rise of deep learning and big data, you often need to combine high-performance AI models (using GPU acceleration) with fast, scalable analytics tools.
In this guide, you’ll learn how to set up an end-to-end pipeline that uses a GPU server for running AI inference with PyTorch and StarRocks.
Prerequisites
- An Ubuntu 24.04 server with an NVIDIA GPU.
- A non-root user or a user with sudo privileges.
- NVIDIA drivers are installed on your server.
Step 1: Install Required Packages
You need several tools and dependencies to build your AI inference pipeline and run StarRocks. This step will guide you through updating your system and installing everything you need.
First, update all the packages to the latest version.
apt update -y
Next, install the required packages.
apt install python3 python3-pip python3-venv docker-compose default-jdk -y
Step 2: Deploy StarRocks with Docker Compose
Now that you have the required tools, it’s time to deploy a StarRocks cluster using Docker Compose. This makes it easy to set up and manage both the Frontend (FE) and Backend (BE) services needed for analytics.
First, create a directory for your StarRocks project.
mkdir ~/starrocks && cd ~/starrocks
Next, create a docker-compose.yml file.
nano docker-compose.yml
Add the following configuration.
version: '3.8'
services:
fe:
image: starrocks/fe-ubuntu:3.2-latest
container_name: starrocks-fe
ports:
- "8030:8030" # Web UI
- "9030:9030" # MySQL protocol
environment:
- FE_SERVERS=starrocks-fe:9010
command:
- /opt/starrocks/fe/bin/start_fe.sh
volumes:
- ./fe-meta:/opt/starrocks/fe/meta
restart: always
be:
image: starrocks/be-ubuntu:3.2-latest
container_name: starrocks-be
depends_on:
- fe
ports:
- "8040:8040" # BE port
command:
- /opt/starrocks/be/bin/start_be.sh
environment:
- FE_SERVERS=starrocks-fe:9010
volumes:
- ./be-storage:/opt/starrocks/be/storage
restart: always
Explanation:
- Frontend (FE): Handles queries and coordinates the cluster, with ports 8030 (web UI) and 9030 (MySQL protocol).
- Backend (BE): Stores data and executes queries, exposed on port 8040. It depends on the FE service.
- Both use local volumes to persist metadata and storage, and will restart automatically if they fail.
Start the StarRocks cluster.
docker compose up -d
Verify that containers are running.
docker ps
You should see both starrocks-fe and starrocks-be containers in the output, indicating that your StarRocks cluster is up and running.
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
97465ff8fb67 starrocks/be-ubuntu:3.2-latest "/opt/starrocks/be/b…" 31 minutes ago Up 31 minutes 0.0.0.0:8040->8040/tcp, [::]:8040->8040/tcp starrocks-be
b184563cf3b3 starrocks/fe-ubuntu:3.2-latest "/opt/starrocks/fe/b…" 31 minutes ago Up 31 minutes 0.0.0.0:8030->8030/tcp, [::]:8030->8030/tcp, 0.0.0.0:9030->9030/tcp, [::]:9030->9030/tcp starrocks-fe
Step 3: Connect to StarRocks Frontend
With StarRocks running in Docker, it’s time to connect to the Frontend (FE) service and prepare it for storing AI inference results. You’ll use the MySQL client to manage your StarRocks databases and tables, since StarRocks supports the MySQL protocol for compatibility.
1. First, install the MySQL client package.
apt install mysql-client -y
2. Next, connect to StarRocks FE using MySQL protocol.
mysql -h 127.0.0.1 -P 9030 -uroot
This command connects to the StarRocks FE running on your server at port 9030, using the default root user.
3. Before you can create and use tables, you must register at least one backend (BE) node with the FE.
mysql>ALTER SYSTEM ADD BACKEND "starrocks-be:9050";
4. Verify that the backend is registered and active.
mysql>SHOW BACKENDS\G
You should see an output showing the BE node’s status, IP address, and “Alive: true.”
*************************** 1. row ***************************
BackendId: 10006
IP: 172.18.0.3
HeartbeatPort: 9050
BePort: 9060
HttpPort: 8040
BrpcPort: 8060
LastStartTime: 2025-06-28 11:06:13
LastHeartbeat: 2025-06-28 11:07:28
Alive: true
SystemDecommissioned: false
ClusterDecommissioned: false
TabletNum: 59
DataUsedCapacity: 0.000 B
AvailCapacity: 269.413 GB
TotalCapacity: 337.143 GB
UsedPct: 20.09 %
MaxDiskUsedPct: 20.09 %
ErrMsg:
Version: 3.2.16-8dea52d
Status: {"lastSuccessReportTabletsTime":"2025-06-28 11:07:14"}
DataTotalCapacity: 269.413 GB
DataUsedPct: 0.00 %
CpuCores: 4
NumRunningQueries: 0
MemUsedPct: 0.73 %
CpuUsedPct: 0.2 %
Location:
1 row in set (0.00 sec)
5. Exit from MySQL.
mysql>exit;
Step 4: Create Tables in StarRocks for Prediction Data
Now that your StarRocks cluster is up and the backend node is active, you need a place to store your AI inference results. You’ll create a database and a table designed for logging model predictions, making it easy to analyze results in real-time.
1. Connect to StarRocks FE with the MySQL client.
mysql -h 127.0.0.1 -P 9030 -uroot
2. Create a new database for your AI demo.
mysql>CREATE DATABASE ai_demo;
3. Switch to your new database.
mysql> USE ai_demo;
4. Create a predictions table to store model outputs.
mysql> CREATE TABLE predictions (
filename VARCHAR(255),
prediction INT,
ts DATETIME DEFAULT CURRENT_TIMESTAMP
) ENGINE=OLAP
DUPLICATE KEY(filename)
DISTRIBUTED BY HASH(filename) BUCKETS 3
PROPERTIES (
"replication_num" = "1"
);
5. Exit the MySQL client.
mysql>exit;
Step 5: Run AI Inference Using GPU (PyTorch)
With your StarRocks database ready, it’s time to generate some predictions using a GPU-accelerated AI model. In this step, you’ll set up a simple PyTorch pipeline to process images, run inference on your GPU, and export results for analytics.
1. Create a Python virtual environment for your project.
python3 -m venv venv
source venv/bin/activate
2. Install PyTorch and related libraries.
pip install torch torchvision pillow
3. Build your model inference script.
nano model_inference.py
Add the following code:
import torch
from torchvision.models import resnet50, ResNet50_Weights
from torchvision import transforms
from PIL import Image
weights = ResNet50_Weights.DEFAULT
model = resnet50(weights=weights).cuda().eval()
transform = weights.transforms()
def predict(image_path):
image = Image.open(image_path).convert('RGB')
input_tensor = transform(image).unsqueeze(0).cuda()
with torch.no_grad():
output = model(input_tensor)
return output.argmax(dim=1).item()
4. Test single image inference.
python3 model_inference.py
5. Create another script to run batch inference on a folder of images.
nano batch_infer.py
Add the following code:
import os, csv
from model_inference import predict
from PIL import UnidentifiedImageError
os.makedirs("images", exist_ok=True) # Ensure folder exists
with open("predictions.csv", "w", newline="") as f:
writer = csv.writer(f)
writer.writerow(["filename", "prediction"])
for file in os.listdir("images"):
if file.endswith(".jpg"):
path = os.path.join("images", file)
try:
label = predict(path)
writer.writerow([file, label])
except UnidentifiedImageError:
print(f"Skipped invalid image: {file}")
This script loops through all .jpg files in the images folder, runs inference, and writes the results to predictions.csv.
6. Run the batch inference.
python3 batch_infer.py
After this step, you’ll have a predictions.csv file with model outputs, ready to load into StarRocks for analytics.
Step 6: Ingest Inference Data into StarRocks
With your AI model’s predictions saved in predictions.csv, the next step is to load these results directly into your StarRocks database. StarRocks offers a simple and fast streaming API called Stream Load for ingesting batch data via HTTP.
1. Use the Stream Load API with curl.
curl --location-trusted -u root: \
-T predictions.csv \
-H "Expect: 100-continue" \
-H "Content-Type: application/octet-stream" \
-H "column_separator:," \
-H "columns: filename,prediction" \
http://localhost:8030/api/ai_demo/predictions/_stream_load
You should see a response like this:
{
"TxnId": 2,
"Label": "c62f6a53-aeff-4be2-a2b5-0d04733217ae",
"Status": "Success",
"Message": "OK",
"NumberTotalRows": 1,
"NumberLoadedRows": 1,
"NumberFilteredRows": 0,
"NumberUnselectedRows": 0,
"LoadBytes": 21,
"LoadTimeMs": 195,
"BeginTxnTimeMs": 22,
"StreamLoadPlanTimeMs": 116,
"ReadDataTimeMs": 0,
"WriteDataTimeMs": 9,
"CommitAndPublishTimeMs": 46
}
Status: "Success" confirms your data was loaded.
Your AI inference results are now stored in StarRocks and ready for instant querying and analytics! In the next step, you’ll learn how to run real-time SQL queries on these predictions.
Step 7: Query Results in Real-Time
Once your predictions are loaded into StarRocks, you can instantly analyze and visualize the data using SQL queries. This step demonstrates how to access your AI inference results and perform real-time analytics.
1. Connect to StarRocks using the MySQL client.
mysql -h 127.0.0.1 -P 9030 -uroot
2. Switch to your AI demo database.
mysql> USE ai_demo;
3. View the latest predictions.
mysql>SELECT * FROM predictions ORDER BY ts DESC LIMIT 5;
This command fetches the five most recent inference results, letting you monitor new data as it arrives.
+----------+------------+---------------------+
| filename | prediction | ts |
+----------+------------+---------------------+
| filename | NULL | 2025-07-09 07:34:34 |
+----------+------------+---------------------+
1 row in set (0.07 sec)
4. Aggregate predictions for analytics.
mysql> SELECT prediction, COUNT(*) AS count FROM predictions GROUP BY prediction ORDER BY count DESC;
This query shows how many times each prediction label appears, great for summarizing your model’s results.
+------------+-------+
| prediction | count |
+------------+-------+
| NULL | 1 |
+------------+-------+
1 row in set (0.03 sec)
Your pipeline is now complete, from GPU-powered inference to instant analytics with StarRocks!
Conclusion
You’ve now built a complete real-time AI inference pipeline using a GPU server and StarRocks. You started by preparing your Ubuntu 24.04 environment, then deployed StarRocks with Docker Compose. After connecting to StarRocks, you created a table for storing prediction results. Next, you used PyTorch to run image classification on your GPU, saved the predictions, and loaded them directly into StarRocks for real-time analytics.