Weaviate is a powerful open-source vector database designed to perform semantic search using vector embeddings. When you integrate it with GPU-powered servers, embedding generation becomes significantly faster, especially for large-scale or real-time applications.
In this guide, you’ll learn how to:
- Set up Weaviate using Docker
- Leverage a GPU for fast embedding generation
- Store custom embeddings in Weaviate
- Query and manage vectorized data using Python scripts
Prerequisites
- An Ubuntu 24.04 server with an NVIDIA GPU.
- A non-root user or a user with sudo privileges.
- NVIDIA drivers are installed on your server.
Step 1: Install Required Dependencies
First, install Python and the necessary tools.
apt install python3 python3-pip python3-venv -y
Restart Docker to ensure GPU support.
systemctl restart docker
Verify that the NVIDIA container runtime is available.
docker info | grep -i runtime
Output.
Runtimes: io.containerd.runc.v2 nvidia runc
Default Runtime: runc
Step 2: Set Up Weaviate with Docker
Configure and launch Weaviate using Docker Compose, ensuring proper port exposure for both REST and gRPC connections that will be used for vector operations.
Create a docker-compose.yml file.
nano docker-compose.yml
Add the below configuration.
services:
weaviate:
image: semitechnologies/weaviate:latest
ports:
- "8080:8080"
- "50051:50051" # <-- required for gRPC
environment:
ENABLE_MODULES: ''
DEFAULT_VECTORIZER_MODULE: 'none'
QUERY_DEFAULTS_LIMIT: 25
AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED: 'true'
Start Weaviate container using the below command.
docker-compose up -d
Verify the running container.
docker ps
Output.
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
2166c9434c83 semitechnologies/weaviate:latest "/bin/weaviate --hos…" 18 minutes ago Up 18 minutes 0.0.0.0:8080->8080/tcp, [::]:8080->8080/tcp, 0.0.0.0:50051->50051/tcp, [::]:50051->50051/tcp root-weaviate-1
Step 3: Set Up a Python Virtual Environment
In this section, we will create an isolated Python environment to manage dependencies cleanly and verify that PyTorch can properly access your GPU hardware.
Create a Python virtual environment.
python3 -m venv venv
Activate the environment.
source venv/bin/activate
Update pip to the latest version.
pip install --upgrade pip
Install required Python packages.
pip install sentence-transformers weaviate-client torch torchvision
Verify GPU availability.
python3 -c "import torch; print(torch.cuda.is_available())"
Output.
True
Step 4: Generate Embeddings Using GPU
In this section, we will create a Python script that leverages GPU acceleration to convert text into high-dimensional vectors using the sentence-transformers library.
Create generate_embeddings.py.
nano generate_embeddings.py
Add the following code.
from sentence_transformers import SentenceTransformer
import torch
device = "cuda" if torch.cuda.is_available() else "cpu"
model = SentenceTransformer('all-MiniLM-L6-v2', device=device)
def get_embedding(text):
embedding = model.encode(text, convert_to_tensor=True)
return embedding.cpu().numpy().tolist()
Run the script.
python3 generate_embeddings.py
Step 5: Insert Data into Weaviate
In this section, we will create and populate a Weaviate collection with your text documents and their corresponding GPU-generated vector embeddings.
Create insert_to_weaviate.py.
nano insert_to_weaviate.py
Add the following code.
from weaviate.connect import ConnectionParams
from weaviate import WeaviateClient
from weaviate.collections.classes.config import DataType
from generate_embeddings import get_embedding
# Connect to Weaviate
connection_params = ConnectionParams.from_url("http://localhost:8080", 50051)
client = WeaviateClient(connection_params)
client.connect()
# Define class name
class_name = "Document"
# Create collection if not exists
if not client.collections.exists(class_name):
client.collections.create(
name=class_name,
vectorizer_config=None, # Use custom embeddings
properties=[
{"name": "content", "data_type": DataType.TEXT} # ✅ Correct enum
]
)
# Sample texts to insert
texts = [
"Weaviate is a vector database.",
"GPU servers are great for deep learning.",
"Embeddings help convert text into searchable vectors."
]
collection = client.collections.get(class_name)
for text in texts:
vector = get_embedding(text)
collection.data.insert(
properties={"content": text},
vector=vector
)
# Close the connection properly
client.close()
Run the script.
python3 insert_to_weaviate.py
Step 6: Query Data in Weaviate
In this section, we will execute vector similarity searches against your Weaviate database to retrieve relevant documents based on semantic meaning rather than exact keyword matches.
Create search_weaviate.py.
nano search_weaviate.py
Add the following code.
from weaviate.connect import ConnectionParams
from weaviate import WeaviateClient
from generate_embeddings import get_embedding
# Connect to Weaviate
connection_params = ConnectionParams.from_url("http://localhost:8080", 50051)
client = WeaviateClient(connection_params)
client.connect()
collection = client.collections.get("Document")
# Query
query = "What is Weaviate?"
query_vector = get_embedding(query)
# Perform search
results = collection.query.near_vector(
near_vector=query_vector,
limit=3,
return_properties=["content"]
)
# Print results
print("\nSearch Results:")
for result in results.objects:
print(f"- {result.properties['content']}")
client.close()
Run the script.
python3 search_weaviate.py
You will see the following output.
Search Results:
- Weaviate is a vector database.
- Embeddings help convert text into searchable vectors.
- GPU servers are great for deep learning.
Step 7: Manage Documents
In this section, we will learn additional operations for maintaining your vector database, including listing, updating, and removing documents as needed.
First, create a script to list all documents.
nano list_documents.py
Add the following code.
from weaviate.connect import ConnectionParams
from weaviate import WeaviateClient
connection_params = ConnectionParams.from_url("http://localhost:8080", 50051)
client = WeaviateClient(connection_params)
client.connect()
collection = client.collections.get("Document")
# Replace with actual UUID
doc_id = "REPLACE-WITH-UUID"
# Delete the document
success = collection.data.delete(uuid=doc_id)
print("Deleted successfully?" , success)
client.close()
Run the script.
python3 list_documents.py
You will see all documents in the output below.
Documents in 'Document' collection:
- ID: 2189ca0b-ebdb-437a-90f1-d2881bfbbbd7 | Content: Weaviate is a vector database.
- ID: 296db11b-b2bc-4554-a236-2fee50579c2d | Content: GPU servers are great for deep learning.
- ID: dde5a358-058b-46f3-9033-47967df64a01 | Content: Embeddings help convert text into searchable vectors.
Create an update_document.py script to update the existing document.
nano update_document.py
Add the below code.
from weaviate.connect import ConnectionParams
from weaviate import WeaviateClient
from weaviate.collections.classes.config import DataType
from generate_embeddings import get_embedding
# Connect to Weaviate
connection_params = ConnectionParams.from_url("http://localhost:8080", 50051)
client = WeaviateClient(connection_params)
client.connect()
try:
collection = client.collections.get("Document")
# UUID of the document to update (replace with actual ID)
doc_id = "296db11b-b2bc-4554-a236-2fee50579c2d"
# New content
new_text = "Updated description about vector databases."
# Generate new embedding
new_vector = get_embedding(new_text)
# Delete existing object by ID
collection.data.delete_by_id(doc_id)
# Re-insert object with same ID and new content + vector
collection.data.insert(
uuid=doc_id,
properties={"content": new_text},
vector=new_vector
)
print("✅ Document updated successfully.")
finally:
client.close()
Replaced the id “296db11b-b2bc-4554-a236-2fee50579c2d” with the actual document.
Run the script.
python3 update_document.py
Output.
✅ Document updated successfully.
Create a delete_document.py to delete the existing document.
nano delete_document.py
Add the below code.
from weaviate.connect import ConnectionParams
from weaviate import WeaviateClient
# Connect to Weaviate
connection_params = ConnectionParams.from_url("http://localhost:8080", 50051)
client = WeaviateClient(connection_params)
client.connect()
try:
# Get the 'Document' collection
collection = client.collections.get("Document")
# Replace this with the actual UUID you want to delete
doc_id = "2189ca0b-ebdb-437a-90f1-d2881bfbbbd7"
# Delete the object by UUID
deleted = collection.data.delete_by_id(doc_id)
if deleted:
print(f"✅ Document with ID {doc_id} deleted successfully.")
else:
print(f"❌ Document with ID {doc_id} not found or could not be deleted.")
finally:
client.close()
Run the script.
python3 delete_document.py
Output.
✅ Document with ID 2189ca0b-ebdb-437a-90f1-d2881bfbbbd7 deleted successfully.
Conclusion
Congratulations! You’ve successfully created a powerful vector search system that combines Weaviate’s efficient similarity search capabilities with GPU-accelerated embedding generation. This setup demonstrates how modern hardware can dramatically improve the performance of semantic search applications.