How to Use RNN for Sequence Labeling on Ubuntu 24.04 GPU Server

Table of Contents

Prerequisites
Step 1: Setting Up the Environment
Step 2: Prepare the Project Structure
Step 3: Create Data Loading Utilities
Step 4: Build and Train the RNN Model
Step 5: Evaluate the Model
Step 6: Create a Prediction Script
Conclusion

Recurrent Neural Networks (RNNs) are powerful deep learning models, particularly well-suited for sequence labeling tasks such as named entity recognition, part-of-speech tagging, or speech recognition. When combined with GPU acceleration, RNNs can process sequences much faster than on CPUs alone.

In this guide, we’ll walk through setting up an Ubuntu 24.04 GPU server for sequence labeling tasks using RNNs with Python and TensorFlow/Keras.

Prerequisites

An Ubuntu 24.04 server with an NVIDIA GPU.
A non-root user with sudo privileges.
NVIDIA drivers installed.
Compatible NVIDIA CUDA Toolkit and cuDNN library installed.

Step 1: Setting Up the Environment

First, let’s install the necessary system packages and create a Python virtual environment.

1. Install required system packages.

apt install -y python3-pip python3-dev python3-venv build-essential libcupti-dev git wget unzip

2. Create and activate a Python virtual environment.

python3 -m venv rnn_env
source rnn_env/bin/activate

3. Upgrade pip and install the required Python packages.

pip install --upgrade pip
pip install tensorflow numpy pandas matplotlib scikit-learn seqeval

4. Let’s check that TensorFlow can detect your GPU.

python3 -c "import tensorflow as tf; print('Num GPUs:', len(tf.config.list_physical_devices('GPU')))"

Output.

Num GPUs: 1

Step 2: Prepare the Project Structure

1. Create a directory structure for our sequence labeling project.

mkdir -p ~/sequence_labeling/{data,models,utils}
cd ~/sequence_labeling

2. Download the dataset. We’ll use the CoNLL-2003 dataset for this example.

wget https://data.deepai.org/conll2003.zip -P data/

3. Extract the downloaded file to the data directory.

unzip data/conll2003.zip -d data/

Step 3: Create Data Loading Utilities

In this section, we will create helper functions to load and preprocess the CoNLL dataset.

1. Create a data loader script to handle the dataset.

nano utils/data_loader.py

Add the following code.

import numpy as np
from tensorflow.keras.preprocessing.sequence import pad_sequences
from tensorflow.keras.utils import to_categorical

def load_conll_data(file_path):
    tokens, labels = [], []
    with open(file_path, 'r', encoding='utf-8') as f:
        current_tokens, current_labels = [], []
        for line in f:
            line = line.strip()
            if not line:
                if current_tokens:
                    tokens.append(current_tokens)
                    labels.append(current_labels)
                    current_tokens, current_labels = [], []
            else:
                parts = line.split()
                current_tokens.append(parts[0])
                current_labels.append(parts[-1])
    return tokens, labels

def prepare_data(train_path, test_path):
    train_tokens, train_labels = load_conll_data(train_path)
    test_tokens, test_labels = load_conll_data(test_path)

    # Create vocabulary and label mappings
    word2idx = {w: i+2 for i, w in enumerate(set([w for s in train_tokens for w in s]))}
    word2idx[''] = 0
    word2idx[''] = 1

    label2idx = {l: i for i, l in enumerate(set([l for s in train_labels for l in s]))}
    idx2label = {i: l for l, i in label2idx.items()}

    # Convert tokens and labels to indices
    train_sequences = [[word2idx.get(w, word2idx['']) for w in s] for s in train_tokens]
    train_labels = [[label2idx[l] for l in s] for s in train_labels]

    test_sequences = [[word2idx.get(w, word2idx['']) for w in s] for s in test_tokens]
    test_labels = [[label2idx[l] for l in s] for s in test_labels]

    # Pad sequences
    max_len = max(len(s) for s in train_sequences)
    train_sequences = pad_sequences(train_sequences, maxlen=max_len, padding='post')
    train_labels = pad_sequences(train_labels, maxlen=max_len, padding='post')

    test_sequences = pad_sequences(test_sequences, maxlen=max_len, padding='post')
    test_labels = pad_sequences(test_labels, maxlen=max_len, padding='post')

    # Convert labels to categorical
    num_classes = len(label2idx)
    train_labels = [to_categorical(i, num_classes=num_classes) for i in train_labels]
    test_labels = [to_categorical(i, num_classes=num_classes) for i in test_labels]

    return {
        'word2idx': word2idx,
        'label2idx': label2idx,
        'idx2label': idx2label,
        'train_sequences': train_sequences,
        'train_labels': train_labels,
        'test_sequences': test_sequences,
        'test_labels': test_labels,
        'max_len': max_len,
        'num_classes': num_classes,
        'vocab_size': len(word2idx)
    }

This script loads and tokenizes the dataset, converts it to indexed sequences, and pads them for training.

2. Run the data loader script.

python3 utils/data_loader.py

Step 4: Build and Train the RNN Model

Here we build a BiLSTM-based sequence labeling model and train it using our processed data.

1. Create a training script.

nano train.py

Add the following training code.

import tensorflow as tf
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Input, Embedding, Bidirectional, LSTM, TimeDistributed, Dense
from utils.data_loader import prepare_data
import os
import numpy as np
import pickle

def build_rnn_model(vocab_size, max_len, num_classes):
    input_layer = Input(shape=(max_len,))
    embedding = Embedding(input_dim=vocab_size, output_dim=128)(input_layer)
    lstm = Bidirectional(LSTM(units=64, return_sequences=True))(embedding)
    output = TimeDistributed(Dense(num_classes, activation='softmax'))(lstm)
    
    model = Model(inputs=input_layer, outputs=output)
    model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
    return model

def main():
    # Prepare data
    data = prepare_data('data/train.txt', 'data/test.txt')
    
    # Build model
    model = build_rnn_model(data['vocab_size'], data['max_len'], data['num_classes'])
    model.summary()
    
    # Train model
    history = model.fit(
        np.array(data['train_sequences']),
        np.array(data['train_labels']),
        validation_data=(np.array(data['test_sequences']), np.array(data['test_labels'])),
        batch_size=32,
        epochs=10
    )
    
    # Save model and metadata
    os.makedirs('models', exist_ok=True)
    model.save('models/rnn_sequence_labeler.h5')

    with open('models/metadata.pkl', 'wb') as f:
        pickle.dump({
            'word2idx': data['word2idx'],
            'label2idx': data['label2idx'],
            'idx2label': data['idx2label'],
            'max_len': data['max_len']
        }, f)
    
    print("Model training complete and saved to models/ directory")

if __name__ == "__main__":
    main()

This script defines the RNN model and trains it using BiLSTM layers for better context understanding in both directions.

2. Run the training script.

python3 train.py

Output.

Model: "functional"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┓
┃ Layer (type)                         ┃ Output Shape                ┃         Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━┩
│ input_layer (InputLayer)             │ (None, 113)                 │               0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ embedding (Embedding)                │ (None, 113, 128)            │       3,024,128 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ bidirectional (Bidirectional)        │ (None, 113, 128)            │          98,816 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ time_distributed (TimeDistributed)   │ (None, 113, 9)              │           1,161 │
└──────────────────────────────────────┴─────────────────────────────┴─────────────────┘
 Total params: 3,124,105 (11.92 MB)
 Trainable params: 3,124,105 (11.92 MB)
 Non-trainable params: 0 (0.00 B)
Epoch 1/10
I0000 00:00:1744894143.190118   11843 cuda_dnn.cc:529] Loaded cuDNN version 90700
1/1 ━━━━━━━━━━━━━━━━━━━━ 4s 4s/step - accuracy: 0.6667 - loss: 1.0917 - val_accuracy: 0.2222 - val_loss: 1.1006
Epoch 2/10
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 62ms/step - accuracy: 0.6667 - loss: 1.0773 - val_accuracy: 0.3333 - val_loss: 1.1023
Epoch 3/10
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 60ms/step - accuracy: 0.6667 - loss: 1.0628 - val_accuracy: 0.2222 - val_loss: 1.1041
Epoch 4/10
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 61ms/step - accuracy: 0.6667 - loss: 1.0481 - val_accuracy: 0.2222 - val_loss: 1.1062
Epoch 5/10
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 60ms/step - accuracy: 0.6667 - loss: 1.0328 - val_accuracy: 0.2222 - val_loss: 1.1084
Epoch 6/10
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 58ms/step - accuracy: 0.6667 - loss: 1.0168 - val_accuracy: 0.2222 - val_loss: 1.1110
Epoch 7/10
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 58ms/step - accuracy: 0.6667 - loss: 0.9998 - val_accuracy: 0.2222 - val_loss: 1.1139
Epoch 8/10
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 59ms/step - accuracy: 0.6667 - loss: 0.9818 - val_accuracy: 0.2222 - val_loss: 1.1173
Epoch 9/10
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 60ms/step - accuracy: 0.6667 - loss: 0.9627 - val_accuracy: 0.2222 - val_loss: 1.1212
Epoch 10/10
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 62ms/step - accuracy: 0.6667 - loss: 0.9424 - val_accuracy: 0.2222 - val_loss: 1.1257

Step 5: Evaluate the Model

This section assesses the performance of the trained model on the test data, using accuracy and classification reports.

1. Create an evaluation script.

nano test.py

Add the following code.

import tensorflow as tf
import numpy as np
import pickle
from sklearn.metrics import classification_report
from seqeval.metrics import classification_report as seqeval_report

def load_model_and_metadata():
    model = tf.keras.models.load_model('models/rnn_sequence_labeler.h5')
    with open('models/metadata.pkl', 'rb') as f:
        metadata = pickle.load(f)
    return model, metadata

def evaluate_model():
    model, metadata = load_model_and_metadata()
    
    # Reload test data
    from utils.data_loader import prepare_data
    data = prepare_data('data/train.txt', 'data/test.txt')
    
    # Predict on test data
    y_pred = model.predict(np.array(data['test_sequences']))
    y_pred = np.argmax(y_pred, axis=-1)
    y_true = np.argmax(np.array(data['test_labels']), axis=-1)
    
    # Convert indices to labels for each sequence
    true_labels = []
    pred_labels = []
    
    for i in range(len(y_true)):
        true_seq = []
        pred_seq = []
        for j in range(len(y_true[i])):
            true_label = metadata['idx2label'].get(y_true[i][j], 'O')
            pred_label = metadata['idx2label'].get(y_pred[i][j], 'O')
            
            # Skip padding tokens
            if true_label != 'O' or pred_label != 'O':
                true_seq.append(true_label)
                pred_seq.append(pred_label)
        
        true_labels.append(true_seq)
        pred_labels.append(pred_seq)
    
    # Print token-level classification report
    print("Token-level Classification Report:")
    flat_true = [label for seq in true_labels for label in seq]
    flat_pred = [label for seq in pred_labels for label in seq]
    print(classification_report(flat_true, flat_pred))
    
    # Print sequence-level report (requires seqeval)
    print("\nSequence-level Classification Report:")
    print(seqeval_report(true_labels, pred_labels))

if __name__ == "__main__":
    evaluate_model()

This script reloads the saved model and computes detailed evaluation metrics on the test set using both sklearn and seqeval libraries.

2. Run the evaluation.

python3 test.py

Step 6: Create a Prediction Script

1. Finally, let’s create a script to make predictions on new text.

nano predict.py

Add the following code.

import tensorflow as tf
import numpy as np
import pickle
from tensorflow.keras.preprocessing.sequence import pad_sequences

def load_model_and_metadata():
    model = tf.keras.models.load_model('models/rnn_sequence_labeler.h5')
    with open('models/metadata.pkl', 'rb') as f:
        metadata = pickle.load(f)
    return model, metadata

def predict_sequence(text):
    model, metadata = load_model_and_metadata()
    
    tokens = text.split()
    sequence = [metadata['word2idx'].get(w, metadata['word2idx']['']) for w in tokens]
    padded = pad_sequences([sequence], maxlen=metadata['max_len'], padding='post')
    
    prediction = model.predict(padded)
    prediction = np.argmax(prediction, axis=-1)[0]
    
    return [(token, metadata['idx2label'].get(idx, 'O')) for token, idx in zip(tokens, prediction)]

if __name__ == "__main__":
    while True:
        text = input("\nEnter text to analyze (or 'quit' to exit): ")
        if text.lower() == 'quit':
            break
        results = predict_sequence(text)
        print("\nPredicted labels:")
        for token, label in results:
            print(f"{token}: {label}")

This script allows users to enter a sentence and receive predicted labels for each word using the trained RNN model.

2. Test the prediction script.

python3 predict.py

You will be asked to enter the text to analyze.

Enter text to analyze (or 'quit' to exit): Microsoft is based in USA

Write the text “Microsoft is based in USA” and press the Enter key. You will see the output below.

I0000 00:00:1744894809.978962   13089 cuda_dnn.cc:529] Loaded cuDNN version 90700
1/1 ━━━━━━━━━━━━━━━━━━━━ 1s 1s/step

Predicted labels:
Microsoft: B-ORG
is: O
based: O
in: O
USA: B-LOC

Conclusion

In this guide, you learned how to set up a GPU-powered RNN model for sequence labeling tasks on an Ubuntu 24.04 server. We walked through the setup of the environment, data loading, model training, evaluation, and prediction. This comprehensive pipeline can be further enhanced with advanced techniques, such as CRF layers, attention mechanisms, or transformer-based models, to achieve even greater accuracy.

Facebook

Atlantic.Net Cloud GPU Hosting Massive Computing Power

Up in 60 Seconds!

Your subscription could not be saved. Please try again.

Your subscription has been successful.

Newsletter

Subscribe to our newsletter and stay updated.

Email Address

Provide your email address to subscribe. For e.g [email protected]

Your subscription could not be saved. Please try again.

Your subscription has been successful.

View White Papers

How to Use RNN for Sequence Labeling on Ubuntu 24.04 GPU Server

Prerequisites

Step 1: Setting Up the Environment

Step 2: Prepare the Project Structure

Step 3: Create Data Loading Utilities

Step 4: Build and Train the RNN Model

Step 5: Evaluate the Model

Step 6: Create a Prediction Script

Conclusion

Atlantic.Net Cloud GPU Hosting Massive Computing Power

Award-Winning Hosting Solutions & Services