Troubleshooting Common Issues

This guide covers the most common issues users encounter and how to resolve them.

Memory and Performance Issues

Out of Memory Errors

Symptoms:

allocation error: CapacityOverflow
Server crashes unexpectedly

Causes:

Hitting the --allocator-size limit
Large batch operations
Image processing without streaming enabled

Solutions:

Increase allocator size:

ahnlich-db run --allocator-size 21474836480  # 20 GiB (default is 10 GiB)
ahnlich-ai run --allocator-size 21474836480

Enable streaming for images (AI proxy):

ahnlich-ai run --enable-streaming  # 10x less memory, 40% slower

Reduce batch sizes:

# Instead of:
large_batch = [entry1, entry2, ..., entry1000]
client.set(Set(store="my_store", inputs=large_batch))

# Do this:
batch_size = 100
for i in range(0, len(large_batch), batch_size):
    batch = large_batch[i:i+batch_size]
    client.set(Set(store="my_store", inputs=batch))

Monitor memory usage:

# Check process memory
ps aux | grep ahnlich

# Monitor with top
top -p $(pgrep ahnlich)

Slow Query Performance

Symptoms:

Queries taking longer than expected
High CPU usage

Diagnostic Steps:

Enable tracing to identify bottlenecks:

ahnlich-db run --enable-tracing --otel-endpoint http://localhost:4317

View traces in Jaeger UI at http://localhost:16686

Check store size:

INFOSERVER

Verify algorithm choice:

Linear algorithms (Cosine, Euclidean, DotProduct) scale linearly with data size
Use KDTree for faster searches with large datasets:

CREATESTORE my_store DIMENSION 128 NONLINEARALGORITHMINDEX (KDTree)

Solutions:

Use predicate indices for filtering:

# Index frequently filtered fields
CREATEPREDINDEX my_store PREDICATES (category, author)

# Then filter efficiently
GETPRED 10 IN my_store WHERE (category = science)

Optimize batch operations:

# Batch SET operations
entries = [entry1, entry2, ..., entry100]
client.set(Set(store="my_store", inputs=entries))

Use appropriate similarity algorithm:

CosineSimilarity: Best for normalized vectors, direction-based similarity
EuclideanDistance: Best for absolute distance measures
DotProduct: Fast when vectors are pre-normalized
KDTree: Best for high-dimensional spatial searches

Adjust thread pool size:

ahnlich-db run --threadpool-size 32  # Default: 16

Connection Issues

Cannot Connect to Server

Symptoms:

connection refused
Failed to dial server
Transport issues with tonic

Diagnostic Steps:

Check if server is running:

# Check DB
curl http://localhost:1369  # or use telnet
ps aux | grep ahnlich-db

# Check AI
curl http://localhost:1370
ps aux | grep ahnlich-ai

Verify port availability:

# Check if port is in use
lsof -i :1369
lsof -i :1370

# Or with netstat
netstat -tuln | grep 1369

Check firewall rules:

# Ubuntu/Debian
sudo ufw status
sudo ufw allow 1369
sudo ufw allow 1370

# CentOS/RHEL
sudo firewall-cmd --list-all
sudo firewall-cmd --add-port=1369/tcp --permanent
sudo firewall-cmd --reload

Solutions:

Start server on all interfaces:

# Allow connections from any IP
ahnlich-db run --host 0.0.0.0 --port 1369
ahnlich-ai run --host 0.0.0.0 --port 1370

Check host/port configuration:

# Correct
client = DbClient("http://127.0.0.1:1369")

# Wrong - missing protocol
client = DbClient("127.0.0.1:1369")  # Invalid URI error

Verify network connectivity:

# Test connectivity
ping <server-host>
telnet <server-host> 1369

Maximum Clients Reached

Symptoms:

Max Connected Clients Reached
Connection rejected

Cause: Hit the --maximum-clients limit (default: 1000)

Solutions:

Increase client limit:

ahnlich-db run --maximum-clients 5000

Check current connections:

LISTCLIENTS

Implement connection pooling:

# Reuse connections instead of creating new ones
class ClientPool:
    def __init__(self, uri, pool_size=10):
        self.pool = [DbClient(uri) for _ in range(pool_size)]
        self.index = 0
    
    def get_client(self):
        client = self.pool[self.index]
        self.index = (self.index + 1) % len(self.pool)
        return client

Close idle connections:

async def cleanup():
    await client.close()

AI Proxy Cannot Connect to Database

Symptoms:

Proxy Errored with connection refused
DatabaseClientError

Diagnostic Steps:

Verify DB is running:

ps aux | grep ahnlich-db

Check DB host/port:

# See what DB is listening on
netstat -tuln | grep 1369

Solutions:

Start DB before AI:

# Terminal 1
ahnlich-db run --port 1369

# Terminal 2 (wait for DB to start)
ahnlich-ai run --db-host 127.0.0.1 --db-port 1369

Verify connection settings:

# If DB is on different host
ahnlich-ai run --db-host 192.168.1.10 --db-port 1369

# If DB uses non-default port
ahnlich-ai run --db-port 1400

For standalone mode (no DB):

ahnlich-ai run --without-db

Adjust connection pool:

ahnlich-ai run --db-client-pool-size 20  # Default: 10

Data and Store Issues

Store Not Found

Symptoms:

Store "my_store" not found

Diagnostic Steps:

List all stores:

LISTSTORES

Check store name spelling:

# Store names are case-sensitive
"MyStore" ≠ "mystore"

Solutions:

Create the store:

# DB
CREATESTORE my_store DIMENSION 128

# AI
CREATESTORE my_store QUERYMODEL all-minilm-l6-v2 INDEXMODEL all-minilm-l6-v2

Check persistence loaded:

# If using persistence
ahnlich-db run \
  --enable-persistence \
  --persist-location /path/to/data.dat \
  --fail-on-startup-if-persist-load-fails true  # Fail loudly if load fails

Verify correct server:

# Make sure you're connecting to the right instance
client = DbClient("http://localhost:1369")  # Not a different instance

Dimension Mismatch Errors

Symptoms:

Store dimension is [128], input dimension of [256] was specified

Cause: Vector dimensions don't match store configuration.

Solutions:

Check store dimension:

INFOSERVER
# Look at store details

For AI stores, verify model dimensions:

Model	Embedding Dimension
all-minilm-l6-v2	384
all-minilm-l12-v2	384
bge-base-en-v1.5	768
bge-large-en-v1.5	1024
resnet-50	2048
clip-vit-b32-*	512

Match query and index models:

# Both must have same dimensions
CreateStore(
    store="my_store",
    query_model=AiModel.BGE_BASE_EN_V15,  # 768-dim
    index_model=AiModel.BGE_BASE_EN_V15,  # 768-dim (same)
)

Recreate store with correct dimension:

DROPSTORE my_store IFTRUE
CREATESTORE my_store DIMENSION 768

Predicate Not Found

Symptoms:

Predicate "author" not found in store

Cause: Querying by a predicate that wasn't indexed.

Solutions:

Create predicate index:

CREATEPREDINDEX my_store PREDICATES (author, category)

Or include when creating store:

CREATESTORE my_store DIMENSION 128 PREDICATES (author, category)

Verify predicates exist:

INFOSERVER
# Check store predicates

Model and AI Issues

Model Not Loading

Symptoms:

index_model or query_model not selected or loaded
Error initializing a model thread
Tokenizer for model failed to load

Diagnostic Steps:

Check supported models:

ahnlich-ai run --supported-models all-minilm-l6-v2,resnet-50

Verify model cache:

# Default location
ls -la ~/.ahnlich/models

# Custom location
ahnlich-ai run --model-cache-location /path/to/models

Check disk space:

df -h ~/.ahnlich/models

Test network connectivity:

# Models download from HuggingFace
curl https://huggingface.co

Solutions:

Wait for initial download:

# First time loading a model downloads from HuggingFace
# This can take several minutes depending on model size
# Watch logs for progress

Clear corrupted cache:

rm -rf ~/.ahnlich/models/model_name
# Restart server to re-download

Increase idle time:

# Keep models loaded longer
ahnlich-ai run --ai-model-idle-time 600  # 10 minutes (default: 5 min)

Pre-download models:

# Download models before starting server
python -c "from transformers import AutoModel; AutoModel.from_pretrained('sentence-transformers/all-MiniLM-L6-v2')"

Token Limit Exceeded

Symptoms:

Max Token Exceeded. Model Expects [256], input type was [512]

Cause: Text input exceeds model's token limit.

Token Limits:

all-minilm-*: 256 tokens
bge-*: 512 tokens
clip-vit-b32-text: 77 tokens

Solutions:

Truncate text:

def truncate_text(text, max_length=200):
    words = text.split()
    return ' '.join(words[:max_length])

text = truncate_text(long_text)

Split into chunks:

def chunk_text(text, chunk_size=200):
    words = text.split()
    return [' '.join(words[i:i+chunk_size]) 
            for i in range(0, len(words), chunk_size)]

chunks = chunk_text(long_document)
for chunk in chunks:
    client.set(Set(store="docs", inputs=[...]))

Use model with larger limit:

# Switch from AllMiniLM (256) to BGE (512)
CreateStore(
    store="my_store",
    query_model=AiModel.BGE_BASE_EN_V15,  # 512 tokens
    index_model=AiModel.BGE_BASE_EN_V15,
)

Image Dimension Errors

Symptoms:

Image Dimensions [(512, 512)] does not match expected [(224, 224)]
Image can't have zero dimension

Cause: Images not matching model requirements (224x224 pixels).

Solutions:

Resize images:

from PIL import Image

def prepare_image(image_path):
    img = Image.open(image_path)
    img = img.resize((224, 224))
    return img.tobytes()

image_bytes = prepare_image("photo.jpg")

Use model preprocessing:

Set(
    store="my_store",
    inputs=[...],
    preprocess_action=PreprocessAction.ModelPreprocessing,  # Auto-resize
)

Validate images before sending:

def validate_image(image_bytes):
    img = Image.open(io.BytesIO(image_bytes))
    if img.width == 0 or img.height == 0:
        raise ValueError("Invalid image dimensions")
    return img

img = validate_image(image_bytes)

Persistence Issues

Persistence File Won't Load

Symptoms:

Failed to load persistence file
Corruption detected

Diagnostic Steps:

Check file permissions:

ls -l /path/to/persistence.dat

Verify file size vs allocator:

# File size
du -h persistence.dat

# Allocator must be >2x file size

Solutions:

Increase allocator size:

# If persistence file is 5 GB, use at least 10 GB allocator
ahnlich-db run \
  --enable-persistence \
  --persist-location /path/to/data.dat \
  --allocator-size 10737418240  # 10 GB

Skip corrupted persistence:

ahnlich-db run \
  --enable-persistence \
  --persist-location /path/to/data.dat \
  --fail-on-startup-if-persist-load-fails false  # Continue without persistence

Backup and delete:

# Backup
cp persistence.dat persistence.dat.backup

# Start fresh
rm persistence.dat
ahnlich-db run --enable-persistence --persist-location persistence.dat

Check disk space:

df -h /path/to/persistence/

Data Lost After Restart

Cause: Persistence not enabled.

Solution:

Enable persistence when starting server:

ahnlich-db run \
  --enable-persistence \
  --persist-location /var/lib/ahnlich/db.dat \
  --persistence-interval 300000  # 5 minutes

Debugging Tips

Enable Detailed Logging

# Set log level
ahnlich-db run --log-level debug

# Or specific modules
ahnlich-db run --log-level "info,ahnlich_db=debug,hf_hub=warn"

Enable Distributed Tracing

# Start Jaeger
docker run -d \
  -p 16686:16686 \
  -p 4317:4317 \
  jaegertracing/all-in-one:latest

# Start server with tracing
ahnlich-db run \
  --enable-tracing \
  --otel-endpoint http://localhost:4317

# View traces at http://localhost:16686

Use CLI for Testing

# Interactive mode
ahnlich --agent DB --host 127.0.0.1 --port 1369

# Test commands
PING
INFOSERVER
LISTSTORES

Check Server Health

# Process status
ps aux | grep ahnlich

# Resource usage
top -p $(pgrep ahnlich)

# Network connections
netstat -anp | grep ahnlich

# Open files
lsof -p $(pgrep ahnlich)

Getting More Help

Still having issues? Try these resources:

Check Error Codes: Error Codes Reference
Read Configuration Docs: Configuration Reference
Enable Tracing: See detailed request flow
Community: WhatsApp Group
GitHub: Report Issues

When reporting issues, include:

Error messages (full text)
Server version
Configuration flags used
Steps to reproduce
Server logs (with --log-level debug)

Memory and Performance Issues​

Out of Memory Errors​

Slow Query Performance​

Connection Issues​

Cannot Connect to Server​

Maximum Clients Reached​

AI Proxy Cannot Connect to Database​

Data and Store Issues​

Store Not Found​

Dimension Mismatch Errors​

Predicate Not Found​

Model and AI Issues​

Model Not Loading​

Token Limit Exceeded​

Image Dimension Errors​

Persistence Issues​

Persistence File Won't Load​

Data Lost After Restart​

Debugging Tips​

Enable Detailed Logging​

Enable Distributed Tracing​

Use CLI for Testing​

Check Server Health​

Getting More Help​

Memory and Performance Issues

Out of Memory Errors

Slow Query Performance

Connection Issues

Cannot Connect to Server

Maximum Clients Reached

AI Proxy Cannot Connect to Database

Data and Store Issues

Store Not Found

Dimension Mismatch Errors

Predicate Not Found

Model and AI Issues

Model Not Loading

Token Limit Exceeded

Image Dimension Errors

Persistence Issues

Persistence File Won't Load

Data Lost After Restart

Debugging Tips

Enable Detailed Logging

Enable Distributed Tracing

Use CLI for Testing

Check Server Health

Getting More Help