Troubleshooting Common Issues
This guide covers the most common issues users encounter and how to resolve them.
Memory and Performance Issuesβ
Out of Memory Errorsβ
Symptoms:
allocation error: CapacityOverflow
Server crashes unexpectedly
Causes:
- Hitting the
--allocator-sizelimit - Large batch operations
- Image processing without streaming enabled
Solutions:
- Increase allocator size:
ahnlich-db run --allocator-size 21474836480 # 20 GiB (default is 10 GiB)
ahnlich-ai run --allocator-size 21474836480
- Enable streaming for images (AI proxy):
ahnlich-ai run --enable-streaming # 10x less memory, 40% slower
- Reduce batch sizes:
# Instead of:
large_batch = [entry1, entry2, ..., entry1000]
client.set(Set(store="my_store", inputs=large_batch))
# Do this:
batch_size = 100
for i in range(0, len(large_batch), batch_size):
batch = large_batch[i:i+batch_size]
client.set(Set(store="my_store", inputs=batch))
- Monitor memory usage:
# Check process memory
ps aux | grep ahnlich
# Monitor with top
top -p $(pgrep ahnlich)
Slow Query Performanceβ
Symptoms:
- Queries taking longer than expected
- High CPU usage
Diagnostic Steps:
- Enable tracing to identify bottlenecks:
ahnlich-db run --enable-tracing --otel-endpoint http://localhost:4317
View traces in Jaeger UI at http://localhost:16686
- Check store size:
INFOSERVER
- Verify algorithm choice:
- Linear algorithms (Cosine, Euclidean, DotProduct) scale linearly with data size
- Use
KDTreefor faster searches with large datasets:
CREATESTORE my_store DIMENSION 128 NONLINEARALGORITHMINDEX (KDTree)
Solutions:
- Use predicate indices for filtering:
# Index frequently filtered fields
CREATEPREDINDEX my_store PREDICATES (category, author)
# Then filter efficiently
GETPRED 10 IN my_store WHERE (category = science)
- Optimize batch operations:
# Batch SET operations
entries = [entry1, entry2, ..., entry100]
client.set(Set(store="my_store", inputs=entries))
- Use appropriate similarity algorithm:
- CosineSimilarity: Best for normalized vectors, direction-based similarity
- EuclideanDistance: Best for absolute distance measures
- DotProduct: Fast when vectors are pre-normalized
- KDTree: Best for high-dimensional spatial searches
- Adjust thread pool size:
ahnlich-db run --threadpool-size 32 # Default: 16
Connection Issuesβ
Cannot Connect to Serverβ
Symptoms:
connection refused
Failed to dial server
Transport issues with tonic
Diagnostic Steps:
- Check if server is running:
# Check DB
curl http://localhost:1369 # or use telnet
ps aux | grep ahnlich-db
# Check AI
curl http://localhost:1370
ps aux | grep ahnlich-ai
- Verify port availability:
# Check if port is in use
lsof -i :1369
lsof -i :1370
# Or with netstat
netstat -tuln | grep 1369
- Check firewall rules:
# Ubuntu/Debian
sudo ufw status
sudo ufw allow 1369
sudo ufw allow 1370
# CentOS/RHEL
sudo firewall-cmd --list-all
sudo firewall-cmd --add-port=1369/tcp --permanent
sudo firewall-cmd --reload
Solutions:
- Start server on all interfaces:
# Allow connections from any IP
ahnlich-db run --host 0.0.0.0 --port 1369
ahnlich-ai run --host 0.0.0.0 --port 1370
- Check host/port configuration:
# Correct
client = DbClient("http://127.0.0.1:1369")
# Wrong - missing protocol
client = DbClient("127.0.0.1:1369") # Invalid URI error
- Verify network connectivity:
# Test connectivity
ping <server-host>
telnet <server-host> 1369
Maximum Clients Reachedβ
Symptoms:
Max Connected Clients Reached
Connection rejected
Cause: Hit the --maximum-clients limit (default: 1000)
Solutions:
- Increase client limit:
ahnlich-db run --maximum-clients 5000
- Check current connections:
LISTCLIENTS
- Implement connection pooling:
# Reuse connections instead of creating new ones
class ClientPool:
def __init__(self, uri, pool_size=10):
self.pool = [DbClient(uri) for _ in range(pool_size)]
self.index = 0
def get_client(self):
client = self.pool[self.index]
self.index = (self.index + 1) % len(self.pool)
return client
- Close idle connections:
async def cleanup():
await client.close()
AI Proxy Cannot Connect to Databaseβ
Symptoms:
Proxy Errored with connection refused
DatabaseClientError
Diagnostic Steps:
- Verify DB is running:
ps aux | grep ahnlich-db
- Check DB host/port:
# See what DB is listening on
netstat -tuln | grep 1369
Solutions:
- Start DB before AI:
# Terminal 1
ahnlich-db run --port 1369
# Terminal 2 (wait for DB to start)
ahnlich-ai run --db-host 127.0.0.1 --db-port 1369
- Verify connection settings:
# If DB is on different host
ahnlich-ai run --db-host 192.168.1.10 --db-port 1369
# If DB uses non-default port
ahnlich-ai run --db-port 1400
- For standalone mode (no DB):
ahnlich-ai run --without-db
- Adjust connection pool:
ahnlich-ai run --db-client-pool-size 20 # Default: 10
Data and Store Issuesβ
Store Not Foundβ
Symptoms:
Store "my_store" not found
Diagnostic Steps:
- List all stores:
LISTSTORES
- Check store name spelling:
# Store names are case-sensitive
"MyStore" β "mystore"
Solutions:
- Create the store:
# DB
CREATESTORE my_store DIMENSION 128
# AI
CREATESTORE my_store QUERYMODEL all-minilm-l6-v2 INDEXMODEL all-minilm-l6-v2
- Check persistence loaded:
# If using persistence
ahnlich-db run \
--enable-persistence \
--persist-location /path/to/data.dat \
--fail-on-startup-if-persist-load-fails true # Fail loudly if load fails
- Verify correct server:
# Make sure you're connecting to the right instance
client = DbClient("http://localhost:1369") # Not a different instance
Dimension Mismatch Errorsβ
Symptoms:
Store dimension is [128], input dimension of [256] was specified
Cause: Vector dimensions don't match store configuration.
Solutions:
- Check store dimension:
INFOSERVER
# Look at store details
- For AI stores, verify model dimensions:
| Model | Embedding Dimension |
|---|---|
| all-minilm-l6-v2 | 384 |
| all-minilm-l12-v2 | 384 |
| bge-base-en-v1.5 | 768 |
| bge-large-en-v1.5 | 1024 |
| resnet-50 | 2048 |
| clip-vit-b32-* | 512 |
- Match query and index models:
# Both must have same dimensions
CreateStore(
store="my_store",
query_model=AiModel.BGE_BASE_EN_V15, # 768-dim
index_model=AiModel.BGE_BASE_EN_V15, # 768-dim (same)
)
- Recreate store with correct dimension:
DROPSTORE my_store IFTRUE
CREATESTORE my_store DIMENSION 768
Predicate Not Foundβ
Symptoms:
Predicate "author" not found in store
Cause: Querying by a predicate that wasn't indexed.
Solutions:
- Create predicate index:
CREATEPREDINDEX my_store PREDICATES (author, category)
- Or include when creating store:
CREATESTORE my_store DIMENSION 128 PREDICATES (author, category)
- Verify predicates exist:
INFOSERVER
# Check store predicates
Model and AI Issuesβ
Model Not Loadingβ
Symptoms:
index_model or query_model not selected or loaded
Error initializing a model thread
Tokenizer for model failed to load
Diagnostic Steps:
- Check supported models:
ahnlich-ai run --supported-models all-minilm-l6-v2,resnet-50
- Verify model cache:
# Default location
ls -la ~/.ahnlich/models
# Custom location
ahnlich-ai run --model-cache-location /path/to/models
- Check disk space:
df -h ~/.ahnlich/models
- Test network connectivity:
# Models download from HuggingFace
curl https://huggingface.co
Solutions:
- Wait for initial download:
# First time loading a model downloads from HuggingFace
# This can take several minutes depending on model size
# Watch logs for progress
- Clear corrupted cache:
rm -rf ~/.ahnlich/models/model_name
# Restart server to re-download
- Increase idle time:
# Keep models loaded longer
ahnlich-ai run --ai-model-idle-time 600 # 10 minutes (default: 5 min)
- Pre-download models:
# Download models before starting server
python -c "from transformers import AutoModel; AutoModel.from_pretrained('sentence-transformers/all-MiniLM-L6-v2')"
Token Limit Exceededβ
Symptoms:
Max Token Exceeded. Model Expects [256], input type was [512]
Cause: Text input exceeds model's token limit.
Token Limits:
- all-minilm-*: 256 tokens
- bge-*: 512 tokens
- clip-vit-b32-text: 77 tokens
Solutions:
- Truncate text:
def truncate_text(text, max_length=200):
words = text.split()
return ' '.join(words[:max_length])
text = truncate_text(long_text)
- Split into chunks:
def chunk_text(text, chunk_size=200):
words = text.split()
return [' '.join(words[i:i+chunk_size])
for i in range(0, len(words), chunk_size)]
chunks = chunk_text(long_document)
for chunk in chunks:
client.set(Set(store="docs", inputs=[...]))
- Use model with larger limit:
# Switch from AllMiniLM (256) to BGE (512)
CreateStore(
store="my_store",
query_model=AiModel.BGE_BASE_EN_V15, # 512 tokens
index_model=AiModel.BGE_BASE_EN_V15,
)
Image Dimension Errorsβ
Symptoms:
Image Dimensions [(512, 512)] does not match expected [(224, 224)]
Image can't have zero dimension
Cause: Images not matching model requirements (224x224 pixels).
Solutions:
- Resize images:
from PIL import Image
def prepare_image(image_path):
img = Image.open(image_path)
img = img.resize((224, 224))
return img.tobytes()
image_bytes = prepare_image("photo.jpg")
- Use model preprocessing:
Set(
store="my_store",
inputs=[...],
preprocess_action=PreprocessAction.ModelPreprocessing, # Auto-resize
)
- Validate images before sending:
def validate_image(image_bytes):
img = Image.open(io.BytesIO(image_bytes))
if img.width == 0 or img.height == 0:
raise ValueError("Invalid image dimensions")
return img
img = validate_image(image_bytes)
Persistence Issuesβ
Persistence File Won't Loadβ
Symptoms:
Failed to load persistence file
Corruption detected
Diagnostic Steps:
- Check file permissions:
ls -l /path/to/persistence.dat
- Verify file size vs allocator:
# File size
du -h persistence.dat
# Allocator must be >2x file size
Solutions:
- Increase allocator size:
# If persistence file is 5 GB, use at least 10 GB allocator
ahnlich-db run \
--enable-persistence \
--persist-location /path/to/data.dat \
--allocator-size 10737418240 # 10 GB
- Skip corrupted persistence:
ahnlich-db run \
--enable-persistence \
--persist-location /path/to/data.dat \
--fail-on-startup-if-persist-load-fails false # Continue without persistence
- Backup and delete:
# Backup
cp persistence.dat persistence.dat.backup
# Start fresh
rm persistence.dat
ahnlich-db run --enable-persistence --persist-location persistence.dat
- Check disk space:
df -h /path/to/persistence/
Data Lost After Restartβ
Cause: Persistence not enabled.
Solution:
Enable persistence when starting server:
ahnlich-db run \
--enable-persistence \
--persist-location /var/lib/ahnlich/db.dat \
--persistence-interval 300000 # 5 minutes
Debugging Tipsβ
Enable Detailed Loggingβ
# Set log level
ahnlich-db run --log-level debug
# Or specific modules
ahnlich-db run --log-level "info,ahnlich_db=debug,hf_hub=warn"
Enable Distributed Tracingβ
# Start Jaeger
docker run -d \
-p 16686:16686 \
-p 4317:4317 \
jaegertracing/all-in-one:latest
# Start server with tracing
ahnlich-db run \
--enable-tracing \
--otel-endpoint http://localhost:4317
# View traces at http://localhost:16686
Use CLI for Testingβ
# Interactive mode
ahnlich --agent DB --host 127.0.0.1 --port 1369
# Test commands
PING
INFOSERVER
LISTSTORES
Check Server Healthβ
# Process status
ps aux | grep ahnlich
# Resource usage
top -p $(pgrep ahnlich)
# Network connections
netstat -anp | grep ahnlich
# Open files
lsof -p $(pgrep ahnlich)
Getting More Helpβ
Still having issues? Try these resources:
- Check Error Codes: Error Codes Reference
- Read Configuration Docs: Configuration Reference
- Enable Tracing: See detailed request flow
- Community: WhatsApp Group
- GitHub: Report Issues
When reporting issues, include:
- Error messages (full text)
- Server version
- Configuration flags used
- Steps to reproduce
- Server logs (with
--log-level debug)