Advanced
Unlike Ahnlich DB, which is concerned with similarity algorithms and indexing, Ahnlich AI focuses on embedding generation. The service introduces model-aware stores, where you define the embedding models used for both data insertion (indexing) and querying. This abstraction lets developers work directly with raw inputs (text or images) while the AI proxy handles embedding generation.
Supported Modelsβ
Ahnlich AI includes several pre-trained models that can be configured depending on your workload. These cover both text embeddings and image embeddings:
| Model Name | String Name | Type | Max Input | Embedding Dim | Description |
|---|---|---|---|---|---|
| ALL_MINI_LM_L6_V2 | all-minilm-l6-v2 | Text | 256 tokens | 384 | Lightweight sentence transformer. Fast and memory-efficient, ideal for semantic similarity in applications like FAQ search or chatbots. |
| ALL_MINI_LM_L12_V2 | all-minilm-l12-v2 | Text | 256 tokens | 384 | Larger variant of MiniLM. Higher accuracy for nuanced text similarity tasks, but with increased compute requirements. |
| BGE_BASE_EN_V15 | bge-base-en-v1.5 | Text | 512 tokens | 768 | Base version of the BGE (English v1.5) model. Balanced performance and speed, suitable for production-scale applications. |
| BGE_LARGE_EN_V15 | bge-large-en-v1.5 | Text | 512 tokens | 1024 | High-accuracy embedding model for semantic search and retrieval. Best choice when precision is more important than latency. |
| JINA_CODE_V2 | jina-embeddings-v2-base-code | Text (Code) | 8192 tokens | 768 | Specialized code embedding model supporting 30+ programming languages. Optimized for code search, documentation lookup, and code-to-text retrieval. |
| RESNET50 | resnet-50 | Image | 224x224 px | 2048 | Convolutional Neural Network (CNN) for extracting embeddings from images. Useful for content-based image retrieval and clustering. |
| CLIP_VIT_B32_IMAGE | clip-vit-b32-image | Image | 224x224 px | 512 | Vision Transformer encoder from the CLIP model. Produces embeddings aligned with its paired text encoder for multimodal tasks. |
| CLIP_VIT_B32_TEXT | clip-vit-b32-text | Text | 77 tokens | 512 | Text encoder from CLIP. Designed to map textual inputs into the same space as CLIP image embeddings for text-to-image or image-to-text search. |
| BUFFALO_L | buffalo-l | Image (Face) | 640x640 px | 512 | Face detection and recognition model. Detects faces in images and generates embeddings for each detected face. Non-commercial use only. |
| SFACE_YUNET | sface-yunet | Image (Face) | 640x640 px | 128 | Lightweight face detection (YuNet) + recognition (SFace) pipeline. Apache 2.0 / MIT licensed - commercially usable. |
| CLAP_AUDIO | clap-audio | Audio | 10 sec max | 512 | Audio encoder from the CLAP model. Produces embeddings from audio inputs for audio similarity search and audio-to-text retrieval. |
| CLAP_TEXT | clap-text | Text | 512 tokens | 512 | Text encoder from the CLAP model. Maps textual descriptions into the same embedding space as CLAP audio embeddings for text-to-audio search. |
Model Constraintsβ
Audio Models (CLAP)β
| Constraint | Value | Notes |
|---|---|---|
| Max duration | 10 seconds | Longer clips will error with AudioTooLongError |
| Sample rate | 48 kHz | Audio is automatically resampled |
| Max samples | 480,000 | 48,000 Hz Γ 10 seconds |
| Preprocessing | Required | NoPreprocessing not supported - always use ModelPreprocessing |
Face Models (Buffalo_L, SFace+YuNet)β
| Constraint | Value | Notes |
|---|---|---|
| Input size | 640x640 px | Images are resized internally |
| Face alignment | 112x112 px | Standard ArcFace alignment |
| Embedding mode | OneToMany | Returns one embedding per detected face |
| Preprocessing | Required | NoPreprocessing not supported |
| Query constraint | Single face | Query images must contain exactly 1 face |
Cross-Modal Compatibilityβ
| Model Pair | Shared Dim | Use Case |
|---|---|---|
clip-vit-b32-text + clip-vit-b32-image | 512 | Text-to-image / image-to-text search |
clap-text + clap-audio | 512 | Text-to-audio / audio-to-text search |
Supported Input Typesβ
| Input Type | Description |
|---|---|
| RAW_STRING | Accepts natural text (sentences, paragraphs). Transformed into embeddings via a selected text-based model. |
| IMAGE | Accepts image files as input. Converted into embeddings via a selected image-based model (e.g., ResNet or CLIP). |
| AUDIO | Accepts audio data as input. Converted into embeddings via an audio-based model (e.g., CLAP Audio). |
Example β Creating a Model-Aware Storeβ
CREATESTORE my_store QUERYMODEL all-minilm-l6-v2 INDEXMODEL all-minilm-l6-v2
-
index_model - defines how inserted data is embedded before being stored in Ahnlich DB.
-
query_model - defines how queries are embedded at search time.
-
Both models must output embeddings of the same dimensionality to ensure compatibility.
Choosing the Right Modelβ
| Model | Best Use Case |
|---|---|
| MiniLM (L6/L12) | Fast, efficient semantic similarity (FAQs, chatbots). |
| BGE (Base/Large) | High semantic accuracy for production-scale applications. |
| Jina Code V2 | Code search, documentation retrieval, and semantic code similarity across 30+ languages. |
| ResNet50 | Image-to-image similarity and clustering. |
| CLIP (Text+Image) | Multimodal retrieval (text-to-image / image-to-text search). |
| Buffalo_L | Face detection and recognition in images (e.g., group photos, ID verification). |
| SFace+YuNet | Lightweight face detection and recognition (e.g., real-time face matching). |
| CLAP (Audio+Text) | Audio similarity search and text-to-audio retrieval. |
Code Search with Jina Code V2β
The Jina Code V2 model is specifically designed for semantic code search across 30+ programming languages. It excels at:
- Finding code snippets by natural language description
- Searching for similar code patterns
- Linking documentation to code
- Code-to-code similarity search
Creating a Code Search Storeβ
CREATESTORE code_repo QUERYMODEL jina-embeddings-v2-base-code INDEXMODEL jina-embeddings-v2-base-code
Indexing Code Snippetsβ
Rust:
use ahnlich_client_rs::prelude::*;
let code_snippets = vec![
StoreInput::RawString("fn fibonacci(n: u32) -> u32 { if n <= 1 { n } else { fibonacci(n-1) + fibonacci(n-2) } }".to_string()),
StoreInput::RawString("def binary_search(arr, target): left, right = 0, len(arr) - 1; while left <= right: ...".to_string()),
StoreInput::RawString("function quickSort(arr) { if (arr.length <= 1) return arr; const pivot = arr[0]; ...".to_string()),
];
let metadata = vec![
StoreValue::from([("language", "rust"), ("file", "algorithms.rs")]),
StoreValue::from([("language", "python"), ("file", "search.py")]),
StoreValue::from([("language", "javascript"), ("file", "sort.js")]),
];
client.set(
"code_repo".to_string(),
code_snippets,
PreprocessAction::ModelPreprocessing,
None,
HashMap::new(),
).await?;
Python:
from ahnlich_client_py import AhnlichAIClient
from ahnlich_client_py.ai_query import Set, StoreInput
code_snippets = [
StoreInput(raw_string="fn fibonacci(n: u32) -> u32 { if n <= 1 { n } else { fibonacci(n-1) + fibonacci(n-2) } }"),
StoreInput(raw_string="def binary_search(arr, target): left, right = 0, len(arr) - 1; while left <= right: ..."),
StoreInput(raw_string="function quickSort(arr) { if (arr.length <= 1) return arr; const pivot = arr[0]; ..."),
]
await client.set(
Set(
store="code_repo",
inputs=code_snippets,
preprocess_action=PreprocessAction.ModelPreprocessing,
)
)
Searching with Natural Languageβ
Rust:
// Search using natural language query
let query = vec![StoreInput::RawString("implement recursive fibonacci sequence".to_string())];
let results = client.get_sim_n(
"code_repo".to_string(),
query,
Condition::new(NonLinearAlgorithm::CosineSimilarity),
5, // top 5 results
PreprocessAction::ModelPreprocessing,
None,
HashMap::new(),
).await?;
// Results will contain the Rust fibonacci function with highest similarity
Python:
# Search using natural language query
from ahnlich_client_py.ai_query import GetSimN
query = [StoreInput(raw_string="implement recursive fibonacci sequence")]
results = await client.get_sim_n(
GetSimN(
store="code_repo",
search_input=query,
condition=Condition.with_algorithm(NonLinearAlgorithm.CosineSimilarity),
closest_n=5,
preprocess_action=PreprocessAction.ModelPreprocessing,
)
)
Searching with Codeβ
Rust:
// Find similar code patterns
let code_query = vec![StoreInput::RawString(
"def fib(n): return n if n <= 1 else fib(n-1) + fib(n-2)".to_string()
)];
let results = client.get_sim_n(
"code_repo".to_string(),
code_query,
Condition::new(NonLinearAlgorithm::CosineSimilarity),
3,
PreprocessAction::ModelPreprocessing,
None,
HashMap::new(),
).await?;
// Will find the Rust fibonacci implementation despite being in a different language
Python:
# Find similar code patterns
code_query = [StoreInput(raw_string="def fib(n): return n if n <= 1 else fib(n-1) + fib(n-2)")]
results = await client.get_sim_n(
GetSimN(
store="code_repo",
search_input=code_query,
condition=Condition.with_algorithm(NonLinearAlgorithm.CosineSimilarity),
closest_n=3,
preprocess_action=PreprocessAction.ModelPreprocessing,
)
)
Use Casesβ
- Documentation Search: Index code examples and search with natural language questions
- Code Discovery: Find similar implementations across different programming languages
- Refactoring Detection: Identify duplicate or similar code patterns
- Code Review Assistance: Find related code snippets for context
- IDE Integration: Power semantic code search in development tools
Model Parameters (model_params)β
Some AI models accept optional runtime parameters via model_params β a map<string, string> field available on Set, GetSimN, and ConvertStoreInputToEmbeddings requests. These parameters let you tune model behavior at inference time without changing store configuration.
When model_params is empty (or omitted), models use their built-in defaults. Models that don't support any parameters simply ignore the field.
Supported Parameters by Modelβ
| Model | Parameter | Type | Default | Description |
|---|---|---|---|---|
| Buffalo_L | confidence_threshold | float (0.0β1.0) | 0.5 | Minimum detection confidence for a face to be included. Higher values = fewer but more confident detections. |
| Buffalo_L | attributes | string (comma-separated) | (empty) | Optional attributes to compute. Use genderage to enable age and gender predictions. When omitted, only face embeddings and bounding boxes are computed. |
| SFace+YuNet | confidence_threshold | float (0.0β1.0) | 0.6 | Minimum detection confidence for a face to be included. Higher values = fewer but more confident detections. |
Text embedding models (MiniLM, BGE), image models (ResNet, CLIP), and audio models (CLAP) do not currently use model_params.
Usage Examplesβ
Rust β setting a high confidence threshold for face detection:
use std::collections::HashMap;
let mut model_params = HashMap::new();
model_params.insert("confidence_threshold".to_string(), "0.9".to_string());
let set_params = Set {
store: "faces_store".to_string(),
inputs: vec![/* ... */],
preprocess_action: PreprocessAction::NoPreprocessing as i32,
execution_provider: None,
model_params,
};
Python β using default parameters (empty dict):
await client.set(
ai_query.Set(
store="faces_store",
inputs=[...],
preprocess_action=preprocess.PreprocessAction.NoPreprocessing,
model_params={} # uses model defaults
)
)
Python β custom confidence threshold:
await client.set(
ai_query.Set(
store="faces_store",
inputs=[...],
preprocess_action=preprocess.PreprocessAction.NoPreprocessing,
model_params={"confidence_threshold": "0.9"}
)
)
When to Tune model_paramsβ
- Inclusive detection (e.g., group photos where you want all faces): Use a lower threshold like
0.3 - Standard detection (balanced): Use the model default (
0.5for Buffalo_L,0.6for SFace+YuNet) - Strict detection (e.g., ID verification where only clear faces matter): Use a higher threshold like
0.9
Embedding Metadataβ
Starting from version 0.2.2, face detection models (Buffalo_L and SFace+YuNet) return bounding box metadata alongside embeddings. This allows you to access face location and confidence information without re-running detection.
Metadata Fields (Face Detection Models)β
For each detected face, the following metadata is automatically included:
| Field | Type | Range | Description |
|---|---|---|---|
bbox_x1 | float | 0.0β1.0 | Normalized x-coordinate of top-left corner |
bbox_y1 | float | 0.0β1.0 | Normalized y-coordinate of top-left corner |
bbox_x2 | float | 0.0β1.0 | Normalized x-coordinate of bottom-right corner |
bbox_y2 | float | 0.0β1.0 | Normalized y-coordinate of bottom-right corner |
confidence | float | 0.0β1.0 | Detection confidence score |
Buffalo_L only β the following fields are included when attributes=genderage is specified:
| Field | Type | Range | Description |
|---|---|---|---|
gender_female_prob | float | 0.0β1.0 | Probability of female gender |
gender_male_prob | float | 0.0β1.0 | Probability of male gender |
age | float | 0.0β100.0 | Predicted age in years |
Coordinates are normalized to the 0-1 range, making them independent of the original image resolution. To convert to pixel coordinates, multiply by the image width/height:
pixel_x1 = bbox_x1 * image_width
pixel_y1 = bbox_y1 * image_height
Metadata Storageβ
When you insert images using face detection models:
- Embeddings are stored in Ahnlich DB as usual
- Metadata (bounding boxes, confidence) is merged into the
StoreValuefor each face - Metadata is returned in
GetSimN,GetPred, andConvertStoreInputToEmbeddingsresponses
API Response Structureβ
The ConvertStoreInputToEmbeddings API returns EmbeddingWithMetadata for face models:
message EmbeddingWithMetadata {
keyval.StoreKey embedding = 1; // The face embedding vector
optional keyval.StoreValue metadata = 2; // Bounding box + confidence
}
For OneToMany models (face detection), multiple EmbeddingWithMetadata objects are returnedβone per detected face.
Usage Examplesβ
Rust β accessing bounding box metadata:
use ahnlich_client_rs::prelude::*;
let response = client.convert_to_embeddings(
store_name,
vec![StoreInput::Image(image_bytes)],
PreprocessAction::ModelPreprocessing,
None,
HashMap::new(),
).await?;
// For face detection models, variant is OneToMany
if let Some(Variant::Multiple(multi)) = &response.values[0].variant {
for face in &multi.embeddings {
if let Some(embedding) = &face.embedding {
println!("Embedding dimensions: {}", embedding.key.len());
}
if let Some(metadata) = &face.metadata {
let bbox_x1 = metadata.value.get("bbox_x1").unwrap();
let bbox_y1 = metadata.value.get("bbox_y1").unwrap();
let confidence = metadata.value.get("confidence").unwrap();
println!("Face at ({}, {}) with confidence {}",
bbox_x1, bbox_y1, confidence);
}
}
}
Python β accessing bounding box metadata:
from ahnlich_client_py import AhnlichAIClient
response = await client.convert_store_input_to_embeddings(
store="faces_store",
inputs=[image_bytes],
preprocess_action=PreprocessAction.ModelPreprocessing,
)
# Each face has embedding + metadata
for face_data in response.values[0].multiple.embeddings:
embedding = face_data.embedding.key # 512-dim vector for Buffalo_L
metadata = face_data.metadata.value
bbox_x1 = float(metadata["bbox_x1"].value)
bbox_y1 = float(metadata["bbox_y1"].value)
confidence = float(metadata["confidence"].value)
print(f"Face at ({bbox_x1}, {bbox_y1}) with confidence {confidence}")
TypeScript β accessing bounding box metadata:
import { AhnlichAIClient } from '@deven96/ahnlich-client-node';
const response = await client.convertStoreInputToEmbeddings({
store: "faces_store",
inputs: [{ image: imageBytes }],
preprocessAction: PreprocessAction.MODEL_PREPROCESSING,
});
// Each detected face has embedding + metadata
for (const faceData of response.values[0].multiple.embeddings) {
const embedding = faceData.embedding.key; // Float32Array
const metadata = faceData.metadata.value;
const bboxX1 = parseFloat(metadata.bbox_x1.value);
const bboxY1 = parseFloat(metadata.bbox_y1.value);
const confidence = parseFloat(metadata.confidence.value);
console.log(`Face at (${bboxX1}, ${bboxY1}) with confidence ${confidence}`);
}
Gender and Age Predictions (Buffalo_L)β
Buffalo_L can compute age and gender predictions for each detected face by setting attributes=genderage in model_params. This adds three additional metadata fields per face: gender_female_prob, gender_male_prob, and age.
Rust β enabling gender and age predictions:
use std::collections::HashMap;
use ahnlich_client_rs::prelude::*;
let mut model_params = HashMap::new();
model_params.insert("attributes".to_string(), "genderage".to_string());
let response = client.convert_to_embeddings(
store_name,
vec![StoreInput::Image(image_bytes)],
PreprocessAction::ModelPreprocessing,
None,
model_params,
).await?;
// Access gender/age metadata
if let Some(Variant::Multiple(multi)) = &response.values[0].variant {
for face in &multi.embeddings {
if let Some(metadata) = &face.metadata {
let female_prob = metadata.value.get("gender_female_prob").unwrap();
let male_prob = metadata.value.get("gender_male_prob").unwrap();
let age = metadata.value.get("age").unwrap();
println!("Age: {}, Female: {}, Male: {}", age, female_prob, male_prob);
}
}
}
Python β enabling gender and age predictions:
from ahnlich_client_py import AhnlichAIClient
response = await client.convert_store_input_to_embeddings(
store="faces_store",
inputs=[image_bytes],
preprocess_action=PreprocessAction.ModelPreprocessing,
model_params={"attributes": "genderage"}
)
# Access gender/age metadata
for face_data in response.values[0].multiple.embeddings:
metadata = face_data.metadata.value
female_prob = float(metadata["gender_female_prob"].value)
male_prob = float(metadata["gender_male_prob"].value)
age = float(metadata["age"].value)
print(f"Age: {age}, Female: {female_prob}, Male: {male_prob}")
TypeScript β enabling gender and age predictions:
import { AhnlichAIClient } from '@deven96/ahnlich-client-node';
const response = await client.convertStoreInputToEmbeddings({
store: "faces_store",
inputs: [{ image: imageBytes }],
preprocessAction: PreprocessAction.MODEL_PREPROCESSING,
modelParams: { attributes: "genderage" }
});
// Access gender/age metadata
for (const faceData of response.values[0].multiple.embeddings) {
const metadata = faceData.metadata.value;
const femaleProb = parseFloat(metadata.gender_female_prob.value);
const maleProb = parseFloat(metadata.gender_male_prob.value);
const age = parseFloat(metadata.age.value);
console.log(`Age: ${age}, Female: ${femaleProb}, Male: ${maleProb}`);
}
Use Cases for Metadataβ
- Face cropping: Use bounding boxes to extract face regions from original images
- Visualization: Draw bounding boxes on images to show detected faces
- Quality filtering: Filter results by confidence score (e.g., only faces with confidence > 0.8)
- Spatial queries: Find faces in specific image regions (e.g., "faces in the top-left quadrant")
- Deduplication: Identify overlapping detections using bounding box coordinates
- Demographic analysis (Buffalo_L with
attributes=genderage):- Age-based filtering (e.g., "find faces that appear under 18")
- Gender distribution analysis in group photos
- Age group clustering (children, adults, elderly)
- Demographic insights for audience analysis
Models Without Metadataβ
Text and image embedding models (MiniLM, BGE, ResNet, CLIP) do not return metadata. The metadata field will be None or empty for these models.