Liya Engine

RAG Configuration

Configure retrieval-augmented generation — retrieval mode, hybrid search, chunking strategy, reranking, and embedding model.

ragConfig controls how Liya Engine retrieves knowledge from your uploaded documents and injects it into LLM context. RAG must be enabled via featureConfig.enable_rag: true.


Schema

{
  "ragConfig": {
    "enabled": true,
    "retrieval_mode": "hybrid",
    "default_top_k": 5,
    "default_min_similarity": 0.7,
    "embedding_model": "text-embedding-3-small",
    "cache_retrieval_results": true,
    "cache_ttl_seconds": 300,
    "hybrid_mode": "semantic_keyword",
    "hybrid_semantic_weight": 0.7,
    "hybrid_keyword_weight": 0.3,
    "enable_reranking": false,
    "chunking_strategy": "fixed",
    "chunking_fixed_size": 512,
    "chunking_overlap": 100
  }
}

Retrieval modes

retrieval_mode

ValueDescription
semanticPure vector similarity search — cosine distance between query and chunk embeddings
hybridCombines vector search with keyword matching (BM25) — recommended for most use cases

When retrieval_mode is hybrid, results from semantic and keyword search are fused using the configured algorithm.

hybrid_mode

ValueDescription
semantic_keywordVector search + BM25 keyword search, fused by RRF or weighted average
semantic_structuredVector search + metadata filters

Fusion weights

{
  "hybrid_semantic_weight": 0.7,
  "hybrid_keyword_weight": 0.3,
  "hybrid_fusion_algorithm": "rrf"
}

rrf (Reciprocal Rank Fusion) is recommended over weighted_average for most use cases as it is less sensitive to score scale differences between retrievers.


Reranking

Enable reranking to improve result relevance by re-scoring a larger initial candidate set:

{
  "enable_reranking": true,
  "reranking_mode": "local",
  "reranking_initial_top_k": 20,
  "reranking_final_top_k": 5
}
reranking_modeDescription
localEmbedding-based reranking within Liya Engine
apiExternal reranking service (e.g. Cohere Rerank) — requires reranking_api_endpoint and reranking_api_key

Chunking

Controls how uploaded documents are split before embedding.

chunking_strategy

ValueDescription
fixedSplit by token count — simple and fast (default)
semanticSplit at topic boundaries using similarity threshold
sliding_windowFixed chunks with overlap — good for dense documents
{
  "chunking_strategy": "fixed",
  "chunking_fixed_size": 512,
  "chunking_overlap": 100,
  "chunking_min_size": 100,
  "chunking_max_size": 1000
}

Embedding model

ModelDimensionsNotes
text-embedding-3-small1536Default — cost-efficient, high quality
text-embedding-3-large3072Higher quality for complex retrieval tasks

Changing the embedding model after documents have been ingested requires re-embedding all existing documents. Contact support before changing this on a production knowledge base.

On this page