Liya Engine

Configure retrieval-augmented generation — retrieval mode, hybrid search, chunking strategy, reranking, and embedding model.

ragConfig controls how Liya Engine retrieves knowledge from your uploaded documents and injects it into LLM context. RAG must be enabled via featureConfig.enable_rag: true.

Schema

{
  "ragConfig": {
    "enabled": true,
    "retrieval_mode": "hybrid",
    "default_top_k": 5,
    "default_min_similarity": 0.7,
    "embedding_model": "text-embedding-3-small",
    "cache_retrieval_results": true,
    "cache_ttl_seconds": 300,
    "hybrid_mode": "semantic_keyword",
    "hybrid_semantic_weight": 0.7,
    "hybrid_keyword_weight": 0.3,
    "enable_reranking": false,
    "chunking_strategy": "fixed",
    "chunking_fixed_size": 512,
    "chunking_overlap": 100
  }
}

Retrieval modes

`retrieval_mode`

Value	Description
`semantic`	Pure vector similarity search — cosine distance between query and chunk embeddings
`hybrid`	Combines vector search with keyword matching (BM25) — recommended for most use cases

Hybrid search

When retrieval_mode is hybrid, results from semantic and keyword search are fused using the configured algorithm.

`hybrid_mode`

Value	Description
`semantic_keyword`	Vector search + BM25 keyword search, fused by RRF or weighted average
`semantic_structured`	Vector search + metadata filters

Fusion weights

{
  "hybrid_semantic_weight": 0.7,
  "hybrid_keyword_weight": 0.3,
  "hybrid_fusion_algorithm": "rrf"
}

rrf (Reciprocal Rank Fusion) is recommended over weighted_average for most use cases as it is less sensitive to score scale differences between retrievers.

Reranking

Enable reranking to improve result relevance by re-scoring a larger initial candidate set:

{
  "enable_reranking": true,
  "reranking_mode": "local",
  "reranking_initial_top_k": 20,
  "reranking_final_top_k": 5
}

`reranking_mode`	Description
`local`	Embedding-based reranking within Liya Engine
`api`	External reranking service (e.g. Cohere Rerank) — requires `reranking_api_endpoint` and `reranking_api_key`

Chunking

Controls how uploaded documents are split before embedding.

`chunking_strategy`

Value	Description
`fixed`	Split by token count — simple and fast (default)
`semantic`	Split at topic boundaries using similarity threshold
`sliding_window`	Fixed chunks with overlap — good for dense documents

{
  "chunking_strategy": "fixed",
  "chunking_fixed_size": 512,
  "chunking_overlap": 100,
  "chunking_min_size": 100,
  "chunking_max_size": 1000
}

Embedding model

Model	Dimensions	Notes
`text-embedding-3-small`	1536	Default — cost-efficient, high quality
`text-embedding-3-large`	3072	Higher quality for complex retrieval tasks

Changing the embedding model after documents have been ingested requires re-embedding all existing documents. Contact support before changing this on a production knowledge base.

RAG Configuration