Liya Engine

Liya Engine includes a built-in RAG (Retrieval-Augmented Generation) pipeline. When enabled, the engine retrieves relevant content from your tenant knowledge base before generating a response, grounding answers in your proprietary data.

Enable RAG

RAG is controlled by the enable_rag feature flag. Update it via PATCH /dashboard/account/config:

PATCH https://api.liyaengine.ai/dashboard/account/config
Authorization: Bearer <jwt>
Content-Type: application/json
 
{
  "featureConfig": {
    "enable_rag": true
  }
}

Once enabled, all intents in your enabled domains will query the knowledge base.

Knowledge sources

Knowledge is organised by domain and source type. Built-in source types for the hiring domain:

Source type	Description
`job_descriptions`	Role specifications, requirements, responsibilities
`company_profiles`	Company overviews, values, culture notes
`resume_templates`	Ideal resume formats and standards
`interview_questions`	Question banks per role or level
`career_frameworks`	Competency levels, progression paths
`industry_standards`	Skills taxonomies, certifications, benchmarks

Upload and manage knowledge sources in the Knowledge section of the dashboard, or via the Knowledge API (see Knowledge API).

RAG configuration

Configure retrieval behaviour via ragConfig in PATCH /dashboard/account/config:

{
  "ragConfig": {
    "retrieval_mode": "hybrid",
    "top_k": 5,
    "score_threshold": 0.72,
    "rerank": true,
    "rerank_top_n": 3,
    "chunk_size": 512,
    "chunk_overlap": 64,
    "chunking_strategy": "semantic"
  }
}

Retrieval modes

Mode	Description
`semantic`	Dense vector search only — best for conceptual queries
`keyword`	BM25 full-text search only — best for exact term matching
`hybrid`	Combines semantic + BM25 with score fusion — recommended default

Key parameters

Parameter	Default	Description
`top_k`	5	Number of chunks retrieved before reranking
`score_threshold`	0.72	Minimum similarity score — chunks below this are discarded
`rerank`	false	Enable cross-encoder reranking of retrieved chunks
`rerank_top_n`	3	Number of chunks kept after reranking
`chunk_size`	512	Target tokens per chunk
`chunk_overlap`	64	Token overlap between adjacent chunks
`chunking_strategy`	`"fixed"`	`"fixed"` or `"semantic"` — semantic splits on sentence boundaries

Tuning retrieval quality

Start with hybrid + reranking

For most use cases:

{
  "retrieval_mode": "hybrid",
  "top_k": 8,
  "rerank": true,
  "rerank_top_n": 4,
  "score_threshold": 0.65
}

Retrieve more candidates (top_k: 8) then let the reranker select the most relevant 4. This trades a small amount of latency for significantly better precision.

Lower the threshold for recall-heavy use cases

If retrieval is missing relevant content, lower score_threshold:

{ "score_threshold": 0.55 }

Use semantic chunking for long documents

For documents like job descriptions or policy manuals, semantic chunking respects natural boundaries:

{
  "chunking_strategy": "semantic",
  "chunk_size": 768,
  "chunk_overlap": 128
}

RAG per domain

RAG is scoped to the domain of each request. A hiring intent only retrieves from hiring knowledge sources — it never queries fintech or other domain sources. This prevents cross-domain contamination.

For custom domains, RAG requires enable_rag: true in the domain's feature config and at least one knowledge source uploaded. See Custom Domain Knowledge Sources.

Disabling RAG per request

Pass enable_rag: false in the request options to bypass retrieval for a single call:

{
  "input": { ... },
  "options": {
    "enable_rag": false
  }
}

Useful when the query doesn't benefit from your knowledge base (e.g. general chat).

Building a RAG Pipeline