Leveraging Retrieval‑Augmented Generation for Persian University Knowledge Retrieval

Published in 15th IKT (accepted – oral), 2024

In this work, the authors develop a two‑stage retrieval‑augmented generation (RAG) pipeline to answer questions about university resources using locally scraped documents. Queries are first categorized to identify the most relevant subset of documents; a Persian large language model then generates answers using a carefully engineered prompt. The paper introduces UniversityQuestionBench (UQB), a benchmark derived from frequently asked questions by students across disciplines, and evaluates the RAG system using faithfulness, answer relevance and context relevance metrics. Experiments demonstrate that incorporating retrieval steps significantly improves the precision and contextual relevance of generated answers when compared with baseline models.

PDF

Share on

Twitter Facebook LinkedIn

Arshia Hemmat

Share on