RAG for Implicit Relation Detection Between Texts (M.Sc. Thesis)
Lab: Ubiquitous Knowledge Processing Lab (UKP), TU Darmstadt Period: November 2024 – May 2025 (extended as Student Research Assistant) Degree: M.Sc. Computer Science, TU Darmstadt
Overview
This research automates the process of detecting implicit textual links via a Retrieval-Augmented Generation (RAG)-based method. The work addresses the challenge of uncovering non-obvious, semantically implicit relationships between documents—a task where traditional retrieval methods fall short.
Contributions
- Designed and implemented a scalable and robust benchmark with two end tasks: classification and information retrieval, evaluating the model’s capability to detect implicit relationships in text
- Trained and evaluated models using the F1000RD dataset to assess the effectiveness of RAG in implicit relationship detection
- Developed an unsupervised learning approach based on the RAG architecture to improve model performance in identifying implicit links
- Extended the RAG architecture with heterogeneous retriever and reader variants and designed multiple joint training strategies—including staged training of reranker, retriever, and reader—improving overall training stability
- Comprehensively compared performance with baselines and analyzed the impact of key factors on implicit link detection performance
