A retrieval-augmented-generation pipeline to help users query system-provided documentation
Speakers: Tzolkin GARDUNO ALVARADO & Gunnar Wolf
Track: Artificial intelligence & Debian
Type: Academic paper
Room: Petit amphi
Time: Jul 15 (Tue): 09:30
Duration: 0:20
The increasing integration of AI into computing workflows demands a re-evaluation of traditional operating system design. In environments like Debian, users are often faced with a vast ecosystem of command-line tools, each accompanied by extensive manual pages (man pages) detailing usage, flags, and parameters. While comprehensive, these documents are frequently dense, verbose, and not well-suited for rapid onboarding or targeted queries. We propose a retrieval-augmented generation (RAG) pipeline to bridge this gap, enabling natural language interaction with system documentation. By combining tokenization, embedding, and dense retrieval with a language generation model, our system allows users to query tool usage in plain language and receive concise, contextually relevant responses. This approach streamlines tool discovery and comprehension, and represents a step toward more intelligent, user-aware operating systems.