ATRIUM
Audio Processing Pipeline · EU Project (Sep 2024 – December 2025)
ATRIUM is a European project in which CLARIN leads Work Package 6 on Service Interoperability and Integration. My role within the project is as an ICT Developer, building an automated audio processing pipeline.
What I work on
- Developing an automated multilingual audio processing pipeline for long-form interview conversations, enabling structured analysis at scale.
- Integrating speaker diarization and LLM-based summarisation, delivering structured outputs via APIs for downstream analytics and decision-making.
- Applying context-aware chunking techniques to reliably summarise long conversations that exceed native LLM context limits.
- Improving the automated speech transcription workflow in the CLARIN transcription portal at BAS, making state-of-the-art ASR more accessible to humanities researchers.
Duration: September 2024 – December 2025
Transcription portal: speechandtech.eu/transcription-portal
Project website: ru.nl/onderzoek/onderzoeksprojecten/atrium