ATRIUM

Audio Processing Pipeline · EU Project (Sep 2024 – December 2025)

ATRIUM is a European project in which CLARIN leads Work Package 6 on Service Interoperability and Integration. My role within the project is as an ICT Developer, building an automated audio processing pipeline.

What I work on

  • Developing an automated multilingual audio processing pipeline for long-form interview conversations, enabling structured analysis at scale.
  • Integrating speaker diarization and LLM-based summarisation, delivering structured outputs via APIs for downstream analytics and decision-making.
  • Applying context-aware chunking techniques to reliably summarise long conversations that exceed native LLM context limits.
  • Improving the automated speech transcription workflow in the CLARIN transcription portal at BAS, making state-of-the-art ASR more accessible to humanities researchers.

Duration: September 2024 – December 2025
Transcription portal: speechandtech.eu/transcription-portal
Project website: ru.nl/onderzoek/onderzoeksprojecten/atrium