70 likes | 173 Views
Unlocking Audio 2 - Connecting with Listeners. MEMORIES Acquiring, enriching, preserving and accessing Audio-Visual assets The RSR case study Jean-François Cosandier (chief officer) Iris Buunk (research assistant) 17 th of March 2009 British Library, London.
E N D
Unlocking Audio 2 - Connecting with Listeners MEMORIES Acquiring, enriching, preserving and accessing Audio-Visual assets The RSR case study Jean-François Cosandier (chief officer) Iris Buunk (research assistant) 17th of March 2009 British Library, London
The RSR audio archives actual situation • 100’000 hours of audio archives • Analogue tapes (85’000 h.) • Direct cut discs (80’000 discs) • Audio files (15’000 h.) • Since 1935 • Interviews, music, programs, news • 20’000 hours digitized (23% of the entire collection)
Problems and limitations • RSR audio archives have to face • huge collections of audio documents • various qualities of metadata (or no metadata at all..) • selection at input often difficult or impossible • a growing demand for online consultation (in a very short time) • continuous process of digitization
The RSR challenges • Survival to the degradation of the carriers • Survival to the evolution of the formats of representation • Survival to the changes in the industrial policies • A high quality digitization • Rich metadata • Rich search possibilities • Fast access to the content • Digital content in a digital production network • Semantic audio search • The end-user accessibility
The RSR requests Main focus: radio interviews • Identification of typical audio components: voices – noise – music • Identification of natural segments in the audio document (jingles, announcements, speech, music piece) • Detection single speaker / multi-speakers • Speaker voice recognition : The production of a determined speaker is compared with samples of voices in a database • Speech to text: extraction of the textual content of the spoken segment & textual content stored as metadata
MEMORIES contribution to the users • The Clip Manager • Segmentation • Locutors identification • Integration of metadata • Presentation of the Speech-to-text result • AXE generation • The Speech-to-text tool • The Single Sensor Source Separation tool • The Asset Management tool (FilmLibrary) • The Advanced Search tool