60 likes | 182 Views
Replicating Linguistic Resources. B2SAFE: MPI-TLA CLARIN Center. Willem Elbers (MPI-TLA) 2 nd EUDAT Conference. Date : 29 October 2013. The Language Archive. Data on languages: about 60 Terabyte of well-described resources about 20.000 hours of digitized audio/video recordings
E N D
Replicating Linguistic Resources B2SAFE: MPI-TLA CLARIN Center Willem Elbers (MPI-TLA) 2nd EUDAT Conference Date: 29 October 2013
The Language Archive • Data on languages: • about 60 Terabyte of well-described resources • about 20.000 hours of digitized audio/video recordings • about 73.000 metadata described sessions • about 4.5 million annotated segments • data on more than 200 languages • among these, data from about 60 DOBES teams • acquisition, speech, multimodal, multilingual, language and cognition, brain imaging, ethnological and other data. • Mission: • Maintaining access to all stored resources for the current generation of researchers, language communities and the interested public. • Preserve the valuable cultural heritage for current en future generations.
B2SAFE • Goals • Replication of data • B2SAFE! • Replication of services • RZG providing Language Archive Technology services at replica side • B2SAFE Community extensions: • Replication based on logical structure defined in the IMDI/CMDI metadata • Integrated with underlying SAM-FS
Approx: 80TB EUDAT > 3TB > 3TB
Summary “Cultural Heritage Data replicated for the future” Data replication running in production LAT Software stack running @ RZG (beta) Replication of authorization records running (beta)
Questions Thank you for your attention