120 likes | 244 Views
Max-Planck-Institute for Psycholinguistics Research and Technical Facilities. Research. Structure and Tasks. Directorate. FC Donders Center for Neuro-imaging. Interfacultaire Werkgroup Taal & Spraak. Acquisition Group. Comprehension Group. Production Group. Language & Cognition.
E N D
Max-Planck-Institute for Psycholinguistics Research and Technical Facilities
Research Structure and Tasks Directorate FC Donders Center for Neuro-imaging Interfacultaire Werkgroup Taal & Spraak Acquisition Group Comprehension Group Production Group Language & Cognition Neuro cognition Technical Group Acquisition Group: principles underlying acquisiton of languages by adults and children Comprehension Group: principles to make us understand what people are saying Production Group: principles to allow us to form ideas into utterances Language & Cognition: underlying relation between language and thought Neurocognition Group: functional architecture of the brain, processes observed with MRI, MEG, EEG IWTS: group established at the KUN with complementary task FCDC: brandnew center for neuroimaging - collaboration with MPI institute is very much oriented to experimental and observational work it’s a data driven approach (similar to physics) TG: support for all technical and methodological aspects development of tools, setups, methods
Technical Group Structure and Tasks TG Support Development Overhead 62% 32% 6% external money Information Services Experiment Services Electronic Services AV Services Desktop Services Server Network Digital Media Corpus Manag Scripts Programs EUDICO Tool Set Browsable Corpus NESU 2 Animations Artwork 0.8 1.7 1.0 1.1 2.3 2.1 0.6 2.3 0.6 3.2 1.6 1.7 1.3 0.3 • Linguistics DB • Equipment DB • MEID Intranet • Picture DB • PrePrint Server • Web-Site • scientific DBs • av maintenance • electronic boxes • fieldwork support • PC hardware • PC setup • mechanical work • helpdesk • SW support • PC Images • printer support • guest support • test new tech • digi setups • digi SW • conversion • cutting • editing • standards • scientific • scripts • scientific • programs • metadata • standard • MD editor • MD browser • conversion • DC mapping • 3D animations • life-like creatures • posters • graphic effects • photos • administration • organization • training • management • BAR member • EC councils • intern. Org. • in house setups • out of house • eye tracker labs • gesture lab • ERP labs • child labs • MRI exp • ext collaborators • exp devices • video lab • observation labs • av copying • av editing • helpdesk • NT server • Unix server • storage sys • backup • email, web • network HW • network services • workflow • metadata org • archive org • scripts • conversion • copying • DVD burning • multimedia • annotation • multimedia • visualiz\ation • multimedia • search • UNICODE • NESU 2 • builder • NESU 2 • runner • graphics • X technology • NESU HW
Calender Information Room Reservation Access to Archives (Preprint Server, …) Electronic Journals and Scientific Info DB Absence Information Various Forms to Request Technical Services Research Information (Experiment Schedule, Picture DB, Experiment Design DB, …) Technical Information of all sort Technical Group MEID Intranet MEID (Max-Planck-Institute’s Electronic Information Desk the central source for various sort of information MEID is an highly automaized, interactive information system; behind the User Interface elements are universal databases such as the Personal database
Fat Servers GigabitSwitch Power Clients/ Small Servers Low Power Clients FastEthernet 10Mbit/s Technical Group Server-, Storage- and Network-Systems MPI-Net 2001 MPI-Computers 2001 MPI-Storage 2001 Fat Servers Power Clients/ Small Servers • Network highlights • SURFnet to/from MPI: 155Mbit/s • MPI internal • Gigabit Switch (12x 1000 Mbit/s, 240x 100 Mbit/s, n x 10 Mbit/s) • 10 Mbit/s used by thin clients • 100 Mbit/s used by power users • 1000 Mbit/s used by fat servers • Network security • Network security is mainly achieved at the moment by port filtering on the router. • Computer highlights • Fat servers: • UNIX server • 3 SUN E450, 1 SUN E250, 1 HP D275 • SUN server all are SPARC II systems with • 2 or 4 CPUs (300 MHz or 400Mhz), 1-2 GB RAM, 1000 Mbit/s Network Interface, • 150 GB -440 GB local disk space • SAN solution with RAID 5 volumes (5x6x72 GB disks = 2.1TB brutto) • connected to 2 SUN server via Fibre Chanel hostadapter • NT-server • 3 transtec 2500, 1 transtec 2600 • transtec server all are INTEL III systems with • 2 CPUs (400 MHz), 750 MB RAM, 1000 Mbit/s Network Interfaces, • 100 GB local disk space • SAN solution with RAID 5 volume on one NT server (6x72 BG disks = 430 GB) Storage highlights On the most UNIX fat server systems JBoDs are used for configuring various volumes of different sizes for different kind of data. The main categories are programs, user data, archive and corpora data. Two systems are parts of a SAN system. Other components of this SAN are a SAN Fibre Channel Switch and a RAID storage system. The various categories of data are stored on different servers. One servers is mainly used as files server for user data and programs the other is for storing archive and corpora data. A third server is functioning as backup master for all client systems (UNIX and NT). Backup system Tape Library ETL 7/3500: 4x DLT 7000 (35-70 GB), 95 slots, max total capacity = 3,5 TB, robot handling mechanism with barcode reader, SCSI-interface , transfer rate oof 36 GB/hr. This tape library was also part of the Hirarchical Storage Management system (HSM) which will be replaced this year by a more sophisticated system.
Major NESU characteristics • Win 2000 support • realtime guarantees (< 1ms) • fast audiovisual stimuli from computer • experiment browser • reverse experiment designing • graphical experiment builder (simple design by mouse) • easy to use experiment runner • short prototyping cycle • hardwareless prototyping • orthogonal design (separation of timing and structure) • special hardware for high accuracy measurements • easily adaptable to external equipment such as MRI • adaptable object-oriented code (Smalltalk) • application of DirectX technology • fast hardware drivers • included performance analysis Technical Group Experimental Facilities Screen shot Experiment Performance Analyzer Screen shot Experiment Builder Screen shot Experiment Runner NESU Nijmegen Experiment Setup Version 2 universal experiment builder and runner Stimulus-Response Experiments Eye Tracking Experiments Gesture Experiments Child Experiments Cognition Experiments Groups Experiment 10 parallel subjects many portable setups (notebooks) Single Reflection System Eyeview System Gesture Lab Child Exp Setup Child Exp Setup with EEG ERP Lab 1 ERP Lab 2 out of house MRI Setup MEG Setup Groups Experiment 4 parallel subjects 11 single subject rooms Eyelink System 1 Eyelink System 2
Fixation Analysis Subject Screen Subject PC Tracker PC Eyelink System Statical Analysis NESU Exp Runner Technical Group Eye Tracking Labs Eye Tracking Setup Measurement Principles Head Movement Compensation Headband Two miniature high speed (250 Hz) IR-cameras record pupil position and shape. Fast hardware is used for geometrical calculations of gaze. During calculation head movements are compensated by means of recording marker positions. Dedicated Network Data Flow & Analaysis Keypoints Reaction Responses Gaze Data Responses • 2 PC's as tracker and subject display systems, connected by ethernet adaptors • Subject PC runs under NESU and shows stimuli to subject. Also controls the tracker PC • Eye tracker extension for NESU controls the flow of the experiment. • Subject PC delivers visual and auditive stimuli • Tracker PC records both eyes (selectable) and saves the data to a binary file • Eye lighting with an infrared light source (LED array). Seen by CCD chip cameras • Tracking possible with max. 250 Hz (selectable) • Automatic correction for head movements • Automatic detection of events (fixations, saccades etc.) • Tracking based on center of a circle representing the pupil • Measurement of pupil diameter possible • interactive fixation analysis (not automatic) • interactive association between fixations, mouse movements, and graphical objects • feedback experiments where screen contents are manipulated depending on gaze Controls Feedback Typical graphical representation of fixation patterns while doing interactive fixation analysis. In the sample experiment mouse trails were also recorded (see black points).
Hierarchical Storage Management System 2nd copies RAID system 3 TB tape library 25 TB Metadata Browse & Search Universe Technical Group Corpus Building Workflow from audio/video recordings to the multi-media archive Digitization Process audio tape digitization computer temporary storage video tape 4 video and 4 audio setups System Manager • The user is only seeing the • metadata universe, i.e. • he operates in a concep- • tional browse and search • domain • The corpus manager has to • organize the digitization • process and organize the • corpus storage and the MD • domain. • The system manager is • responsible for reliable • storage mechanisms, • enough capacity and fast • access. Corpus Manager Field Recordings User
Technical Group Fieldwork & Expedition Support Expedition Schedule and Planning Typical Field Equipment Set Equipment Database In 2001 about 25 field trips were prepared and equipped. Each field trip is entered in a planning document. A typical equipment set for an expedition icnludes various power supply devices, recording and annotation equipment. The Technial Group has a central database which covers all equipment we have ordered and all persons at the institute. This DB is used to control the flow of equipment and the status of every unit. Software setup for Field Trips Equipment Check & Maintenance Cycle Miniaturization & Robustness before trip check & maintenance after trip check & maintenance Miniaturization and robustness are the two major requirements for field work. Therefore the MPI is always looking for newest technology. However, only experience in the field can tell us, whether both go together. magazin The screen shot indicates the type of software which is installed on a field notebook. It contains digitization tools, media inspection tools, experiment tools, and tools to create metadata descriptions, notes, and annotations. When being returned from the field the equipment is briefly checked for severe damages. The before trip maintenance is done very carefully, since guarantees have to be given to the field researchers.
Metadata Vision MPI 98 connection by simple URL mechanism!! Lund Helsinki Lancaster ICE LDC Japan Childes China SIL ??? AIATSIS MPI Nijmegen MPI Leipzig Lacito ELRA ISLE Metadata Initiative Technical Group Browsable Corpus Tools User friendly generation of metadata descriptions which adhere to open standards; creation of a conceptual domain IMDI Open International Standard Metadata Browsers allow to operate in a conceptual domain including all metadata descriptions adhering to standards immediate execution of a useful tool on the chosen set of files world-wide interconnected domain typical browsable hierarchy
t video camera 1 (t1) t1* video camera 2 (t2) t2* eye tracker 1 (t3) t3* EUDICO transcription GATE morphology XML left eye Tipster easily > 50 tiers r.h. gesture CHAT r.h. gesture phase AIF GDB r. hand ATLAS Appl. Technical Group EUDICO Tool Set Complexity of Multi-Modal Annotations EUDICO Architecture Enabling Distributed Operation Abstract Corpus Model the Nucleus of EUDICO ACM was designed to represent many of the current annotations formats to achieve format Independence time scales, independent streams, partial time alignment, large nr. tiers, hierarchies, (labeled) references EUDICO Annotation Tool EUDICO Visualization Tool EUDICO Visualization Tool EUDICO Search Tool flexible tier definition character selection support for Chinese powerful audio and video segment definition EUDICO Annotation Format flexible XML-based format The annotation tool combines all modern concepts of defining segments in audio and video signals, has input methods for many languages and character sets, and generates UNICODE and XML-structured files. <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE ANNOTATION_DOCUMENT > <ANNOTATION_DOCUMENT DATE="July 17, 2001" AUTHOR="Hennie Brugman" VERSION="1.0"> <HEADER TIME_UNITS="milliseconds" MEDIA_FILE="file:/server/media.mpg"/> <TIME_ORDER> <TIME_SLOT TIME_SLOT_ID="ts1" TIME_VALUE="1000"/> <TIME_SLOT TIME_SLOT_ID="ts2" TIME_VALUE="2000"/> <TIME_SLOT TIME_SLOT_ID="ts3" TIME_VALUE="3000"/> <TIME_SLOT TIME_SLOT_ID="ts4"/> <TIME_SLOT TIME_SLOT_ID="ts5" TIME_VALUE="5000"/> <TIME_SLOT TIME_SLOT_ID="ts6" TIME_VALUE="6000"/> </TIME_ORDER> <TIER TIER_ID="t1" LINGUISTIC_TYPE_REF="orthography" PARTICIPANT="jan" DEFAULT_LOCALE="IPA-96"> <ANNOTATION> <ALIGNABLE_ANNOTATION ANNOTATION_ID="a1" TIME_SLOT_REF1="ts1" TIME_SLOT_REF2="ts3"> <ANNOTATION_VALUE>utterance 1</ANNOTATION_VALUE> </ALIGNABLE_ANNOTATION> </ANNOTATION> <ANNOTATION> <ALIGNABLE_ANNOTATION> Subtitle Viewer Grid Viewer input methods for various character sets such as IPA Time Line Viewer Compact Viewer EUDICO generates XML-structured files following the flexible EAF schema Different user adjustable viewers allow the user to view his data in a flexible way. Other useful viewers will be added.
Technical Group Exchanging Documents with Unicode and XML Goals of Unicode Unicode Usage Unicode Overview • No mixture of fonts and character sets anymore • No font conversion necessaryanymore • No double usage of ordinal positions anymore • Unifying characters as much as possible Conversion of MS Word documents to XML Word file ADL file aaaaaa aaaaaa a ‡ bbbbbb bb bbbbbbb dddd ddddddddd a2a2a2a2a2a2a2a2 ‡ î ã Œ 111 111 11111 2222 222 2222 3333 3333 33 3333 ADL BLOCK = WRAPED TIER transcription, BOLD TIER english, ITALIC TIER morpho + Eudico programs support Unicode by the use of a special editor. A wide variety of languages is supported by offering virtual keyboards. XML file <adlf> <block> <sentence name="transcription">aaaaaa aaaaaa a ‡ a2a2a2a2a2a2a2a2 ‡ î ã Œ </sentence> <sentence name="english">bbbbbb bb bbbbbbb </sentence> <sentence name="morpho">dddd ddddddddd </sentence> </block> <block> <sentence name="transcription">111 111 11111</sentence> <sentence name="english">2222 222 2222</sentence> <sentence name="morpho">3333 3333 33 3333</sentence> </block> </adlf> Unicode Character Allocation taken from “The Unicode Standard Version 3.0”, Addison-Wesley Often transcripts and other linguistic documents are written in MS Word applying idiosyncratic structures although this format is not open and not suitable for archiving and further analysis. At MPI we developed a flexible converter which allows the user to describe his file structure in simple terms (Annotation Description Language - ADL) such that XML files following the EUDICO Annotation Format (EAF) are created. For some languages such as Chinese lookup windows are generated on screen to offer all characters as selectable items.