310 likes | 517 Views
Optical Music Recognition. Ichiro Fujinaga McGill University 2003. Content. Optical Music Recognition Levy Project Levy Sheet Music Collection Digital Workflow Management Gamera Guido / NoteAbility. Optical Music Recognition (OMR).
E N D
Optical Music Recognition Ichiro Fujinaga McGill University 2003
Content • Optical Music Recognition • Levy Project • Levy Sheet Music Collection • Digital Workflow Management • Gamera • Guido / NoteAbility
Optical Music Recognition (OMR) • Trainable open-source OMR system in development since 1984 • Staff recognition and removal • Run-length coding • Projections • Lyric removal / classifier • Stems and notehead removal • Music symbol classifier • Score reconstruction Demo
OMR: Classifier • Connected-component analysis • Feature extraction, e.g: • Width, height, aspect ratio • Number of holes • Central moments • k-nearest neighbor classifier • Genetic algorithm
Overall Architecture for OMR Image File Staff removal Segmentation Recognition K-NN Classifier Output Symbol Name Optimization Genetic Algorithm K-nn Classifier Knowledge Base Feature Vectors Best Weight Vector Off-line
Lester S. Levy Collection • North American sheet music (1780–1960) • Digitized 29,000 pieces • including “The Star-Spangle Banner” and “Yankee Doodle” • Database of: • text index records • images of music (8bit gray) • lyrics (first lines of verse and chorus) • color images of cover sheets (32bit)http://levysheetmusic.mse.jhu.edu
Digital Workflow Management • Reduce the manual intervention for large-scale digitization projects • Creation of data repository (text, image, sound) • Optical Music Recognition (OMR) • Gamera • XML-based metadata • composer, lyricist, arranger, performer, artist, engraver, lithographer, dedicatee, and publisher • cross-references for various forms of names, pseudonyms • authoritative versions of names and subject terms • Music and lyric search engines • Analysis toolkit
The problem • Suitable OCR for lyrics not found • Commercial OCR systems are often inadequate for non-standard documents • The market for specialized recognition of historical documents is very small • Researchers performing document recognition often “re-invent” the basic image processing wheel
The solution • Provide easy to use tools to allow domain experts (people with specialized knowledge of a collection) to create custom recognition applications • Generalize OMR for structured documents
Introducing Gamera • Framework for creation of structured document recognition system • Designed for domain experts • Image processing tools (filters, binarizations, etc.) • Document segmentation and analysis • Symbol segmentation and classification • Feature extraction and selection • Classifier selection and combiners • Syntactical and semantic analysis Generalized Algorithms and Methods for Enhancement and Restoration of Archives
Features of Gamera • Portability (Unix, Windows, Mac) • Extensibility (Python and C++ plugins) • Easy-to-use (experts and programmers) • Open source • Graphic User Interface • Interactive / Batchable (scripts)
Scripting Environment (Python) Automatic Plugin Wrapper (Boost) Architecture of Gamera Graphic User Interface (wxWindows) Plugins (Python) Plugins (C++) GAMERA Core (C++)
Example of C++ Plugin // Number of pixels in matrix #include “gamera.hh” #ifdef __area_wrap__ #define NARGS 1 #define ARG1_ONEBIT #endif using namespace Gamera; template <class T> feature_t area(T &m) { return feature_t(m.nrows() * m.ncols()); }
Example of Python Plugin // This filters a list of CC objects import gamera def filter_wide(ccs, max_width): tmp = [] for x in ccs: if x.ncols() > max_width: x.fill_matrix(0) else: tmp.append(x) return tmp
“A formal language for score-level representation” Plain text: readable, platform independent Extensible and flexible Adequate representation NoteServer: Web/Windows GUIDO/XML NoteAbility (K. Hamel) GUIDO Music Notation FormatH. Hoos, K. Renz, J. Kilian
GUIDO: An example { [ \beamsOff | \clef<"treble"> \key<"D"> f#*1/8. g*1/16 | a*1/4. d2*1/8 d*1/4. c#*1/8 | e1*1/2 _*1/4 f#*1/8. g*1/16 | c#2*1/4. b1*1/8 a*1/4. g*1/8 | | e#*1/2 f#*1/4 f#*1/8. g*1/16 | a*1/4. d2*1/8 d*1/4. c#*1/8 | e1*1/2 _*1/4 f#*1/8 g | c#2*1/4. b1*1/8 a*1/4. c#*1/8 ], …
Conclusions • Gamera allows rapid development of domain-specific document recognition applications • Domain experts can customize and control all aspects of the recognition process • Includes an easy-to-use interactive environment for experimentation • Beta version available on Linux • OS X version in preparation
Projections X-projections Y-projections back