530 likes | 699 Views
MPEG-4 Systems and DMIF. 21 세기 유망핵심부품 기술 세미나. Doug Young Suh, Ph.D. Kyung Hee University Suh@khu.ac.kr. Outline. Overview ISO/IEC 14496-1 MPEG-4 Systems ISO/IEC 14496-6 DMIF. Overview. MPEG-4 Systems : interactive audio-visual scene. 14496-1 MPEG-4 Systems. 14496-2 MPEG-4 Video.
E N D
MPEG-4 Systems and DMIF 21세기 유망핵심부품 기술 세미나 Doug Young Suh, Ph.D. Kyung Hee University Suh@khu.ac.kr
Outline • Overview • ISO/IEC 14496-1 MPEG-4 Systems • ISO/IEC 14496-6 DMIF
Overview • MPEG-4 Systems : interactive audio-visual scene 14496-1 MPEG-4 Systems 14496-2 MPEG-4 Video 14496-3 MPEG-4 Audio
MPEG4 Server Client Authoring Tool BIFS Encoder BIFS Composition DMIF CallSetup Control DMIF CallSetup Control SL SL Video Encoder Video Decoder MP4File MP4 File Audio Encoder Audio Encoder Video ES Audio ES FlexMux DMIF TransMux FlexMux DMIF TransMux RTP/UDP/IP RTP/UDP/IP Interactive VOD Based on MPEG-4
Concept 1 : Layered Model 한국 철학자 네팔 철학자 철학자 프로토콜 철학자/통역 인터페이스 통역 프로토콜 통역 네팔어 영어 통역 한국어 영어 통역/통신 인터페이스 통신 프로토콜 한국 통신 네팔 통신
Concept 2: Object-oriented • Encapsulation : data, method • Inheritance • Not object-based Human Name Age Call() Employee Salary Fire() Customer Balance Register()
The ISO/IEC 14496 terminal architecture 14496-1 Systems 14496-2 video 14496-3 audio 14496-6 DMIF
Tools in Systems • Terminal model with time and buffer management • BIFS (Binary Format for Scenes) • OD (Object Descriptor) • Interface to IPMP systems • SL (Sync Layer) • FlexMux • MPEG-Java : an application engine
14496-1 Terminology • Ascene is composed of one or more than one objects. 예) 일기예보장면(scene)에서 사람 (object1)과 배경(object2)이 있고, 소리(object3)가 나온다. • ES : 압축된 media data, 대개 object와 1:1 • AU : 대개 영상은 한 VOP, audio는 한 frame (e.g. 10ms) • CU : decoding 후 독립적으로 다룰 수 있는 가장 작은 단위
Systems Buffer Model • DB : bitrate 변화 및 network jitter 흡수 • CM : prediction (P-, B-VOP)용, CU decoding time 차이 흡수 • DB, CM으로 초기 지연이 결정됨 • CM은 최소화하여야 (특히, PDA)
Time Model • 필요한 이유 • Lip synchronization : CTS, DTS • Clock recovery : e.g. broadcast, IMT-2000 • Assumption • DTS 순간 decoding 되고, DB에서 지워지면서, decoding된 CU는 CM에 저장됨 • 현재 CTS에서 다음 CTS 사이에 composition 됨 (한 CU는 적어도 다음 CU의 CTS까지는 CM에 있어야)
Time Base • STB in the decoder system • OTB for media source systems • Video : 60 times in a second • Audio : 44100 times in a second • Mapping OTB to STB
MP4 File • Self-contained cf. *.asf of MS Media Player • Include IOD, OD, BIFS, ES
OD Framework • Basic syntax abstract aligned(8) expandable(228-1) class BaseDescriptor : bit(8) tag=0 { // empty. To be filled by classes extending this class. } abstract aligned(8) expandable(228-1) class BaseCommand : bit(8) tag=0 { // empty. To be filled by classes extending this class. } • IPMP : IPMP OD, IMMP ES • Command : OD stream, OD as an ES (convey, update, and remove ODs) • Descriptor : OD components (Object, IOD, ES, Decoder, QoS)
OD Stream • Command 전달 (convey, update, and remove) • Examples class ObjectDescriptorUpdate extends BaseCommand : bit(8) tag=ObjectDescrUpdateTag { ObjectDescriptorBase OD[1 .. 255]; } class ObjectDescriptorRemove extends BaseCommand : bit(8) tag=ObjectDescrRemoveTag{ bit(10) objectDescriptorId[(sizeOfInstance*8)/10]; } class ES_DescriptorRemove extends BaseCommand : bit(8) tag=ES_DescrRemoveTag { bit(10) objectDescriptorId; aligned (8) bit(16) ES_ID[1..255]; } class IPMP_DescriptorRemove extends BaseCommand : bit(8) tag=IPMP_DescrRemoveTag { bit(8) IPMP_DescriptorID[1..255]; }
Object descriptors linking scene description to elementary streams
OD Component 1: IOD // BIFS와 media 별 OD에 대한 Es_Descriptor를 가진 OD // Call-setup을 위하여 필요함 class InitialObjectDescriptorextends BaseDescriptor: bit(8) tag=InitialObjectDescrTag bit(10) ObjectDescriptorID; bit(1) URL_Flag; bit(1) includeInlineProfileLevelFlag; constbit(4) reserved=0b1111; if (URL_Flag){ bit(8) URLlength; bit(8) URLstring[URLlength]; } else { bit(8) ODProfileLevelIndication; bit(8) sceneProfileLevelIndication; bit(8) audioProfileLevelIndication; bit(8) visualProfileLevelIndication; // e.g. Simple, Simple Scalable, Core, Main, etc. bit(8) graphicsProfileLevelIndication; ES_Descriptor ESD[1 .. 255]; // 한 개 이상 있어야 OCI_Descriptor ociDescr[0 .. 255]; // 없어도 됨 IPMP_DescriptorPointeripmpDescrPtr[0 .. 255]; } ExtensionDescriptorextDescr[0 .. 255]; }
OD Component 2 : OD class ObjectDescriptorextends BaseDescriptor: bit(8) tag=ObjectDescrTag { bit(10) ObjectDescriptorID; bit(1) URL_Flag; constbit(5) reserved=0b1111.1; if (URL_Flag) { bit(8) URLlength; bit(8) URLstring[URLlength]; //point to another OD } else { ES_Descriptor esDescr[1 .. 255]; // an array of ES_Descriptors, 한 개 이상 있어야 OCI_Descriptor ociDescr[0 .. 255]; IPMP_DescriptorPointeripmpDescrPtr[0 .. 255]; } ExtensionDescriptorextDescr[0 .. 255]; }
OD Component 3 : ES_Descriptor class ES_Descriptor extends BaseDescriptor: bit(8) tag=ES_DescrTag { bit(16) ES_ID; bit(1) streamDependenceFlag; bit(1) URL_Flag; constbit(1) reserved=1; bit(5) streamPriority; if (streamDependenceFlag) bit(16) dependsOn_ES_ID; if (URL_Flag){ bit(8) URLlength; bit(8) URLstring[URLlength]; } DecoderConfigDescriptordecConfigDescr; SLConfigDescriptorslConfigDescr; QoS_Descriptor qosDescr[0 .. 1]; // 있으면, 한 개까지 IPMPDescriptor ipmpDescrPtr[0 .. 1]; ………………………… 중략 ……………………….. ExtensionDescriptorextDescr[0 .. 255]; }
OD Component 4 :DecoderConfigDescriptor class DecoderConfigDescriptor extends BaseDescriptor : bit(8) tag=DecoderConfigDescrTag { bit(8) objectTypeIndication; // MPEG-1,-2 video, audio, etc. bit(6) streamType; bit(1) upStream; const bit(1) reserved=1; bit(24) bufferSizeDB; bit(32) maxBitrate; bit(32) avgBitrate; DecoderSpecificInfo decSpecificInfo[0 .. 1]; }
Other OD Components • QoS_Descriptor : delay, loss, AU_Size, etc. • DecoderSpecificInfo • SLConfigDescriptor • ContentIdentificationDescriptor
BIFS • Binary information needed to combine, reconstruct, and present audio-visual data at the client side (not at the server side) • spatio-temporal location/scale/orientation of audio-visual objects • largely based on VRML (ISO/IEC 14772-1) • BIFS_ES, BIFS AU (BIFS-Command, BIFS-Anim), BIFS SL, BIFS time base, BIFS decoder • For interactivity, SENSOR node
Logical structure of the scene • a graph with links and nodes (refer to graph theory.)
Time 1. 한번 play 2. Play 도중 stop (loop=FALSE, startTime<stopTime<startTime+duration) 3. 계속 되풀이 (loop=TRUE, stopTime<=startTime) Parameters Loop, duration, startTime, stopTime
BIFS-Command • Modify properties of the scene graph, its nodes, and behaviors • applied to conditional nodes • ReplaceEntireScene(new_scene_graph) // random access point 2. Insertion(nodeID,event,ROUTE) 3. Deletion(nodeID,event,ROUTE) 4. Replace(nodeID,event,ROUTE)
BIFS-Anim • update of the certain fields of nodes in the scene graph • meshes, 2D/3D positions, rotations, scale factors, and color attributes • Separate ESs for BIFS-Command (CommandFrames) and BIFS-Anim (AnimationFrames)
Composite Texture2D example (projected on 3D cube) CompositeTexture2D{ eventIn MFNode addChildren eventIn MFNode removeChildren exposedField MFNode children exposedField SFInt32 pixelWidth exposedField SFInt32 pixelHeight exposedField SFNode background exposedField SFInt32 viewport }
Sync layer (SL) • defines a syntax for the packetization of each ES into AUs or parts of AU • SPS (SL packet stream) : the sequence of SL packets from one ES
SLConfigDescriptor in ES_Descriptor class SLConfigDescriptor extends BaseDescriptor : bit(8) tag=SLConfigDescrTag { bit(8) predefined; if (predefined==0) { bit(1) useAccessUnitStartFlag; bit(1) useAccessUnitEndFlag; bit(1) useRandomAccessPointFlag; bit(1) hasRandomAccessUnitsOnlyFlag; bit(1) usePaddingFlag; bit(1) useTimeStampsFlag; bit(1) useIdleFlag; bit(1) durationFlag; bit(32) timeStampResolution; bit(32) OCRResolution; bit(8) timeStampLength; // must be 64 bit(8) OCRLength; // must be 64 bit(8) AU_Length; // must be 32 bit(8) instantBitrateLength; bit(4) degradationPriorityLength; bit(5) AU_seqNumLength; // must be 16 bit(5) packetSeqNumLength; // must be 16 bit(2) reserved=0b11; } if (durationFlag) { bit(32) timeScale; bit(16) accessUnitDuration; bit(16) compositionUnitDuration; } if (!useTimeStampsFlag) { bit(timeStampLength) startDecodingTimeStamp; bit(timeStampLength) startCompositionTimeStamp; } }
SL Packet Header • packetSequenceNumber • degradationPriority • objectClockReference • decodingTimeStamp • compositionTimeStamp • accessUnitLength • instantBitrate
MPEG-Java • Flexible programmatic control system (not parametric) • Capability for graceful degradation under limited or time varying resources • Capability to respond to user interaction and provide enhanced multimedia functionality
MPEG-J System • Combine MPEG-media and safe executable code (Java code) • Components of MPEG-4 player • Execution and presentation resources • Decoders • Network resources • Scene graph • Downloadable decoder????
FlexMux (optional) • Multiplexing or separate channel? • Multiplexing : circuit switching • Separate channels : packet switching • Multiplexing : low overhead • RTP/UDP/IP header size (40 bytes > ) compared to audio packet payload (20 bytes) • Simpler than MPEG-2 TS
Simple Mode MuxCode Mode
MP4 File format • (normally) self-contained file cf. *.asf • Protocol-unaware, media-unaware
MP4 File Usage • Interchange • Content creation : authoring • Preparation for streaming : interleaving • Local presentation : CD, DVD-ROM • Streamed presentation (not yet, in IM1)
MP4 Terminology • atom : ‘object’ in sense of object-oriented concept e.g. ‘iods’ OD atom, ‘moov’ movie atom, ‘mdat’ media data atom etc. • trak : ES + [hint trak] e.g. video trak, audio trak • hint trak : packetization information • Container : file‘moov’‘mvhd’‘mdhd’
Hint track • Bridge between MPEG-4 and a protocol • Each TransMux has its own hint track format. (ES over TransMuxes) • aligned(8) class HintMediaHeaderAtom extends FullAtom(‘hmhd’, version = 0, 0) { unsigned int(16) maxPDUsize; unsigned int(16) avgPDUsize; unsigned int(32) maxbitrate; unsigned int(32) avgbitrate; unsigned int(32) slidingavgbitrate; }
DMIF Usage Client Server
DMIF Terminology • Service : DMIF provides a service to an application(or user). • Service session : local association between DMIF instance and a service • Network session : an association between two DMIF peers • Channel over which a DMIF user sends or receives data
DMIF Terminology uu DMIF user DMIF user Service session Service session service service dd DMIF Instance DMIF Instance Network session TransMux Network channels
Network service primitives User User 1. Request 4. Confirm 2. Indication 3. Response Network
DMIF-Application Interface • Service primitives e.g. DA_ServiceAttach(IN: URL, uuDataInBuffer, uuDataInLen; OUT: response, serviceSessionId, uuDataOutBuffer, uuDataOutLen) • Channel primitives e.g. DA_ChannelDelete(IN: loop(channelHandle,reason) OUT: loop(response)) • Data primitives e.g. DA_Data(IN: channelHandle, streamDataBuffer, streamDataLen)
DMIF Network Interface • Session primitives : setup and release • DN_SessionSetup(), DN_SessionRelease() • Service primitives : attach and detach • DN_ServiceAttach(), DN_ServiceDetach() • Transmux primitives : setup, release, and config • DN_TransMuxSetup(), DN_TransMuxRelease(), DN_TransMuxConfig() • Channel primitives : add and delete