420 likes | 536 Views
The Long March ( 長 征 ) to 3D Video. Leonardo Chiariglione Speech at 3D Systems and Applications Seoul – 2014/05/28. It has already been not a short march. Analogue Printing Photography Telegraphy Telephony Audio recording Radio Television Video recording. Digital
E N D
The Long March (長征)to 3D Video Leonardo Chiariglione Speech at 3D Systems and Applications Seoul – 2014/05/28
It has already been not a short march Analogue • Printing • Photography • Telegraphy • Telephony • Audio recording • Radio • Television • Video recording Digital • Video conference • Video telephony • Video interactive • Television • (3D TV)
The dimensions of future media • Time/space resolution • Screen content • Colour • Brightness • Scalability • 3D Video • 3D Audio • Metadata • File format • Sensors/actuators • Human interaction • Fusion of real & virtual • Detection/analysis • Linking • Energy saving • User profile
There has been progress in resolution • QSIF • SIF • Standard Definition (interlace) • High Definition (Interlaced/progressive) • 4k (Progressive) • 8k (Progressive)
The cost of being digital Video Audio
Are there limits to compression? • Input bandwidth to humans • Eyes: 2 channels of 430–790 THz • Ears: 2 channels of 20 Hz – 20 kHz • A nerve fiber connecting senses to the brain can transmit a new impulse every ~6ms = 167 spikes/s (1 bit ~16 spikes) • Eye • 1.2 M fibers transmit 10 bit/s each • An eye sends ~12 Mbit/s to brain • Ear • 30 k fibers in the cochlear nerve • An ears sends ~300 kbit/s to brain
Sensors-to-brain bitrates 430– 790 THz 1.2M nerve fibers ~12Mbit/s 0.020-20kHz ~0.3Mbit/s 30k nerve fibers
High Dynamic Range and Wider Color Gamut • Higher Dynamic Range and Wider Color Gamut can give users a better sense of “being there”, with a viewing experience closer to real life experience • Light bulb > 10,000 nits • Surface lit in the sunlight > 100,000 nits • Night sky < 0.005 nits • Question: if dynamic ranges and volumes of the color gamut increase significantly, are existing MPEG video coding standards able to efficiently support future needs?
Wider Color Gamut ITU-R BT.709 ITU-R BT.2020
Dynamic Range- Examples Bright areas can have > 10,000 Cd/m2 luminance Dark areas can have < 0.01 Cd/m2 luminance
Screen Content applications • Wireless display • Companion screen • Control rooms with high resolution display wall • Digital operating room (DiOR) • Virtual desktop infrastructure (VDI) • Screen/desktop sharing and collaboration • Cloud computing and gaming • Factory automation display • Supervisory control and data acquisition (SCADA) display • Automotive/navigation display • PC over IP (PCoIP) • Ultra-thin client • Remote sensing
Where we are • Janury 2014: Joint Call for Proposals for Coding of Screen Content • April 2014: Proposals evaluation • Conclusion: evidence that significantly improved coding efficiency can be obtained by exploiting screen content characteristics with novel dedicated coding tools • April 2014: Standardization plan and tentative time line • First Test Model: Apr. 2014 • PDAM: Oct. 2014 • DAM: Feb. 2015 • FDAM: Oct. 2015
MPEG standards for coding multiple cameras • A long history, starting from MPEG-2 (mid 1990s) • MPEG standards (existing and under development) • Multiview coding – can only display views captured at the source • Depth-based coding – can also display limited number of additional views • Camera arrangement: cameras are assumed to be linearly arranged
Free viewpoint television (FTV)/1 • Free viewpoint television (FTV): a hypothetical 3D transmission system that enables a viewer to select arbitrarys viewpoints, inside and outside a scene • FTV requires many technologies, not just from MPEG • A 3D video format supporting the generation of views not already included in the bitstream generated by the encoder would be a major enabler for FTV. • Purpose of MPEG FTV exploration: to develop the know-how to enable MPEG to develop the said 3D video format
Free viewpoint television (FTV)/2 • Areas considered in the MPEG FTV exploration • Compare and evaluate the depth quality attainable for general camera arrangements • Evaluate view synthesis algorithms and improve their performance • To investigate the coding efficiency of the most promising coding technologies currently available • To investigate the influence of mis-registration on the View Synthesis performance • To investigate the representation capability of BIFS to clarify the elements that need to be standardized
FTV Seminar A Viewing Revolution in the Making Date: 2014 July 8 T14:00-18:00 Venue: Main Hall B, Sapporo Convention Center Sapporo, Japan Exhibition of FTV demos Room 101, 10:00-17:00, July 1 to 4.
Parallel worlds • For centuries humans have been building two different types of worlds Knowledge Books Physical Informational Films Music
Immersion • A definition of immersion: a state in which connections of a human with • Physical world are severed • Informational world are activated
How far is immersion progressing? Fairly… …or too far?
Can we reconnect the two worlds? • Smartphones • Enable universal access to the Informational world while sensing also the Physical world • Enhance history and meaning of the real world with powerful digital elements • Let’s create two-way bridges • Extend reality to virtual • Add reality to virtual Physical & Informational 29
Functions of an Augmented Reality browser • Retrieve scenario from the internet • Start video acquisition and track objects • Recogniseobjects and recover camera pose • Get streamed 3D graphics and compose new scenes • Get input from various sensors • Access interaction possibilities and objects from a remote server • Adapt to offer optimal AR experience
The AR technology chain Local Real World Environment Remote Real World Environment Remote Sensors & Actuators Local Sensors & Actuators MPEG ARAF Authoring Tools ARAF Browser User ARAF Augmented Reality Application Format Media Servers Service Servers
Augmented Reality Application Format • A set of MPEG-4 scene graph nodes • Audio, image, video, graphics, programming, communication, user interactivity, animation • Map, MapMarker, Overlay, ReferenceSignal, ReferenceSignalLocation, CameraCalibration, AugmentedRegion • Connection to sensors defined in MPEG-V • Orientation, Position, Angular Velocity, Acceleration, GPS, Geomagnetic, Altitude, Local camera(s) • Compressed media • Image, (3D) sound, (3D) video, 2D/3D graphics ARAF
The whole used to be the message Classic Books: the value is in the content as a whole
Today the link adds value to the message On-line knowledge: the value is in the link
The video used to be the message Classic video content: the value is in the content as a whole
Next the link will add value to the video message New video content: the value is in the link From EU FP7 BRIDGET project
An unequal fight • Many new services – all more demanding in bandwidth • Compression improves, but cannot cope with all the demands just by itself • UHD is 4 times the uncompressed bitrate of HD, but HEVC “only” compresses two times AVC) • And we have HDR, WCG, SCC, FTV… • At prime time 30% of USA internet is taken by Netflix traffic • We need more tools to solve the problem
The mobile industry perspective 10 x more spectrum 10 x better spectrum utilisation 10 x more base stations 1000 x more capacity X X =
Making the network smarter • Video has lion’s share of internet traffic – more so as we add more dimensions to the user experience • We need to cope with (human-vehicle) mobility • More and more of human life happens on the move • We need new smarter approaches instead of just throwing more network capacity, beyond • Digital video recording (on premises or networked) • Peer-to-Peer (P2P) Overlays • Content Distribution Networks (CDNs)
Video and Information Centric Networking Migration path from today’s IP infrastructure to pub/sub support for ICN Same content available at different network locations Information Centric Network IP Network Client - content - network mobility under energy consumption constraints From FP7/NICT EU-JAPAN GreenICN project
Green MPEG Media Pre-processor Media Encoder Media Media Encoded Media Encoded Media Media Decoder Presentation Subsystem Green Meta- data Generator Green Meta- data Generator Green Metadata Green Metadata Green Meta- data Extractor Power control Power control Power control Power control Green Feedback Green Feedback Power optimization module Power optimization module