Video Description in More than one Language

Video Description inMore than one Language • Overview • Need • Capability • Approaches • Conclusion Disclaimer: This presentation does not contain any recommendations, assessments or positions from or by NAB

Video Description The term ‘video description’ means the insertion of audio narrated descriptions of a television program’s key visual elements into natural pauses between the program’s dialogue. (S. 3304) AKA: Descriptive Video, Visually Impaired (VI)

Video Description Insertion Voice under? “no dialog” “no dialog” “no dialog” Primary Audio Dialog Descriptive insertions Audio with Video Description (Complete Mix)

Pause to reflect • Receiver makers refused to support the original Dolby design to save bits by enabling supplemental audio tracks • So service providers must consume bits to send everything for any audio service • Wonder if that lack of innovation is in Gary’s book… moving on…

Initial Mandates • Initial video description rules will go into effect October 2011. • Four Networks (ABC, CBS, Fox, and NBC) and top 5 national non-broadcast networks will have to provide 50 hrs./quarter with video description1 • Broadcast stations and MVPDs with technical capability to do so generally must pass through audio containing video descriptions. 1. For the top 25 DMAs and 50k+ subscriber systems respectively

More to Come from the FCC • Reports • Rule makings • Not later than this or sooner than that • Soonest for all DMAs : 2037 • Not tomorrow … but it is coming • English is assumed, but the Spanish speaking population is growing … • So what can we do?

Dolby E 8 ch. PCM Stream, compressed3 Mb/s total, twisted pair or coax Digital Audio Interfaces Studio/Master Control Video Subsystem Video Video Source Coding Router/MC Switcher Work File Audio Subsystem Audio Audio Source Coding Ancillary Data Control Data

Eight is enough For one additional service (with 5.1) -- need to replicate the path to get more than one language or type of service

Uncompressed Audio Interconnects AES-3: 2 ch / PCM stream, uncompressed 1.92 Mb/s, Twisted Pair or Coax (may also carry 8-channel Dolby E) then in HANC: 16 channels total Digital Audio Interface Studio/Master Control Video Subsystem Video Video Source Coding & Compression MC SW Work File Audio Subsystem Audio Audio Source Coding & Compression Ancillary Data Control Data

Sixteen is enough • For a pair of 5.1 channel services, each with an associated stereo audio descriptive video mix. • For a 5.1 service and 5 stereo services • So audio in several languages with VI and HI could be supported – but only one could be 5.1

RemoteProduction& PostVenues DTV Contribution Distribution LocalStation NetworkCenter HD or SD ATSCDigital For MPEG links, Audio channels can be carried as program elements with PMT-based signaling

Transmission • Digital Transmission • A large number of audio services for a single video (MPEG-2 Transport) can be signaled and sent (depending on the number of descriptors associated with each audio) • Analog Transmission • Second audio or video description <choose>

Transport Component Organization ATSC Transport Virtual Channel 1 MDTV Audio Audio Video Audio Audio PSI(P) Virtual Channel 2

Multi-Program Multiplex PES Streams Video+PCR Audio1: CM eng Audio2: VI eng Audio 3: CM spa Audio 4: VI spa Program 1 OK Mux Multi- Program Transport Stream Program 2 Program 3 Mux SI Tables (PMT) 4 each AC-3 and ISO-639 descriptors PSIP Tables Each event different descriptors

Electra 8000 – Hardware overview Dual ASI Output Management Port PSIP input Optional Audio Card 5.1 + 2.0 or 3 x 2.0 Optional Audio Card 5.1 + 2.0 or 3 x 2.0 Optional Audio Card 5.1 + 2.0 or 3 x 2.0 Optional Audio Card 5.1 + 2.0 or 3 x 2.0 CPC Control Card ASI Output Card 2 ASI StatMux Engine Midplane Encoder Encoder Encoder Encoder Encoder Video Encoding Video Encoding Video Encoding Video Encoding Audio Encoding 3 X 2.0 Audio Encoding 3 x 2.0 Audio Encoding 3 x 2.0 Audio Encoding 3 x 2.0 SDI / HDSDI (embedded audio)

Encoder Outputs Optional Audio Cards Each can encode or transcode from Dolby E one 5.1 + one 2.0 Dolby Digital Video Inputs One configuration would be to provide one 5.1 in English with the Descriptive Video on a Dolby E path, and the Spanish 5.1 with Descriptive Video in Spanish on another Dolby E path. Note this configuration supports more than my example case. The maximum audio + video shown is Four video programs (HD, SD or mixed) Four 5.1 surround channels Sixteen 2.0 stereo channels Based on slide from

Announcement Paradigms • This Program has English, Spanish, with Video Description in both Languages • Separate virtual channels • English • English Description for the Blind • English for the Hearing Impaired • Spanish • Spanish Description for the Blind • Spanish for the Hearing Impaired But Cable may have to do something like this if delivery to NTSC sets is required

Terrestrial Emission Overview(signaling and announcement) Event 1 Event 2 Events & tracks CM (5.1, eng) CM (5.1, eng) + VI (2, eng) CM ( 5.1, spa + VI (2, spa) EIT 0 (partial) EIT 0 (partial) Event 1 – AC-3 descriptor with one audio service Event 2 – AC-3 descriptor with four audio services PMT – one ISO-639 descriptor Event 2 – AC-3 descriptor with four audio services PMT – four ISO-639 descriptors (one per program element) ATSC Transport PSIP & PSI

DTV Receiver Audio Audio Decoding RF Tuner & VSB De- Modulator Transport De- Multiplex Video Decoding Display Processor Video RF Channel Select PSIP Data Audio Select Program Guide Database Program select from user

CEA-CEB-21 Recommended Practice for Selection and Presentation of DTV Audio In progress since July 2008, but almost done

Key Issues • User set up and control • Explicit Language selection • Explicit VI and HI selection • Differences between stream construction (Off-air and Cable)

Key Recommendations • Receivers should gather user preferences and allow them to be changed later • Receivers should read the tables and descriptors and use the contents • Receivers should automatically select best fit to preferences when more than one stream is present

Key Recommendations • Should consider the following items when providing for user selection of their preferred audio stream: • Stream type (CM, VI, or HI,) as signaled by the bsmod field in the AC-3_audio_stream_descriptor(). • The language field encoded in the AC-3_audio_stream_descriptor(). • The component_name_descriptor() to provide supplemental audio stream information to users, if needed.

Conclusion • Multiple language, multiple community service audio tracks are part of your future (unless English is declared to be the <ONE> Official Language for the United States of America) • Force fitting to the 2-audio mold is problematic • When breaking the mold, plan ahead

Credits ATSC CEA Mike Dolan Graham Jones

Video Description in More than one Language

Video Description in More than one Language

Presentation Transcript

One and More Than One

ANOVA With More Than One IV

Breastfeeding more than one baby

Metals with More than one Cation

More Than One in Eight Boys Report Abuse

Session 1.3 More Than One Piece

More than one Factor?

Grammar : Suffixes One and more than one

More than one may be true

Opening more than one

Opening more than one

ANOVA With More Than One IV

Authors (if more than one), Presenter

Mylingual There ’ s More Than One Language To Do It

More Than One Street

1. 18 Waves in more than one dimension

Atoms with more than one electron

More Than One Future Cash Flow?

Wearing more than one dosemeter

True Folklore existed in more than one time or place and has more than one version.

Roku on More Than One TV

More Than One Easy Fish Recipe