SVC (Scalable Video Coding) & JSVM (Joint Scalable Video Model)

SVC(Scalable Video Coding)&JSVM(Joint Scalable Video Model) Kyumin Jeong kmjeong@adams.kw.ac.kr Computer Communications Lab. October 1, 2007

What is SVC? • Scalable Video Coding (SVC) • Project of the Joint Video Team (JVT) • ISO/IEC Moving Pictures Experts Group (MPEG) • ITU-T Video Coding Experts Group (VCEG) • As an amendment of the H.264/MPEG-4 AVC standard • One coding for multiple usage • Universal Multimedia Access paradigm • 하나의 비디오 스트림으로 다양한 전송 네트워크와 다양한 수신 단말에 적응적 서비스가 가능한 비디오 부호화 방법 • One source, use Anytime, Anywhere, Anydevice

What is SVC? • Scalable Video Coding (SVC)

Scalability

JSVM • JSVM (Joint Scalable Video Model) software • Reference software for the Scalable Video Coding (SVC) project • Written in C++ • Provided as source code • Can be obtained via a CVS server • Preparations for using JSVM (윈도우 환경) • Python • WinCVS를 사용하기 위해서 필요함, tcl/tk는 Python에 포함 • http://www.python.org/ • WinCVS • http://www.wincvs.org • WinCVS 실행시 발생하는 문제 해결 (아래 링크) • WinCvs could not find Python 2.1 installed

Accessing the latest JSVM Software • CVS (Concurrent Versions System) • 최신 JSVM 소프트웨어를 사용하기 위해 필요 • SVC 프로젝트 진행중 => JSVM 업데이트 • CVS 서버 접속을 위한 세팅

Accessing the latest JSVM Software • Using a command line CVS client • Admin - Command Line 선택 후, 아래 명령어를 차례대로 입력 1. cvs –d :pserver:jvtuser:jvt.Amd.2@garcon.ient.rwth-aachen.de:/cvs/jvt login 2. cvs –d :pserver:jvtuser@garcon.ient.rwth-aachen.de:/cvs/jvt checkout jsvm 실행 후 결과 : C:\JSVM 생성

Structure of the CVS repository for the JSVM Software

Building the JSVM software • Windows32 platform with Microsoft Visual Studio • C:\jsvm\JSVM\H264Extension\build\windows • Microsoft Visual Studio .NET 2003 (VC7) workspace • H264AVCVideoEncDec.sln • Microsoft Visual Studio .NET 2005 and 2006 (VC8) • H264AVCVideoEncDec_vc8.sln • Microsoft Visual Studio 6 workspace • H264AVCVideoEncDec.dsw • Build all project files by selecting Build→Batch Build

Building the JSVM software • Result • The versions with a “d” before the dot • Debug mode • Without a “d” before the dot • Release mode

Information on binaries and libraries

Usage and Configuration of the JSVM Software

테스트 파일 • URL • ftp://ftp.tnt.uni-hannover.de/pub/svc/testsequences/ • Contents 내용 • City, Crew, Football, Foreman, Bus, Ice, Harbour, Mobile, Soccer • Contents 종류 • 4CIF, CIF, QCIF에 대해 60, 30, 15, 7.5 Frame/Sec

테스트 인코딩 및 디코딩 • 테스트 과정 • JSVM SW 및 Test Contents 구하기 • Build a JSVM S/W • Manual 참조하여 Configure file을 수정 • 이 때 가급적 test directory를 만들어 이 곳에 executables, test contents, cfg 파일들을 모아놓고 처리하는 것이 편함 (단, versioning에 주의) • DownConvertStatic.exe 수행 • 필요한 경우 Test Contents의 변형을 위해 down convert 시킴 • H264AVCEncoderLibTestStatic.exe 수행 • Multi-Layer encoding인 경우 (e.g) test.264 파일생성 됨 • BitStreamExtractorStatic.exe 수행 • 원하는 layer의 bitstream을 추출 • H264AVCDecoderLibTestStatic.exe 수행 • 각 layer별 인코딩 된 미디어 데이터를 decoding하여 YUV 포맷 화 • PSNRStatic.exe • PSNR 측정

Resampler – DownConvertStatic.exe • Spatial/temporal resampling Usage DownConvertStatic <win> <hin> <in> <wout> <hout> <out> [<method> [<t> [<skip> [<frms> ]]]] [[-crop <args>] [-phase <args>]] Examples #Down-sampling a 4CIF 60Hz sequence to a QCIF 15Hz sequence using the dyadic method DownConvertStatic 704 576 4CIF60.yuv 176 144 QCIF15.yuv 1 2 #Resampling of a CIF 30Hz sequence to a 528x432 15Hz sequence using the normative upsampler. DownConvertStatic 352 288 CIF30.yuv 528 432 528x432_15.yuv 0 1 method : rescaling methods (default: 0) 0: normative upsampling non-normative downsampling (JVT-R006) 1: dyadic upsampling (AVC 6-tap (1/2 pel) on odd luma samples dyadic downsampling (MPEG-4 downsampling filter) 2: crop only 3: upsampling (Three-lobed Lanczos-windowed sinc) 4: upsampling (JVT-O041: AVC 6-tap (1/2 pel) + bilinear 1/4 pel)

Resampler – DownConvertStatic.exe • Cropping window • Generating the output sequence as a cropped area of the input • Cropping parameters • crop_type • 0 : for the entire sequence • 1 : for each picture of the sequence • crop_file • Filename of the file containing the cropping parameters -crop crop_type crop_file <x_orig> <y_orig> <crop_width> <crop_height> // parameters for frame 0 <x_orig> <y_orig> <crop_width> <crop_height> // parameters for frame 1 <x_orig> <y_orig> <crop_width> <crop_height> // parameters for frame 2 <x_orig> <y_orig> <crop_width> <crop_height> // parameters for frame 3 ... DownConvertStatic.exe 1280 720 720p50.yuv 720 576 SD25.yuv 0 –crop 0 crop.txt 1 content of crop.txt: 190 0 900 720

Resampler – DownConvertStatic.exe • Chroma phase shift management • -phase option • in_uv_ph_x, in_uv_ph_y, out_uv_ph_x and out_uv_ph_y • The horizontal and vertical phase shift, the chroma components in comparison to the luma component • Input : Luma samples for the input sequence • Output : Chroma phase shift of the output sequence • Value range of -1 to 1 • Resampling mode • -resample_mode option • Allow the resampling of both progressive or interlace material • Value range of 0 to 5

Encoder - H264AVCEncoderLibTestStatic • Generating AVC or SVC bit-streams • Using the encoder • No guarantee rate-distortion efficient coding • For obtaining optimized encoding results, the encoder configuration has to be carefully specified • Not provide a rate-control • The bit-rate needs to be controlled by selecting appropriate quantization parameters • Single-layer coding mode and Scalable coding mode • Single-layer bit-stream can also be generated in the scalable coding mode • Encoding mode is specified by the parameter AVCMode Usage H264AVCEncoderLibTestStatic.exe –pf <mcfg> [command line options]

Encoder - H264AVCEncoderLibTestStatic • Single-layer coding mode • AVCMode = 1 • AVC compatible bit-stream is generated • Encoder configuration file in single-layer coding mode

Encoder - H264AVCEncoderLibTestStatic • Command line options • Scalable coding mode • AVCMode = 1 • One or more layer configuration files have to be specified inside the main configuration files -bf (bitstream) The parameter bitstream specifies the filename for the bit-stream to be generated. -frms (frames) The parameter frames specifies the number of frames of the input sequence to be encoded. -pf (config) The parameter config specifies the name of the config file to be used. -h Prints out a brief help on using the encoder. C:\jsvm\JSVM0-config-samples\Contrib-examples\JVT-O018\Munich-Test-Points\cfg_ags

Other JSVM Softwares • Decoder - H264AVCDecoderLibTestStatic.exe • SVC to AVC Bit-stream Rewriter - AvcRewriterStatic.exe • Converts an SVC bitstream to an AVC bitsream Usage H264AVCDecoderLibTestStatic <str> <rec> [-ec <ec>] [<maxPOCDiff>] str: bit-stream file (input) rec: reconstructed video sequence (output) ec: error concealment method (1-3), 0 means no error concealment maxPocDiff: maximum difference of POC values of successive output frames AvcRewriterStatic <svcstr> <avcstr> svcstr: bit-stream file (input) avcstr: rewritten avc bit-stream file (output) Usage

Other JSVM Softwares • Bit-stream extractor - BitStreamExtractorStatic.exe • Extract sub-streams of an AVC or SVC stream • Streams with a reduced spatial and/or temporal resolution and/or a reduced bit-rate Usage BitStreamExtractorStatic [-pt trace] <in> [<out> [[-e] [-ql | -qlord]] | [-sl] | [[-l] [-t] [-f]] | [-b] | [-et] ] Extraction of a scalable layer BitStreamExtractorStatic input.svc output.svc –sl 7 Extraction of a scalable sub-stream using the general option –e BitStreamExtractorStatic input.svc output.svc –e 176x144@15:600

Other JSVM Softwares • Quality level assigner – QualityLevelAssignerStatic.exe • Embed information about quality layers inside a bit-stream • Quality layer information can be employed by the extraction process • To optimize the rate-distortion efficiency of the extracted sub-stream QualityLevelAssignerStatic -in Input -org L Original [-org L Original] [-out Output [-sei] | -wp DatFile] [-dep | -ind] [-mlql] or QualityLevelAssignerStatic -in Input -out Output -rp DatFile [-sei] -in Input - input bit-stream -out Output - output bit-stream with determined quality layer id's -org L Original - original image sequence for layer L -wp DatFile - data file for storing rate and distortion values -rp DatFile - data file with previously computed rate and distortion values -sei - provide quality layer info using SEI mesages -dep - determine only dependent distortions (speed-up by factor of 2, slight coding eff. losses) -ind - determine only independent distortions (speed-up by factor of 2, slight coding eff. losses) -mlql - determine Multi Layer Quality Layer Ids Usage

Other JSVM Softwares • MCTF pre-processing tool – MCTFPreProcessorStatic.exe • Pre-filtering image sequences • PSNR tool – PSNRStatic.exe • Measuring the Peak-Signal-To-Noise-Ratio (PSNR) between two sequences • Calculating the bit-rate Usage MCTFPreProcessor -w Width -h Height -f frms -i Input -o Output [-gop GOPSize] [-qp QP] -w Width - frame width in luma samples (multiple of 16) -h Height - frame height in luma samples (multiple of 16) -gop GOPSize - GOP size for MCTF (2,4,8,16,32,64, default: 16) -qp QP - QP for motion estimation and mode decision (>0, default: 26) PSNRStatic 176 144 org.yuv rec.yuv 0 0 str.svc 15 2>PSNR.txt type PSNR.txt 128,00 32,23 38,79 39.02 Usage PSNRStatic <w> <h> <org> <rec> [<t> [<skip> [<strm> <fps>]]] - rec: reconstructed file - t: number of temporal downsampling stages (default: 0) - strm: coded stream - frms: frames per second (1) bit-rate in kbit/s, (2) Y-PSNR in dB – luminance component, (3) U-PSNR in dB – chrominance component U or Cb, (4) V-PSNR in dB – chrominance component V or Cr

Other JSVM Softwares • Fixed QP encoder - FixedQPEncoderStatic.exe • Find the basis quantization parameter to meet • A rate or a quality constraint for single layer coding • The rate or quality constraints for the layers of a scalable stream • SIP Analyser tool – SIPAnalyser.exe • Make the selective inter-layer prediction decision Usage FixedQPEncoderStatic <rc_cfg> rc_cfg : the rate control parameter file configuring the fixed QP encoder SIPAnalyser.exe <sip_cfg> [FileLabel] sip_cfg : the SIP parameter file FileLabel : A suffix to the output filename Usage

Use Examples as a Brief Tutorial- Single Layer Coding

Original Sequences Generation • Goal • YUV 시퀀스를 시공간적으로 다운샘플된 4CIF,CIF로 생성 • The original CIF resolution sequences • BUS_352x288_30_orig_01.yuv, • FOREMAN_352x288_30_orig_01.yuv • FOOTBALL_352x288_30_orig_01.yuv • MOBILE_352x288_30.yuv • URL : ftp.tnt.uni-hannover.de/pub/svc/testsequences/

Original Sequences Generation • How to generate dowsampled versions • Test file : SOCCER_352x288_30_orig_02.yuv • Change : SOCCER_352x288_30.yuv • Down-sampling original CIF 30Hz sequence to QCIF 15Hz sequence for the CIF scenario

Single Layer Coding • Single layer coding • Encoded bit-stream does not provide several spatial resolutions or several bit-rates for a specific spatio-temporal resoultion • Generated bit-stream contains spatial or CGS/MGS enhancement layer • Preparation for encoding “SOCCER_176x144_15.yuv” • 아래 폴더에서 encoder.cfg와 layer0.cfg를 C:\jsvm\bin에 복사 • C:\jsvm\JSVM\H264Extension\data • encoder.cfg와 layer0.cfg를 수정

Single Layer Coding • encoder.cfg 수정 • 아래 나와 있는 옵션들을 각각 수정 • layer0.cfg 수정 # JSVM Main Configuration File OutputFile test.264 # Bitstream file FrameRate 30.0 # Maximum frame rate [Hz] FramesToBeEncoded 150 # Number of frames (at input frame rate) GOPSize 16 # GOP Size (at maximum frame rate) BaseLayerMode 2 # Base layer mode (0: AVC w larger DPB, # 1:AVC compatible, 2:AVC w subseq SEI) SearchMode 4 # Search mode (0:BlockSearch, 4:FastSearch) SearchRange 32 # Search range (Full Pel) NumLayers 1 # Number of layers LayerCfg layer0.cfg # Layer configuration file # JSVM Layer Configuration File InputFile SOCCER_176x144_15.yuv # Input file SourceWidth 176 # Input frame width SourceHeight 144 # Input frame height FrameRateIn 15 # Input frame rate [Hz] FrameRateOut 15 # Output frame rate [Hz]

Single Layer Coding • Encoder call for single-layer coding • Result : test.264 and rec_layer0.yuv

Single Layer Coding • Extracting a temporal sub-sequence

Single Layer Coding • Decoding a temporal sub-sequence

Single Layer Coding • Use of PSNR tool

SVC (Scalable Video Coding) & JSVM (Joint Scalable Video Model)