140 likes | 324 Views
Speeding up VirtualDub. Presented by: Shmuel Habari Advisor: Zvika Guz. Software Systems Lab Technion. What is VirtualDub?. VirtualDub is an incredibly popular open source video processing tool.
E N D
Speeding up VirtualDub Presented by: Shmuel Habari Advisor: Zvika Guz Software Systems Lab Technion
What is VirtualDub? • VirtualDub is an incredibly popular open source video processing tool. • It is capable of merging videos, cutting scenes, adding subtitles and applying a wide variety of filters. It also supports third party video compression (i.e. DivX) • VirtualDub is constantly being refined, expanded and adapted by it’s original creator, Avery Lee . http://www.virtualdub.org/
VirtualDub’s Benchmark • The benchmark chosen was to use the Resize filter, an often-used cpu heavy filter. • Choosing a vibrant color animation video, so every flaw, if any, will be visible. • The result video was made w/o audio filtering, and with no third party compression utilities.
VTune Performance Analyzer • Analyzing the benchmark using VTune: • First step - VDFastMemcpyPartialMMX2
Fast Memory Copy • This functions handles copying large quantities of data from a memory source address, into a memory destination address. • Each cycle copies the data into the registers, and then into the specified address. • Moving to the next 64 bytes, the loop continues, till all the data has been copied. • From observations, the function was called to read 2048 bytes every time.
Clockticks Samples • Again using VTune it was seen that predictably, the most clockticks were when reading from the memory.
Dummy Loop • Seeing that, the solution was to fill the cache before beginning to copy the data • I’ve added a dummy loop, a.k.a. @mainloop, reading 1024 bytes ahead, before running blastloop. • When the cache empties – if we did not reach the end of the source data, another 1024 bytes would be read. • Using the Dummy loop, a speedup of 4.21% was gained.
Threads • As stated before, the original VirtualDub is a project in development. • The original creator had access to code optimizing programs – VTune included – allowing him to improve the code himself, removing many pitfalls and errors common to non-optimized code. • Also, VirtualDub proved to be multithreaded, to a point:
Threads • The 1st thread is the processing thread - however, the 2nd thread is the audio thread – since we specifically disabled the audio, It did not contain almost any activity: • Therefore – theoretically, Multithreading the Process thread was still possible
Threads • At first I had high hopes for multithreading VirtualDub – studying the code I came to the conclusion that it processed the video frame by frame, and in each frame it scanned line by line. • Two approachs I decided to try were: • Processing two frames in parallel • Cutting a frame in half, and processing the top and bottom in parallel.
Threads • At first I had high hopes for multithreading VirtualDub – studying the code I came to the conclusion that it processed the video frame by frame, and in each frame it scanned line by line. • Two approachs I decided to try were: • Processing two frames in parallel • Cutting a frame in half, and processing the top and bottom in parallel.
Threads • However, All my attempts at hyper threading VirtualDub’s processing failed. • At first believing that I’ve encountered global variables being addressed, I’ve discovered them to be private variables to a much higher level class. • Attempts to duplicate said class in order to split the workload failed.
Threads • Lastly, I’ve turned to OpenMP, hoping to use it’s innate capabilities to duplicate the variables into each thread. • VirtualDub’s complexity made it impossible for me to covert it to Intel Compiler – every change resulted in a staggering amount of errors, each requiring many small code changes, and still more that couldn’t be solved. • Limiting the use of Intel compiler into the only necessary projects did not show an improvement.
Conclusion • A lot of time and effort were put into this project. • To my dismay, it is not evident in percent of speedup, but rather as error messages and various versions of code, each a bit closer to a working version, but never quite there. • The bottom line, is that despite the promise initially shown by VirtualDub, ultimately too much had already been originally done in it – leaving it optimized, monstrously big and intricate for my optimization.