420 likes | 1.33k Views
High performance 3D rendering toolkit. Use of multiple CPUs, Use of multiple graphics ... Sepia 2. reads back the framebuffer and distributes it over a fast network (ServerNet II) ...
E N D
1. Computing Architectures for Virtual Reality Multiprocessor Servers / Graphic Supercomputers vs. PC Clusters Architecture
2. Introduction VR requires:
fast graphics and haptics refresh rates? graphic pipelines
low latencies? interactivity
3. Graphics Rendering Pipeline Three functional stages
Application stage (SW)
Geometry stage (HW)
Rasterizer stage (HW)
4. Computing Architectures (1) Single Host Multiprocessor Server
Massively parallel architecture
Multiprocessor?interprocessor communication?shared memory pool
Multipipe graphics?parallel rendering?bus based fast communication
5. Computing Architectures (2) Distributed Cluster
off-the-shelf hardware
interconnecting network
scalable
6. Distributed VR system architecture problems No shared memory pool? low latency network
Independent graphic accelerator cards? video signal synchronization
If tiled image rendering? composition
7. Single Host Multiprocessor Multipipe Servers SGI InfiniteReality
massively parallel architecture
bus-based broadcast communication to distribute primitives
Graphics subsystem:
Geometry engine,
Raster Manager,
Display generator
8. SGI InfiniteReality (1)
9. SGI InfiniteReality (2)
10. SGI Performer (1) High performance 3D rendering toolkit
Use of multiple CPUs,
Use of multiple graphics pipelines
11. SGI Performer (2) - Multiprocessing Each stage of the graphics pipeline process can then run as a separate process on a separate CPU
APP
CULL
DRAW
12. SGI Performer (3) - Multichannels Each rendering pipelinecan render multiplechannels – multiple video outputs
13. SGI Performer (4) - Multipipes Multiple displays
synchronized with genlock
14. SGI Performer (5) - Hyperpipe Temporal Decomposition
To use with DPLEX ring or chain
15. SGI Performer (5) - Frame Synchronization pfSync synchronizes the graphics pipeline to the frame rate
DRAW time overruns is specified by the phase control? scene management: LOD, culling, …
16. PC Cluster Architecture (1) Each node must have access to the same entire data set
real time visualization and interactivity ? network latency
the seamless, synchronized graphic display (image reassembly)
17. PC Cluster Architecture (2) 3 levels of synchronization:
video signal synchronization? genlock
dynamic data synchronization? network
frame completion synchronization? swapbuffers barrier
18. PC Cluster Architecture (3) Display szenarios
19. PC Cluster Architecture (4) 3DLabs
20. PC Cluster Architecture - Networks (1)
Hardware Solutions
Giga Ethernet (1 Gigabit/s, half duplex)
Myrinet (2 + 2 Gigabit/s full duplex)
ServerNet II (Compaq), ...
Software Interfaces
TCP / IP
PVM, MPI, ...
21. PC Cluster Architecture - Networks (2) Myrinet
massively parallel processors (MPP) communication technology? specialized communication channels, cut-through switches, host interfaces? "OS bypass" for low-latency communication
22. PC Cluster Architecture - Synchronization (1) The following is required to provide a seamless image:
each channel must render the same data
pixel rates must be identical
the displays must start new images at the same time
swapping of their buffers during the same blanking period
23. PC Cluster Architecture - Synchronization (2) Video Signal Synchronization
Genlock:pixel level synchronization is ensured by all graphic pipelines via an (external) sync signal? most precise way
Framelock:synchronizes once per frame at the end of the blanking period
24. PC Cluster Architecture - Synchronization (3) Dynamic Data Synchronization
2 types of changing data
control information: direction of view
changing / dynamic data set information: model movement
3 approaches for distribution:
distribute stimuli
calculate resulting data centrally and distribute
calculate end graphics data centrally and distribute
25. PC Cluster Architecture - Synchronization (4) Frame Completion Synchronization: nodes have to wait until are ready to swap buffers? swap barrier synchronization
Multiview:
26. PC Cluster Architecture - Synchronization (5) Net Juggler and SoftGenLock
based on an ”input event level” parallelization? No highbandwidth network necessary
Synchronization:
Real time Linux
Fast sync network: PAPERS (Parallel Port - 4µs )
27. PC Cluster Architecture - Composition (1) Display Reassembly in Hardware
Lightning-2
connects to graphic accelerators via DVI
any pixel data generated from any node to be dynamically mapped to any location on any display
Sepia 2
reads back the framebuffer and distributes it over a fast network (ServerNet II)
28. PC Cluster Architecture - Composition (2) Lightning-2
Pixel Mapping? strip header
Frame transfer protocol? RS232 back connect for sync
Image composition
Tiled images
Colour keying
Depth compositing PC Cluster Architecture - Composition (1)PC Cluster Architecture - Composition (1)
29. PC Cluster Architecture - Composition (3) Lightning-2 Architecture
30. Single Multipipe Graphics Accelerator (1) Wildcat III 6210 / Wildcat II 5110
31. Single Multipipe Graphics Accelerator (2)
32. Single Multipipe Graphics Accelerator (3) ParaScale Architecture
33. Commercial Cluster Solutions SGI Graphics Cluster™
ImageSync
DataSync
34. Commercial Cluster Solutions Evans & Sutherland – SimFUSION
35. Commercial Cluster Solutions AEC – ArsBox
nodes via Parallelport synchronized
Redhat 7.1
SGI Performer
100Mbit Ethernet
36. Conclusion Graphic Supercomputers
Massively parallel structure
Expensive
Established
PC clusters
Off-the-shelf hardware
Distributed ? synchronization