260 likes | 265 Views
This thesis proposal discusses the motivation, goals, and example applications of an architecture that aims to provide fault tolerance, scalability, and dynamic replay service in a distributed messaging system. The proposal also includes a literature survey, research issues and tasks, milestones, typical scenarios, tests, and contributions.
E N D
Scaling and Fault Tolerance for Distributed Messages in a Service and Streaming Architecture Thesis Proposal Hasan Bulut hbulut@cs.indiana.edu
Outline • Motivation • Goals of the Architecture & Example Applications • Literature Survey • Research Issues and Tasks • Milestones • Typical Scenarios • Tests • Contributions • Summary
Motivation • Collaboration systems enable people to collaborate with each other. However, there are various open research issues in these systems. Some of them are: • A more fault tolerant system • A distributed and replicated archiving system • An architecture or framework to cope with network failures • A mechanism to recover from failures while session is recorded • Playback is available only after the session is over • Playback mechanism for live sessions
Motivation • An architecture or framework to recover late or broken clients • Late clients will miss parts of the session that have already passed • Extending services to unicast clients • What happens if multicast feature is disabled on the network? • Support for heterogeneous clients • Support for videoconferencing (i.e. H.323 clients) and streaming clients (i.e. RealOne player) • Support for desktops and mobile devices such as cellular phones.
Goals of the Architecture • A service oriented architecture • Provide RTSP (Real Time Streaming Protocol) semantics • Compatible with Web Services standards and technologies • Persistent and fault tolerant architecture • A distributed and replicated archiving system in a messaging system environment • Dynamic replay service. Ability to switch among distributed replay services in case node failures • Scalable architecture • Allow a large number of clients to connect to the system. • Allow heterogeneous (different types of) clients to connect to the system
Goals of the Architecture • Provide a flexible and extendable framework for new services • Allow instant replay of streams. With this feature, it would be possible to annotate streams • Improve Quality of Service (QoS) • Time ordering of events • Maintaining the time spacing between consecutive events • Enable late and broken clients to receive the past events (streams) • A generic architecture that can work with any collaboration tool, such as audio/video, whiteboard, text chat etc.
Example Applications • Consider a late client joining live audio/video session. This client has three options: • Does not care about the missed stream. • Plays the missed stream in a faster mode until he/she catches up with the live stream. • Plays the stream from the beginning and follows the live session from behind. • The stream is not necessarily a video stream. It can be events from a shared displays/applications such as whiteboards or from other collaboration tools. • Client can play a 2-hour long archived stream in 30 min (scaling 2-hour stream to 30-min stream).
Literature Survey • Collaboration systems • Access Grid, InSORS, VRVS, Web based collaboration tools (WebEx, Centra) • Archiving and replay services used in collaboration systems • Voyager, IG Recorder • Streaming media standards • SMIL, RTSP (RFC 2326), RTP/RDT, data types such as H.261, H.263, MPEG-4, RealMedia • XGSP – XML Based General Session Protocol; GlobalMMCS • NaradaBrokering - Distributed messaging infrastructure
Collaboration Systems • Access Grid (AG) • Uses Internet2 multicast for audio/video transmission. • Voyager: Open source archiving tool used to record audio/video streams in MBONE sessions. • InSORS: Can be viewed as a commercial version of AG. • IG Recorder • Similar to Voyager, it records audio/video streams as well as other data streams (i.e. powerpoint slides) in AG sessions. • VRVS • Provides some kind of integration of different A/V endpoints. • No information about archiving system. • WebEx / Centra : Web based collaboration systems. • Recording and playback is done in a traditional way; session is recorded in a local storage.
SETUP Ready TEARDOWN PAUSE TEARDOWN PLAY / RECORD Playing / Recording Init Streaming Media Standards • RTSP – Real Time Streaming Protocol • NOT a transport protocol. • VCR-like control protocol over media. • Stateful server-client communication. RTSP States
Streaming Media Standards • SMIL - Synchronized Multimedia Integration Language • “An XML-based language that allows authors to write interactive multimedia presentations” • Multiple streams can be presented in a synchronized timeline. • Real Time Transport Protocol – RTP • Usually used in conjunction with RTCP. • RTSP server can deliver media data using RTP • RealNetworks’ Data Transport – RDT • RealNetworks’ proprietary standard to deliver media. • Can be used over UDP or TCP • Data types • H.261, H263, JPEG , etc. (mostly used in VC systems) • RealMedia, MPEG, etc. (mostly used in RTSP streaming clients)
Streaming Servers • Streaming servers are implementation of RTSP. Support for RTSP may vary. • Helix Streaming Server • Streaming server from RealNetworks • Open source version has limited capability. Formats: RealMedia, mp3 • Commercial version provides live archiving to the local storage (as media files). Formats: RealMedia, mp3, mpeg-4, QT and WM • Darwin Streaming Servers • Open source streaming server from Apple. • Supports QT format. • Archives the session to the local storage (as media files)
XML Based General Session Protocol (XGSP) • XGSP is a conference control framework. • The goal of XGSP is to integrate heterogeneous systems into one collaboration system. • Includes three components; user session management, application session management and floor control. • SIP is a non-XML text-based signaling protocol for Internet conferencing, telephony and instant messaging • GlobalMMCS : A prototype system to verify and refine XGSP conference control framework. • A XGSP media server • H.323, SIP gateways and Real Servers for A/V clients • XGSP A/V Session Server • The web server
NaradaBrokering (NB) • Virtualizes communication transport and endpoints • UDP, TCP, Multicast, SSL ….. • Based on a distributed network of cooperating broker nodes. (brokers support software overlay network) • Efficiently routes (content or endpoint-based) information from producers to consumers of content. • Subscriptions can be based on SQL, Regular expressions and XPath queries. • Been deployed and tested in the context of multimedia conferencing and Grid applications. • Introduces delays of order one to two milliseconds at each broker
Research Issues • We need to research capabilities/services that need to exist in a messaging system to achieve a higher quality of service (qos) of archiving and replay service • Effect of • Timestamping events using NTP on achieving synchronization among streams • Time ordering of events using buffering service and • Time spaced release of events using time differential service on stream quality. • A metadata management service for archiving and replay • How to build a session catalog to describe information regarding the streams in the session • How to manage messaging system topics for RTSP sessions • How to expose this service as a web service
Research Issues • Improving fault tolerance of the system • Redundancy in archiving/replay services • How to provide continuity of the stream in case of a replay service node crash • How the replay service can leverage fault tolerance • Scalable replay service • How many requests a replay service can support • Load balancing among replay services • Effect of network threshold • Supporting different type of clients with different capabilities • Other research issues • Systematic applications of major and minor event concepts in event driven systems • How to expose RTSP semantics as a web service • Synchronization of replaying multiple streams
Research Tasks • RTSP semantics support in XGSP (service oriented architecture) • How RTSP clients can join to XGSP sessions • A RTSP to XGSP signaling gateway • How XGSP will support RTSP clients • RTSP semantics support in NB (messaging system) • How to support active replay (play, pause, rewind, forward, absolute positioning, etc) for both live and archived streams • Instant replay • How to support and provide seeking capability in live streams • Current RTSP servers do not support rewind in live streams • Changes to NB archive and replay service to support RTSP semantics • Do we need extensions to RTSP?
Milestones I • NB Time Service • An implementation of Network Time Protocol (RFC 1305) • Entities generating events in the system should utilize Time Service to timestamp the events. • NB Buffering Service • The goal is to time-order events. • Delay introduced by the buffer service can vary based on the above parameter values. • Time Differential Service • Releases events preserving the time spacing between events. • Streaming Gateway • Transcodes audio/video streams into RealMedia format. • Targets both desktop PCs and cellular phones • Stream conversion is a CPU intensive application
Milestones II • NB Replay Service • Should provide API to support RTSP semantics. • RTSP Media/Topic Manager • Binding RTSP sessions with related NB topics. • XGSP Archive Manager • Provides RTSP RECORD semantics to start archiving of topics. • Session Metadata Service • Metadata service for archiving system. • RTSP Server / Proxy • Ability to dynamically locate replay and archiving services. • Ability to switch between replicas. • We will apply those to e-sports project
Typical Scenario for Live Streaming and Recording Replay/ Archiving Service 1: XGSP client sends and receives RTP packets 2, 3: Archiving service subscribes to the topic and records the sessions on different storages. 4: RTSP client communicates with RTSP server/proxy and establishes a RTSP session. 5: RTSP client receives the stream from the topic. NB Stable Storage Replay/Archiving Service 3 2 NB X NB Stable Storage RTSP Server/Proxy 5 1 4 Producer (XGSP Client (MBONE tools, ...) ,…) RTSP Client Two way NB link One way NB link that carries stream Local Storage access Communication channel Topic X
Typical Scenario for Live Streaming and Recording (with stream conversion) 1: XGSP client sends and receives RTP packets 2: Streaming Gateway (SG) subscribes to the stream topic and receives the stream 3: SG publishes the stream over NB link 4, 5: Archiving service subscribes to the topic and records the sessions on different storages. 6: RTSP client communicates with RTSP server/proxy and establishes a RTSP session. 7:RTSP client receives the stream from the topic. NB Stable Storage Streaming Gateway Replay/Archiving Service 4 2 NB 4 X 3 RTSP Server/Proxy 7 5 6 1 NB Stable Storage Producer (XGSP Client (MBONE tools, ...) ,…) Replay/ Archiving Service RTSP Client Two way NB link One way NB link that carries stream Local storage access Communication channel Topic X
Typical Scenario for Playback Replay/Archiving Service NB Stable Storage 1: RTSP client communicates with RTSP server/proxy and establishes a RTSP session. 2: Stream is published by replay service 3: Alternate stream to 2 4: RTSP client receives the stream from the topic. 2 NB X RTSP Server/Proxy 4 1 3 RTSP Client Replay/Archiving Service One way NB link One way NB link that carries stream Communication channel Topic NB Stable Storage X
Typical Scenario for Instant Replay Producer (XGSP Client (MBONE tools, ...) ,…) 1: XGSP client sends and receives RTP packets 2: Archiving service subscribes to the topic and records the sessions. 3: RTSP client communicates with RTSP server/proxy and establishes a RTSP session. 4: RTSP client receives the stream from the topic. 5: RTSP client communicates with RTSP server/proxy for instant replay. 6: Replay service publishes the archived stream to a topic 7: RTSP client receives the archived stream. RTSP Server/Proxy ,5 1 3 • NB 4 RTSP Client X X 7 2 6 NB Stable Storage Replay/ Archiving Service Two way NB link One way NB link that carries stream Local Storage access Communication channel Topic X
Tests • NB Time Service tests on several machines. • Time differential service performance test. • Measuring number of clients that can be supported by a single replay service and storage. • Measuring client scalability • Measuring latency of recovery from failures • How long will it take to dynamically switch between replay services during a node failure (node that provides the replay service)? • How long will it take for an archiving node to recover the missed events?
Contribution of this Thesis • Combines the benefits of RTSP with distributed messaging system and a service oriented architecture for archiving and replay in a geographically distributed large network • A fault tolerant architecture for collaboration systems • Enables late, broken clients to receive missed streams • An architecture for instant replay of live streams • A scalable replay architecture benefits from the advantages of service oriented architecture and messaging systems • Support for heterogeneous clients
Summary • This thesis addresses the following open research issues in collaboration systems • A framework for fault tolerance: • Support for late or broken clients in live sessions. • Distributed archiving/replay system • Support for different clients : Research extension of architectures to support different clients with different capabilities, i.e. cellular phone clients. • Client scalability: Research extension of architectures to support as many clients as possible. Centralized servers support a limited number of clients • An instant replay mechanism for live streams.