Network Programming Mastering Complexity with ACE & Patterns Presentation

Dr. Douglas C. Schmidt d.schmidt@vanderbilt.edu www.cs.wustl.edu/~schmidt/tutorials-ace.html C++ Network ProgrammingMastering Complexity with ACE & Patterns Professor of EECS Vanderbilt University Nashville, Tennessee

Motivation: Challenges of Networked Applications Complexities in networked applications • Accidental Complexities • Low-level APIs • Poor debugging tools • Algorithmic decomposition • Continuous re-invention/discovery of core concepts & components • Inherent Complexities • Latency • Reliability • Load balancing • Causal ordering • Scheduling & synchronization • Deadlock • Observation • Building robust, efficient, & extensible concurrent & networked applications is hard • e.g., we must address many complex topics that are less problematic for non-concurrent, stand-alone applications

Presentation Outline • Presentation Organization • Background • Concurrent & network challenges & solution approaches • Patterns & wrapper facades in ACE + applications Cover OO techniques & language features that enhance software quality • Patterns,which embody reusable software architectures & designs • ACE wrapper facades, which encapsulate OS concurrency & network programming APIs • OO language features, e.g., classes, dynamic binding & inheritance, parameterized types

The Layered Architecture of ACE www.cs.wustl.edu/~schmidt/ACE.html • Features • Open-source • 200,000+ lines of C++ • 40+ person-years of effort • Ported to many OS platforms • Large open-source user community • www.cs.wustl.edu/~schmidt/ACE-users.html • Commercial support by Riverace • www.riverace.com/

Sidebar: Platforms Supported by ACE • ACE runs on a wide range of operating systems, including: • PCs, e.g., Windows (all 32/64-bit versions), WinCE; Redhat, Debian, and SuSE Linux; & Macintosh OS X; • Most versions of UNIX, e.g., SunOS 4.x and Solaris, SGI IRIX, HP-UX, Digital UNIX (Compaq Tru64), AIX, DG/UX, SCO OpenServer, UnixWare, NetBSD, & FreeBSD; • Real-time operating systems, e.g., VxWorks, OS/9, Chorus, LynxOS, Pharlap TNT, QNX Neutrino and RTP, RTEMS, & pSoS; • Large enterprise systems, e.g., OpenVMS, MVS OpenEdition, Tandem NonStop-UX, & Cray UNICOS • ACE can be used with all of the major C++ compilers on these platforms • The ACE Web site at http://www.cs.wustl.edu/~schmidt/ACE.html contains a complete, up-to-date list of platforms, along with instructions for downloading & building ACE

Key Capabilities Provided by ACE Event Handling & IPC Service Access & Control Synchronization Concurrency

The Pattern Language for ACE • Pattern Benefits • Preserve crucial design information used by applications & middleware frameworks & components • Facilitate reuse of proven software designs & architectures • Guide design choices for application developers

The Frameworks in ACE

Networked Logging Service Example • Key Participants • Client application processes • Generate log records • Server logging daemon • Receive, process, & store log records • The logging server example in C++NPv2 is more sophisticated than the one in C++NPv1 • C++ code for all logging service examples are in • ACE_ROOT/examples/ C++NPv1/ • ACE_ROOT/examples/ C++NPv2/ • There’s an extra daemon involved

Patterns in the Networked Logging Service Leader/ Followers Monitor Object Active Object Half-Sync/ Half-Async Reactor Pipes & Filters Acceptor- Connector Component Configurator Proactor Wrapper Facade Thread-safe Interface Strategized Locking Scoped Locking

ACE Basics: Logging • ACE’s logging facility usually best for diagnostics • Can customize logging sinks • Filterable logging severities • Portable printf()-like format directives (thread/process ID, date/time, types) • Serializes output across multiple threads • ACE propagates settings to threads created via ACE • Can log across a network • ACE_Log_Msg class; use thread-specific singleton most of the time, via ACE_LOG_MSG macro • Macros encapsulate most usage. Most common: • ACE_DEBUG ((severity, format [, args…])); • ACE_ERROR[_RETURN] ((severity, format [,args…])[, return-value]); • See ACE Programmer’s Guide (APG) tables 3.1 (severities), 3.2 (directives), 3.3 (macros)

ACE Logging Usage • The ACE logging API is similar to printf(), e.g.: • ACE_ERROR ((LM_ERROR, "(%t) fork failed")); • generates: • Oct 31 14:50:13 1992@ics.uci.edu@2766@LM_ERROR@client::(4) fork failed • and • ACE_DEBUG ((LM_DEBUG, "(%t) sending to server %s", host)); • generates: • Oct 31 14:50:28 1992@ics.uci.edu@1832@LM_DEBUG@drwho::(6) sending to server tango

Logging Severities • You can control which severities are seen at run time • Two masks determine whether a message is displayed: • Process-wide mask (defaults to all severities enabled) • Per-thread mask (defaults to all severities disabled) • If logged severity is enabled in either mask, message is displayed • Set process/instance mask with: • ACE_Log_Msg::priority_mask (u_long mask, MASK_TYPE which); • MASK_TYPE is ACE_Log_Msg::PROCESS or ACE_Log_Msg::THREAD. • Since default is to enable all severities process-wide, all severities are logged in all threads unless you change it • Per-thread mask initializer can be adjusted (default is all severities disabled): • ACE_Log_Msg::disable_debug_messages (); • ACE_Log_Msg::enable_debug_messages(); • Any set of severities can be specified (OR’d together) • Note that these methods set and clear a (set of) bits instead of replacing the mask, as priority_mask() does

Logging Severities Example • To allow threads to decide their own logging, the desired severities must be: • Disabled at process level & enabled in the thread(s) to display them. • e.g., ACE_LOG_MSG->priority_mask (0, ACE_Log_Msg::PROCESS); ACE_Log_Msg::enable_debug_messages (); ACE_Thread_Manager::instance ()->spawn (service); ACE_Log_Msg::disable_debug_messages (); ACE_Thread_Manager::instance ()->spawn_n (3, worker); • LM_DEBUG severity (only) logged in service thread • LM_DEBUG severity (and all others) not logged in worker threads • Note that enable_debug_messages() & disable_debug_messages() are static methods

Redirect Logging to a File • Default logging sink is stderr. Redirect to a file by setting the OSTREAM flag and assigning a stream. Can set the flag in two ways: • ACE_Log_Msg::open (const ACE_TCHAR *prog_name, u_long options_flags = ACE_Log_Msg::STDERR, const ACE_TCHAR *logger_key = 0); • ACE_Log_Msg::set_flags (u_long flags); • Assign a stream: • ACE_Log_Msg::msg_ostream (ACE_OSTREAM_TYPE *);(Optional 2nd arg to tell ACE_Log_Msg to delete the ostream) • ACE_OSTREAM_TYPE is ofstream where supported, else FILE* • To also stop output to stderr, use open() without STDERR flag, or ACE_Log_Msg::clr_flags (STDERR)

Redirect Logging to Syslog • Redirected log output to ACE_Log_Msg::SYSLOG goes to: • Windows NT4 and up: system’s Event Log • UNIX/Linux: syslog facility (uses LOG_USER syslog facility) • Can’t set this with set_flags/clr_flags; must open. For example: • ACE_LOG_MSG->open(argv[0], ACE_Log_Msg::SYSLOG, ACE_TEXT (“syslogTest”)); • Windows: 3rd arg, if supplied, replaces 1st as program name in event log • To turn it off, call open() again with different flag(s). This seems odd, but you’re effectively resetting the logging… think of it as reopen().

Logging Callbacks • Logging callbacks are useful for adding special processing or filtering to log output • Derive a class from ACE_Log_Msg_Callback & reimplement: • virtual void log (ACE_Log_Record &log_record); • Use ACE_Log_Msg::msg_callback() to register callback • Also call ACE_Log_Msg::set_flags() to add ACE_Log_Msg::MSG_CALLBACK flag • Beware… • Callback registration is specific to each ACE_Log_Msg instance • Callbacks are not inherited when new threads are created

Useful Logging Flags • There are some other ACE_Log_Msg flags that add useful functionality to ACE’s logging: • VERBOSE: Prepends program name, timestamp, host name, process ID, and message priority to each message • VERBOSE_LITE: Prepends timestamp and message priority to each message (this is what ACE test suite uses) • SILENT: Don’t display any messages of any severity • LOGGER: Write messages to the local client logger deamon

Tracing • ACE’s tracing facility logs function/method entry & exit • Uses logging with severity LM_TRACE, so output can be selectively disabled • Just put ACE_TRACE macro in the function: #include “ace/Log_Msg.h” void foo (void) { ACE_TRACE (“foo”); // … do stuff } Says: (1024) Calling foo in file ‘test.cpp’ on line 8 (1024) Leaving foo • Clever indenting by call depth makes output easier to read • Huge amount of output, so tracing no-op’d out by default; rebuild with config.h having: #define ACE_NTRACE 0

Networked Logging Service Example • Key Participants • Client application processes • Generate log records • Server logging daemon • Receive, process, & store log records • We’ll develop architecture similar to ACE’s, but not same implementation. • C++ code for all logging service examples are in • ACE_ROOT/examples/ C++NPv1/ • ACE_ROOT/examples/ C++NPv2/

Network Daemon Design Dimensions • Communication dimensions address the rules, form, & level of abstraction that networked applications use to interact • Concurrency dimensions address the policies & mechanisms governing the proper use of processes & threads to represent multiple service instances, as well as how each service instance may use multiple threads internally • Service dimensions address key properties of a networked application service, such as the duration & structure of each service instance • Configuration dimensions address how networked services are identified & the time at which they are bound together to form complete applications

Communication Design Dimensions • Communication is fundamental to networked application design • The next three slides present a domain analysis of communication design dimensions, which address the rules, form, and levels of abstraction that networked applications use to interact with each other • We cover the following communication design dimensions: • Connectionless versus connection-oriented protocols • Synchronous versus asynchronous message exchange • Message-passing versus shared memory

Connectionless vs. Connection-oriented Protocols • A protocol is a set of rules that specify how control & data information is exchanged between communicating entities SYN SYN/ACK ACK Acceptor Connector 3-way handshake in TCP/IP • Connection-oriented applications must address two additional design issues: • Data framing strategies, e.g., bytestream vs. message-oriented • Connection multiplexing (muxing) strategies, e.g., multiplexed vs. nonmultiplexed

Alternative Connection Muxing Strategies • In multiplexed connections all client requests emanating from threads in a single process pass through one TCP connection to a server process • Pros: Conserves OS communication resources, such as socket handles and connection control blocks • Cons: harder to program, less efficient, & less deterministic • In nonmultiplexed connections each client uses a different connection to communicate with a peer service • Pros: Finer control of communication priorities & low synchronization overhead since additional locks aren't needed • Cons: use more OS resources, & therefore may not scale well in certain environments

Sync vs. Async Message Exchange • Asynchronous request/response protocols stream requests from client to server without waiting for responses synchronously • Multiple client requests can be transmitted before any responses arrive from a server • These protocols therefore often require a strategy for detecting lost or failed requests & resending them later • Synchronous request/response protocols are the simplest form to implement • Requests & responses are exchanged in a lock-step sequence. • Each request must receive a response synchronously before the next is sent

Message Passing vs. Shared Memory • Message passing exchanges data explicitly via the IPC mechanisms • Application developers generally define the protocol for exchanging the data, e.g.: • Format & content of the data • Number of possible participants in each exchange (e.g., point-to-point unicast), multicast, or broadcast) • How participants begin, conduct, & end a message-passing session • Shared memory allows multiple processes on the same or different hosts to access & exchange data as though it were local to the address space of each process • Applications using native OS shared memory mechanisms must define how to locate & map the shared memory region(s) & the data structures that are placed in shared memory

Sidebar: C++ Objects & Shared Memory Allocating a C++ Object in shared Memory void *obj_buf = … // Get a pointer to location in shared memory ABC *abc = new (obj_buf) ABC; // Use C++ placement new operator • General responsibilities using placement new operator • Pointer passed to placement new operator must point to a memory region that is big enough & is aligned properly for the object type being created • The placed object must be destroyed by explicitly calling the destructor • Pitfalls initializing C++ objects with virtual functions in shared memory • The shared memory region may reside at a different virtual memory location in each process that maps the shared memory • The C++ compiler/linker need not locate the vtable at the same address in different processes that use the shared memory • ACE wrapper façade classes that can be initialized in shared memory must therefore be concrete data types • i.e., classes with only non-virtual methods

Overview of the Socket API (1/2) Sockets are the most common network programming API available on operating system platforms • Originally developed in BSD Unix as a C language API to TCP/IP protocol suite • The Socket API has approximately two dozen functions classified in five categories • Socket is a handle created by the OS that associates it with an end point of a communication channel • Asocket can be bound to a local or remote address • In Unix, socket handles & I/O handles can be used interchangeably in most cases, but this is not the case for Windows

Overview of the Socket API (2/2) Local context management Connection establishment & termination Data transfer mechanisms Options management Network addressing

Taxonomy of Socket Dimensions The Socket API can be decomposed into the following dimensions: • Type of communication service • e.g., streams versus datagrams versus connected datagrams • Communication & connection role • e.g., clients often initiate connections actively, whereas servers often accept them passively • Communication domain • e.g., local host only versus local or remote host

Limitations with the Socket APIs (1/2) • Poorly structured, non-uniform, & non-portable • API is linear rather than hierarchical • i.e., the API is not structured according to the different phases of connection lifecycle management and the roles played by the participants • No consistency among the names • Non-portable & error-prone • Function names: read() & write() used for any I/O handle on Unix but Windows needs ReadFile() & WriteFile() • Function semantics: different behavior of same function on different OS e.g., accept () can take NULL client address parameter on Unix/Windows, but will crash on some operating systems, such as VxWorks • Socket handle representations: different platforms represent sockets differently e.g., Unix uses unsigned integers whereas Windows uses pointers • Header files: Different platforms use different names for header files for the socket API

Limitations with the Socket APIs (2/2) • Lack of type safety • I/O handles are not amenable to strong type checking at compile time • e.g., no type distinction between a socket used for passive listening & a socket used for data transfer • Steep learning curve due to complex semantics • Multiple protocol families & address families • Options for infrequently used features such as broadcasting, async I/O, non blocking I/O, urgent data delivery • Communication optimizations such as scatter-read & gather-write • Different communication and connection roles, such as active & passive connection establishment, & data transfer • Too many low-level details • Forgetting to use the network byte order before data transfer • Possibility of missing a function, such as listen() • Possibility of mismatch between protocol & address families • Forgetting to initialize underlying C structures e.g., sockaddr • Using a wrong socket for a given role

Example of Socket API Limitations (1/3) 1 #include <sys/types.h> 2 #include <sys/socket.h> 3 4 const int PORT_NUM = 10000; 5 6 int echo_server () 7 { 8 struct sockaddr_in addr; 9 int addr_len; 10 char buf[BUFSIZ]; 11 int n_handle; 12 // Create the local endpoint. Possible differences in header file names Forgot to initialize to sizeof (sockaddr_in) Use of non-portable handle type

Example of Socket API Limitations (2/3) 13 int s_handle = socket (PF_UNIX, SOCK_DGRAM, 0); 14 if (s_handle == -1) return -1; 15 16 // Set up address information where server listens. 17 addr.sin_family = AF_INET; 18 addr.sin_port = PORT_NUM; 19 addr.sin_addr.addr = INADDR_ANY; 20 21 if (bind (s_handle, (struct sockaddr *) &addr, 22 sizeof addr) == -1) 23 return -1; 24 Use of non-portable return value Protocol and address family mismatch Wrong byte order Unused structure members not zeroed out Missed call to listen()

Example of Socket API Limitations (3/3) 25 // Create a new communication endpoint. 26 if (n_handle = accept (s_handle, (struct sockaddr *) &addr, 27 &addr_len) != -1) { 28 int n; 29 while ((n = read (s_handle, buf, sizeof buf)) > 0) 30 write (n_handle, buf, n); 31 32 close (n_handle); 33 } 34 return 0; 35 } SOCK_DGRAM handle illegal here Reading from wrong handle No guarantee that “n” bytes will be written

ACE Socket Wrapper Façade Classes • ACE defines a set of C++ classes that address the limitations with the Socket API • Enhance type-safety • Ensure portability • Simplify common use cases • Building blocks for higher-level abstractions These classes are designed in accordance with the Wrapper Facade design pattern

The Wrapper Façade Pattern (1/2) Applications Solaris VxWorks Win2K Linux LynxOS • Context • Networked applications must manage a variety of OS services, including processes, threads, socket connections, virtual memory, & files • OS platforms provide low-level APIs written in C to access these services • Problem • The diversity of hardware & operating systems makes it hard to build portable & robust networked application software • Programming directly to low-level OS APIs is tedious, error-prone, & non-portable

The Wrapper Façade Pattern (2/2) calls API FunctionA() calls methods Application calls API FunctionB() calls API FunctionC() Wrapper Facade void method1(){ void methodN(){ functionA(); functionA(); data functionB(); } } method1() … methodN() : Application : Wrapper : APIFunctionA : APIFunctionB Facade method() functionA() functionB() • Solution • Apply the Wrapper Facade design pattern (P2) to avoid accessing low-level operating system APIs directly This pattern encapsulates data & functions provided by existing non-OO APIs within more concise, robust, portable, maintainable, & cohesive OO class interfaces

ACE Socket Wrapper Façades Taxonomy • The structure of the ACE Socket wrapper facades reflects the domain of networked IPC properties • The ACE Socket wrapper façade classes provide the following capabilities: • ACE_SOCK_* classes encapsulate Internet-domain Socket API functionality • ACE_LSOCK_* classes encapsulate UNIX-domain Socket API functionality • ACE also has wrapper facades for datagrams • e.g., unicast, multicast, broadcast

Roles in the ACE Socket Wrapper Facade • The active connection role (ACE_SOCK_Connector) is played by a peer application that initiates a connection to a remote peer • The passive connection role (ACE_SOCK_Acceptor) is played by a peer application that accepts a connection from a remote peer & • The communication role (ACE_SOCK_Stream) is played by both peer applications to exchange data after they are connected

ACE Socket Addressing Classes (1/2) • Motivation • Network addressing is a trouble spot in the Socket API • To minimize the complexity of these low-level details, ACE defines a hierarchy of classes that provide a uniform interface for all ACE network addressing objects

ACE Socket Addressing Classes (2/2) • Class Capabilities • The ACE_Addr class is the root of the ACE network addressing hierarchy • The ACE_INET_Addr class represents TCP/IP & UDP/IP addressing information • This class eliminates many subtle sources of accidental complexity

ACE I/O Handle Classes (1/2) • Motivation • Low-level C I/O handle types are tedious, error-prone, & non-portable • Even the ACE_HANDLE typedef is still not sufficiently object-oriented & typesafe int buggy_echo_server (u_short port_num) { sockaddr_in s_addr; int acceptor = socket (PF_UNIX, SOCK_DGRAM, 0); s_addr.sin_family = AF_INET; s_addr.sin_port = port_num; s_addr.sin_addr.s_addr = INADDR_ANY; bind (acceptor, (sockaddr *) &s_addr, sizeof s_addr); int handle = accept (acceptor, 0, 0); for (;;) { char buf[BUFSIZ]; ssize_t n = read (acceptor, buf, sizeof buf); if (n <= 0) break; write (handle, buf, n); } } int is not portable to Windows Reading from wrong handle

ACE I/O Handle Classes (2/2) • Class Capabilities • ACE_IPC_SAP is the root of the ACE hierarchy of IPC wrapper facades • It provides basic I/O handle manipulation capabilities to other ACE IPC wrapper facades • ACE_SOCK is the root of the ACE Socket wrapper facades & it provides methods to • Create & destroy socket handles • Obtain the network addresses of local & remote peers • Set/get socket options, such as socket queue sizes, • Enable broadcast/multicast communication • Disable Nagle‘s algorithm

The ACE_SOCK_Connector Class • Motivation • There is a confusing asymmetry in the Socket API between (1) connection roles & (2) socket modes • e.g., an application may accidentally call recv() or send() on a data-mode socket handle before it's connected • This problem can't be detected until run time since C socket handles are weakly-typed int buggy_echo_client (u_short port_num, const char *s) { int handle = socket (PF_UNIX, SOCK_DGRAM, 0); write (handle, s, strlen (s) + 1); sockaddr_in s_addr; memset (&s_addr, 0, sizeof s_addr); s_addr.sin_family = AF_INET; s_addr.sin_port = htons (port_num); connect (handle, (sockaddr *) &s_addr, sizeof s_addr); } Operations called in wrong order

The ACE_SOCK_Connector Class • Class Capabilities • ACE_SOCK_Connector is factory that establishes a new endpoint of communication actively & provides capabilities to • Initiate a connection with a peer acceptor & then to initialize an ACE_SOCK_Stream object after the connection is established • Initiate connections in either a blocking, nonblocking, or timed manner • Use C++ traits to support generic programming techniques that enable wholesale replacement of IPC functionality

Sidebar: Traits for ACE Wrapper Facades (1/2) • ACE uses the C++ generic programming idiom to define & combine a set of characteristics to alter the behavior of a template class • In C++, the typedef & typename language feature is used to define a trait • A trait provides a convenient way to associate related types, values, & functions with template parameter type without requiring that they be defined as members of the type • Traits are used extensively in the C++ Standard Template Library (STL)

Sidebar: Traits for ACE Wrapper Facades (2/2) • ACE Socket wrapper facades use traits to define the following associations • PEER_ADDR – this trait defines the ACE_INET_Addr class associated with the ACE Socket Wrapper Façade • PEER_STREAM – this trait defines the ACE_SOCK_Stream data transfer class associated with the ACE_SOCK_Acceptor & ACE_SOCK_Connector factories class ACE_TLI_Connector { public: typedef ACE_INET_Addr PEER_ADDR; typedef ACE_TLI_Stream PEER_STREAM; // ... class ACE_SOCK_Connector { public: typedef ACE_INET_Addr PEER_ADDR; typedef ACE_SOCK_Stream PEER_STREAM; // ...

Using the ACE_SOCK_Connector (1/3) • This example shows how the ACE_SOCK_Connector can be used to connect a client application to a Web server int main (int argc, char *argv[]) { const char *pathname = argc > 1 ? argv[1] : “/index.html"; const char *server_hostname = argc > 2 ? argv[2] : “www.dre.vanderbilt.edu"; typedef ACE_SOCK_Connector CONNECTOR; CONNECTOR connector; CONNECTOR::PEER_STREAM peer; CONNECTOR::PEER_ADDR peer_addr; if (peer_addr.set (80, server_hostname) == -1) return 1; else if (connector.connect (peer, peer_addr) == -1) return 1; • Instantiate the connector, data transfer, & address objects • Block until connection established or connection request failure

Using the ACE_SOCK_Connector (2/3) // Designate a nonblocking connect. if (connector.connect (peer, peer_addr, &ACE_Time_Value::zero) == -1) { if (errno == EWOULDBLOCK) { // Do some other work ... // Now, try to complete connection establishment, // but don't block if it isn't complete yet. if (connector.complete (peer, 0, &ACE_Time_Value::zero) == -1) • Perform a non-blocking connect • If connection not established, do other work & try again without blocking // Designate a timed connect. ACE_Time_Value timeout (10); // 10 second timeout. if (connector.connect (peer, peer_addr, &timeout) == -1) { if (errno == ETIME) { // Timeout, do something else • Perform a timed connect e.g., 10 seconds in this case

Network Programming Mastering Complexity with ACE & Patterns Presentation