230 likes | 383 Views
Stream Socket Programming. Idioms and pitfalls. Stream Socket Characteristics. Transmissions across a stream socket are considered to be a continuous stream of bytes. Any other structure must be created by the applications participating in the communication.
E N D
Stream Socket Programming Idioms and pitfalls
Stream Socket Characteristics • Transmissions across a stream socket are considered to be a continuous stream of bytes. • Any other structure must be created by the applications participating in the communication. • "Message" boundaries are not guaranteed to be preserved. • Most applications DO want to communicate in terms of a series of separate messages.
Application Protocols • Rules the processes involved in your application use to communicate. • Includes: • Allowed message types (formats) • Rules about when each message type can be sent. • Programming language structures don't always map exactly onto your message formats
Banking Example • Report total deposit amount (in pennies), total withdrawal amount (in pennies), number of deposits, number of withdrawals. • Design decision #1: encoding scheme • Character strings of digits • Binary integer values
Character Encoding • Advantages • No limit on size of values that can be encoded • No byte ordering issues • Disadvantages • Inefficient • Easy to get buffer size wrong or waste space • Must be careful about delimiters
Binary numeric encoding • Advantages • Uses fewer bits to represent a given value • Fields in message are a fixed number of bytes • Disadvantages • Byte ordering is significant – must use hton/ntoh • Building structs to represent messages isn't always straightforward (alignment issues)
Alignment rules • Compilers lay out structs to maximize alignment. • Fields are allocated in the order they are declared • A data value is aligned if its address is a multiple of its size (e.g. 32 bit ints – 4 bytes – at addresses divisible by 4) • A struct is aligned if its address is a multiple of the size of the largest data type it contains. • Unnamed padding bytes are added to keep struct members aligned.
Example struct mixedData { char Data1; short Data2; int Data3; char Data4; }; Total data bytes = 1 + 2 + 4 + 1 = 8
Example struct MixedData /* after compilation */ { char Data1; char Padding0[1]; /* So following 'short' is on a 2 byte boundary */ short Data2; int Data3; char Data4; char Padding1[3]; }; Total bytes = 1 + 1 + 2 + 4 + 1 + 3 = 12 Size is a multiple of sizeof(int)
To avoid padding: struct mixedData2 { int Data3; short Data2; char Data1; char Data4; }; Data items declared in order of decreasing size. Assuming actual space needed is a multiple of size of largest data item, no padding needed. sizeof(mixedData2) = 8
Bank Example struct bankMsg { int depositAmt; short depositCnt; int withdrawAmt; short withdrawCnt; }; • Total data size is 4 + 2 + 4 + 2 bytes = 12 bytes • On RHEL 5, using gcc, sizeof(bankMsg) = 16. Why?
No padding needed: struct bankMsg { int depositAmt; int withdrawAmt; short depositCnt; short withdrawCnt; }; You can also add the padding to your definition so it is explicit. (Recall sockaddr_in)
Parsing received messages • If the fields are fixed size, we can just send and receive the associated struct: struct bankMsg msg; void *buffer = (void *) &msg; msg.depositAmt = 2324234; msg.withdrawAmt = 2232344; msg.depositCnt = 50; msg.withdrawCnt = 42; send(s, &msg, sizeof(bankMsg), 0);
Parsing received messages • In the receiving process: struct bankMsg msg; void *buffer = (void *) &msg; int rbytes, rv; ... for (rbytes = 0; rbytes < sizeof(msg); rbytes += rv){ if ((rv = recv(s, buffer + rbytes, sizeof(msg) – rbytes, 0)) < 0) /* handle error */ } /* Fix byte order! */ msg.depositAmt = ntohl(msg.depositAmt); ...
Portability/Compatibility • The C/C++ standards don't specify alignment rules – left up to compiler implementations • Say you don't pay attention to padding when you define message format structs in your code. What can go wrong?
Delimited char strings • Sending multiple messages consisting of variable-length delimited character strings is tricky. • Doing a receive may give you bytes from more than one message. • Depending on how you have structured your parsing routines, it may be complicated to track which bytes of a receive buffer you have parsed.
Delimited char strings • One solution: for each delimited string you expect, receive one byte at a time until you receive the delimiter. • Leaves subsequent strings waiting in the received stream for subsequent calls to recv(). • Downside: slower than multi-byte receives, but if you don't know how many characters to expect, you're kind of stuck.
Working with core dumps • When a program crashes, it can create a file containing the image of the address space of the process at the time of the crash – a "core dump" • Debuggers can examine these files and show you precisely where the error occurred. Good for tracking down segmentation faults.
Requirements • You must compile your program with –g to preserve symbol table information for the debugger • You need to be allowed to create core files in your account. • Use the ulimit –c command to check. If return is 0, no core files will be created. • Change with "ulimit –c unlimited" • Caution: you can only do this once per login session, you can't switch back and forth.
Using core dumps • If a core dump is created, you will see "(core dumped)" in the error message • Linux creates a file called core.n, where n is a unique number • To examine the core file, use gdb (ddd also works): gdb executable-namecorefile-name
What you see • gdb reports where the error occurred, e.g. #0 0x08048370 in a (p=0x0) at test_core.c:11 • int y = *p; • a is the method name • p is the variable that caused the problem • test_core is the name of the executable, 11 is the line number • you can use gdb to examine variable values, etc.
Caveats • Core files are big, and because of the Linux naming convention, you will create a separate one every time a program crashes. • Pay attention to creation dates, and make sure you're examining the latest core dump. • Periodically delete core files. Make core.* one of the things the "clean" target in your makefile cleans up • Don't set ulimit to "unlimited" unless you need to examine a core dump.
A reasonable tutorial • http://www.network-theory.co.uk/articles/gccdebug.html