470 likes | 640 Views
Perl and UNIX Network Programming. Naoya Ito naoya at hatena.ne.jp. Why now network programming?. httpd is boring Some recent web application have special feature of networking. Comet Socket API of ActionScript 3 mini server for development, like Catalyst's server.pl. Agenda.
E N D
Perl and UNIX Network Programming Naoya Ito naoya at hatena.ne.jp
Why now network programming? • httpd is boring • Some recent web application have special feature of networking. • Comet • Socket API of ActionScript 3 • mini server for development, like Catalyst's server.pl
Agenda • UNIX network programming basics with Perl • I/O multiplexing • Perl libraries for modern network programming
BSD Socket API with C int main (void) { int listenfd, connfd; struct sockaddr_in servaddr; char buf[1024]; listenfd = socket(AF_INET, SOCK_STREAM, 0); bzero(&servaddr, sizeof(servaddr)); servaddr.sin_family = AF_INET; servaddr.sin_addr.s_addr = htonl(INADDR_ANY); servaddr.sin_port = htons(9999); bind(listenfd, (struct sockaddr *) &servaddr, sizeof(servaddr)); listen(listenfd, 5); for (;;) { connfd = accept(listenfd, NULL, NULL) ; while (read(connfd, buf, sizeof(buf)) > 0) { write(connfd, buf, strlen(buf)); } close(connfd); } }
BSD Socket API • socket() • struct sockaddr_in • bind() • listen() • accept() • read() / write() • close()
Perl Network Programming • TMTOWTDI • less code • CPAN • performance is good enough • right design >> ... >> language advantage
BSD Socket API with Perl #!/usr/local/bin/perl use strict; use warnings; use Socket; socket LISTEN_SOCK, AF_INET, SOCK_STREAM, scalar getprotobyname('tcp'); bind LISTEN_SOCK, pack_sockaddr_in(9999, INADDR_ANY); listen LISTEN_SOCK, SOMAXCONN; while (1) { accept CONN_SOCK, LISTEN_SOCK; while (sysread(CONN_SOCK, my $buffer, 1024)) { syswrite CONN_SOCK, $buffer; } close CONN_SOCK; }
use IO::Socket #!/usr/local/bin/perl use strict; use warnings; use IO::Socket; my $server = IO::Socket::INET->new( Listen => 20, LocalPort => 9999, Reuse => 1, ) or die $!; while (1) { my $client = $server->accept; while ($client->sysread(my $buffer, 1024)) { $client->syswrite($buffer); } $client->close; } $server->close;
blocking on Network I/O while (1) { my $client = $server->accept; while ($client->sysread(my $buffer, 1024)) { # block $client->syswrite($buffer); } $client->close; } accept(2) server listen queue read(2) I can't do client #1 client #2
busy loop / blocking % ps -e -o stat,pid,wchan=WIDE-WCHAN-COLUMN,time,comm while (1) { $i++ } STAT PID WIDE-WCHAN-COLUMN TIME COMMAND R+ 18684 - 00:00:38 perl while (1) { STDIN->getline } STAT PID WIDE-WCHAN-COLUMN TIME COMMAND S+ 8671 read_chan 00:00:00 perl
Linux internals process buffer fread() TASK_RUNNING libc.so buffer read(2) system call vfs TASK_UNINTERRUPTIBLE ext3 switch to Kernel-Mode. User-process goes sleep. device driver Kernel-Mode Hardware Interruption. Hardware (HDD) ref: 『Linux カーネル2.6解読室』 p.32
Again: blocking while (1) { my $client = $server->accept; while ($client->sysread(my $buffer, 1024)) { # block $client->syswrite($buffer); } $client->close; }
We need parallel processing • fork() • threads • Signal I/O • I/O Multiplexing • Asynchronous I/O
I/O Multiplexing • Parallel I/O in single thread, watching I/O event of file descripters • less resource than fork/threads • select(2) / poll(2) • wait for a number of file descriptors to change status.
select(2) accepted connection #2 listening socket accepted connection #1 1. ready! select(2) 3. ok, I'll try to accept() 2. now listening socket is ready to accept a new connection. caller
select(2) on Perl • select(@args) • number of @args is not 1 but 4. • difficult interface • IO::Select • OO interface to select(2) • easy interface
IO::Select SYNOPSYS use IO::Select; $s = IO::Select->new(); $s->add(\*STDIN); $s->add($some_handle); @ready = $s->can_read($timeout); # block
use IO::Select my $listen_socket = IO::Socket::INET->new(...) or die $@; my $select = IO::Select->new or die $!; $select->add($listen_socket); while (1) { my @ready = $select->can_read; # block for my $handle (@ready) { if ($handle eq $listen_socket) { my $connection = $listen_socket->accept; $select->add($connection); } else { my $bytes = $handle->sysread(my $buffer, 1024); $bytes > 0 ? $handle->syswrite($buffer) : do { $select->remove($handle); $handle->close; } } } }
And more things we must think... • blocking when syswrite() • use non-blocking socket • Line-based I/O • select(2) disadvantage
non-blocking socket + Line-based I/O use POSIX; use IO::Socket; use IO::Select; use Tie::RefHash; my $server = IO::Socket::INET->new(...); $server->blocking(0); my (%inbuffer, %outbuffer, %ready); tie %ready, "Tie::RefHash"; my $select = IO::Select->new($server); while (1) { foreach my $client ( $select->can_read(1) ) { handle_read($client); } foreach my $client ( keys %ready ) { foreach my $request ( @{ $ready{$client} } ) { $outbuffer{$client} .= $request; } delete $ready{$client}; } foreach my $client ( $select->can_write(1) ) { handle_write($client); } } sub handle_error { my $client = shift; delete $inbuffer{$client}; delete $outbuffer{$client}; delete $ready{$client}; $select->remove($client); close $client; } sub handle_read { my $client = shift; if ($client == $server) { my $new_client = $server->accept(); $new_client->blocking(0); $select->add($new_client); return; } my $data = ""; my $rv = $client->recv($data, POSIX::BUFSIZ, 0); unless (defined($rv) and length($data)) { handle_error($client); return; } $inbuffer{$client} .= $data; while ( $inbuffer{$client} =~ s/(.*\n)// ) { push @{$ready{$client}}, $1; } } sub handle_write { my $client = shift; return unless exists $outbuffer{$client}; my $rv = $client->send($outbuffer{$client}, 0); unless (defined $rv) { warn "I was told I could write, but I can't.\n"; return; } if ($rv == length( $outbuffer{$client}) or $! == POSIX::EWOULDBLOCK) { substr( $outbuffer{$client}, 0, $rv ) = ""; delete $outbuffer{$client} unless length $outbuffer{$client}; return; } handle_error($client); } • oops
select(2) disadvantage • FD_SETSIZE limitation • not good for C10K • Inefficient processing • coping list of fds to the kernel • You must scan list of fds in User-Land
select(2) internals process FD_ISSET select(2) select(2) fd fd fd fd fd fd fd fd fd copy copy fd fd fd fd fd fd I/O event kernel ref: http://osdn.jp/event/kernel2003/pdf/C06.pdf
Modern UNIX APIs • epoll • Linux 2.6 • /dev/kqueue • BSD • devpoll • Solaris
epoll(4) • better than select(2), poll(2) • no limitation of numbers of fds • O(1) scallability • needless to copy list of fds • epoll_wait(2) returns only fds that has new event
epoll internals process epoll_create() epoll_wait() epoll_ctl(ADD) epoll_ctl(ADD) epoll_ctl(ADD) fd table fd fd fd fd fd fd I/O event kernel ref: http://osdn.jp/event/kernel2003/pdf/C06.pdf
epoll on perl • Sys::Syscall • epoll • sendfile • IO::Epoll • use IO::Epoll qw/:compat/
Libraries for Perl Network Programming • TMTOWTDI • POE • Event::Lib • Danga::Socket • Event • Stem • Coro ...
They provides: • Event-based programming for parallel processing • system call abstraction • select(2) / poll(2) / epoll / kqueue(2) / devpoll
POE • "POE is a framework for cooperative, event driven multitasking in Perl. " • POE has many "components" on CPAN • I'm lovin' it :)
Hello, POE use strict; use warnings; use POE qw/Sugar::Args/; POE::Session->create( inline_states => { _start => sub { my $poe = sweet_args; $poe->kernel->yield('hello'), # async / FIFO }, hello => sub { STDOUT->print("Hello, POE!"); }, }, ); POE::Kernel->run;
Watching handles in Event loop POE::Session->create( inline_states => { _start => sub { my $poe = sweet_args; $poe->kernel->yield('readline'), }, readline => sub { my $poe = sweet_args; STDOUT->syswrite("input> "); $poe->kernel->select_read(\*STDIN, 'handle_input'); }, handle_input => sub { my $poe = sweet_args; my $stdin = $poe->args->[0]; STDOUT->syswrite(sprintf "Hello, %s", $stdin->getline); $poe->kernel->yield('readline'); } }, );
Results % perl hello_poe2.pl input> naoya Hello, naoya input> hatena Hello, hatena input> foo bar Hello, foo bar input>
Results of strace % strace -etrace=select,read,write -p `pgrep perl` Process 8671 attached - interrupt to quit select(8, [0], [], [], {3570, 620000}) = 1 (in [0], left {3566, 500000}) read(0, "naoya\n", 4096) = 6 write(1, "Hello, naoya\n", 13) = 13 select(8, [0], [], [], {0, 0}) = 0 (Timeout) write(1, "input> ", 7) = 7 select(8, [0], [], [], {3600, 0}) = 1 (in [0], left {3595, 410000}) read(0, "hatena\n", 4096) = 7 write(1, "Hello, hatena\n", 14) = 14 select(8, [0], [], [], {0, 0}) = 0 (Timeout) write(1, "input> ", 7) = 7 select(8, [0], [], [], {3600, 0}) = 1 (in [0], left {3598, 860000}) read(0, "foobar\n", 4096) = 7 write(1, "Hello, foobar\n", 14) = 14 select(8, [0], [], [], {0, 0}) = 0 (Timeout) write(1, "input> ", 7) = 7 select(8, [0], [], [], {3600, 0}
use POE::Wheel::ReadLine POE::Session->create( inline_states => { ... readline => sub { my $poe = sweet_args; $poe->heap->{wheel} =POE::Wheel::ReadLine->new( InputEvent => 'handle_input', ); $poe->heap->{wheel}->get('input> '); }, handle_input => sub { my $poe = sweet_args; $poe->heap->{wheel}->put(sprintf "Hello, %s", $poe->args->[0]); $poe->heap->{wheel}->get('input> '); } }, ); ...
Parallel echo server using POE POE::Session->create( inline_states => { _start => \&server_start, }, package_states => [ main => [qw/ accept_new_client accept_failed client_input /], ] ); POE::Kernel->run; sub server_start { my $poe = sweet_args; $poe->heap->{listener} = POE::Wheel::SocketFactory->new( BindPort => 9999, Reuse => 'on', SuccessEvent => 'accept_new_client', FailureEvent => 'accept_failed', ); } sub accept_new_client { my $poe = sweet_args; my $wheel = POE::Wheel::ReadWrite->new( Handle => $poe->args->[0], InputEvent => 'client_input', ); $poe->heap->{wheel}->{$wheel->ID} = $wheel; } sub client_input { my $poe = sweet_args; my $line = $poe->args->[0]; my $wheel_id = $poe->args->[1]; $poe->heap->{wheel}->{$wheel_id}->put($line); } sub accept_failed {}
Again, Parallel echo server using POE use POE qw/Sugar::Args Component::Server::TCP/; POE::Component::Server::TCP->new( Port => 9999, ClientInput => sub { my $poe = sweet_args; my $input = sweet_args->args->[0]; $poe->heap->{client}->put($input); }, ); POE::Kernel->run();
POE has many components on CPAN • PoCo::IRC • PoCo::Client::HTTP • PoCo::Server::HTTP • PoCo::EasyDBI • PoCo::Cron • PoCo::Client::MSN • PoCo::Client::Linger ...
using POE with epoll • just use POE::Loop::Epoll • use POE qw/Loop::Epoll/;
Event::Lib • libevent(3) wrapper • libevent is used by memcached • libevent provides: • event-based programming • devpoll, kqueue, epoll, select, poll abstraction • Similar to Event.pm • Simple
echo server using Event::Lib my $server = IO::Socket::INET->new(...) or die $!; $server->blocking(0); event_new($server, EV_READ|EV_PERSIST, \&event_accepted)->add; event_mainloop; sub event_accepted { my $event = shift; my $server = $event->fh; my $client = $server->accept; $client->blocking(0); event_new($client, EV_READ|EV_PERSIST, \&event_client_input)->add; } sub event_client_input { my $event = shift; my $client = $event->fh; $client->sysread(my $buffer, 1024); event_new($client, EV_WRITE, \&event_client_output, $buffer)->add; } sub event_client_output { ... }
Result of strace on Linux 2.6 epoll_wait(4, {{EPOLLIN, {u32=135917448, u64=135917448}}}, 1023, 5000) = 1 gettimeofday({1167127923, 189763}, NULL) = 0 read(7, "gho\r\n", 1024) = 5 epoll_ctl(4, EPOLL_CTL_MOD, 7, {EPOLLIN|EPOLLOUT, {u32=135917448, u64=135917448}}) = 0
Danga::Socket • by Brad Fitzpatrick - welcome to Japan :) • It also provides event-driven programming and epoll abstraction • Perlbal, MogileFS
Summary • For Network programming, need a little knowledge about OS, especially process scheduling, I/O and implementation of TCP/IP. • Use modern libraries/frameworks to keep your codes simple. • Perl has many good libraries for UNIX Network Programming.