470 likes | 1.07k Views
Perl and UNIX Network Programming . Naoya Ito naoya at hatena.ne.jp. Why now network programming?. httpd is boring Some recent web application have special feature of networking. Comet Socket API of ActionScript 3 mini server for development, like Catalyst's server.pl. Agenda.
E N D
Perl and UNIX Network Programming Naoya Ito naoya at hatena.ne.jp
Why now network programming? • httpd is boring • Some recent web application have special feature of networking. • Comet • Socket API of ActionScript 3 • mini server for development, like Catalyst's server.pl
Agenda • UNIX network programming basics with Perl • I/O multiplexing • Perl libraries for modern network programming
BSD Socket API with C int main (void) { int listenfd, connfd; struct sockaddr_in servaddr; char buf[1024]; listenfd = socket(AF_INET, SOCK_STREAM, 0); bzero(&servaddr, sizeof(servaddr)); servaddr.sin_family = AF_INET; servaddr.sin_addr.s_addr = htonl(INADDR_ANY); servaddr.sin_port = htons(9999); bind(listenfd, (struct sockaddr *) &servaddr, sizeof(servaddr)); listen(listenfd, 5); for (;;) { connfd = accept(listenfd, NULL, NULL) ; while (read(connfd, buf, sizeof(buf)) > 0) { write(connfd, buf, strlen(buf)); } close(connfd); } }
BSD Socket API • socket() • struct sockaddr_in • bind() • listen() • accept() • read() / write() • close()
Perl Network Programming • TMTOWTDI • less code • CPAN • performance is good enough • right design >> ... >> language advantage
BSD Socket API with Perl #!/usr/local/bin/perl use strict; use warnings; use Socket; socket LISTEN_SOCK, AF_INET, SOCK_STREAM, scalar getprotobyname('tcp'); bind LISTEN_SOCK, pack_sockaddr_in(9999, INADDR_ANY); listen LISTEN_SOCK, SOMAXCONN; while (1) { accept CONN_SOCK, LISTEN_SOCK; while (sysread(CONN_SOCK, my $buffer, 1024)) { syswrite CONN_SOCK, $buffer; } close CONN_SOCK; }
use IO::Socket #!/usr/local/bin/perl use strict; use warnings; use IO::Socket; my $server = IO::Socket::INET->new( Listen => 20, LocalPort => 9999, Reuse => 1, ) or die $!; while (1) { my $client = $server->accept; while ($client->sysread(my $buffer, 1024)) { $client->syswrite($buffer); } $client->close; } $server->close;
blocking on Network I/O while (1) { my $client = $server->accept; while ($client->sysread(my $buffer, 1024)) { # block $client->syswrite($buffer); } $client->close; } accept(2) server listen queue read(2) I can't do client #1 client #2
busy loop / blocking % ps -e -o stat,pid,wchan=WIDE-WCHAN-COLUMN,time,comm while (1) { $i++ } STAT PID WIDE-WCHAN-COLUMN TIME COMMAND R+ 18684 - 00:00:38 perl while (1) { STDIN->getline } STAT PID WIDE-WCHAN-COLUMN TIME COMMAND S+ 8671 read_chan 00:00:00 perl
Linux internals process buffer fread() TASK_RUNNING libc.so buffer read(2) system call vfs TASK_UNINTERRUPTIBLE ext3 switch to Kernel-Mode. User-process goes sleep. device driver Kernel-Mode Hardware Interruption. Hardware (HDD) ref: 『Linux カーネル2.6解読室』 p.32
Again: blocking while (1) { my $client = $server->accept; while ($client->sysread(my $buffer, 1024)) { # block $client->syswrite($buffer); } $client->close; }
We need parallel processing • fork() • threads • Signal I/O • I/O Multiplexing • Asynchronous I/O
I/O Multiplexing • Parallel I/O in single thread, watching I/O event of file descripters • less resource than fork/threads • select(2) / poll(2) • wait for a number of file descriptors to change status.
select(2) accepted connection #2 listening socket accepted connection #1 1. ready! select(2) 3. ok, I'll try to accept() 2. now listening socket is ready to accept a new connection. caller
select(2) on Perl • select(@args) • number of @args is not 1 but 4. • difficult interface • IO::Select • OO interface to select(2) • easy interface
IO::Select SYNOPSYS use IO::Select; $s = IO::Select->new(); $s->add(\*STDIN); $s->add($some_handle); @ready = $s->can_read($timeout); # block
use IO::Select my $listen_socket = IO::Socket::INET->new(...) or die $@; my $select = IO::Select->new or die $!; $select->add($listen_socket); while (1) { my @ready = $select->can_read; # block for my $handle (@ready) { if ($handle eq $listen_socket) { my $connection = $listen_socket->accept; $select->add($connection); } else { my $bytes = $handle->sysread(my $buffer, 1024); $bytes > 0 ? $handle->syswrite($buffer) : do { $select->remove($handle); $handle->close; } } } }
And more things we must think... • blocking when syswrite() • use non-blocking socket • Line-based I/O • select(2) disadvantage
non-blocking socket + Line-based I/O use POSIX; use IO::Socket; use IO::Select; use Tie::RefHash; my $server = IO::Socket::INET->new(...); $server->blocking(0); my (%inbuffer, %outbuffer, %ready); tie %ready, "Tie::RefHash"; my $select = IO::Select->new($server); while (1) { foreach my $client ( $select->can_read(1) ) { handle_read($client); } foreach my $client ( keys %ready ) { foreach my $request ( @{ $ready{$client} } ) { $outbuffer{$client} .= $request; } delete $ready{$client}; } foreach my $client ( $select->can_write(1) ) { handle_write($client); } } sub handle_error { my $client = shift; delete $inbuffer{$client}; delete $outbuffer{$client}; delete $ready{$client}; $select->remove($client); close $client; } sub handle_read { my $client = shift; if ($client == $server) { my $new_client = $server->accept(); $new_client->blocking(0); $select->add($new_client); return; } my $data = ""; my $rv = $client->recv($data, POSIX::BUFSIZ, 0); unless (defined($rv) and length($data)) { handle_error($client); return; } $inbuffer{$client} .= $data; while ( $inbuffer{$client} =~ s/(.*\n)// ) { push @{$ready{$client}}, $1; } } sub handle_write { my $client = shift; return unless exists $outbuffer{$client}; my $rv = $client->send($outbuffer{$client}, 0); unless (defined $rv) { warn "I was told I could write, but I can't.\n"; return; } if ($rv == length( $outbuffer{$client}) or $! == POSIX::EWOULDBLOCK) { substr( $outbuffer{$client}, 0, $rv ) = ""; delete $outbuffer{$client} unless length $outbuffer{$client}; return; } handle_error($client); } • oops
select(2) disadvantage • FD_SETSIZE limitation • not good for C10K • Inefficient processing • coping list of fds to the kernel • You must scan list of fds in User-Land
select(2) internals process FD_ISSET select(2) select(2) fd fd fd fd fd fd fd fd fd copy copy fd fd fd fd fd fd I/O event kernel ref: http://osdn.jp/event/kernel2003/pdf/C06.pdf
Modern UNIX APIs • epoll • Linux 2.6 • /dev/kqueue • BSD • devpoll • Solaris
epoll(4) • better than select(2), poll(2) • no limitation of numbers of fds • O(1) scallability • needless to copy list of fds • epoll_wait(2) returns only fds that has new event
epoll internals process epoll_create() epoll_wait() epoll_ctl(ADD) epoll_ctl(ADD) epoll_ctl(ADD) fd table fd fd fd fd fd fd I/O event kernel ref: http://osdn.jp/event/kernel2003/pdf/C06.pdf
epoll on perl • Sys::Syscall • epoll • sendfile • IO::Epoll • use IO::Epoll qw/:compat/
Libraries for Perl Network Programming • TMTOWTDI • POE • Event::Lib • Danga::Socket • Event • Stem • Coro ...
They provides: • Event-based programming for parallel processing • system call abstraction • select(2) / poll(2) / epoll / kqueue(2) / devpoll
POE • "POE is a framework for cooperative, event driven multitasking in Perl. " • POE has many "components" on CPAN • I'm lovin' it :)
Hello, POE use strict; use warnings; use POE qw/Sugar::Args/; POE::Session->create( inline_states => { _start => sub { my $poe = sweet_args; $poe->kernel->yield('hello'), # async / FIFO }, hello => sub { STDOUT->print("Hello, POE!"); }, }, ); POE::Kernel->run;
Watching handles in Event loop POE::Session->create( inline_states => { _start => sub { my $poe = sweet_args; $poe->kernel->yield('readline'), }, readline => sub { my $poe = sweet_args; STDOUT->syswrite("input> "); $poe->kernel->select_read(\*STDIN, 'handle_input'); }, handle_input => sub { my $poe = sweet_args; my $stdin = $poe->args->[0]; STDOUT->syswrite(sprintf "Hello, %s", $stdin->getline); $poe->kernel->yield('readline'); } }, );
Results % perl hello_poe2.pl input> naoya Hello, naoya input> hatena Hello, hatena input> foo bar Hello, foo bar input>
Results of strace % strace -etrace=select,read,write -p `pgrep perl` Process 8671 attached - interrupt to quit select(8, [0], [], [], {3570, 620000}) = 1 (in [0], left {3566, 500000}) read(0, "naoya\n", 4096) = 6 write(1, "Hello, naoya\n", 13) = 13 select(8, [0], [], [], {0, 0}) = 0 (Timeout) write(1, "input> ", 7) = 7 select(8, [0], [], [], {3600, 0}) = 1 (in [0], left {3595, 410000}) read(0, "hatena\n", 4096) = 7 write(1, "Hello, hatena\n", 14) = 14 select(8, [0], [], [], {0, 0}) = 0 (Timeout) write(1, "input> ", 7) = 7 select(8, [0], [], [], {3600, 0}) = 1 (in [0], left {3598, 860000}) read(0, "foobar\n", 4096) = 7 write(1, "Hello, foobar\n", 14) = 14 select(8, [0], [], [], {0, 0}) = 0 (Timeout) write(1, "input> ", 7) = 7 select(8, [0], [], [], {3600, 0}
use POE::Wheel::ReadLine POE::Session->create( inline_states => { ... readline => sub { my $poe = sweet_args; $poe->heap->{wheel} =POE::Wheel::ReadLine->new( InputEvent => 'handle_input', ); $poe->heap->{wheel}->get('input> '); }, handle_input => sub { my $poe = sweet_args; $poe->heap->{wheel}->put(sprintf "Hello, %s", $poe->args->[0]); $poe->heap->{wheel}->get('input> '); } }, ); ...
Parallel echo server using POE POE::Session->create( inline_states => { _start => \&server_start, }, package_states => [ main => [qw/ accept_new_client accept_failed client_input /], ] ); POE::Kernel->run; sub server_start { my $poe = sweet_args; $poe->heap->{listener} = POE::Wheel::SocketFactory->new( BindPort => 9999, Reuse => 'on', SuccessEvent => 'accept_new_client', FailureEvent => 'accept_failed', ); } sub accept_new_client { my $poe = sweet_args; my $wheel = POE::Wheel::ReadWrite->new( Handle => $poe->args->[0], InputEvent => 'client_input', ); $poe->heap->{wheel}->{$wheel->ID} = $wheel; } sub client_input { my $poe = sweet_args; my $line = $poe->args->[0]; my $wheel_id = $poe->args->[1]; $poe->heap->{wheel}->{$wheel_id}->put($line); } sub accept_failed {}
Again, Parallel echo server using POE use POE qw/Sugar::Args Component::Server::TCP/; POE::Component::Server::TCP->new( Port => 9999, ClientInput => sub { my $poe = sweet_args; my $input = sweet_args->args->[0]; $poe->heap->{client}->put($input); }, ); POE::Kernel->run();
POE has many components on CPAN • PoCo::IRC • PoCo::Client::HTTP • PoCo::Server::HTTP • PoCo::EasyDBI • PoCo::Cron • PoCo::Client::MSN • PoCo::Client::Linger ...
using POE with epoll • just use POE::Loop::Epoll • use POE qw/Loop::Epoll/;
Event::Lib • libevent(3) wrapper • libevent is used by memcached • libevent provides: • event-based programming • devpoll, kqueue, epoll, select, poll abstraction • Similar to Event.pm • Simple
echo server using Event::Lib my $server = IO::Socket::INET->new(...) or die $!; $server->blocking(0); event_new($server, EV_READ|EV_PERSIST, \&event_accepted)->add; event_mainloop; sub event_accepted { my $event = shift; my $server = $event->fh; my $client = $server->accept; $client->blocking(0); event_new($client, EV_READ|EV_PERSIST, \&event_client_input)->add; } sub event_client_input { my $event = shift; my $client = $event->fh; $client->sysread(my $buffer, 1024); event_new($client, EV_WRITE, \&event_client_output, $buffer)->add; } sub event_client_output { ... }
Result of strace on Linux 2.6 epoll_wait(4, {{EPOLLIN, {u32=135917448, u64=135917448}}}, 1023, 5000) = 1 gettimeofday({1167127923, 189763}, NULL) = 0 read(7, "gho\r\n", 1024) = 5 epoll_ctl(4, EPOLL_CTL_MOD, 7, {EPOLLIN|EPOLLOUT, {u32=135917448, u64=135917448}}) = 0
Danga::Socket • by Brad Fitzpatrick - welcome to Japan :) • It also provides event-driven programming and epoll abstraction • Perlbal, MogileFS
Summary • For Network programming, need a little knowledge about OS, especially process scheduling, I/O and implementation of TCP/IP. • Use modern libraries/frameworks to keep your codes simple. • Perl has many good libraries for UNIX Network Programming.