720 likes | 922 Views
ecs150 Spring 2006 : Operating System #7: mbuf (Chapter 11). Dr. S. Felix Wu Computer Science Department University of California, Davis http://www.cs.ucdavis.edu/~wu/ sfelixwu@gmail.com. IPC. Uniform communication for distributed processes “socket”: network programming
E N D
ecs150 Spring 2006:Operating System#7: mbuf(Chapter 11) Dr. S. Felix Wu Computer Science Department University of California, Davis http://www.cs.ucdavis.edu/~wu/ sfelixwu@gmail.com ecs150, spring 2006
IPC • Uniform communication for distributed processes • “socket”: network programming • operating system kernel issues • Semaphores, messages queues, and shared memory for local processes ecs150, spring 2006
Socketan IPC Abstraction Layer ecs150, spring 2006
MbufsMemory Buffers • The main data structure for network processing in the kernel • Why can’t we use “kernel memory management” facilities such as kernel malloc (power of 2 alike), page, or VM objects directly? ecs150, spring 2006
“Packet” • EtherNet or 802.11 header • IP header • IPsec header • Transport headers (TCP/UDP/…) • SSL header • Others??? ecs150, spring 2006
PropertiesNetwork Packet Processing • Variable sizes • Prepend or remove • Fragment/divide or defragment/combine • can we avoid COPYING as much as possible??? • Queue • Parallel processing for high speed • E.g., Juniper routers are running FreeBSD ecs150, spring 2006
4 sys/mbuf.h kern/kern_mbuf.c kern/ipc_mbuf.c kern/ipc_mbuf2.c 4 2 256 bytes ecs150, spring 2006
the same packet next packet M_EXT M_PKTHDR M_EOR M_BCAST M_MCAST ecs150, spring 2006
#define M_EXT 0x0001 #define M_PKTHDR 0x0002 #define M_EOR 0x0004 #define M_RDONLY 0x0008 #define M_PROTO1 0x0010 #define M_PROTO2 0x0020 #define M_PROTO3 0x0040 #define M_PROTO4 0x0080 #define M_PROTO5 0x0100 #define M_SKIP_FIREWALL 0x4000 #define M_FREELIST 0x8000 #define M_BCAST 0x0200 #define M_MCAST 0x0400 #define M_FRAG 0x0800 #define M_FIRSTFRAG 0x1000 #define M_LASTFRAG 0x2000 ecs150, spring 2006
struct mbuf { struct m_hdr m_hdr; union { struct { struct pkthdr MH_pkthdr; union { struct m_ext MH_ext; char MH_databuf[MHLEN]; } MH_dat; } MH; char M_databuf[MLEN]; } M_dat; }; ecs150, spring 2006
struct mbuf { struct m_hdr m_hdr; union { struct { struct pkthdr MH_pkthdr; union { struct m_ext MH_ext; char MH_databuf[MHLEN]; } MH_dat; } MH; char M_databuf[MLEN]; } M_dat; }; ecs150, spring 2006
struct mbuf { struct m_hdr m_hdr; union { struct { struct pkthdr MH_pkthdr; union { struct m_ext MH_ext; char MH_databuf[MHLEN]; } MH_dat; } MH; char M_databuf[MLEN]; } M_dat; }; ecs150, spring 2006
24 bytes ecs150, spring 2006
IPsec_IN_DONE IPsec_OUT_DONE IPsec_IN_CRYPTO_DONE IPsec_OUT_CRYPTO_DONE ecs150, spring 2006
mbuf • Current: 256 • Old: 128 (shown in the following slides) ecs150, spring 2006
A Typical UDP Packet ecs150, spring 2006
m_devget: When an IP packet comes in… ecs150, spring 2006
mtod & dtom • mbuf ptr data region • e.g. struct ip • mtod? • dtom? ecs150, spring 2006
mtod & dtom • mbuf ptr data region • e.g. struct ip • mtod? • dtom? • #define dtom(x) (struct mbuf *) ((int) (x) & (MSIZE -1))) ecs150, spring 2006
mtod & dtom • mbuf ptr data region • e.g. struct ip • mtod? • dtom? • #define dtom(x) (struct mbuf *)((int *)(x)&~(MSIZE -1))) ecs150, spring 2006
netstat -m • Check for mbuf statistics ecs150, spring 2006
mbuf • IP input/output/forward • IPsec • IP fragmentation/defragmentation • Device IP Socket ecs150, spring 2006
Memory Management for IPC • Why do we need something like MBUF? ecs150, spring 2006
I/O Architecture IRQ Control bus Device Controller I/O Device CPU Memory Data and I/O buses Initialization Input Output Configuration Interrupt Internal buffer ecs150, spring 2006
Direct Memory Access • Used to avoid programmed I/O for large data movement • Requires DMA controller • Bypasses CPU to transfer data directly between I/O device and memory ecs150, spring 2006
DMA Requests • Disk address to start copying • Destination memory address • Number of bytes to copy ecs150, spring 2006
Is DMA a good idea? • CPU is a lot faster • Controllers/Devices have larger internal buffer • DMA might be much slower than CPU • Controllers become more and more intelligent • USB doesn’t have DMA. ecs150, spring 2006
Network Processor ecs150, spring 2006
File System Mounting • A file system must be mounted before it can be accessed. • A unmounted file system is mounted at a mount point. ecs150, spring 2006
Mount Point ecs150, spring 2006
fs0: /dev/hd0a logical disks / usr sys dev etc bin mount -t ufs /dev/hd0e /usr / local adm home lib bin fs1: /dev/hd0e mount -t nfs 152.1.23.12:/export/cdrom /mnt/cdrom ecs150, spring 2006
Distributed FS • Distributed File System • NFS (Network File System) • AFS (Andrew File System) • CODA ecs150, spring 2006
Distributed FS ftp.cs.ucdavis.edu fs0: /dev/hd0a / usr sys dev etc bin / local adm home lib bin Server.yahoo.com fs0: /dev/hd0e ecs150, spring 2006
Distributed File System • Transparency and Location Independence • Reliability and Crash Recovery • Scalability and Efficiency • Correctness and Consistency • Security and Safety ecs150, spring 2006
Correctness • One-copy Unix Semantics?? ecs150, spring 2006
Correctness • One-copy Unix Semantics • every modification to every byte of a file has to be immediately and permanently visible to every client. ecs150, spring 2006
Correctness • One-copy Unix Semantics • every modification to every byte of a file has to be immediately and permanently visible to every client. • Conceptually FS sequent access • Make sense in a local file system • Single processor versus shared memory • Is this necessary? ecs150, spring 2006
DFS Architecture • Server • storage for the distributed/shared files. • provides an access interface for the clients. • Client • consumer of the files. • runs applications in a distributed environment. open close read write opendir stat readdir applications ecs150, spring 2006
NFS (SUN, 1985) • Based on RPC (Remote Procedure Call) and XDR (Extended Data Representation) • Server maintains no state • a READ on the server opens, seeks, reads, and closes • a WRITE is similar, but the buffer is flushed to disk before closing • Server crash: client continues to try until server reboots – no loss • Client crashes: client must rebuild its own state – no effect on server ecs150, spring 2006
RPC - XDR • RPC: Standard protocol for calling procedures in another machine • Procedure is packaged with authorization and admin info • XDR: standard format for data, because manufacturers of computers cannot agree on byte ordering. ecs150, spring 2006
data structure data structure rpcgen RPC program rpcgen RPC client.c RPC.h RPC server.c ecs150, spring 2006