390 likes | 523 Views
Tony Kombol. NETWORKED FILE SYSTEMS. Networked File Systems. Allows a server to act as persistent storage for one or more clients connected over a network Usually presented in the same manner as a local disk First developed in the 1970s Network File System (NFS)
E N D
Tony Kombol NETWORKED FILE SYSTEMS
Networked File Systems Allows a server to act as persistent storage for one or more clients connected over a network Usually presented in the same manner as a local disk First developed in the 1970s Network File System (NFS) Created in 1985 by Sun Microsystems First widely deployed networked file system
NFS – A Brief History NFSv1 ca. 1985 Internal release only NFSv2 RFC 1094, ca. 1989 UDP only Completely Stateless Locking, quotas, etc. outside protocol Handled by extra RPC daemons
NFS – A Brief History NFSv3 RFC 1813, ca. 1995 64-bit file sizes and offsets Asynchronous writes File attributes included with other responses TCP support
NFS – A Brief History NFSv4 RFC 3010, ca. 2000 RFC 3530, ca. 2003 Protocol development handed to IETF Performance improvements Mandated security (Kerberos) Stateful protocol
NFS – Extensions Network Lock Manager Supports System V style file locking APIs Remote Quota Reporting Allows users to view their storage quotas
NFS – Sun RPC Allows execution of functions on one machine from another Written in 1984 by Sun Renamed to Open Network Computing Remote Procedure Call (ONC RPC) RPCs register with a RPC ‘port mapper’ Most UNIX variants have implementations Windows Services for UNIX has an implementation
NFS – Quirks Mapping users for access control not provided by NFS Central user management recommended Network Information Service (NIS) Previously called Yellow Pages Designed by Sun Microsystems Created in conjunction with NFS LDAP + Kerberos is a modern alternative
NFS – Quirks Design requires trusted clients (other computers) Read/write access traditionally given to IP addresses Up to client to honor permissions and enforce access control RPC and the port mapper are notoriously hard to secure Designed to execute function on the remote server Hard to firewall An RPC is registered with the port mapper and assigned a random port
NFS – Quirks NFSv4 solves most of the quirks Kerberos can be used to validate identity Validated identity prevents rogue clients from reading or writing data RPC is still annoying
Similar Protocols Server Message Block (SMB) aka Common Internet File System (CIFS) Developed by IBM “Embraced and Extended” by Microsoft Originally ran over NetBIOS protocol Apple Filing Protocol (AFP) Originally ran over AppleTalk protocol Mostly obsolete
Distributed File Systems Universal Paths Path to a resource is always the same no matter where you are Transparent to clients View is one file system Physical location of data is abstracted
Distributed File Systems Ability to relocate volumes while online Replication of volumes Most support read-only replicates Some support read-write replicates Allows load-balancing between servers Partial access in failure situations Only volumes on a failed server unavailable Some support for turning a read-write replicate into a read-write master Often support file caching Some support offline access
Distributed File Systems – Examples Microsoft DFS Suite of technologies Uses SMB as underlying protocol Andrew File System (AFS) Developed at Carnegie Mellon Used at: NASA, JPL, CERN, Morgan Stanley
AFS – History 1983 Andrew Project began at Carnegie Mellon 1988 AFSv3 Installations of AFS outside Carnegie Mellon 1989 Transarc founded to commercialize AFS 1998 Transarc purchased by IBM 2000 IBM releases code as OpenAFS
AFS – Benefits AFS has many benefits over traditional networked file systems Much better security Uses Kerberos authentication Authorization handled with ACLs ACLs are granted to Kerberos identities No ticket, no data Clients do not have to be trusted
AFS – Benefits Scalability High client to server ratio 100:1 typical, 200:1 seen in production Enterprise sites routinely have > 50,000 users Caches data on client Limited load balancing via read-only replicates Limited fault tolerance Clients have limited access if a file server fails
AFS – Caching Read and write operations occur in file cache Only changes to file are sent to server on close Cache consistency occurs via a callback Client tells server it has cached a copy Server will notify client if a cached file is modified Callback must be re-negotiated if a time-out or error occurred Does not require re-downloading the file
AFS – Volumes Volumes are the basic unit of AFS file space A volume contains files, directories and mount points for other volumes Top volume is root.afs Mounted to /afs on clients Alternate is dynamic root Dynamic root populates /afs with all known cells
AFS – Volumes Volumes can be mounted in multiple locations Quotas can be assigned to volumes Volumes can be moved between servers Volumes can be moved even if they are in use
AFS – Read-Only Replicates Volumes can be replicated to read-only clones Read-only clones can be placed on multiple servers Clients will choose a clone to access If a copy becomes unavailable, client will use a different copy Result is simple load balancing
AFS – Limitations Whole-file locking Prevents shared databases Deliberate design decision Directory-based ACLs Can not assign an ACL to an individual file Complicated to setup and administer No commercial backer Several consulting companies sell support
Cluster File Systems Allow multiple clients to read/write files at same time Designed for speed Often provide share-access to the underlying file system (block-level) Clients can communicate with each other Lock negotiation Transferring of blocks Many provide fault tolerance
Cluster File Systems – Examples Lustre Cluster File Systems, Inc. General Parallel File System (GPFS) IBM Global File System (GFS) RedHat
NAS or SAN? SAN is not NAS backwards!
Networked Attached Storage • Server that serves files to network attached systems • CIFS (SMB) and NFS are two example protocols a NAS may use • Often used to refer to ‘turn-key’ solutions • Crude analogy: formatted hard drive • Accessed through network
Storage Area Network Designed to consolidate storage space into one cohesive unit Note: can also spread data! Pooling storage allows better fault tolerance RAID across disks and arrays Multiple paths between SAN and servers Secondary servers can take over for failed servers Analogy: unformatted hard drive
SANs SANs normally use block-level protocols Exposed to the OS as a block device Same as a physical disk Only one server can read/write a target block Compare to NAS Multiple servers can still be allocated to a SAN SANs should never be directly connected to another network Especially not the Internet! SANs typically have their own network SAN protocols are designed for speed, not security
SANs - Protocols Fibre Channel Combination interconnect and protocol Bootable iSCSI Internet Small Computer System Interface Runs over TCP/IP Bootable AoE (ATA over Ethernet) No overhead from TCP/IP Not routable Not a bad thing!
SANs – Common Workloads File Servers Database Servers Virtual machine images Physical machine images