510 likes | 708 Views
Naming in Distributed System. Presented by Faraz Rasheed & Uzair Ahmed faraz@oslab.khu.ac.kr RealTime & Multimedia Lab Kyung Hee University, Korea. Contents. Naming Entities Names, Identifiers and Address Name Spaces Name Resolution Closure Mechanism Linking and Mounting
E N D
Naming in Distributed System Presented by Faraz Rasheed & Uzair Ahmed faraz@oslab.khu.ac.kr RealTime & Multimedia Lab Kyung Hee University, Korea
Contents • Naming Entities • Names, Identifiers and Address • Name Spaces • Name Resolution • Closure Mechanism • Linking and Mounting • Implementation of Name Space • Implementation of Resolution • Conclusion
Why naming is important? • Names are used to • Share resources • Uniquely identify entities • To refer locations, and so on… • Name resolution allows a process to access the named entity
Naming Entities • Name string of characters used to refer to an entity • Entity in DS can be anything, e.g., hosts, printers, disks, files, mailboxes, web pages, etc • Access Point To access an entity • Address name of access point • Access points of an entity may change
Identifier and True Identifiers • We need • single name of entity independent from the address of that entity location independent • Identifiers name that uniquely identifies an entity • True Identifier has three properties • Refers to at most one entity • Each entity is referred to by at most one identifier • Never reused • Differentiating point for Address and Identifier
Name Space • Names in DS are organized into Name Spaces • Name Space represented as labeled, directed graph • Leaf node no outgoing edges • Directory node number of labeled outgoing edges • Stores directory table containing entries for each outgoing edge as a pair (edge label, node identifier) • Root Node only outgoing edges • PathName sequence of labels • Absolute Path first node in path name is root • Relative Path the opposite case
Name Resolution • The process of looking up a name • Closure Mechanism Knowing how and where to start name resolution • Mounting transparent way for name resolution with different name spaces • Mounted File System letting a directory node store the identifier of a directory node from a different name space (foreign name space) • Mount point directory node storing the node identifier • Mounting point directory node in the foreign name space • Normally the mounting point is root
Mounted File System • During resolution, mounting point is looked up & resolution proceeds by accessing its directory table • Mounting requires at least • Name of an access protocol (for communication) • Name of the server (resolved to address) • Name of mounting point in foreign name space (resolved to node identifier in foreign NS) • Each of these names needs to be resolved • Three names can be represented as URL nfs://oslab.khu.ac.kr/home/faraz
Global Name Service (GNS) • Another way to merge different name spaces • Mechanism add a new root node and make the exiting root node its children • Problem • Existing names need to be changed. E.g., home/faraz people/home/faraz • Expansion is generally hidden from user • Has a significant performance overhead when merging 100s or 1000s of name spaces
Implementation of Name Space • For large scale DS, name spaces are organized hierarchically • Name Spaces are partitioned into three logical layers • Global Layer formed by highest-level nodes • Administration Layer formed by directory nodes managed within a single organization • Managerial Layer formed by nodes that may typically change regularly
Implementation of Name Resolution • Assumptions • No replication of name servers • No client side caching • Each client has access to a local name server • Two possible implementations • Iterative Name Resolution • Server will resolve the path name as far as it can, and return each intermediate result to the client • Recursive Name Resolution • A name server passes the result to the next name server found by it
Iterative Name Resolution • Advantage • Less burden on name sever • Disadvantage • More communication cost
Recursive Name Resolution • Advantages • Caching result is more effective • Reduced communication cost • Disadvantage • Demands high performance on each name server
Domain Name System (DNS) • An example implementation of name resolution • Primarily used for looking up host address and mail servers • DNS name space is hierarchically organized as a rooted tree • A label is a case sensitive string with max. length of 63 characters • Max. length of complete path name is 255 characters • The root is represented by a dot • We generally omit this dot for readability
Naming versus Locating Entities • Entities are named for lookup and subsequent access • Human-friendly Names • Identifiers • Addresses • Virtually all naming systems maintain mapping from Human-friendly names to addresses • Partitioning of Name space • Global Level • Administrator Level • Managerial Level
cs.vu.nl cs.vu.nl abc ftp.cs.vu.nl ftp.abc.cs.vu.nl ftp.cs.vu.nl cs.vu.nl ftp.khu.ac.kr ftp.cs.vu.nl Naming versus Locating Entities
Naming versus Locating Entities • Possible Solutions • Record the address of new machine • Lookup operation shall work • Another update shall be required to database in case it changes again • Record the name of the new machine • Less efficient • Find the name of new machine • Lookup the address associated with the name • Addition of step to lookup operation • For highly mobile entities, it becomes only worse
Naming versus Locating Entities • Direct, single level mapping between names and addresses. • T-level mapping using identities.
Simple solutions: Broadcasting and multicasting • A location service accepts an identifier as input and returns the current address of the identified entity. • Simple solutions exist to work in local area network. • Address Resolution Protocol (ARP) to map the IP address of a machine to its data-link address, which uses broadcasting. • Multicasting can be used to locate entities in point-to-point networks (such as the Internet). • Each multicasting address can be associated with multiple replicated entities.
Forwarding Pointers (1) • The principle of forwarding pointers using (proxy, skeleton) pairs.
Forwarding Pointers (1) • Redirecting a forwarding pointer, by storing a shortcut in a proxy.
Home-Based Approaches • Example: The principle of Mobile IP. (Perkins, 1997)
Hierarchical Approaches (1) • Hierarchical organization of a location service into domains, each having an associated directory node.
Hierarchical Approaches (2) • An example of storing information of an entity having two addresses in different leaf domains.
Hierarchical Approaches (3) • Looking up a location in a hierarchically organized location service.
Hierarchical Approaches (4) • An insert request is forwarded to the first node that knows about entity E. • A chain of forwarding pointers to the leaf node is created.
Pointer Caches (1) • Caching a reference to a directory node of the lowest-level domain in which an entity will reside most of the time.
Pointer Caches (2) • A cache entry that needs to be invalidated because it returns a nonlocal address, while such an address is available.
Scalability Issues • The scalability issues related to uniformly placing subnodes of a partitioned root node across the network covered by a location service.
The Problem of Unreferenced Objects • An example of a graph representing objects containing references to each other.
Reference Counting (1) • The problem of maintaining a proper reference count in the presence of unreliable communication.
Reference Counting (2) • Copying a reference to another process and incrementing the counter too late • A solution.
Advanced Referencing Counting (1) • The initial assignment of weights in weighted reference counting • Weight assignment when creating a new reference.
Advanced Referencing Counting (2) • Weight assignment when copying a reference.
Advanced Referencing Counting (3) • Creating an indirection when the partial weight of a reference has reached 1.
Advanced Referencing Counting (4) • Creating and copying a remote reference in generation reference counting.
Reference Listing (1) • Skeleton Keeps track of Proxies • Instead of counting them maintain an explicit list of references • Adding/removing references to the list have no effect on the fact the proxy is already exists/removed • Idempotent Operations • Repeatable without affecting the end result • Increment/decrement operation are clearly not idempotent
Reference Listing (2) • Advantages • Don’t require reliable communication • Duplicate messages need not to be detected • Only insertion/deletion should be acknowledged • Easier to keep system consistent in case of process failures • Drawback • Scale badly • Solution • Leasing
Identifying Unreachable Entities • Trace based garbage collection • Scalability problems • Naïve tracing • Mark and sweep collectors • White, Grey, Black marks • Drawbacks • Reachability graphs need to remain same during both phases • No process can run when GC is running
Tracing in Groups (1) • Initial marking of skeletons.
Tracing in Groups (2) • After local propagation in each process.
Tracing in Groups (3) • Final marking.
Conclusion • Naming, organization of names and name resolution are key issue in any distributed systems • Locating entities is an open research issues. There are few methods like Forwarding pointers, hierarchical approaches, home based approaches and pointer caches but each has its own short comings • Reference counting, advanced reference counting and Reference listing are few methods that can be used for unreferenced objects
- All is well that ends well ! Thank you all Questions / Comments?