140 likes | 268 Views
Sri Ramkrishna. What is RCU, fundamentally?. Intro. RCU stands for Read Copy Update A synchronization method that allows reads to occur concurrently with updates. Supports concurrency between single updater and multiple readers
E N D
Sri Ramkrishna What is RCU, fundamentally?
Intro • RCU stands for Read Copy Update • A synchronization method that allows reads to occur concurrently with updates. • Supports concurrency between single updater and multiple readers • while maintaining multiple versions of objects and ensuring that they are not freed up till all reads are complete. • Compares favorably with locking where you have to lock a critical section regardless of readers or writers.
Intro • RCU is made up of three fundamental mechanisms: • Publish Subscribe mechanism • Wait for pre-existing RCU readers to complete • Maintain multiple versions of RCU objects.
Publish Subscribe • Allows you to read data even though it's being updated. • Some background: • We've seen that there is no guarantee that reads (or writes) are going to be in program order • Values could be assigned or viewed in the wrong order. • A Publish mechanism is simply a mechanism that when a memory location is updated it forces the cpu and the compiler to execute pointer assignments and object initializations in the right order.
Subscribe Publish • For some architectures like the DEC Alpha, this isn't enough as some compilers do some wacky optimization. • it optimizes by guessing and checks if their guess is right. • For these, RCU introduces a rcu_deference primitive. • Really only useful on DEC Alpha architectures. Safe to not use it elsewhere.
Subscribe Publish • The rcu_dereference primitive uses memory barriers and compiler directives to force thecpu and compiler to fetch values in the proper order. • Example: • rcu_read_lock () • p = rc__dereference (gp) • if (p != NULL) • do_something_with (p->a, p->b, p->c) • rcu_read_unlock ()
Subscribe Publish • The rcu_read_lock and rcu_read_unlock calls are used to declare a read-side critical section. • make sure that you don't sleep or block in this part. • don't actually do anything on some architectures.
Wait for Pre-Existing RCU Readers to Complete • There are many ways to wait for something to finish using refcounts, reader/writer locks etc. • Has the advantage of scability as it doesn't explicitly have to track any of the threads. • Uses reader-side critical sections • can contain anything as long as it doesn't block or sleep. • Can create performance problems since all readers won't be able to complete because some threads are blocking.
Wait for Pre-Existing RCU Readers to Complete • Basic operation is: • Make a change to an object • Wait for all pre-existing RCU read-side critical sections to complete using synchronize_rcu primtive • Clean up • This removes all unused objects and frees memory. • Forces a context switch to a cpu. A context switch can't happen until all critical sections are complete. • If a context switch happens it means that we have no readers going through their critical sections and probably not referencing the memory.
Wait for Pre-Existing RCU Readers to Complete • synchronize_rcu primitive is really a technique to force a context switch on all cpus. • if a context switch happens, then we know that there are no readers and we can safely reclaim the memory.
1 struct foo { 2 struct list head list; 3 int a; 4 int b; 5 int c; 6 }; 7 LIST_HEAD (head); 8 / * . . . */ 9 p = search (head, key); 10 if (p == NULL) { 11 /* do something */ 12 } 13 q = kmalloc (sizeof (*p), GFP_KERNEL); 14 q = *p; 15 q->b = 2; 16 q->c = 3; 17 list_replace_rcu (&p->list, &q->list); 18 synchronize_rcu (); 19 kfree(p)
Maintaining Multiple Versions of Recently Updated Objects • RCU will make a new copy every time an RCU object has been modified to. • Readers will continue to see the old copy. • When synchronize_rcu is called and returns, RCU will garbage collect reader's version of the object and update all the readers views since it won't complete till all the readers are done.