LUM final presentation

LUM final presentation Chanit Giat Rachel Stahl Instructor: Artyom Borzin Summer semester 2002

PROXY CACHE ENGINE • The proxy cache engine gives hardware support to a server’s OS in order to improve its service rate, and adds security features. • The main memory of a network server is the quick storage device, where the recently accessed data is saved. • The system stores the information about all the files’ mapping in main memory and calculates the exact path to the required file if present in main memory. If not present, orders the operating system to bring it from the storage device, and supplies the path to the free memory space is supplied. • The system holds 2 main data bases: • A main memory, which holds up to 2Meg paths to the server’s memory, and their aging parameters. • A bit map table, which allows faster memory management by holding the free space image of the main memory.

SEARCH: CID=1 ASIS Site# Length Data Main functions: • Search– returns the path to the main memory, or a path to a free space in the memory. • Set attributes– sets the file’s aging attributes, as supplied by the OS. • Delete– deletes a certain path from the memory. • Count free– returns number of free path slots in the memory. • Init– initialize the machine. • (age – when number of records exceeds a specified number, the system cleans up some of them.)

Local Bus Interface Reg. file Output FIFO Data Stream controller Input FIFO Database Manager (DBM) SRAM (Bit Map) CRC unit Decoder UTCAM Previous uArchitecture

uArchitecture changes: • Doubling the front-end of the machine, including: • Input FIFO • Decoder • CRC unit • Buffering between the decoders and the DBM with a FIFO. • The search for a free index in the Bit Map is now done in parallel to the rest of the command execution.

New uarchitecture LOCAL BUS INTERFACE Double D B M FrontEnd0 DBM Fifo Decoder CRC Input FIFO Data Stream Controller F I F O FrontEnd1 Decoder CRC Input FIFO Output FIFO Reg. file

performance • 2 input FIFOs – double rate receiving data from OS. • 2 decoders – allows decoding of 2 commands in parallel. Significant for several long ‘search’ commands in a row. • DBM FIFO – separates between the decoding and execution of commands, enables them to perform in parallel. • Search for a free index now executes in parallel to other execution stages of a command. Saves ~50 clock cycles per ‘search’ command, which usually takes ~400-1000 cycles.

performance • 2 search commands each with 102 bytes of path (on which crc is working): Old architecture ~110k commands per second (718n) (2380n) (6128n) (9344n) (14574n) Second command (2380n) Start execution (6202n) Finished execution (8560n) First command (718n) Waiting (2nd command) Decoding(2nd command) Decoding(1st command) Exe(1st ) Start of simulation Decoding(1st command) Exe(1st ) Finished execution (9486n) Decoding(2nd command) Exe(2nd ) (7868n) New architecture ~190k commands per second

LUM final presentation

LUM final presentation

Presentation Transcript

Final Presentation

Final Presentation

Final Presentation

Final Presentation

Final Presentation

Final Presentation

Final Presentation

Final Presentation

Presentation Final

Final Presentation

Final Presentation

LUM final presentation

Final Presentation

Final Presentation

Final Presentation

Final Presentation

Final Presentation

Final Presentation

Final Presentation

Final Presentation

Final Presentation

Final Presentation