790 likes | 1.54k Views
The Answer to Free Memory, Swap, Oracle and everything A presentation about using memory where it’s needed most Christo Kutrovsky The Pythian Group 2007 April The Answer to Free Memory, Swap, Oracle and everything The 45 minutes version
E N D
The Answer to Free Memory, Swap, Oracle and everything A presentation about using memory where it’s needed most Christo Kutrovsky The Pythian Group 2007 April
The Answer to Free Memory, Swap, Oracle and everything The 45 minutes version A presentation about using memory where it’s needed most Christo Kutrovsky The Pythian Group 2007 April
Who Am I? • Joined Pythian in 2003 • Became team lead for one of Pythian's service delivery teams in 2006 • Notable clients: Palm Coast Data, Freshdirect.com • Presented at Collaborate '06, '07, RMOUG • Special interest in 11g, RAC, Disk IO performance, and memory • Pythian's delegate to the 11g beta, participated at the camp level (two visits)
Who is Pythian? • Provides turnkey global data architecture and operations teams on a linear-cost-to-effort basis • Founded in 1997, headquartered in Ottawa, Canada, with offices in India and Australia • Supporting almost 100 clients worldwide and more than 600 production databases • Almost 50 production engineers engaged in client service delivery • Broad data infrastructure expertise primarily focused on Oracle, Microsoft SQL Server, and MySQL on enterprise hardware
Agenda • Types of memory • Virtual Memory areas • How do we monitor memory usage • And make sense of it • Oracle examples • Case studies
Questions • How many developers • How many managing linux • How many managing unix (AIX, solaris) • How many have root access • How many have control of database memory consumption
Terminology • What is memory • The ability of a computer system to store data
Types of Memory • Short term • RAM (memory) • Long term (“permament”) • Disk, tape (storage)
Types of Memory - physical • CPU Registers • fastest, very limited • CPU Cache (L1/L2/L3) • some latency, LRU maintained • RAM • major latency (relatively), partially LRU • Disk • do something else while you wait
What is RAM • Faster, temporary storage • A work area • A place where you put your data while you process it
The Many caches CPU Registers2 ns CPU Cache 8 ns 1:4 Main Memory (RAM) 100ns 1:12 CPU Disk – Long term memory 3’000’000 ns 1:30’000 TAPE – even longer
CPU Cache & CPU Registers • CPU Registers – your two hands (or more) • You use them to hold the items while you work on them • CPU Cache – your desk • You use it as a quickly accessible location to store your most used items • Represents your current tasks
Main Memory - RAM • RAM – Random Access Memory • It’s like your office • Need to get up from your desk to grab items to work on • You usually grab multiple at a time to save roundtrips
Our office Your hands2 seconds CPU Cache “Desk”4 sec. Main Memory (RAM) “Your office” 12 seconds CPU Disk “Flying to Australia” 8 hours TAPE – use a cargo ship to go
Growing your office • You always need more • Your “office” needs to handle all your active clients, or they will be unhappy • Running out of space in your office is not acceptable
The Disk – extending the memory • The Solution? • Ship some of your least needed binders to Australia • Relatively complex process • need to find the least needed binders • need to know how to return them, when they are needed
Introduction to virtual memory • Processes “see” memory independently, as if it was alone on the system • Each process has freedom to use addresses in the whole “user address space” • Typically – 3 Gb user space, 1 Gb system space (on 32 bit)
Virtual memory mapping 32 bit addressing space 0 gb 1 gb 2 gb 3 gb 4 gb P1 P2 Reservedvirtualregion for the system(kernel) RAM RAM split into 4 kb chunks
VM Management • Implemented via per process page table • Indicates: • page location (disk/memory) • page permissions (read/write/execute) • page attributes (ex. copy on write)
Virtual memory PTE table PTE Table for P1 rw – in RAM – 0xFFA rw – in RAM – 0xFFB RAM in RAM – 0xFFC – copy on write w – unallocated rw – on disk - SWAP rx – on disk - FILE unallocated FILE SWAP P1
Additional benefits from VM • Protection • Features • memory mapped files • in memory file system • shared memory • shared memory – copy on write • Use more then what you have
Concept types of memory • Shared • initially exists on disk • file cache(linux), buffers, system cache • initially does not exist on disk • anonymous(linux), computed(aix) • Private • does not exist on disk • special case copy on write
Linux VM Components • direct “user” dependant types of memory • Buffers (shared) • Cached (shared) • Anonymous (private or shared) • Hugepages • indirect (system) managed areas • Slab – kernel structures • PageTables
VM areas with Oracle System User SLAB Buffers Mapped Anonymous (PGA,PLSQL arrays) Pagetables Cached IPC Memory (SGA)
Monitoring Monitoring Memory with Oracle in mind
TOP • top • most commonly used tool • most confused interpretation
top – sample output top - 22:03:11 up 3:19, 2 users, load average: 2.98, 1.22, 0.52 Tasks: 89 total, 1 running, 88 sleeping, 0 stopped, 0 zombie Cpu0 : 0.7% us, 0.8% sy, 0.0% ni, 0.3% id, 98.0% wa, 0.2% hi, 0.0% si Cpu1 : 0.0% us, 0.8% sy, 0.0% ni, 97.6% id, 1.4% wa, 0.2% hi, 0.0% si Cpu2 : 0.0% us, 0.2% sy, 0.0% ni, 99.7% id, 0.2% wa, 0.0% hi, 0.0% si Cpu3 : 0.2% us, 0.2% sy, 0.0% ni, 33.6% id, 66.1% wa, 0.0% hi, 0.0% si Mem: 8310308k total, 8049068k used, 261240k free, 36620k buffers Swap: 7823644k total, 572k used, 7823072k free, 3395900k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 8494 oracle 16 0 1662m 1.6g 1.5g D 2.0 19.8 0:03.15 oracletest (LOCAL=YES) 4796 oracle 16 0 1626m 1.5g 1.5g S 1.0 19.5 0:03.91 ora_dbw1_test 4794 oracle 15 0 1626m 1.5g 1.5g S 0.7 19.5 0:12.23 ora_dbw0_test 4798 oracle 16 0 1626m 1.5g 1.5g S 0.7 19.5 0:03.97 ora_dbw2_test 4800 oracle 16 0 1626m 1.5g 1.5g S 0.7 19.5 0:04.09 ora_dbw3_test 1 root 16 0 2384 600 512 S 0.0 0.0 0:00.86 init [3] 2 root RT 0 0 0 0 S 0.0 0.0 0:00.00 [migration/0] 3 root 34 19 0 0 0 S 0.0 0.0 0:00.00 [ksoftirqd/0]
Top – data comes from /proc/<pid>/status cat /proc/10450/status Name: oracle State: S (sleeping) SleepAVG: 98% Tgid: 10450 Pid: 10450 PPid: 1 TracerPid: 0 Uid: 503 503 503 503 Gid: 503 503 503 503 FDSize: 256 Groups: 503 603 VmSize: 83424 kB VmLck: 0 kB VmRSS: 1484204 kB VmData: 1612 kB VmStk: 124 kB VmExe: 52720 kB VmLib: 8420 kB …
top – additional columns • top can have additional columns • swap file usage • computed • code • data • THEY ARE ALL WRONG
vmstat vmstat 2 procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 0 0 0 3631424 11096 120204 0 0 35 31 255 20 0 0 99 0 0 0 0 3631488 11096 120204 0 0 0 0 1014 18 0 0 100 0 0 0 0 3631488 11096 120204 0 0 0 0 1012 16 0 0 100 0 • r – run queue – how many processes currently waiting for or running on the CPU • b – how many processes waiting, usually waiting on IO • swpd – swap memory usage • free – free memory • cache – file system cache
vmstat cont vmstat 2 procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 0 0 0 3631424 11096 120204 0 0 35 31 255 20 0 0 99 0 0 0 0 3631488 11096 120204 0 0 0 0 1014 18 0 0 100 0 0 0 0 3631488 11096 120204 0 0 0 0 1012 16 0 0 100 0 • si/so – swap in / out – in Kb/sec • bi/bo – bytes in / out – in Kb/sec • cs – context switches • us/sy/id/wa – user/system/idle/wait time for CPUs
/proc/meminfo • cat /proc/meminfo MemTotal: 8310308 kB MemFree: 93448 kB Buffers: 132036 kB Cached: 3413324 kB SwapCached: 0 kB Active: 1658252 kB Inactive: 1942032 kB HighTotal: 7470528 kB HighFree: 8768 kB LowTotal: 839780 kB LowFree: 84680 kB SwapTotal: 7823644 kB SwapFree: 7823072 kB Dirty: 100 kB Writeback: 0 kB Mapped: 82500 kB Slab: 92028 kB Committed_AS: 490700 kB PageTables: 3952 kB VmallocTotal: 106488 kB VmallocUsed: 5964 kB VmallocChunk: 99900 kB HugePages_Total: 2200 HugePages_Free: 1088 Hugepagesize: 2048 kB
/proc/meminfo – 64 bit SwapTotal: 4816888 kB SwapFree: 4192148 kB Dirty: 252 kB Writeback: 0 kB Mapped: 1350480 kB Slab: 461584 kB CommitLimit: 6851404 kB Committed_AS: 4959776 kB PageTables: 46668 kB VmallocTotal: 536870911 kB VmallocUsed: 2992 kB VmallocChunk: 536867847 kB HugePages_Total: 2000 HugePages_Free: 128 Hugepagesize: 2048 kB • cat /proc/meminfo MemTotal: 8165032 kB MemFree: 106428 kB Buffers: 219484 kB Cached: 2864760 kB SwapCached: 69256 kB Active: 1508428 kB Inactive: 1915392 kB HighTotal: 0 kB HighFree: 0 kB LowTotal: 8165032 kB LowFree: 106428 kB
MemTotal • Total memory visible by the OS • If it’s not what you’ve put in the machine, probably you have a bad SIM/DIMM
MemFree • Memory that is currently un-occupied and available to use immediately • Not the maximum amount of memory available at the moment • Controlled by (Linux RH4) /proc/sys/vm/min_free_kbytes
MemFree – example grep MemFree /proc/meminfo MemFree: 26568 kB echo 900000 > /proc/sys/vm/min_free_kbytes grep MemFree /proc/meminfo MemFree: 210056 kB
Buffers • Cache of raw disk blocks • Usually occupied with ext3 metadata • Mostly ext3 pointers (extent management) • Not the cache of actual user data • In older kernels, was controllable
Cached • File system cache • If direct IO is not used for datafiles – will have your datafiles cached • Binary (for execution) memory • includes the “oracle” binary caching • all the libraries caching • Does not mean “occupied” – usually can be released immediately • The Oracle SGA – when not using hugepages
Cached – example part 1 [root@~]# cat /proc/meminfo … MemFree: 8232512 kB Buffers: 9328 kB Cached: 28372 kB … du -smc indx01_* 1714 indx01_01.dbf 1761 indx01_02.dbf 1722 indx01_03.dbf 5197 total … cat indx01_* > /dev/null
Cached – example part 2 [root@~]#vmstat 2 procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 0 0 0 8093888 10808 163392 0 0 0 0 1012 17 0 0 100 0 0 0 0 809395210808163392 0 0 0 0 1012 16 0 0 100 0 0 1 0 7956736 10948 300272 0 0 68602 0 1567 1126 0 2 76 22 0 1 0 780857611092448068 0 0 73992 80 1623 1210 0 2 75 23 … 0 1 0 2847616 16104 5397616 0 0 65792 0 1542 1076 0 2 75 23 0 0 0 2766272 16180 5479180 0 0 40698 0 1341 675 0 1 85 14 0 0 0 2766208 16192 5479168 0 0 0 114 1033 22 0 0 100 0 cat /proc/meminfo … MemFree: 2766464 kB Buffers: 16192 kB Cached: 5479168 kB …
Cached – example #2 part 1 cat indx01_* >newfile vmstat 2 procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 0 0 0 2765312 17044 5479356 0 0 0 0 1012 17 0 0 100 0 0 3 0 2405376 17428 5833612 0 0 16 36866 1324 144 1 18 76 6 0 2 0 2143616 17688 6091532 0 0 4 111748 2000 213 0 16 50 34 … 0 1 0 16832 6784 8198556 0 0 8556 26684 1942 1267 0 2 74 24 1 1 0 16832 6856 8198744 0 0 12518 20720 2130 1767 0 3 74 23 … cat /proc/meminfo … MemFree: 16768 kB Buffers: 2192 kB Cached: 8196908 kB … Dirty: 277468 kB Writeback: 0 kB …
Cached – example #2 part 2 cat /proc/meminfo … MemFree: 20672 kB Buffers: 3300 kB Cached: 8191900 kB … Dirty: 0 kB Writeback: 0 kB … rm newfile procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 0 0 0 2329633808189480 0 0 0 28 1015 18 0 0 100 0 0 1 0 3257472 3948 4996372 0 0 284 0 1084 160 0 14 78 8 0 1 0 3255552 5828 4996572 0 0 940 0 1247 485 0 1 75 24 0 1 0 3253696 7616 4996344 0 0 884 96 1237 470 0 2 75 23 0 0 0 3253440 7988 4996492 0 0 186 0 1061 112 0 0 95 4 0 0 0 3253440 7988 4996492 0 0 0 0 1012 14 0 0 100 0
Swap • SwapTotal • SwapFree • SwapCached • written to swap, but still in memory • applies only to anonymous memory • OS will anticipate memory needs, and pre-swap inactive data, but keep it in memory • Actual swapping (memory that will need to be read from disk) = SwapTotal - SwapFree - SwapCached
Active/Inactive • Active – recently used memory • Includes all types of memory (cached, buffers, anonymous) • OS will try to keep it in RAM • Inactive – memory that will be first reused • “free” memory • Can be used to gauge the “working set”
High/Low Total/Free • 32 bit limitations, no high memory on 64 bit • Some kernel structures cannot be allocated in “high memory” • Used to be a problem in older kernels, newer kernels protect low memory
Dirty & Writeback • Dirty – cache/buffers memory that requires to be written to disk • thresholds can be adjusted • Writeback – memory actively been written to disk • Can reach high values with async writes with large queue
Committed_AS & Mapped • Committed_AS • Total memory requested on the system • Not used, just requested • If every process in the system is to touch and use the memory it has requested, this is how much would be used • Mapped • memory used for in-memory mapped files • all anonymous memory • includes committed & touched memory
Committed_AS - example cat grab.c main() {void *p; p=malloc(1073741824); sleep(60);} cat /proc/meminfo ... MemFree: 3230592 kB ... Committed_AS: 49972 kB ./grab cat /proc/meminfo ... MemFree: 3230464 kB ... Committed_AS: 1098808 kB
Slab • Slab – “in-kernel data structures cache” • similar to Oracle’s “shared_pool” • designed to prevent memory fragmentation • detailed monitoring:/proc/slabinfoslabtop • Basically “system space”
slabtop – ordered by cache size Active / Total Objects (% used) : 88874 / 139343 (63.8%) Active / Total Slabs (% used) : 5839 / 5846 (99.9%) Active / Total Caches (% used) : 90 / 132 (68.2%) Active / Total Size (% used) : 17286.03K / 23311.27K (74.2%) Minimum / Average / Maximum Object : 0.01K / 0.17K / 128.00K OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME 32382 24900 76% 0.27K 2313 14 9252K radix_tree_node 56925 40013 70% 0.05K 759 75 3036K buffer_head 364 363 99% 4.00K 364 1 1456K size-4096 2485 2471 99% 0.54K 355 7 1420K ext3_inode_cache 2376 413 17% 0.50K 297 8 1188K size-512 256 256 100% 3.00K 128 2 1024K biovec-(256) 4576 4481 97% 0.15K 176 26 704K dentry_cache 10248 4548 44% 0.06K 168 61 672K size-64 4340 1215 27% 0.12K 140 31 560K size-128 1980 316 15% 0.25K 132 15 528K size-256 …