280 likes | 460 Views
Lecture – Performance. Performance management on UNIX. Performance Analysis. Performance analysis involves identifying various system bottlenecks This involves a number of steps We must ask a number of questions Is there a performance Problem? Is the problem CPU or I/O related?.
E N D
Lecture –Performance Performance management on UNIX
Performance Analysis • Performance analysis involves identifying various system bottlenecks • This involves a number of steps • We must ask a number of questions • Is there a performance Problem? • Is the problem CPU or I/O related?
Performance Analysis • CPU Related? • What is the current load on the CPU? • What is the average load on the CPU? • I/O Related • Is it normal disk I/O? • Would more/faster disks help? • Is it paging I/O? • Would more physical memory help?
Related to a Particular User or Program? • Identify the user / program • Identify what they are doing to cause the problem • Revise their operating procedures • Consider removing them from the system
Determining CPU Usage • Determining the CPU usage is the first thing we should do • There are a number of tools to do this • vmstat gives several pieces of useful information including CPU usage • vmstat [interval] [count] • Interval is the number of seconds between reports and count is the number of reports to generate
vmstat 2 10 [rbradley@aisling]$ vmstat 2 10 procs memory swap io system cpu r b w swpd free buff cache si so bi bo in cs us sy id 1 0 0 5484 27240 136584 198840 0 1 5 8 8 8 4 7 4 0 0 0 5484 27240 136584 198840 0 0 0 96 155 100 0 0 100 0 0 0 5484 27232 136584 198844 0 0 0 0 159 112 2 0 98 0 0 0 5484 27216 136584 198844 0 0 0 0 130 51 0 2 98 0 0 0 5484 27216 136588 198848 0 0 0 86 157 63 0 0 100 0 0 0 5484 27216 136588 198848 0 0 0 0 139 46 0 0 100 0 0 0 5484 27224 136588 198836 0 0 0 30 153 47 0 0 100 0 0 0 5484 27712 136588 198824 0 0 0 8 166 107 1 0 99 0 0 0 5484 26876 136588 198828 0 0 0 0 139 92 6 2 91 0 0 0 5484 26876 136592 198824 0 0 0 144 137 69 0 0 100
vmstat • The first line gives the average values since the system was booted and should be ignored • To determine the CPU usage, we are interested in the last three columns, us, sy, id • us: % of CPU dedicated to User tasks • sy: % of CPU dedicated to System tasks. Including I/O performing general O/S functions etc. • id: % of CPU idle procs memory swap io system cpu r b w swpd free buff cache si so bi bo in cs us sy id 1 0 0 5484 27240 136584 198840 0 1 5 8 8 8 4 7 4 0 0 0 5484 27240 136584 198840 0 0 0 96 155 100 0 0 100 0 0 0 5484 27232 136584 198844 0 0 0 0 159 112 2 0 98
Analysing vmstat output (CPU) • Just because CPU time is high or idle time is low does not indicate a system problem • It may simply indicate that a number of batch jobs are scheduled to run at the same time and might benefit from being rearranged • In order to establish if there is a genuine problem it is necessary to monitor the system over an extended period • If average CPU% remain high, there is a problem
Analysing vmstat output(Process States) • There are three states in which a process may be at any point in time • Runtime, uninterrupted sleep, swapped out • Process Statistics: • r: Number of processes waiting for runtime • b: Number of processes in uninterrupted sleep • w: Number of processes swapped out, but otherwise able to run • A high r suggests there is a bottle neck. procs memory swap io system cpu r b w swpd free buff cache si so bi bo in cs us sy id 1 0 0 5484 27240 136584 198840 0 1 5 8 8 8 4 7 4 0 0 0 5484 27240 136584 198840 0 0 0 96 155 100 0 0 100 0 0 0 5484 27232 136584 198844 0 0 0 0 159 112 2 0 98
Analysing vmstat output (Memory) • Memory Statistics • swapd: Amount of virtual memory used (KB) • free: Amount of idle memory (KB) • buff: Ammount of memory used in buffers • cache:amount of memory left in cache procs memory swap io system cpu r b w swpd free buff cache si so bi bo in cs us sy id 1 0 0 5484 27240 136584 198840 0 1 5 8 8 8 4 7 4 0 0 0 5484 27240 136584 198840 0 0 0 96 155 100 0 0 100 0 0 0 5484 27232 136584 198844 0 0 0 0 159 112 2 0 98
Analysing vmstat output (Swap) • Swap Statistics • si: Amount of memory swapped in from disk (KB/s) • so: Amount of memory swapped out to disk (KB/s) • Swap statistics are arguably the most important statistic to monitor, and of these, the so field • This field indicates the pages that have been swapped out, even if done before vmstat was started procs memory swap io system cpu r b w swpd free buff cache si so bi bo in cs us sy id 1 0 0 5484 27240 136584 198840 0 1 5 8 8 8 4 7 4 0 0 0 5484 27240 136584 198840 0 0 0 96 155 100 0 0 100 0 0 0 5484 27232 136584 198844 0 0 0 0 159 112 2 0 98
Analysing vmstat output (I/O) • I/O Statistics • bi: Blocks received from a block device (blocks/sec) • bo: Blocks sent to a block device (blocks/sec) • If there are a large number of block transfers, the problem with your system may lie here (i.e. device access is high) • A single reading, however is not indicative of the system as a whole, simply a snapshot • All Linux blocks are 1KB except for CDRom blocks (2KB) procs memory swap io system cpu r b w swpd free buff cache si so bi bo in cs us sy id 1 0 0 5484 27240 136584 198840 0 1 5 8 8 8 4 7 4 0 0 0 5484 27240 136584 198840 0 0 0 96 155 100 0 0 100 0 0 0 5484 27232 136584 198844 0 0 0 0 159 112 2 0 98
Analysing vmstat output (System) • System Statistics • in: The number of interrupts per second, including the system clock • cs: The number of context switches per second procs memory swap io system cpu r b w swpd free buff cache si so bi bo in cs us sy id 1 0 0 5484 27240 136584 198840 0 1 5 8 8 8 4 7 4 0 0 0 5484 27240 136584 198840 0 0 0 96 155 100 0 0 100 0 0 0 5484 27232 136584 198844 0 0 0 0 159 112 2 0 98
Analysing vmstat output (CPU usage) • System Statistics • us: % of CPU dedicated to user tasks • sy: % of CPU dedicated to system tasks • id: % of CPU idle procs memory swap io system cpu r b w swpd free buff cache si so bi bo in cs us sy id 1 0 0 5484 27240 136584 198840 0 1 5 8 8 8 4 7 4 0 0 0 5484 27240 136584 198840 0 0 0 96 155 100 0 0 100 0 0 0 5484 27232 136584 198844 0 0 0 0 159 112 2 0 98
top • top is another tool for identifying problems with a LINUX system • Displays the top CPU processes • Displays a listing of the most CPU intensive tasks on the system • Can provide an interactive interface for manipulating the processes • Default is to update every 5 seconds • top operates by examining files in the /proc pseudo file system • This pseudo file system is used as an interface to kernel data structures • man proc
top [rbradley@aisling rbradley]$ top 17:14:41 up 47 days, 2:27, 8 users, load average: 0.06, 0.03, 0.07 61 processes: 59 sleeping, 2 running, 0 zombie, 0 stopped CPU states: 0.0% user 0.2% system 0.0% nice 0.0% iowait 99.8% idle Mem: 513316k av, 200052k used, 313264k free, 0k shrd, 44976k buff 57692k actv, 11208k in_d, 1024k in_c Swap: 1052248k av, 9096k used, 1043152k free 34656k cached PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME CPU COMMAND 1 root 15 0 108 76 56 S 0.0 0.0 0:15 0 init 2 root 15 0 0 0 0 SW 0.0 0.0 0:00 0 keventd 3 root 15 0 0 0 0 SW 0.0 0.0 0:01 0 kapmd 4 root 34 19 0 0 0 SWN 0.0 0.0 0:00 0 ksoftirqd_CPU0 9 root 15 0 0 0 0 SW 0.0 0.0 0:00 0 bdflush 226 root 15 0 0 0 0 SW 0.0 0.0 0:00 0 kjournald 586 root 15 0 200 160 116 S 0.0 0.0 0:08 0 syslogd 590 root 15 0 180 168 120 S 0.0 0.0 0:03 0 klogd 666 root 15 0 480 348 232 S 0.0 0.0 1:09 0 sshd 719 root 15 0 52 4 0 S 0.0 0.0 0:00 0 gpm 728 root 15 0 176 148 88 S 0.0 0.0 0:05 0 crond 785 xfs 15 0 1836 60 32 S 0.0 0.0 0:00 0 xfs 803 daemon 15 0 180 164 116 S 0.0 0.0 0:00 0 atd 812 root 23 0 52 4 0 S 0.0 0.0 0:00 0 mingetty 813 root 23 0 52 4 0 S 0.0 0.0 0:00 0 mingetty
Analysing top output • Up: The time the system has been up and the three load averages • Average number of processes ready to run in the last 1,5 and 15 minutes • Same as the output of uptime • Processes: The total number of processes running at the time of the last update • Broken down into running, sleeping, stopped and zombied • (A zombie process is a finished process where the parent has not read it exit state – which causes the process to be cleaned up) 17:14:41 up 47 days, 2:27, 8 users, load average: 0.06, 0.03, 0.07 61 processes: 59 sleeping, 2 running, 0 zombie, 0 stopped CPU states: 0.0% user 0.2% system 0.0% nice 0.0% iowait 99.8% idle Mem: 513316k av, 200052k used, 313264k free, 0k shrd, 44976k buff 57692k actv, 11208k in_d, 1024k in_c Swap: 1052248k av, 9096k used, 1043152k free 34656k cached
Analysing top output • CPU States: The percentage of CPU time in user mode, system mode, niced tasks (negative nice tasks) and idle • Time spent in niced tasks will also be counted system and user time, so the total will be more than 100% • Mem: Statistics on memory usage, including total available memory, free memory, used memory, shared memory, memory used for buffers 17:14:41 up 47 days, 2:27, 8 users, load average: 0.06, 0.03, 0.07 61 processes: 59 sleeping, 2 running, 0 zombie, 0 stopped CPU states: 0.0% user 0.2% system 0.0% nice 0.0% iowait 99.8% idle Mem: 513316k av, 200052k used, 313264k free, 0k shrd, 44976k buff 57692k actv, 11208k in_d, 1024k in_c Swap: 1052248k av, 9096k used, 1043152k free 34656k cached
Analysing top output • Swap: Statistics on swap space including total swap space and used swap space • This and the Mem section together are the same as the output of free* • PID: The process ID of each task • USER: The username pf the task’s owner • PRI: The priority of the task • NI: The nice value of the task. Negative values are lower priority 17:14:41 up 47 days, 2:27, 8 users, load average: 0.06, 0.03, 0.07 61 processes: 59 sleeping, 2 running, 0 zombie, 0 stopped CPU states: 0.0% user 0.2% system 0.0% nice 0.0% iowait 99.8% idle Mem: 513316k av, 200052k used, 313264k free, 0k shrd, 44976k buff 57692k actv, 11208k in_d, 1024k in_c Swap: 1052248k av, 9096k used, 1043152k free 34656k cached PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME CPU COMMAND 1 root 15 0 108 76 56 S 0.0 0.0 0:15 0 init
Analysing top output PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME CPU COMMAND 1 root 15 0 108 76 56 S 0.0 0.0 0:15 0 init • SIZE: The size of the task’s code plus data stack space, in kilobytes • RSS: The total amount of physical memory used by the task in kilobytes • SHARE: The amount of shared memory used by the task • STATE: The state of the task, S: sleeping, D: uninterrupted sleep, R: running, Z: zombies, T: stopped or traced • %CPU: The task’s share of the CPU since the last screen update as a a percentage of total CPU time • %MEM: The task’s percentage of physical memory • Time: Total CPU time used by process since it started • COMMAND: The task’s command name
Using top to control processes • In addition to command-line options for controlling the appearance of top (not covered here) there are a number of commands that can be issued to top while running • Space: immediately updates the display • ^L: Erases and redraws the screen • k: kill a process You will be prompted for the pid and a signal to send to the process (normally 15)
Using top to control processes • i: ignore zombie processes • n: change the number of processes to view • r: renice a process • P: sort tasks by CPU usage • M: sort tasks by Memory usage
Renice • The renice command is used to alter the priority of running processes • The default nice value is 0 • The range in Linux is -20 to +20 • The lower the value the faster the process runs • Can examine the nice value of a process using ps –l
Renice • The owner of and root can change the nice value of aprocess using renice • Changes apply to all child processes • renice priority [[-p] pid ...] [[-g] pgrp ...] [[-u] user ...] [rbradley@aisling]$ ps -l F S UID PID PPID C PRI NI ADDR SZ WCHAN TTY TIME CMD 0 S 1634 24496 24495 0 75 0 - 1091 wait4 pts/1 00:00:00 bash 0 R 1634 26361 24496 0 75 0 - 778 - pts/1 00:00:00 ps [rbradley@aisling]$ renice 5 24496 24496: old priority 0, new priority 5 [rbradley@aisling]$ ps -l F S UID PID PPID C PRI NI ADDR SZ WCHAN TTY TIME CMD 0 S 1634 24496 24495 0 80 5 - 1091 wait4 pts/1 00:00:00 bash 0 R 1634 26363 24496 0 80 5 - 777 - pts/1 00:00:00 ps
Renice • Once a nice value has been increased, only the root user can reduce it again, not even to the default value [rbradley@aisling]$ renice 19 24496 24496: old priority 5, new priority 19 [rbradley@aisling]$ ps -l F S UID PID PPID C PRI NI ADDR SZ WCHAN TTY TIME CMD 0 S 1634 24496 24495 0 94 19 - 1091 wait4 pts/1 00:00:00 bash 0 R 1634 26390 24496 0 94 19 - 778 - pts/1 00:00:00 ps [rbradley@aisling]$ renice 1 24496 renice: 24496: setpriority: Permission denied
How Much Swap Space? • A quick rule of thumb often used is twice as much as you have physical memory • This approach is a bit simplistic and does not scale well • Estimate total memory requirements • Add some megabytes as a spare • Subtract the amount of physical memory available • If the value from 3 is > 3 times the available physical memory, you need more memory
How Much Swap Space • Sometimes the above formula will show that you don’t need swap space at all • It is a good policy to create some anyway • Linux uses the swap space so that as much physical memory as possible is kept free • It swaps out pages that have not been used for a while • When memory is needed, it is available
How Much Swap Space? • If swap space is removed (using the swapoff command) the system will attempt to move any swapped pages into other swap space or physical memory • If there is not enough space elsewhere the system may become unavailable for a time, while it sorts itself out, but it will come back