The Uptime Infrastructure Monitor AIX agent collects the following performance metrics from the systems on which it is installed:
- CPU
- Memory
- Disk
- Network
- Process
- User
The AIX agent uses a number of utilities to gather these metrics including:
sar: collects information about system activity. This version of sar is bundled with AIX. mpstat: collects processor-related metrics. ifconfig: configures the parameters for network interfaces. ps: reports on the status of processes.
Each set of performance metrics is averaged between the interval at which the UIM monitoring station polls the agent, such as every 10 minutes.
Whenever the sar
command uses the -f
option to specify a file, that file is generated using the sadc 1 1
command. The sadc
command polls the system counters at a one-second interval, and then writes the information that it receives to a file. The sar
command, then reads this file.
CPU
The UIM agent uses the sar -u -f
command to collect CPU metrics from an AIX system. The statistics that the agent returns are averaged for all CPUs on the system and the sar
command compares the system counters during a one-second interval. If you have multiple CPUs, the CPU statistics output by the agent are an average of all the CPUs on the server.
Metric | Explanation |
---|---|
% Usr | The amount of time that the CPU spends in user mode. |
% Sys | The amount of time that the kernel spends processing system calls. |
% WIO | The amount of waiting time that a runnable process for a device takes to perform an I/O operation. |
Multi CPU Usage | Whether a system with multiple CPUs is effectively balancing tasks between CPUs, or if processes are being forced off CPUs in certain circumstances. |
Run Queue Length | The percentage of time that one or more services or processes are waiting to be served by the CPU. |
Run Queue Occupancy | The percentage of time that one or more services or processes are waiting to be served by the CPU. |
Memory
The Uptime Infrastructure Monitor agent uses the vmstat 1 2 command to average statistics for the entire system. The agent also uses the sar utility with the following options to collect memory metrics from an AIX system: -b -f (cache metrics) -r -f (unused memory pages and disk blocks) -q -f (the average queue length while it is occupied, and the percentage of time the queue is occupied) -c -f (system calls) The sar commands compare the system counters over a one-second interval.
ExplanationMetric
Disk
The Uptime Infrastructure Monitor agent uses the following commands to collect disk statistics:
df -k to gather file system capacity statistics, for the file system.
sar -d -f to output disk statistics (e.g. %busy, Read/Write/s) per disk, and compare those statistics between polling intervals. By default, the disk statistics are generated for all disks (including disks that are not active). This can be changed within the agent by setting the ACTIVEON LY flag in the perfparse.sh file to 0.
Network
The Uptime Infrastructure Monitor agent uses the netstat command with the following options to collect network metrics from an AIX system: netstat -s to combine TCP retransmits for all interfaces netstat -I <interface> to average statistics (e.g. kbps, errors and collisions) per interface.
Process and Workload
The Uptime Infrastructure Monitor agent uses the ps -eo command to collect, process metrics from an AIX system. By default, the agent only gathers the top 20 processes and sorts them by the highest CPU usage. Workload statistics are sorted within Uptime Infrastructure Monitor's core. However, the core uses the same 20 processes that were gathered from the Process method. The following data are also gathered with the processes: the names of users, groups and processes along with their individual statistics (e.g. memory and CPU usage). Uptime Infrastructure Monitor's core will then sort the statistics based on the graph you want to generate (e.g. user, group or process name).
User
The Uptime Infrastructure Monitor agent uses the following utilities to collect user metrics from an AIX system: ps -eo last | head 10 (login history for the last 10 users on the system) who (lists who is currently logged into the system)