Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

The Uptime Infrastructure Monitor AIX agent collects the following performance metrics from the systems on which it is installed: Image Removed

  • CPU

...

  • Memory

...

  • Disk

...

  • Network

...

  • Process

...

  • User

The AIX agent uses a number of utilities to gather these metrics including:

Code Block
languagesql
sar: collects information about system activity. This version of sar is bundled with AIX. 
mpstat: collects processor-related metrics.

...

 
ifconfig: configures the parameters for network interfaces. 
ps: reports on the status of processes.

Each set of performance metrics is averaged between the interval at which the Uptime Infrastructure Monitor Monitoring Station UIM monitoring station polls the agent (e.g. , such as every 10 minutes).

Whenever the sar command uses the -f option to specify a file, that file is generated using the sadc 1 1 command. The sadc command polls the system counters at a one-second interval, and then writes the information that it receives to a file. The sar command, then reads this file.

CPU

The Uptime Infrastructure Monitor UIM agent uses the sar -u -f command to collect CPU metrics from an AIX system. The statistics that the agent returns are averaged for all CPUs on the system and the sar command  command compares the system counters during a one-second interval. If you have multiple CPUs, the CPU statistics output by the agent are an average of all the CPUs on the server.

MetricExplanation
% UsrThe amount of time that the CPU spends in user mode.
% SysThe amount of time that the kernel spends processing system calls.
% WIOThe amount of waiting time that a runnable process for a device takes to perform an I/O operation.
Multi CPU UsageWhether
or not
a system with multiple CPUs is effectively balancing tasks between CPUs, or if processes are being forced off CPUs in certain circumstances.
Run Queue LengthThe percentage of time that one or more services or processes are waiting to be served by the CPU.
Run Queue OccupancyThe percentage of time that one or more services or processes are waiting to be served by the CPU.

Memory

The Uptime Infrastructure Monitor UIM agent uses the the vmstat 1 2 command to average statistics for the entire system. The agent also uses the sar utility with the following options to collect memory metrics from an AIX system:

Code Block
languagesql
-b -f (cache metrics)

...

 
-r -f (unused memory pages and disk blocks)

...

 
-q -f (the average queue length while it is occupied, and the percentage of time the queue is occupied)

...

 
-c -f (system calls)

The sar commands compare the system counters over a one-second interval.

MetricExplanation

...

Free MemoryThe amount of physical memory available to the operating system, system library files, and applications.
Cache Hit RateHow often the system accesses the CPU cache.
Page-outs/sThe rate at which pages were written to disk.
Page-ins/sThe rate at which pages were read from or written to the disk.
Page Free/sThe number of pages that are freed from memory each second.
Attaches/sThe number of pages that get attached to memory each second.
odio/sThe number of non-paging
disk
I/O per operations that occur each second.
slotsThe number of available initiators.
cycle/sThe number of page replacement cycles that occur each second.
fault/sThe number of page faults that occur each second.
Software Locks/sThe number of software locks that are issued each second.

Disk

The Uptime Infrastructure Monitor UIM agent uses the following commands to collect disk statistics:

Code Block
languagesql
df -k to gather file system capacity statistics, for the file system.

...


sar -d -f to output disk statistics (e.g. %busy, Read/Write/s) per disk, and compare those statistics between polling intervals.

By default, the disk statistics are generated for all disks ( including disks that are not active). This can be changed You can change this option within the agent by setting the ACTIVEON LY ACTIVEONLY flag in the the perfparse.sh sh file to to 0.

MetricExplanation
Disk (Spindle) NameThe
names
name of each disk on the system.
Usage (% Busy)The percentage of time during which the disk drive is handling read or write requests.
Blocks per secondThe number of read and write operations on the disk that occur each second.
Transfers/sThe average number of bytes that
have been
are transferred to or from the disk during write or read operations.
Average Queued RequestsThe number of threads that are waiting for processor time.
Average Service TimeThe average amount of time, in milliseconds,
that is
required
for
to carry out a request
to be carried out
.
Average Wait TimeThe average amount of time, in milliseconds, that a transaction
is waiting
waits in a queue. The wait time is directly proportional to the length of the queue.

Network

The Uptime Infrastructure Monitor UIM agent uses the the netstat command with the following options to collect network metrics from an AIX system:

Code Block
languagesql
netstat -s to combine TCP retransmits for all interfaces

...

 
netstat -I <interface> to average statistics (e.g. kbps, errors and collisions) per interface.
MetricExplanation
Receive Rate

The rate, in kilobytes per seconds, at which data is received over a specific network adapter.

Send Rate

The rate, in kilobytes per seconds, at which data is sent over a specific network adapter.

Packets Inbound Errors

The number of inbound packets that contained errors, which

preventing

prevent those packets from being delivered to a higher-layer protocol.

Packets Outbound Errors

The number of outbound packets that could not be transmitted because of errors.

CollisionsThe number of signals from two separate nodes on the network that have collided.
TCP RetransmitsThe number of packets that
have been
were re-sent over a network interface.

Process and

...

workload

The Uptime Infrastructure Monitor UIM agent uses the the ps -eo command to collect , and process metrics from an AIX system. By default, the agent only gathers the top 20 processes and sorts them by the highest CPU usage.

Workload statistics are sorted within Uptime Infrastructure MonitorUIM's core. However, the core uses the same 20 processes that were gathered from the Process method. The following data are Data also gathered with the processes : include the names of users, groups and processes, along with their individual statistics (e.g. , such as memory and CPU usage). Uptime Infrastructure MonitorUIM's core will then sort sorts the statistics based on the graph you want to generate (e.g. , for example user, group, or process name).

MetricExplanation
Number of ProcessesThe number of processes that are currently running on a system.
Process Creation RateThis metric determines Determines whether or not there are runaway processes on a system or if a forking-based process (like a Web server) is spawning too many processes over a specified period of time.
Processes RunningThe number of processes that are currently running.
Processes BlockedThe number of processes that are currently being blocked from running.
Processes WaitingThe number of processes that are currently waiting to runnrun.
Workload - UserThe demand that network and local services are putting on the system, based on the IDs of the users who are logged into a system.
Workload - GroupThe demand that network and local services are putting on the system, based on the IDs of the user groups that are logged into a system.
Workload - Process NameThe demand that network and local services are putting on a system, based on the processes that are running.
Workload Top 10 - UserThe 10 network and local services that are are putting the most load on the system, based on the IDs of the users who are logged into a system.
Workload Top 10 - GroupThe 10 network and local services that are are putting the most load on the system, based on the IDs of the user groups who are logged into a system.
Workload Top 10 - Process NameThe 10 network and local services that are are putting the most load on the system, based on the processes that are running.

User

The Uptime Infrastructure Monitor UIM agent uses the following utilities to collect user metrics from an AIX system:

Code Block
languagesql
ps -eo

...

 
last | head 10 (login history for the last 10 users on the system) 
who (lists who is currently logged into the system)
MetricExplanation
Login HistoryThe number of times or frequency at which a user has logged logs into a system during any 30-minute time interval.
SessionsThe number of sessions or number of distinct users who are logged into a system during any 30-minute time interval.