Wednesday, October 19, 2011

Combining DB2 and OS Diagnostics Part 3

Combining DB2 and OS Diagnostics Part 2

A simple methodology on Unix systems to approach a problem scenario:


On UNIX, problem investigation often starts with the following:
· Check vmstat or iostat output for the amount of system and user CPU being consumed
versus wait and idle time. Wait time usually, but not always, means waiting on I/O. If
system CPU is equal to or greater than user CPU, this almost always indicates a problem.
· Check the top CPU-using processes by ordering ps output by descending CPU penalty (C)
and memory usage (RSS and SZ columns). For example:
ps -elf | sort +5 -rn
· Are the same processes near the top? Is the CPU penalty staying in the 60-120 range? This
would mean these processes are constantly consuming CPU cycles and need to be
investigated.

On Windows, check for the top CPU and memory users by ordering the output on those columns.
I/O is more difficult to judge without getting perfmon output.

Where my DB2 database spending time


In order to identify where DB2 is spending most of it's time.

In 9.7 new in memory monitoring functions has been introduced.
One of the most interesting and useful feature I found, now you can easily determine where DB2 is spending most of its time.

You can use this SQL, V9.7 onward.

select u.metric_name,
u.parent_metric_name,
sum(u.total_time_value),
sum(u.count)
from table(mon_get_service_subclass_details(null,null,-2)) as t,
table(mon_format_xml_times_by_row(t.details)) as u
group by metric_name, parent_metric_name


Output:



Metric_Name Parent_Metrics_Name Total_Time_Value Count
AGENT_WAIT_TIME TOTAL_WAIT_TIME 0 0
AUDIT_FILE_WRITE_WAIT_TIME TOTAL_WAIT_TIME 0 0
AUDIT_SUBSYSTEM_WAIT_TIME TOTAL_WAIT_TIME 0 0
CLIENT_IDLE_WAIT_TIME
12781606
DIAGLOG_WRITE_WAIT_TIME TOTAL_WAIT_TIME 452 211
DIRECT_READ_TIME TOTAL_WAIT_TIME 13351 6066
DIRECT_WRITE_TIME TOTAL_WAIT_TIME 635 90
FCM_MESSAGE_RECV_WAIT_TIME FCM_RECV_WAIT_TIME 65799 35803
FCM_MESSAGE_SEND_WAIT_TIME FCM_SEND_WAIT_TIME 0 0
FCM_RECV_WAIT_TIME TOTAL_WAIT_TIME 66430 35943
FCM_SEND_WAIT_TIME TOTAL_WAIT_TIME 0 0
FCM_TQ_RECV_WAIT_TIME FCM_RECV_WAIT_TIME 631 140
FCM_TQ_SEND_WAIT_TIME FCM_SEND_WAIT_TIME 0 0
IPC_RECV_WAIT_TIME TOTAL_WAIT_TIME 414 105880
IPC_SEND_WAIT_TIME TOTAL_WAIT_TIME 443 108850
LOCK_WAIT_TIME TOTAL_WAIT_TIME 0 0
LOG_BUFFER_WAIT_TIME TOTAL_WAIT_TIME 0 0
LOG_DISK_WAIT_TIME TOTAL_WAIT_TIME 48250 6832
POOL_READ_TIME TOTAL_WAIT_TIME 1398 990
POOL_WRITE_TIME TOTAL_WAIT_TIME 1233 147
TCPIP_RECV_WAIT_TIME TOTAL_WAIT_TIME 0 0
TCPIP_SEND_WAIT_TIME TOTAL_WAIT_TIME 0 0
TOTAL_COMMIT_PROC_TIME TOTAL_RQST_TIME 7344 11901
TOTAL_COMPILE_PROC_TIME TOTAL_RQST_TIME 13021 19386
TOTAL_IMPLICIT_COMPILE_PROC_TIME TOTAL_RQST_TIME 0 0
TOTAL_LOAD_PROC_TIME TOTAL_RQST_TIME 0 0
TOTAL_REORG_PROC_TIME TOTAL_RQST_TIME 0 0
TOTAL_ROLLBACK_PROC_TIME TOTAL_RQST_TIME 2485 19189
TOTAL_ROUTINE_USER_CODE_PROC_TIME TOTAL_RQST_TIME 23046 1247
TOTAL_RQST_TIME
504097 149909
TOTAL_RUNSTATS_PROC_TIME TOTAL_RQST_TIME 8968 61
TOTAL_SECTION_PROC_TIME TOTAL_RQST_TIME 21926 31140
TOTAL_SECTION_SORT_PROC_TIME TOTAL_SECTION_PROC_TIME 22 182
TOTAL_WAIT_TIME TOTAL_RQST_TIME 132649
WLM_QUEUE_TIME_TOTAL TOTAL_WAIT_TIME 0 0

Combining DB2 and OS Diagnostics Part 1

Often there is need to combine DB2 diagnostic data with OS. Here are few useful commands and utilities you can use on different platforms..



AIX HP-UX Solaris Linux
OS level /usr/sbin/oslevel /usr/bin/uname -a /usr/bin/uname -a /bin/uname -a
Software /usr/bin/lslpp -ah /usr/sbin/swlist -v /usr/bin/showrev -p /bin/rpm -qa
CPUs /usr/sbin/lsdev -C (grep proc) /usr/sbin/ioscan (grep processor) /usr/sbin/psrinfo -v /bin/dmesg (grep CPU)
Memory /usr/sbin/lsattr -El sys0 (grep realmem) /usr/samples/kernel/vmtune /usr/sbin/dmesg (grep Physical) /usr/sbin/prtconf (grep Memory) /bin/dmesg (grep Memory) /usr/bin/free
IPC resources Ipcs /usr/sbin/kmtune SAM GUI tool /usr/sbin/sysdef -i /etc/system (file) /usr/bin/ipcs -l
User limits ulimit -a /etc/security/limits (file) ulimit -a ulimit -a ulimit -a
User environment Set Set Set Set



Windows NT has a nice interface for obtaining system information:
winmsd /af


On Windows 2000® and XP®:
<OSdrv>:\Program Files\Common Files\Microsoft Shared\MsInfo\Msinfo32
/report
This command does not report on installed software. One way to obtain this information is
to export the registry (at least the HKEY_LOCAL_MACHINE\SOFTWARE branch) to a
file by using the regedit tool.


System error and message logs on UNIX®
The following table shows where the logs are located on different UNIX platforms.



AIX HP-UX Solaris Linux
Error / Message Logs /usr/bin/errpt -a /var/adm/syslog/syslog.log /usr/sbin/dmesg /var/log/messages* /var/adm/messages*