LDD Today

Performance perspectives
Rules-of-thumb for monitoring Windows NT/2000 and Domino statistics

by
Harry
Murray

Level: Intermediate
Works with: Domino
Updated: 04-Jun-2002


There are many books and articles available to help you tune your operating systems and Domino, and to explain how to monitor those systems to be sure your server is performing well. But if you're like many Domino and system managers, you spend most of your day solving problems, doing maintenance, planning and executing upgrades, and so on, and have little time to wade through all that great tuning and system monitoring information. You also may have more than one operating system (OS) to support, which makes it all the more difficult to become an expert and “know it all.” To complicate things further, even if you have taken the time to read the books and articles, you may still not be sure you have set up the environment optimally because of seemingly conflicting points of view and the complexity of it all.

This is the first of a multi-part series on performance monitoring and performance tuning tips. It focuses on the Windows NT and Windows 2000 operating systems; other operating systems will be covered in subsequent parts of the series. But this month, we'll attempt to extract the most important Windows NT and Windows 2000 OS statistics and resource usage bottleneck thresholds from the available written information and from our Domino performance engineers. We'll tell you only what you need to know to be sure your system is performing properly and what standard tools you can use to accomplish this. We'll also include some of the more important OS tuning tips as well as a list of references in case you want to dig deeper.

For this column, we assume that you have a basic knowledge of the OS. The performance monitoring tools that we discuss are the ones shipped with the OS.

We're hoping this series will become a simple set of rules-of-thumb that you can use as a basic guide to optimal performance from your servers. As always, we welcome your feedback on the ideas presented here. We are attempting to put a stake in the ground as to what we think is correct. We hope this stimulates discussion that will further refine these ideas. It’s possible that we've missed some important points. If your experience indicates that some of our information needs to be corrected, please let us know.

Guidelines for monitoring system performance and tuning systems
First, let's go over some general guidelines about monitoring system performance and tuning systems:
Before discussing the tools that help you in your data collection and analysis, let's look at how to enable disk and network statistics for both the Windows performance monitor and the new Domino platform statistics.

The monitoring tools
Before you start using the monitoring tools, make sure you have done the following to enable statistics collection.

Setting up Win32 systems
To set up Windows NT for performance monitoring and platform statistics collection, use:
On Windows 2000, use:
To enable network counters for Windows NT/2000:
If you need additional information regarding enabling the SNMP server, refer to your Windows NT or Windows 2000 System Administration Reference Guide.

After you make any of these changes, you must restart the system so that the settings take effect.

Using Perfmon
The Performance monitor (Perfmon) is an excellent utility to view and capture a great deal of performance data. For example, it can capture Total CPU data, which is the percent of CPU used for the combination of everything that's happening on the system. It can capture the percent disk time for each disk on the system, which will tell us how busy the disks are. It also can capture memory available MB, which is the available real RAM memory. Applications such as Domino offer the ability to add application statistics to the mix. (To activate Domino statistics in the Windows NT/2000 Performance monitor, see the Domino Administration Help.)

To start Perfmon on the server, click Start and choose Programs - Administrative Tools - Performance Monitor. In Perfmon, you can click the plus sign (+) icon to add counter statistics. Note that in Windows 2000, you can right-click to get many additional options. Not only can you monitor the Perfmon stats in real-time, but you can log (capture and save in a database) Perfmon stats that are of interest.

The following screen shows the Performance monitor in Windows 2000. The counter represented by the white line is the server CPU counter. You can see that the server CPU usage climbed to almost 100 percent. This counter is the Total CPU for the 8 CPU server. A value of 100 percent indicates that all 8 CPUs are almost 100 percent busy. The blue line shows the percent disk time on a 10 disk RAID set. In Windows NT, the 100 percent disk time would be an indicator that the disk is working at near capacity. In Windows 2000, however, the percent disk time for the 10 disk RAID set can go as high as around 1000 percent without running at maximum capacity. So the disk shown below is not near capacity.

Windows 2000 Performance Monitor

More details on how to use Perfmon are given in the Windows NT and Windows 2000 Resource Kits Guides from Microsoft referenced at the end of this article.

Using Windows Task Manager
You can use the Windows Task Manager to see quickly how busy the CPU is and to view memory statistics. Press Ctrl-Alt-Delete and click the Task Manager button to open the Task Manager. Then click the Performance tab. Note that Task Manager does not give Disk I/O statistics.

Collecting Platform statistics using Domino
There is a Domino feature, Platform stats, that provides another way to capture OS statistics such as CPU utilization. As a Domino systems administrator, you may find it easier to collect these statistics using Domino rather than the OS tools. These Platform stats can be used in place of statistics from Perfmon.

The Platform stats are integrated into Domino Monitoring (Events4.nsf) and Reports (Statrep.nsf). You can set thresholds and alarms for server resource usages within Domino. Platform stats are available via the server console, the new Domino 6 Administration client, and the Web client. Both Domino statistics and OS statistics can be collected together both in real-time and for historical trend analysis. Platform stats are grouped into five categories: logical disk, memory, network, CPU, and miscellaneous system stats. For some stats, average, minimum, and peak values are calculated.

Platform stats were introduced in 5.0.2 for the Windows NT and Solaris operating systems and in 5.0.3 for the IBM iSeries (AS/400). In addition, Domino 6 supports Platform stats for the IBM zSeries (S/390), AIX , and Windows 2000. Note that in R5, you need to include the following NOTES.INI setting to activate Platform stats:

Platform_Statistics_Enabled=1

When you do a "show stat platform" at the Domino R5 console, many statistics are reported. The naming convention for the statistics that are reported is the same across platforms to make it easier for you to support multiple server platforms. Some of the particularly interesting reported statistics include:

Platform.LogicalDisk._Total.1._Total.1.AvgQueueLength = 0 (average disk queue length for all disks)
Platform.Memory.KBFree = 72,788 (unused available RAM memory)
Platform.System.TotalUtil = 9.6 (total percent CPU)

See the Domino R5 Platform stats sidebar for a complete listing of the reported statistics.

When you do a "show stat platform" at the Domino 6 console, over 130 statistics are reported. Some of the particularly interesting reported statistics include:

Platform.LogicalDisk.5.AvgQueueLen = 0.5 (average disk queue length for disk H:)
Platform.Memory.RAM.AvailMBytes = 2,660.5 (unused available RAM memory. This number needs a little explaining. To get the actual amount of physical memory used, you need to subtract the the total amount of physical RAM memory in the system (up to 4 GBs) from the above value. Because this particular system had 4000 MBs, the actual RAM memory used was 4000-2660.5=1339.5 MBs.)
Platform.Network.1.CurrBandwidthMbitsPerSec = 100 (network bandwidth 100 Mbit)
Platform.Network.Total.NetworkBytesPerSec = 168,280.5 (total network bytes per second)
Platform.Network.Total.PctUtilBandwidth = 1.3 (percent network utilization)
Platform.System.PctCombinedCpuUtil = 6.2 (total percent CPU)

Notice that network statistics were added to the Domino 6 Platform stats. See the Domino 6 Platform stats sidebar for a complete listing of the reported statistics.

Using the Server Health Monitoring tool from Tivoli Software
Some of you might have seen the new Domino 6 tool currently called Server Health Monitoring demo at Lotusphere 2002. This new tool may be the answer to most of your wishes with respect to determining if you have a performance problem with your Domino server. Not only can it alert you to server performance problems, it can suggest some of the possible solutions. Fully integrated with Domino, you can use it to monitor one or more servers simultaneously through the Domino 6 Administrator client. For more information, see the LDD Today article Start using Domino 6 Server Health Monitoring now!

Resource bottleneck threshold rules-of-thumb
This section attempts to establish rules-of-thumb to determine if important system resources are on the verge of being a bottleneck and limiting server performance.

Disk bottlenecks
To calculate thresholds for disk bottlenecks, use the Perfmon Disk Queue Length. For example (where the number of spindles equals the number of disks in the RAID set or, if it’s a single disk, equals 1):
These thresholds work for a single disk or a RAID 0 set (see the A few tuning tips section for an explanation of some RAID types). However, the expected performance of other RAID types would be less. For example, a RAID 5 set has one disk dedicated to redundancy, so you might subtract 1 from the number of disks.

In R5, the Domino Platform stat that gives the average disk queue length for all the disks is Platform.LogicalDisk._Total.1._Total.1.AvgQueueLength. In Domino 6, it's Platform.LogicalDisk.5.AvgQueueLen. Domino 6 has a separate statistic for each logical disk. For example:

Platform.LogicalDisk.5.AvgQueueLen = 0.5 (average disk queue length for disk 5)

The Domino Platform stats can be used in place of Perfmon stats.

Note: The logical disk counters in Windows NT/2000 may be set to Off. See The monitoring tools section for how to turn them on.

Memory bottlenecks
To calculate thresholds for memory bottlenecks, use the Perfmon Memory Available Bytes stat. For example:
The R5 Domino Platform stat that gives available RAM memory is Platform.Memory.KBFree. In Domino 6, it's Platform.Memory.RAM.AvailMBytes.

Platform.Memory.RAM.AvailMBytes = 2,660.5 (unused available RAM memory)

The Platform stats can be used in place of Perfmon stats.

Alternatively, in a system with greater than 2 GBs, you can look at the Perfmon counter Process Working Set Total. When this counter approaches 2 GBs, you are almost out of application space memory because 32-bit applications (such as Domino) are limited to 2 GBs for all applications in Windows. You can also view memory conditions in the Windows Task Manager on the Performance tab under MEM Usage.

We've observed that the maximum amount of memory (RAM) used for both the OS and Domino in Windows NT/2000 is about 2.3 GBs. This is determined by subtracting the Perfmon counter Memory Available Bytes from the total RAM that the OS “sees.” You can determine the amount of memory seen by the OS by looking at the Physical Memory setting on the Performance tab of the Windows Task Manager.

CPU bottlenecks
To calculate CPU thresholds, use the Perfmon Total CPU stat. For example:
The R5 Domino Platform stat that gives percent CPU utilization is Platform.System.TotalUtil. In Domino 6, it's Platform.System.PctCombinedCpuUtil. The Platform stats can be used in place of Perfmon stats.

It’s important to take into consideration the duration of the high CPU values. For example, a high value for a few minutes may not be a problem, but high values for greater than 5 minutes or more that occur often may be a concern.

Also, if the value of Processor Queue Length under the Perfmon counter System is equal to or greater than the number of CPUs for a long period of time (say greater than 5 minutes), then there may be a CPU bottleneck.

Network bottlenecks
For Windows 2000, Perfmon gives the network bandwidth in bits per second and the total local network bytes per second if SNMP (Simple Network Management Protocol) is installed. For Windows NT, the bandwidth is not available but network bytes per second is available if SNMP is installed, and presumably the bandwidth of the network adapter is known by support staff. (See the article “Monitoring Network Traffic” in Windows & .Net Magazine.)

So to calculate the network thresholds:
There is no R5 Domino Platform stat that provides network utilization. For Domino 6, use Platform.Network.1.CurrBandwidthMbitsPerSec, Platform.Network.Total.NetworkBytesPerSec, and Platform.Network.Total.PctUtilBandwidth. The Platform stats can be used in place of Perfmon stats.

Note: A much better way to determine if there is a network bottleneck is to use network “sniffer” software or hardware to determine if the collision rate is more than 5 percent. That level of collision rate usually indicates a problem with the network or a saturated network. The collision rate is not available from within Windows NT/2000.

A few tuning tips
There are some things you can do to improve system performance. They're not magic OS tuning parameters, just basic Windows good system management practices. That doesn't negate their importance, however.

Fragmented disks
You should run a defragmenter utility often on your disks, including the OS disk. For busy disks, weekly defragmenting is recommended. If defragmenting is not done, performance will degrade significantly. A simple defragmenter is shipped with Windows 2000, but it might be better to acquire a more advanced defragmenter that can automatically run on a number of systems at periodic intervals.

Pagefile
For best performance, it is important for medium-size and large systems to have a separate pagefile disk, especially with Windows NT. Using the same hardware configurations in our performance testing labs, the percent pagefile disk utilization has been observed higher in Windows NT than in Windows 2000 for a given load. At very high system utilization (with thousands of active users), we have observed Domino becoming unstable until we moved the pagefile to its own RAID set. We then were able to increase the number of active users by about 16 percent. For example on Windows 2000, we observed that instability can occur at only 6 percent pagefile disk utilization with approximately 3,700 IMAP users using a single pagefile disk. After adding an old 10 disk RAID set for the pagefile, we were able to run at 4,400 active IMAP users.

LargeSystemCache
Windows NT and Windows 2000 have a disk-I/O cache called the LargeSystemCache. You can set it to “Maximize data throughput for file sharing,” “Maximize data throughput for network applications," or "Minimize memory used." In Windows NT, there is also a fourth setting to "balance file sharing and network applications."

The default setting is to favor file sharing, which uses more memory than the network application setting and the minimize memory settings. We have recently observed in the lab that leaving the default setting favoring file sharing can reduce data disk utilization by 15 percent at high usage. So, we are considering changing our current recommendation of setting the cache to favor network applications and recommend setting it to favor file sharing (the default). If your server memory is a bottleneck, you should set the cache to favor network applications or even, in extreme situations, to minimize memory.

To change the LargeSystemCache setting in Windows 2000, open the Control Panel, double-click the Network and Dial-up Connections icon, double-click Local Area Connection, click the Properties button, and select File And Printer Sharing For Microsoft Networks. For Windows NT, open the Control Panel, double-click the Network icon, click the Services tab, select the server icon, and then click the Properties button. You can then see the four options; select "maximum throughput for file sharing."

Optimize performance for applications or background services
You should optimize the performance for background applications and services to ensure that a foreground application doesn’t dominate the server resources. Domino is considered to be a background application by the OS. Using the server interactively is considered a foreground application and will take priority over Domino. So if you have the default settings in the OS and you are using the server interactively, you may negatively affect Domino performance.

To set the priority for background services in Windows 2000, open the Control Panel, double-click System, click the Advanced tab, click the Performance Options button, and select Background services. To set the priority in Windows NT, open the Control Panel, double-click the System icon, click the Performance tab, and then move the "Select the performance boost for the foreground application" slider to None.

File system
The NTFS file system has significant performance advantages over FAT or FAT32. Be sure that the disks are formatted with at least a cluster size of 4K for best performance. It’s best to format with a cluster size that is a little larger than the average file size on the disk. For example, to use a 16 KB allocation size for formatting the NTFS volumes, at the command prompt use:

(format <drive>: /fs:ntfs /A:16K)

Note that sizes of 512 bytes, 1024 bytes, 2048 bytes, 4096 bytes, 8192 bytes, 16KB, 32KB, or 64KB are supported by NTFS.

RAID sets
When setting up data disk RAID sets in Windows NT/2000, set the stripe size to be approximately equal to the average logical disk bytes per transfer measured in Perfmon for the typical workload for the server. The Domino Performance team usually sets this to 16K for their analysis. Set cache write policy to write back, and set cache read policy to read ahead.

Hardware RAID, where a special RAID controller card is connected to a set of disks, is highly recommended for your data disks. The RAID card has disk I/O performance that is many times that of the OS software RAID implementation.

RAID levelFault Tolerance Level ProvidedPerformance of Random WritePerformance of Random READ
0 (stripe)NoneBestBest
1 (mirror)Capable of surviving one disk failure without data lossGoodGood
5 (stripe / parity)Capable of surviving one disk failure without data lossBestWorst

There are other combinations of RAID levels. For example RAID-1+ (or enhanced RAID-1) uses a combination of RAID-0 and RAID-1. Each of the mirror “disks” is actually a RAID-0 set. This offers the best performance of any RAID implementations. RAID-5 uses one disk for data redundancy and is low in cost but does not have the highest performance.

PCI buses
Distribute the network adapters and RAID controller across multiple buses, if your server has them, so that the total I/O bandwidth for each PCI bus is as equal as possible. In general, it is a good idea to put a RAID controller on a bus that does not have a network adapter.

In conclusion
The “rules-of-thumb” for performance tuning are constantly changing as the OS, applications, and our knowledge changes. This article presents a snapshot of what the Domino Performance team thinks is important at this point in time. Hopefully, it has given you some new and helpful ideas. As always, we would very much appreciate hearing from you, especially if you have drawn conclusions different from ours based on your experience. Look for rules of thumb for other operating systems in future Performance Perspective columns.

Windows NT/2000 references

ABOUT THE AUTHOR
Harry Murray has worked on the Domino Performance team for four years. Prior to that, he worked on the Compaq Computer Application Systems Engineering Performance team.