LDD Today

Performance perspectives
iSeries performance with Domino 6, QuickPlace 3, and Sametime 3

by
Dave Johnson,
Don Morrison, and
Joe Peterson
Recently released Lotus collaboration products offer some very impressive performance improvements. These improvements clearly translate to improved performance for end users. In many cases, these enhancements allow an existing server to handle more work without degrading response time—and perhaps even result in lower total cost of ownership (TCO) at your site.

This article discusses the performance improvements on the IBM eServer iSeries platform for Domino 6, Sametime 3, and QuickPlace 3. We show how Domino 6 compares with Release 5.0.11 and how Sametime 3 and QuickPlace 3 compare with their previous versions. We're happy to report these new releases deliver improved CPU usage and response time and in some cases, also significantly improve scalability.

We collected the performance data presented in this article in controlled lab environments simulating large numbers of users driving workloads on the servers under test. The test environments included NotesBench workloads, a Sametime load tool, and LoadRunner scripts. We describe the environments and workloads used in each test later in this article. Note that the performance data we show does not represent maximum capacities for the systems. Also note that although we used NotesBench workloads in our tests, our results were not officially audited.

Of course, no test environment, however carefully planned, can predict exact performance numbers for every possible customer configuration. Nevertheless, we are confident our information is both useful and accurate. (We talk more on this topic later in this article.)

We assume you're an experienced iSeries system administrator familiar with Domino, Sametime, and QuickPlace.

Notes client improvements with Domino 6
Using the R5Mail workload, we compared performance using Domino 5.0.11 and Domino 6. The following table summarize our results:

Domino version
Number of R5Mail users
Average CPU utilization (percent)
Average response time
(msec)
Average disk utilization
(percent)
Domino 5.0.11
3,000
39.4
26
7.1
Domino 6
3,000
27.6
18
5.2
Domino 5.0.11
8,000
69.7
67
25.2
Domino 6*
8,000
46.7
46
26.1

* Additional memory was added for this test.

In one test, we compared R5.0.11 with Domino 6 by running 3,000 simulated users on an iSeries model i270-2253 with a two-way 450 MHz processor. This system was configured with 8 GB of memory and 12 x 18 GB disk drives configured with RAID5. As you can see in the previous table, we obtained a 30 percent improvement in CPU utilization with Domino 6 with a substantial improvement in response time.

In the second test, we simulated 8,000 users running on a model i810-2469 with a two-way 750 MHz processor. The system was equipped with 24 x 8.5 GB disk drives configured with RAID5. In this test, we obtained a slightly greater than 30 percent improvement in CPU utilization with a significant reduction in response time with Domino 6. For this comparison, we intentionally constrained main memory slightly, with 8 GB available for the 8,000 users. We found that we needed to add 13 percent more memory (an additional 1 GB in this case) when running Domino 6 to achieve the same paging rates, faulting rates, and average disk utilization as the Domino 5.0.11 test. Adding more memory to the Domino 5.0.11 test would likely have improved response times and disk utilization. Domino 6 includes new memory caching techniques for the Notes client to improve response time, which may require additional memory. (See the conclusion of this article for more information.)

Plotted graphically, our test results for R5Mail CPU usage appears as follows:

R5Mail workload on iSeries - CPU usage

And this graph shows R5Mail response time:

R5Mail workload on iSeries - response time

Tests described in this section were conducted using single Domino partitions. Similar Domino 6 improvements can be expected for environments using multiple Domino partitions.

iNotes Web Access client improvements with Domino 6
Using the R5iNotes workload to simulate iNotes Web Access users, we compared performance using Domino 5.0.11 and Domino 6. The following table and graphic display our test results:

Domino version
Number of R5iNotes users
Average CPU utilization
(percent)
Average response time
(msec)
Average disk utilization
(percent)
Domino 5.0.11
2,000
41.5
96
<1
Domino 6
2,000
24.0
64
<1
Domino 5.0.11
3,800
19.4
119
<1
Domino 6
3,800
11.0
65
<1
Domino 5.0.11
20,000
96.2
>5000
<1
Domino 6
20,000
51.5
72
<1

As the table shows, in each test we conducted, Domino 6 provided at least a 40 percent CPU usage improvement over R5.0.11 with significant response time reduction. Our tests used systems with abundant main storage and disk resources so that CPU was the only constraining factor. As a result, disk utilization in all tests was less than one percent.

We conducted three tests comparing Domino 5.0.11 and Domino 6 by running three different user workloads (2,000, 3,800, and 20,000 simulated users). The 2,000-user test used a model i825-2473 with six 1.1 GHz POWER4 processors, 45 GB of memory, and 60 x 18 GB disk drives configured with RAID5 in a single Domino partition. The 3,800-user test employed a single Domino partition on a model i890-0198 with thirty-two 1.3 GHz POWER4 processors. This system had 64 GB of memory and eighty-nine 18 GB disk drives configured with RAID5 protection. The 20,000-user test used 10 Domino partitions on an i890-0198 32-way system with 1.3 GHz POWER4 processors. This system was equipped with 192 GB of memory and 360 x 18 GB disk drives running with RAID5 protection.

Plotted on a graph, our R5iNotes CPU usage numbers appear as follows:

R5iNotes workload on iSeries - CPU usage

This chart displays R5iNotes response time:

R5iNotes workload on iSeries - response time

In addition to the test results shown in the previous table and graphs, we performed other tests measuring performance characteristics of Domino 6. In one set of tests called “paging curves,” we established a baseline using the R5iNotes workload. Then over the course of several hours, we gradually reduced the main storage available to the Domino server(s) and observed the effect on paging rates, faulting rates, and response times. These tests allowed us to build a performance curve showing the amount of memory available per user versus the paging rate and response time. Based on a paging curve study of the R5iNotes workload on Domino 6, we determined that (similar to the R5Mail workload) some additional memory was required to achieve the same rates as Domino 5.0.11

As a further demonstration of the performance benefits of Domino 6 for iNotes Web Access, an official NotesBench audit of the R5iNotes workload with 40,200 users was performed. This audit was done on a 32-way iSeries model i890 system with the same configuration as the system used for the 3,800 user tests described in this section. Prior to switching to Domino 6, the highest number of iNotes Web Access users we had been able to simulate on that system was approximately 20,000 with Domino 5.0.11.

QuickPlace 3 improvements
To compare QuickPlace 3 and 2.0.8, we created a workload on the QuickPlace server using the LoadRunner tool (available from Mercury Interactive). For our tests, we used LoadRunner scripts simulating users performing the following actions against the QuickPlace server approximately every 15 minutes:
The following table displays our results:

QuickPlace version
Number of users
Average CPU usage
(percent)
Average response time
(seconds)
QuickPlace transactions
QuickPlace 2.0.8
500
69.1
8.9
56,342
QuickPlace 3
500
36.3
3.8
65,920

Graphically, these numbers appear as shown in the following two illustrations. First, let's look at CPU usage:

QuickPlace on iSeries - CPU usage

And here's response time:

QuickPlace on iSeries - response time

Our test indicates QuickPlace 3 consumes 45 percent less CPU time than QuickPlace 2.0.8. This is especially impressive considering that the QuickPlace 3 server processed 17 percent more transactions than the QuickPlace 2.0.8 server. Even though we built key and think time delays into the LoadRunner scripts, QuickPlace 3's dramatic response time improvements allowed the scripts to execute more quickly and to generate more transactions per user per time interval.

We ran the QuickPlace tests on a model i825-2473 with six 1.1 GHz POWER4 processors equipped with 45 GB of memory and seventy-five 18 GB disk drives configured with RAID5 using a single QuickPlace server. We used a relatively slow 16 Mbps network environment. Additional testing with QuickPlace 3 on a 100 Mbps network demonstrated sub-second response times for our test workload.

For more information on QuickPlace 3 performance, see the previous Performance Perspectives column, "Comparing QuickPlace 3.0 with QuickPlace 2.0.8."

Sametime 3 improvements
We compared several types of Sametime usage to determine performance improvements provided by Sametime 3. We used internal tools to create workloads on the Sametime server being tested. These tools simulated the following Sametime workloads:

Sametime chat
Each user performs the following activities over a 10 hour period:
Sametime meeting
Each user participates in a collaborative meeting. The meeting has a moderator that shares a Freelance Graphics presentation. The moderator changes the current slide every 10 seconds. Each update results in approximately 25 KB of data sent to each meeting participant.

Sametime audio/video
Each user participates in a collaborative meeting with audio/video. The meeting has a single moderator with a camera and microphone transmitting both an audio and video stream to each meeting participant. (This workload did not include sharing a presentation as in the Sametime meeting workload.)

This table shows our results:

Sametime version
Sametime workload
Number of Sametime users
Average CPU utilization (percent)
Sametime 2.5
Chat
60,000
30.7
Sametime 3
Chat
60,000
16.8
Sametime 2.5
Meeting
500
14.7
Sametime 3
Meeting
500
17.8
Sametime 2.5
Audio/Video
85
16.3
Sametime 3
Audio/Video
85
19.5

First the good news: For the chat workload, Sametime 3 reduced CPU usage more than 45 percent compared to Sametime 2.5. As for the less exciting news, our simulations produced slightly increased CPU usage with Sametime 3 meeting and audio/video workloads. (Although note the dramatic improvement in Sametime 3 CPU usage on Domino 6 compared with R5.0.11, described in the next section.) The following chart shows these results graphically:

Sametime on iSeries

We performed all Sametime testing on a iSeries model i820-2438 server, a four-way system with 600 MHz processors. This server included 12 GB of memory and 45 x 9 GB disk drives configured with RAID5 protection.

Additional tests using our Sametime workloads indicated significant improvements for scalability and login rates. With the multiple partition support in Sametime 3, it is now possible to scale to much larger numbers of users than with Sametime 2.5. In addition, you can use multiple Sametime servers to provide response time improvements. For example, while testing login throughput with a single Sametime 2.5 server, we obtained a maximum rate of approximately 15 logins per second. With Sametime 3, this rate improved to approximately 21 logins per second. And with Sametime 3, you can add more Sametime servers and further boost the number of logins per second. For example, we used two Sametime servers to achieve 30 logins per second at 89 percent CPU utilization with lower login response times than with a single Sametime server performing 20 logins per second at 62 percent CPU utilization.

For additional information on performance improvements with Sametime 3, consult the Performance Perspectives column, "Sametime 3 vs 1.5.5 performance comparison."

Sametime 3 on Domino 6
Preliminary tests we've done with Sametime 3 on Domino 6 shows significant performance improvements for creating and joining meetings compared to R5.0.11. The following table illustrates the improvement in the CPU utilization for the HTTP job when running a Sametime meeting workload on Domino 6 versus R5.0.11:

Domino version
Sametime 3 workload
Average CPU utilization (percent)
Domino 5.0.11
Create meetings
61
Domino 6
Create meetings
42
Domino 5.0.11
Join meetings
51
Domino 6
Join meetings
31

The 30 percent reduction in CPU for the HTTP job is produced by improvements in the Domino 6 formula (compute) engine. This results in computation performance up to twice as fast as previous Domino releases. This benefits many areas including view refreshes, agents, and form rendering. (All are heavily used components in this workload.) Plotted on a chart, our Sametime 3 results appear as follows:

Sametime 3 on iSeries

Our Sametime meeting workload consisted of 180 meetings and 900 users. All 180 meetings were created prior to the meetings starting. Then during each of two phases, 90 meetings were started with five users in each meeting. We collected the HTTP CPU data during these activities. This data was produced on an AS/400 model 170-2388 with two 255 MHz processors, 3.5 GB of memory, and ten 8 GB disk drives configured with RAID5 protection.

Performance in your environment: the Workload Estimator
Industry standard benchmarks such as NotesBench are helpful in evaluating performance effects of software, hardware, parameter, and configuration changes because they provide a repeatable workload environment. They can also help you make cross-platform comparisons because everyone “plays by the same rules” when executing the benchmarks. However, it's important to note that these benchmarks can only simulate user environments, not exactly match them. Therefore, you shouldn't expect any benchmark test to precisely project the number of users and/or performance results in a real life environment. (Perhaps another way of phrasing this is "your results may vary.") For example, some customers may find the mail database sizes used with the NotesBench R5Mail and R5iNotes workloads are considerably smaller than those within their own environments. In addition, our lab tests probably don't run all the same server tasks as your site does, thus our benchmarks may not be executing the same kinds of transactions as your users.

For these reasons, we recommend that when estimating Domino, Sametime, and QuickPlace workloads on iSeries, use the IBM eServer Workload Estimator (formerly called the iSeries Workload Estimator). This is an interactive on-line application that can help you estimate various performance numbers you can expect with your site's hardware and software configurations. The Workload Estimator uses data collected through extensive testing and observation. In addition to the performance comparisons described in this article, the Workload Estimator factors in performance behaviors observed on customer systems. This lets you size for many options, including concurrency, clustering, transaction logging, virus protection, and OS/400 Logical Partitioning. The Workload Estimator provides sizing capability for all of the environments described in this article. This can help you appropriately judge how the performance enhancements we've discussed might affect your own site.

Conclusion
It's a fast-paced world and getting faster every day. To help you keep up, we've incorporated outstanding performance improvements in Domino 6, Sametime 3, and QuickPlace 3. This article has provided a brief overview of these improvements with facts and figures you can apply to your own environment to help keep your ever-demanding users happy. For example, our tests show that on the IBM eServer iSeries platform Domino 6 averages 40 percent less CPU usage for iNotes Web Access clients and 30 percent less for Notes clients (along with better response time for both) compared to R5.0.11. Sametime 3 boasts 45 percent less CPU utilization for chat users compared to Sametime 2.5 with improved login rates. And QuickPlace 3 uses 45 percent less CPU than its predecessor.

We hope you found this article helpful. Please let us know of other performance-related topics you'd like us to cover, and we'll consider them for future Performance Perspectives columns.


ABOUT THE AUTHORS
Dave Johnson is currently a member of the iSeries System Performance area with a focus on Domino performance. Dave's team is also responsible for NotesBench audits for IBM eServer iSeries.

Don Morrison joined the iSeries Domino Development team in 1999 and has worked on various assignments including Domino.Doc, Domino Web Server, and currently Sametime performance.

Joe Peterson is on the iSeries development team in Rochester Minnesota. His focus is on performance of Lotus products.