adminfoo.net: Windows Perfmon: The Top Ten Counters

One of the things I love about Windows is Performance Monitor a/k/a PerfMon. It's an amazing tool that goes far too often unused - and when it does get used, it is often misinterpreted. So today I'm going to take you on the nickel tour through PerfMon, and the ten counters most valuable to determining overall system health and activity.

To open PerfMon, just go to the Start Menu, choose Run and type perfmon.
Bottleneck analysis
The most common use of PerfMon is to answer the burning question: why is my system running slow?

With the five performance counters listed below, you can quickly get an overall impression of how healthy a system is - and where the problems are, if they exist. The idea here is to pick counters that will be at low or zero values when the system is healthy, and at high values when something is overloaded. A 'perfectly healthy' system would show all counters flatlined at zero. (Perfection is unattainable, so you'll probably never see all of these counters flatlined at zero in real life. The CPU will almost always have a few items in queue.)

Processor utilization
- System\Processor Queue Length - number of threads queued and waiting for time on the CPU. Divide this by the number of CPUs in the system. If the answer is less than 10, the system is most likely running well.
Memory utilization
- Memory\Pages Input/Sec - The best indicator of whether you are memory-bound, this counter shows the rate at which pages are read from disk to resolve hard page faults. In other words, the number of times the system was forced to retreive something from disk that should have been in RAM. Occasional spikes are fine, but this should generally flatline at zero.
Disk Utilization
- PhysicalDisk\Current Disk Queue Length\driveletter - this is probably the single most valuable counter to watch. It shows how many read or write requests are waiting to execute to the disk. For single disks, it should idle at 2-3 or lower, with occasional spikes being okay. For RAID arrays, divide by the number of active spindles in the array; again try for 2-3 or lower. Because a shortage of RAM will tend to beat on the disk, look closely at the Memory\Pages Input/Sec counter if disk queue lengths are high.
Network Utilization
- Network Interface\Output Queue Length\nic name - is the number of packets in queue waiting to be sent. If there is a sustained average of more than two packets in queue, you should be looking to resolve a network bottleneck.
- Network Interface\Packets Received Errors\nic name - packet errors that kept the TCP/IP stack from delivering packets to higher layers. This value should stay low.

To highlight a particular counter's line on the graph, select that counter in the lower pane. Then click the lightbulb icon on the toolbar above the graph. This will make the line for that counter turn thick and white (or black on some systems - I never found out why this changes).

Pay close attention to the scale column! Perfmon attempts to automatically pick a scale that will magnify or reduce the counter enough to produce a meaningful line on the graph ... but it doesn't always get it right. As an example, Perfmon often chooses to multiply Disk Queue Length by 100. So, you might think the disk queue length is sustained at 10 (bad!) when in fact it's really at 1 (good). If you're not sure, highlight the counter in the lower pane, and watch the Last and Average values just below the graph. In the screenshot below, I modified all of the counters to a scale value of 1.0, then changed the graph's vertical axis to go from 0-10.
To change graph properties (like scale and vertical axis as discussed above), rightclick the graph and choose Properties. There are a number of things to customize here ... fiddle with it until you have a graph that looks good to you.
To get a more detailed explanation of any counter, rightclick anywhere in the perfmon graph and choose Add Counters. Select the counter and object that you are curious about, and click the Explain button.
This screenshot shows a very lightly-loaded XP system, with the Memory\Pages Input/Sec counter highlighted:

All we see here is the Proccessor Queue Length hovering between 1 and 4, and two short spikes of Pages Input/Sec. All other counters are flatlined at zero, which is easy to check by highlighting each of them and watching the values bar underneath the graph. This is a happy system - no problems here!
But if we saw any of the above counters averaging more than 2-4 for long periods of time (except Processor Queue Length: don't worry unless it's above 10 for long lengths of time), we'd be able to conclude that there was a problem with that subsystem. We could then drill down using more detailed counters to see exactly what was causing that subsystem to be overloaded. More detailed analysis is beyond the scope of this article, but if there's enough interest I could do a second article on that. Leave a comment if you're interested!
General activity counters
Well, the system is healthy - and that's good ... but how hard is it working? Is the processor workin' hard, or hardly workin'? How much RAM is in use, how many bytes are being written to or read from the disk or network? The following counters are a good overview of general activity of the system.

Processor utilization
- Processor\% Processor Time\_Total - just a handy idea of how 'loaded' the CPU is at any given time. Don't confuse 100% processor utilization with a slow system though - processor queue length, mentioned above, is much better at determining this.
Memory utilization
- Process\Working Set\_Total (or per specific process) - this basically shows how much memory is in the working set, or currently allocated RAM.
- Memory\Available MBytes - amount of free RAM available to be used by new processes.
Disk Utilization
- PhysicalDisk\Bytes/sec\_Total (or per process) - shows the number of bytes per second being written to or read from the disk.
Network Utilization
- Network Interface\Bytes Total/Sec\nic name - Measures the number of bytes sent or received.

In the graph below, I added these five counters to my existing 'bottlenecks' graph, and changed the vertical axis to go from 0-100. I highlighted the Working Set\_Total counter, which is currently at about 123 megabytes for the system. Notice how it shows a thick line at the top of the graph - you could assume that it was pegged at 100, if you didn't read the values bar (123,052,03 divided by a million is approximately 123 megabytes).

And ... that's all for now. Hopefully this quick show-and-tell has given you enough information to use PerfMon more usefully in the future!

32 comments:

: said...; First of all, it's great to have you back. I really enjoy coming to the site. If you would be up to writing a second more detailed article on this topic, I for one would appreciate the effort. Thanks.; Saturday, April 7, 2007 at 6:34:00 AM PDT
: said...; Thanks for your time spent on writing this good piece!
Cheers from Kristian; Thursday, May 3, 2007 at 5:00:00 AM PDT
: Anonymous said...; Umm, are you okay? I keep on coming back thinking my rss reader is bad, but nope, still no updates :-(; Tuesday, June 19, 2007 at 6:44:00 PM PDT
: Anonymous said...; quxxo what happen???; Monday, July 9, 2007 at 12:55:00 AM PDT
: Anonymous said...; Good stuff. One thing this is really useful for is determining if the server can be a virtual machine. I know disk I/O will kill a VM host, but I'm not too sure on what exact numbers would constitute a good candidate for a VM.; Wednesday, February 20, 2008 at 7:55:00 PM PST
: Jimmy said...; Hi,

I'd like to also see a followup to this article. It was really good stuff. Thank you; Friday, June 20, 2008 at 5:19:00 AM PDT
: Anonymous said...; Great article. Just like Jimmy, I'd love to see a follow up article.; Tuesday, June 24, 2008 at 11:44:00 AM PDT
: Ross said...; Very helpful article, thanks a lot.; Tuesday, August 19, 2008 at 7:31:00 AM PDT
: Anonymous said...; This is the best article for perfmon I've seen yet! I would be very interested in seeing the second article. When can you have it posted??? ;-); Tuesday, August 26, 2008 at 8:34:00 AM PDT
: Anonymous said...; Great article, I'd love to see a second!; Wednesday, September 10, 2008 at 8:59:00 AM PDT
: Anonymous said...; This is very interesting. Would love to see more details into this..; Sunday, December 14, 2008 at 7:35:00 PM PST
: atroon said...; Got here via your post in Ars. Thanks much for the article; I maintain a Windows network using a finicky custom program, and this is absolute gold.; Monday, January 12, 2009 at 6:39:00 AM PST
: Anonymous said...; This is a great article. Thanks! I too would be interested in more detail.; Tuesday, February 17, 2009 at 11:41:00 AM PST
: Anonymous said...; Introducing a new public forum for Perfmon.
http://social.technet.microsoft.com/Forums/en-US/perfmon/threads/

thanks.; Friday, March 20, 2009 at 2:33:00 PM PDT
: Anonymous said...; Any idea why I see max values for something like (java)\% Processor Time at 159.688? Average is 35.668 with a stddev of 29.234 but anyone looking at the max will be like "whaa??"

Regards.; Thursday, May 21, 2009 at 4:16:00 AM PDT
: Anonymous said...; this is wicked man, thank you very much for your information.; Friday, June 5, 2009 at 11:07:00 PM PDT
: said...; thank you very informative...; Thursday, December 3, 2009 at 2:47:00 PM PST
: said...; Take a look at SmartMon (www.perfmonanalysis.com) when its time to analyze the Perfmon data that has been collected.; Thursday, February 25, 2010 at 7:17:00 PM PST
: Spring said...; I had no clue this existed thanks!; Sunday, March 28, 2010 at 1:47:00 PM PDT
: Anonymous said...; This is the best article I've read in a week of searching. Thanks!!!!; Wednesday, March 31, 2010 at 11:31:00 AM PDT
: Mayuri Mehta said...; Thanks for such a great article. It is very helpful....; Monday, April 5, 2010 at 9:43:00 PM PDT
: Anonymous said...; Thanks for the detailed info. It really helps.

Regards,
Sohail Chaudhry; Tuesday, June 8, 2010 at 12:17:00 PM PDT
: Anonymous said...; It really helps..Thank you

Would like to see more articles like this...; Wednesday, October 13, 2010 at 1:59:00 PM PDT
: Anonymous said...; A timeless classic.

Thank you.; Monday, February 7, 2011 at 9:55:00 AM PST
: Anonymous said...; Great advice - Thanks; Friday, February 25, 2011 at 11:06:00 AM PST
: said...; Still valid and still useful, thanks for taking the time to write this.

James; Friday, March 18, 2011 at 3:47:00 AM PDT
: said...; I use it every time I generate performance reports. Thank you very much for such a nice article.!!; Tuesday, May 10, 2011 at 2:29:00 AM PDT
: Anonymous said...; Great Article! Very Useful....Thank you!; Thursday, May 26, 2011 at 8:59:00 AM PDT
: Anonymous said...; Brilliant, To the point and simple to understand, just how I need it. Thanks you this was very helpful to me .; Friday, July 22, 2011 at 12:34:00 AM PDT
: said...; Excellent article.

I wonder is it useful to run more than one instances of perfmon to split up counters which are a percentage and those that are a discrete number?

This might help to make it easier to visualize the two types of value and not have large discrete values (say number MBytes of HDD available) obscure percentage values due to perfmon automatically picking the scale.; Sunday, November 20, 2011 at 8:52:00 PM PST
: said...; Wow, the Current Disk Queue Length really is the most valuable counter.

For anyone who really wants their socks blown off, download the DiskLED utility and plug in Current Disk Queue Length, and get a REAL load meter for your I/O-bound system!!! Wow!; Tuesday, November 29, 2011 at 6:44:00 PM PST
: hermes victoria said...; Great blog.I think they will get their salary depends upon their strength,experience.; Monday, July 1, 2013 at 1:57:00 AM PDT

adminfoo.net

2007/04/04

Windows Perfmon: The Top Ten Counters

32 comments:

Links

Labels