Monitoring Windows Home Server With Munin

by Alex Kuretz on March 9, 2010 · 13 comments

in News

If you are like me, visualizing information can be much more intuitive than simply reading numbers and metrics. Windows Home Server gives us pie charts to show how our server storage is allocated, HP gives us the System Status tab to show CPU/memory/network load in real-time, and Disk Management gives us wireframes of our servers and several different types of graphs. These various tools provide us with several different views into the function of our Windows Home Server, but effective monitoring requires a different type of tool.

I’ll preface this article by saying that this may be beyond many of your levels of interest, your technical knowledge, or you just may not have a Linux system on which to run the software. However, even if you don’t plan to deploy Munin, I think you’ll find the graphs and information that I share to be interesting, so read on.

In the Unix sysadmin world there are many tools available for system monitoring and alerting, one of which is called Munin. The Munin website has a great description on what exactly Munin does, so I’ll let them say it best.

Munin is a networked resource monitoring tool that can help analyze resource trends and “what just happened to kill our performance?” problems. It is designed to be very plug and play. A default installation provides a lot of graphs with almost no work.

Munin uses the popular RRDTool for storing data, is written in Perl, and can use data gathered by plugins written in any language. It operates in a master/node model where the master queries the nodes regularly (usually every 5 minutes) for data, stores it in the RRD database, and updates the graphs. The graphs display data on a day, week, month, and year basis. The master does not run on Windows, unfortunately, which may prohibit many of you from implementing this, however there is a Munin agent for Windows that can gather several useful metrics.

I use Munin to monitor the server that runs this website, it tracks many metrics such as bandwidth, disk SMART status, CPU load, web server request rates, memory usage, email send rate, the list goes on and on. These statistics are useful for showing me when there is some extra load due to a popular article, alerting when something bad happens, and just generally telling me the overall health of the server. For example, here is the bandwidth you used visiting this site today, this allows me to plan ahead to keep things running optimally for you all.

This information can also be useful in the context of monitoring our Windows Home Server, especially when we want to visualize behaviors. For example, many of you are aware that Drive Extender will spin up your hard drives every hour and perform a scan to make sure that all data is safely duplicated across multiple hard drives and perform various checks of the system (Windows Home Server calls this “Balancing Storage”, you’ll see the status at the bottom of your Server Console). Here’s what this process looks like to the CPU in my EX495 over the past day.

Windows Home Server CPU Load

As you can see the load is fairly light, only consuming about half of one of the cores in my dual-core EX495. This becomes more obvious when we look at the Processor Time graph that shows the load for each individual core.

Windows Home Server EX495 Processor Time

Now here is a view of the past week, with the high periods showing where Twonky caused high CPU utilization when getting stuck indexing my media after installing the 3.0 Patch 2.

Windows Home Server High CPU Load

We’ve seen the CPU behavior, but how are the disk drives behaving during Drive Extender’s hourly scans? Below is a daily graph showing the utilization of each of the 5 logical drives (4 physical drives with 2 partitions on the system drive), plus a graph from Disk Management that shows the location of the drives in my server. C: and D: are Disk 0, lIW is Disk 1, lJl is Disk2, and lJ6 is Disk 3.

These graphs show that the C: partition that houses the Windows Home Server operating system is under very little load, as are the very full (98% full) disk 1 (red) and disk 2 (light blue). The most heavily utilized drives during Storage Balancing are the D: partition of disk 0 (dark blue) and disk 3, both of which have free space for storing and duplicating files and are the locations where Windows Home Server is moving the most data.

Here’s a graph that I think many of you will be interested to see, this shows the operating temperature (as reported by SMART monitoring) of the disk drives in my Home Server. Note the upticks in temperature on the hourly mark when Drive extender runs.

It should be no surprise that the stock Seagate 7200 RPM is the hottest, while the two Western Digital GP drives are significantly lower in temperature but very similar to each other. The coolest running drive is the Samsung EcoGreen, and I have to thank Nigel “Cougar” Wilks for the tip on how cool these drives run, I’ve been pleased with it for the past couple of months that it has been running in my server.

As a point of interest, here is the week view of the hard drive temps. Note the flat spot towards the middle, this is where we were on vacation for a long weekend and had the heat turned down the entire time. This shows the effect that our climate control has on the electronics in our homes.

I hope this gives you an interesting insight into what is possible with server monitoring tools. Munin is only one tool of many others available, and some may be more suited to monitoring Windows systems. However Munin is a tool that I am familiar with and already have running to monitor several other systems, so this was a quick setup that has provided a lot of useful information to me. If you are interested in setting up Munin, read the links at the beginning of the article, and feel free to ask any questions in the comments or the forums.


Article by

I'm Alex Kuretz, and I'm the founder of MediaSmartServer.net. I was the Lead Test and Integration Engineer at HP for the MediaSmart Server until April 2008 when I moved on to other opportunities outside HP. I've kept active in the Windows Home Server community, creating several add-ins and helping users make the most of their Home Servers.


{ 13 comments }

Comp1962 March 9, 2010 at 4:30 pm

Alex,

Very intersting utility. I am always monitoring my servers for one thing or another. Those who have access to my servers for one thing or another will often let me know is something is not running properly and from there I try to figure out what might be the cause of the problem. This program certainly could help me identify areas of potential problems which in tern can help you take corrective action before something bad occurs. In my field of work we call it Trending and based on information collected we can determine a course of action which we refer to as Planned Predictive Preventative Maintenance.

To simplify what I just said its like your car. If you track certain pieces of information like your gas milage which not alot of people do but they really should. If you know what your cars average fuel economy is and then one day you notice your fuel economy is becomming worse. Its actually an indicator that something is wrong or a failure is about to occur. At the very least it could be just a change in air pressure of your tires or something worse.

Data always tell you something and the more information you have they better equipped to address and handle a potential problem. I will have to get me a copy of Munin and check it out.

Thanks for writing about it……

RonV March 9, 2010 at 6:55 pm

I have been running Munin for a couple of years with my Linux PBX and also collecting data from the desktops with the windows client. I added the client to my WHS about a week after I built it. Not only can you use the embedded agents but also add performance monitor counters. Since my server has a dual core processor I added the CPU monitor, total processes, memory etc.

One thing that I don’t like so far is that since my hard drives are using AHCI drivers they look like SCSI to WHS so no HD temps or smart data.

Dandandin May 14, 2010 at 5:20 am

What did you use, munin-node-win32 , snmpagent, or microsoft snmp?

I tried munin-node-win32 but the results are not pretty and informative like yours

Alex Kuretz May 14, 2010 at 10:55 am

Dandandin, I used munin-node-win32, the specific version on my server is munin-node-win32-v1.5.1942-bin. What are you seeing?

Dandandin May 15, 2010 at 8:38 am

I used that version too, but i did not manage to let show the hdd temperatures, or the hdd activity

Alex Kuretz May 15, 2010 at 8:45 am

I’m guessing you don’t have a MediaSmart Server? It’s up to the drive controller to show the SMART data for the drives.

Dandandin May 15, 2010 at 1:10 pm

Yes, I have a lenovo atom-based server.
Not only the SMART dats is not shown, also the disk time and the processor time as you show.

Augustus November 16, 2010 at 6:39 am

Does the munin client detect the HDD teperature connected in a raid controller

Alex Kuretz November 16, 2010 at 8:11 am

That will depend on whether the RAID controller exposes that information via SMART.

Augustus November 18, 2010 at 11:55 pm

Our windows servers (OnRAID) are not showing the HDD temperature, all other graphs are pretty good and accurate. any thing can be done to show the HDD temperature on these servers.
kindly advice

Alex Kuretz November 19, 2010 at 7:50 am

If the controller doesn’t provide the info, there’s no way to get it.

Augustus December 27, 2010 at 12:33 am

which version of munin client for windows is good .we are using 1.5.0.1947

Alex Kuretz December 27, 2010 at 2:09 pm

I’ve not kept up with what is different in the versions, I just checked and I’m running 1.5.1942.

Comments are closed, visit the forums to continue the discussion.

Previous post:

Next post: