ARR Health Checks–Week 34

You can find this week’s video here.

Application Request Routing (ARR) is used as a load balancer (reverse proxy) for highly available websites. This week I cover health checks in ARR and lay out a few principles that will help you be more effective in your web farm environment.

Properly planning health checks is important with any load balancer, and Application Request Routing (ARR) is no exception.

Health checks are used to check the state of your servers so that if a server fails, it is automatically taken out of rotation, and then added back again when it has recovered. At first glance it may seem that minimal thought needs to go into planning your health checks, but that’s not the case. Through my own mistakes in the early days of working with web farms I’ve come to the realization that you shouldn’t test your database server, web service calls or other external dependencies with the health checks. Instead you should only test the web server, app pool and site. This video answers why to this question, and more.

This is now the 10th week in a mini-series on web farms, and the 34th week of the entire series. You can view past and future weeks here:http://dotnetslackers.com/projects/LearnIIS7/

You can find this week’s video here.

4 Comments

  • Hi Scott,

    I have a server farm with two servers, one of them is unavailable and my health test is configured to make a request every 30 seconds but when i read the iis request log and i found something like this:

    2013-11-25 18:43:26 /health-test/ -www.mysite.com 200 0 0 270 90 15 -> health check 30 seconds period
    2013-11-25 18:43:55 /health-test/ -www.mysite.com 200 0 0 270 90 0
    2013-11-25 18:43:55 /health-test/ -www.mysite.com 200 0 0 270 90 0
    2013-11-25 18:43:55 /health-test/ -www.mysite.com 200 0 0 270 90 0
    2013-11-25 18:43:55 /health-test/ -www.mysite.com 200 0 0 270 90 0
    2013-11-25 18:43:55 /health-test/ -www.mysite.com 200 0 0 270 90 0
    2013-11-25 18:43:55 /health-test/ -www.mysite.com 200 0 0 270 90 0
    2013-11-25 18:43:55 /health-test/ -www.mysite.com 200 0 0 270 90 15
    2013-11-25 18:43:55 /health-test/ -www.mysite.com 200 0 0 270 90 0
    2013-11-25 18:43:55 /health-test/ -www.mysite.com 200 0 0 270 90 0
    2013-11-25 18:43:55 /health-test/ -www.mysite.com 200 0 0 270 90 0
    2013-11-25 18:43:55 /health-test/ -www.mysite.com 200 0 0 270 90 0
    2013-11-25 18:43:56 /health-test/ -www.mysite.com 200 0 0 270 90 0 -> health check 30 seconds period
    2013-11-25 18:44:25 /health-test/ -www.mysite.com 200 0 0 270 90 0
    2013-11-25 18:44:25 /health-test/ -www.mysite.com 200 0 0 270 90 0
    2013-11-25 18:44:25 /health-test/ -www.mysite.com 200 0 0 270 90 0
    2013-11-25 18:44:25 /health-test/ -www.mysite.com 200 0 0 270 90 0
    2013-11-25 18:44:25 /health-test/ -www.mysite.com 200 0 0 270 90 0
    2013-11-25 18:44:25 /health-test/ -www.mysite.com 200 0 0 270 90 0
    2013-11-25 18:44:25 /health-test/ -www.mysite.com 200 0 0 270 90 0
    2013-11-25 18:44:25 /health-test/ -www.mysite.com 200 0 0 270 90 0
    2013-11-25 18:44:25 /health-test/ -www.mysite.com 200 0 0 270 90 0
    2013-11-25 18:44:25 /health-test/ -www.mysite.com 200 0 0 270 90 0
    2013-11-25 18:44:25 /health-test/ -www.mysite.com 200 0 0 270 90 0
    2013-11-25 18:44:26 /health-test/ -www.mysite.com 200 0 0 270 90 0 -> health check 30 seconds period
    2013-11-25 18:44:54 /health-test/ -www.mysite.com 200 0 0 270 90 0
    2013-11-25 18:44:54 /health-test/ -www.mysite.com 200 0 0 270 90 0
    2013-11-25 18:44:54 /health-test/ -www.mysite.com 200 0 0 270 90 0
    2013-11-25 18:44:54 /health-test/ -www.mysite.com 200 0 0 270 90 0
    2013-11-25 18:44:54 /health-test/ -www.mysite.com 200 0 0 270 90 0
    2013-11-25 18:44:55 /health-test/ -www.mysite.com 200 0 0 270 90 15
    2013-11-25 18:44:55 /health-test/ -www.mysite.com 200 0 0 270 90 15
    2013-11-25 18:44:55 /health-test/ -www.mysite.com 200 0 0 270 90 0
    2013-11-25 18:44:55 /health-test/ -www.mysite.com 200 0 0 270 90 0
    2013-11-25 18:44:55 /health-test/ -www.mysite.com 200 0 0 270 90 0
    2013-11-25 18:44:55 /health-test/ -www.mysite.com 200 0 0 270 90 0
    2013-11-25 18:44:55 /health-test/ -www.mysite.com 200 0 0 270 90 0

    as you can see, there is a request wich only repeats every 30 seconds but i dont understand why are another request and they are in the same second, do you know how health check works at this level?

    Thanks.

  • Hi Oscar,

    Sorry for the really long delay in replying. I'm not sure if you'll see this reply or not, but I'll reply in case you do.

    The issue that you get is an interesting one. The problem is that ARR doesn't have a dedicated worker process to do the health testing with, so it just uses ALL of the w3wp.exe worker processes. What that means is that if you have multiple app pools, or a web garden, then you'll have a pretty crazy health testing pattern.

    The only way to resolve that is to consolidate your app pools on your ARR server, or to live with the heavy testing pattern.

  • Hello Scott,

    What if the ARR health check returns "unknown"? Do you know any specific reason for that? This seems to be a mystery.

    Thanks,
    Andras

  • Hi Andras,

    Good question. Some things that you can check are:

    - try to manually test the page from your ARR server using a web browser. That should see if it's accessible and works as you assume it should.
    - see if event viewer offers any clues.
    - review the health test settings to make sure that they are enabled and are actually checking at regular intervals.
    - check the logs on the web server to see if traffic is going to the web server.
    - try different options with the health tests, like setting to an unrelated site, or turning them off. See if that offers any clues to the issue.

Comments have been disabled for this content.