Checking the Health of an ASP.NET Core Application

Monday, July 29, 2024

Introduction

Having a way to evaluate the health status of our system has been around since ASP.NET Core 2.2. In a nutshell, the idea is, you register a number of health checks, and you run them all at some time to assess their state, if any of them returned anything other than healthy, then the system is considered to be either unhealthy or degraded.

Health Checkers

A health checker is an implementation of IHealthCheck registered with a unique name and possibly some associated tags. Microsoft makes available a few checkers, in packages Microsoft.Extensions.Diagnostics.HealthChecks.ResourceUtilization (for system resources, CPU and memory usage), Microsoft.Extensions.Diagnostics.HealthChecks.EntityFrameworkCore (for EF Core), and Microsoft.Extensions.Diagnostics.HealthChecks.Common, Microsoft.Extensions.Diagnostics.Probes (for Kubernetes), as well as a framework to help build more, in Microsoft.Extensions.Diagnostics.HealthChecks.

An example of using the Microsoft.Extensions.Diagnostics.HealthChecks.ResourceUtilization health check to monitor Resource Utilization:

builder.Services.AddResourceMonitoring();
builder.Services.AddHealthChecks()
    .AddResourceUtilizationHealthCheck(o =>
    {
        o.CpuThresholds = new ResourceUsageThresholds
        {
            DegradedUtilizationPercentage = 80,
            UnhealthyUtilizationPercentage = 90,
        };
        o.MemoryThresholds = new ResourceUsageThresholds
        {
            DegradedUtilizationPercentage = 80,
            UnhealthyUtilizationPercentage = 90,
        };
        o.SamplingWindow = TimeSpan.FromSeconds(5);
    });

For those interested, please read this article.

And the Application Lifecycle (from Microsoft.Extensions.Diagnostics.HealthChecks.Common), which checks if the application has fully started or is shutting down:

builder.Services.AddHealthChecks()
    .AddApplicationLifecycleHealthCheck();

See here for more information on this checker.

Or the Manual Check, also from Microsoft.Extensions.Diagnostics.HealthChecks.Common, which lets us control the state manually:

builder.Services.AddHealthChecks()
    .AddManualHealthCheck();

...

IManualHealthCheck<MyService> healthCheck = serviceProvider.GetRequiredService<IManualCheck>();
healthCheck.ReportUnhealty("Not so well");

Here are a couple examples of health checkers:

public class PingHealthCheck : IHealthCheck
{
    public PingHealthCheck(string ipAddress)
    {
        ArgumentException.ThrowIfNullOrWhiteSpace(ipAddress, nameof(ipAddress));
 
        if (!System.Net.IPAddress.TryParse(ipAddress, out var ip))
        {
            throw new ArgumentException($"Invalid IP address {ipAddress}", nameof(ipAddress));
        }

        this.IPAddress = ip;
    }
 
    public IPAddress IPAddress { get; }
 
    public async Task<HealthCheckResult> CheckHealthAsync(HealthCheckContext context, CancellationToken cancellationToken = default(CancellationToken))
    {
        using var ping = new Ping();
        var reply = await ping.SendPingAsync(IPAddress);
 
        if (reply.Status == IPStatus.Success)
        {
            return HealthCheckResult.Healthy("The IP address is reachable.");
        }
 
        return HealthCheckResult.Unhealthy("The IP address is unreachable.");
    }
}

As you can guess, it checks wether or not we have TCP/IP connectivity with a certain IP address by sending a ping request.

Yet another example, this time, to see if we can access a site:

public class WebHealthCheck : IHealthCheck
{
    public WebHealthCheck(string url)
    {
        ArgumentException.ThrowIfNullOrWhiteSpace(url, nameof(url));
 
        if (!Uri.TryCreate(url, UriKind.Absolute, out var uri))
        {
            throw new ArgumentException($"Invalid URL {url}", nameof(url));
        }
 
        this.Url = uri;
    }
 
    public Uri Url { get; }
 
    public async Task<HealthCheckResult> CheckHealthAsync(HealthCheckContext context, CancellationToken cancellationToken = default(CancellationToken))
    {
        using var client = new HttpClient();
        var response = await client.GetAsync(this.Url);
 
        if (response.StatusCode < HttpStatusCode.BadRequest)
        {
            return HealthCheckResult.Healthy("The URL is accessible.");
        }
 
        return HealthCheckResult.Unhealthy("The URL is inaccessible.");
    }
}

Yet another one, to check SQL Server connectivity:

public class SqlServerHealthCheck : IHealthCheck
{
    public SqlServerHealthCheck(string connectionString)
    {
        ArgumentException.ThrowIfNullOrWhiteSpace(connectionString, nameof(connectionString));
 
        ConnectionString = connectionString;
    }
 
    public string ConnectionString { get; }
 
    public async Task<HealthCheckResult> CheckHealthAsync(HealthCheckContext context, CancellationToken cancellationToken = default(CancellationToken))
    {
        try
        {
            using var con = new SqlConnection(ConnectionString);
            using var cmd = con.CreateCommand();
 
            cmd.CommandText = "SELECT 1";
            await cmd.ExecuteScalarAsync();
 
            return HealthCheckResult.Healthy("Connection successful.");
        }
        catch (Exception ex)
        {
            return HealthCheckResult.Unhealthy("Connection failed.", ex);
        }
    }
}

There are some other open source checkers, the most known and widely used library is AspNetCore.Diagnostics.HealthChecks from Xabaril, it contains a large number of checkers that cover common technologies such as Azure and AWS.

We register checkers using the AddHealthChecks() extension method:

builder.Services.AddHealthChecks()
    .AddCheck("SQL Server Check", new SqlServerHealthCheck(builder.Configuration.GetConnectionString("Default")))
    .AddCheck("Web Check", new WebHealthCheck("https://google.com"))
    .AddCheck("Ping Check", new PingHealthCheck("8.8.8.8"));

You can register as much as we want using the AddCheck() method, which also has an optional tags parameter:

builder.Services.AddHealthChecks()
    .AddCheck("SQL Server Check", new SqlServerHealthCheck(builder.Configuration.GetConnectionString("Default")), tags: new[]{ "db", "sql" })
    .AddCheck("Web Check", new WebHealthCheck("https://google.com"), tags: new[]{ "tcp", "http" })
    .AddCheck("Ping Check", new PingHealthCheck("8.8.8.8"), tags: new[]{ "tcp" });

When checking the health, you can look for only the checkers associated with a specific tag, as we'll see later.

For dynamic registrations, there are two overloads:

AddCheck<T>(): which takes a generic argument type of some class that implements IHealthCheck and which will be built using dependency injection (DI)
AddTypeActivatedCheck<T>(): which also takes a generic argument but also some optional parameters to pass to the constructor of the health checker type

The difference between these two is that the latter can take additional constructor arguments to pass to the IHealthCheck-implementing class. Take, for example, this health checker:

public class DbContextHealthCheck<TContext> : IHealthCheck where TContext : DbContext
{
    public DbContextHealthCheck(TContext context, Func<TContext, bool> condition)
    {
        ArgumentNullException.ThrowIfNull(context, nameof(context));
        ArgumentNullException.ThrowIfNull(condition, nameof(condition));
 
        Context = context;
        Condition = condition;
    }
 
    public TContext Context { get; }
    public Func<TContext, bool> Condition { get; }
 
    public Task<HealthCheckResult> CheckHealthAsync(HealthCheckContext context, CancellationToken cancellationToken = default(CancellationToken))
    {
        try
        {
            if (Condition(Context))
            {
                return Task.FromResult(HealthCheckResult.Healthy("Query succeeded."));
            }
            else
            {
                return Task.FromResult(HealthCheckResult.Unhealthy("Query failed."));
            }
        }
        catch (Exception ex)
        {
            return Task.FromResult(HealthCheckResult.Unhealthy("Query failed.", ex));
        }
    }
}

As you can see, this uses an Entity Framework Core context (DbContext-derived class), which it gets from DI in its constructor, but also another parameter, the one that actually does the check. The way to register this would be like this (for some hypothetical BlogContext class, and checking if there are any posts, which is merely used here for demo purposes, of course):

builder.Services.AddDbContext<BlogContext>();

builder.Services.AddHealthChecks()
    .AddTypeActivatedCheck<DbContextHealthCheck<BlogContext>>("Blogs Check", (BlogContext ctx) => ctx.Posts.Any());

For the curious, there is already a Microsoft implementation of DbContextHealthCheck in Microsoft.Extensions.Diagnostics.HealthChecks.EntityFrameworkCore.

If you want to use a lambda (possibly only for testing/debugging) to check for results you have two options:

AddCheck(): performs a synchronous check
AddAsyncCheck(): identical to the previous one, but this time, asynchronous

Here is an example:

builder.Services.AddHealthChecks()
    .AddCheck("Sync Check", () => HealthCheckResult.Healthy("All is well"))
    .AddAsyncCheck("Async Check", async () => await Task.FromResult(HealthCheckResult.Degraded("All is not well")));

Again, it is possible to pass, on each overload of the AddXXCheck method, some string tags that can be used to group together checkers, as we've seen earlier, it's the tags parameter, which takes a collection of strings.

The IHealthCheck's CheckHealthAsync method should return one occurrence of HealthCheckResult containing a Description and a Status property with of one the three possible status:

Healthy: when everything is working as expected
Degraded: when something is possibly wrong, but still functional
Unhealthy: when all is not working as expected. It can take an optional exception

For Unhealthy results, it is also possible to return an Exception.

Health Checks on Request

So, when you want to check the health status of your system, you have two options:

Use the provided endpoint
Invoke the registered checkers explicitly

Of course, option #1 also does #2 automatically. Normally, we add middleware to the pipeline that listens on a specific URL, using the UseHealthChecks() extension method:

app.UseHealthChecks("/Health");

If you don't specify the path, "/Health" is used by default.

A more interesting example, passing configuration options (HealthCheckOptions), would be:

app.UseHealthChecks("/Health", new HealthCheckOptions
    {
        AllowCachingResponses = true,
        ResponseWriter = async (context, report) =>
        {
            var result = JsonSerializer.Serialize(new
            {
                report.Entries.Count,
                Unhealthy = report.Entries.Count(x => x.Value.Status == HealthStatus.Unhealthy),
                Degraded = report.Entries.Count(x => x.Value.Status == HealthStatus.Degraded),
                Status = report.Status.ToString(),
                report.TotalDuration,
                Checks = report.Entries.Select(e => new
                {
                    Check = e.Key,
                    e.Value.Description,
                    e.Value.Duration,
                    Status = e.Value.Status.ToString()
                })
            });
 
            context.Response.ContentType = MediaTypeNames.Application.Json;
            await context.Response.WriteAsync(result);
        }
    });

AllowCachingResponses, which by default is false, allows the response to be cached for a small period of time on the browser.

As you can see, in this example, we are projecting the report into a dynamic type which holds the information we're interested in exposing. When accessing the /Health endpoint it returns this:

health report

Now, by default, the endpoint returns status codes:

HTTP 200 - OK: when the status is Healthy or Degraded
HTTP 503 - Service Unavailable: when the status is Unhealthy

But we can change it using ResultStatusCodes:

app.MapHealthChecks("/Health", new HealthCheckOptions
    {
        ResultStatusCodes =
        {
            [HealthStatus.Healthy] = StatusCodes.Status200OK,
            [HealthStatus.Degraded] = StatusCodes.Status200OK,
            [HealthStatus.Unhealthy] = StatusCodes.Status503ServiceUnavailable
        },
        //rest goes here
    });

And also set a condition on which checkers to fire, using Predicate:

app.MapHealthChecks("/Health", new HealthCheckOptions
    {
        Predicate = check => check.Tags.Contains("db"),
        //rest goes here
    });

As you can probably tell, we are filtering those checkers that are registered with the tag "db".

And, if you want to have a simple HTTP client for checking the status remotely, we can have something like this:

public static class ServiceCollectionExtensions
{
    public static IServiceCollection AddHealthCheckClient(this IServiceCollection services, string baseAddress)
    {
        services.AddHttpClient("HealthCheckClient", client =>
        {
            client.BaseAddress = new Uri(baseAddress);
            client.DefaultRequestHeaders.Add(HeaderNames.Accept, MediaTypeNames.Application.Json);
            client.DefaultRequestHeaders.Add(HeaderNames.CacheControl, "no-cache");
        }).AddTypedClient<IHealthCheckClient, HealthCheckClient>();

        return services;

    }
}

public interface IHealthCheckClient
{
    Task<HealthStatus> CheckHealth(CancellationToken cancellationToken = default);
}

public class HealthCheckClient : IHealthCheckClient
{
    private readonly HttpClient _httpClient;

    public HealthClient(HttpClient httpClient)
    {
        _httpClient = httpClient;
    }

    public async Task<HealthStatus> CheckHealth(CancellationToken cancellationToken = default)
    {
        var response = await _httpClient.GetAsync(string.Empty, cancellationToken);
        return response.StatusCode == HttpStatusCode.OK ? HealthStatus.Healthy : HealthStatus.Unhealthy;
    }
}

The IHealthCheck interface abstracts away the actual implementation, so we can actually have a local one:

public class LocalHealthCheckClient : IHealthCheck
{
    private readonly HealthCheckService _service;

    public LocalHealthCheckClient(HealthCheckService service)
    {
        _service = service;
    }

    public async Task<HealthStatus> CheckHealth(CancellationToken cancellationToken = default)
    {
        var report = await _service.CheckHealthAsync(cancellationToken);
        return report.Status;
    }
}

Just need to register the health client and you're done:

builder.Services.AddHealthCheckClient("https://localhost:80/Health");

For option #2, when we don't want to access the HTTP endpoint, we can do it using a local API, which we get from DI: it's the HealthCheckService class, it gets registered by the call to AddHealthChecks(), and here is one possible way to use it:

[HttpGet("[action]")]
public async Task<IActionResult> CheckHealth([FromServices] HealthCheckService healthService, CancellationToken cancellationToken)
{
    var report = await healthService.CheckHealthAsync(cancellationToken);
    var result = new
        {
            report.Entries.Count,
            Unhealthy = report.Entries.Count(x => x.Value.Status == HealthStatus.Unhealthy),
            Degraded = report.Entries.Count(x => x.Value.Status == HealthStatus.Degraded),
            Status = report.Status.ToString(),
            report.TotalDuration,
            Checks = report.Entries.Select(e => new
            {
                Check = e.Key,
                e.Value.Description,
                e.Value.Duration,
                Status = e.Value.Status.ToString()
            })
        };

    return Json(result);
}

As you can see, it's the same projection that we did for the HealthCheckOptions, so we can just add an extension method:

public static class HealthReportExtensions
{
    public static object ToExtended(this HealthReport report)
    {
        return new
        {
            report.Entries.Count,
            Unhealthy = report.Entries.Count(x => x.Value.Status == HealthStatus.Unhealthy),
            Degraded = report.Entries.Count(x => x.Value.Status == HealthStatus.Degraded),
            Status = report.Status.ToString(),
            report.TotalDuration,
            Checks = report.Entries.Select(e => new
            {
                Check = e.Key,
                e.Value.Description,
                e.Value.Duration,
                Status = e.Value.Status.ToString()
            })
        };
    }
}

One thing to know is that there is also an overload of CheckHealthAsync that allows you to set the conditions to filter the health checkers by:

var report = await healthService.CheckHealthAsync(registrations => registrations.Tags.Contains("tcp"), cancellationToken);

Health Checks on a Schedule

Final option I'm going to cover is running the health checks on a schedule. ASP.NET Core Health Checker supports this out of the box, by means of the IHealthCheckPublisher interface and some custom implementation. Here is a simple example:

builder.Services.Configure<HealthCheckPublisherOptions>(options =>
{
    options.Delay = TimeSpan.FromSeconds(10);
});

builder.Services.AddSingleton<IHealthCheckPublisher, PeriodicHealthCheckPublisher>();

HealthCheckPublisherOptions contains information that is to be used by the infrastructure like the initial Delay and the scheduling Period, and also a Predicate to set the filtering conditions for the health checkers.

As for the PeriodicHealthCheckPublisher class itself, it's very simple:

class PeriodicHealthCheckPublisher : IHealthCheckPublisher
{
    public Task PublishAsync(HealthReport report, CancellationToken cancellationToken)
    {
        var result = JsonSerializer.Serialize(new
        {
            report.Entries.Count,
            Unhealthy = report.Entries.Count(x => x.Value.Status == HealthStatus.Unhealthy),
            Degraded = report.Entries.Count(x => x.Value.Status == HealthStatus.Degraded),
            Status = report.Status.ToString(),
            report.TotalDuration,
            Checks = report.Entries.Select(e => new
            {
                Check = e.Key,
                e.Value.Description,
                e.Value.Duration,
                Status = e.Value.Status.ToString()
            })
        });

        Console.WriteLine(result);

        return Task.CompletedTask;
    }
}

The PublishAsync method is called by the infrastructure with the Delay and the Period specified, and we can do whatever we want with the results in it.

There is a more advanced option, which is to run individual health checkers on a schedule. We achieve that by using adding HealthCheckRegistration entries to the list of health checks:

builder.Services.AddHealthChecks()
    .Add(new HealthCheckRegistration("Ping Check", new PingHealthCheck("8.8.8.8"), HealthStatus.Unhealthy, new [] { "tcp" }) { Delay = TimeSpan.FromMinutes(1), Period = TimeSpan.FromMinutes(5) }));

Security

One thing that you mind have asked yourself was, is the health check endpoint available for everyone, and isn't that a security issue, in some way? Having considered that, the framework allows us to protect accesses to it using the out-of-the box security middleware of ASP.NET Core.

So, if you want to only allow it to be accessed locally (RequireHost):

app.MapHealthChecks("/Health")
    .RequireHost("localhost");

You can also specify a local domain, which means only requests coming from it will be authorised:

app.MapHealthChecks("/Health")
    .RequireHost("local.domain");

Or even require an authorised user altogether (RequireAuthorization):

app.MapHealthChecks("/Health")
    .RequireAuthorization();

You may have noticed that I switched to MapHealthChecks from UseHealthChecks. Well, it turns out that MapHealthChecks allows composition, meaning, we can add more stuff to it, whereas UseHealthChecks does not.

Conclusion

And that was what I wanted to talk about! For more information, please have a look at https://learn.microsoft.com/en-us/aspnet/core/host-and-deploy/health-checks and https://learn.microsoft.com/en-us/dotnet/architecture/microservices/implement-resilient-applications/monitor-app-health. And don't forget to keep your comments coming!

Nice one!
PS please, fix the type in the title :)

Georgi Hadzhigeorgiev - Monday, July 29, 2024 5:44:32 PM

@Georgi: many thanks! Can you believe I hadn’t noticed? 😉

RicardoPeres - Monday, July 29, 2024 11:06:56 PM

Thanks for this article, I really appreciate the effort you put into it. For anyone else who tries playing with the AddResourceUtilizationHealthCheck code above, you will also need to add the following code:

`builder.Services.AddResourceMonitoring();`

Matthew - Thursday, August 1, 2024 9:44:09 AM

@Matthew: many thanks, updated!

RicardoPeres - Thursday, August 1, 2024 10:46:20 AM

What third party app do you use to monitor health checks from multiple instances of the same app and get the alerts when the app is unhealthy?

Robert - Monday, August 5, 2024 7:18:38 AM

@Robert: in real life/work? Mostly NewRelic.

RicardoPeres - Tuesday, August 6, 2024 9:34:27 AM

This breakdown of health checks in ASP.NET Core is really useful! Quick question!!!
what do you usually do when a service is in a "Degraded" state, like if the database is slowing down but still working? Do you log it, send alerts, or automate some kind of response? Also, how do you keep the performance hit low when running all these checks, especially if they're heavy on resources?

چاپ کاتالوگ - Wednesday, August 14, 2024 6:46:37 PM

All of the above: log, send alerts, and maybe throttle down the requests, something for an upcoming post!
The checks are either ran on a schedule (every few minutes) or on demand, as I’ve shown. They really don’t take much resources (CPU, memory), so it should be ok.

RicardoPeres - Thursday, August 15, 2024 9:46:27 AM

Thank you for sharing this insightful article! For those experimenting with the AddResourceUtilizationHealthCheck code mentioned earlier, you'll also need to include this additional line to ensure proper functionality:

builder.Services.AddHealthChecks();

Buy Essay - Wednesday, December 25, 2024 5:06:05 PM