Azure Service Bus Subscriptions with Correlation Filters

Sunday, April 1, 2018

[![filter][1]][2]

Azure Service Bus pub/sub is implemented using topics and subscriptions. Messages are published to topics and copied over to subscription queues with matching criteria. Criteria are declared using Rules. Each rule has a Filter. Filters help the broker decide if a message sent to a topic will be copied over to a subscription or not. Let’s dive into the world of filters to understand how they work. There are three types of filters supported by the broker:

Boolean filters
SQL filters
Correlation filters

Boolean filters

These filters (TrueFilter and FalseFilter) are not the most sophisticated. They are literally "catch-all" or "catch nothing" options. The TrueFilter is the default when nothing else is defined. It’s handy when implementing a wiretap to analyze all messages flowing through a topic.

SQL filters

Just as the name indicates, SQL filters allow SQL language-based expressions to define criteria used to filter to identify messages that will be copied over to subscription. If you’re interested in the syntax, see documentation. One thing I’ll mention is the idea of scope, which denotes the type of property; user-defined properties prefixed with user and system defined properties prefixed with sys.

With SQL filters it’s possible to create very complex rules for filtering messages out. Keep in mind that the more complex these rules are, the higher performance tall will be on the broker will have to apply these rules to every message. An example of a SQL rule:

sys.Label LIKE '%bus%'` OR `user.tag IN ('queue', 'topic', 'subscription')

Correlation filters

Unlike Boolean and SQL filters, this group is used to perform matching against one or more user and system properties in a very efficient way. Paraphrasing official documentation:

The CorrelationFilter provides an efficient filter that deal with equality only. As such, the cost of evaluating filter expression is minimal and almost immediate w/o extra compute required

The only challenge with this filter, it’s not quite clear how to use it. There are two constructors

Default (empty) constructor
Constructor taking a single argument

The second constructor with a single string argument initiates a correlation filter to use the passed in value to be the criteria for CorrelationId. The first constructor is a mystery. Well, not quite. Empty correlation filter can be used to assign other system and user properties that can be used for filtering. For system properties ContentType, MessageId, ReplyTo, ReplyToSessionId, SessionId, To, and CorrelationId can be assigned values to filter on. What about user-defined properties? That’s nwhere it’s a little unclear if you follow the documentation, which hopefully will get updated soon.

To specify user-defined properties for correlation filter, a property Properties of type IDictionary <string, object> is exposed. Keys for this dictionary are the user-defined properties to look up on messages. Values associated with keys are the values to correlate on. Here’s an example.

var filter = new CorrelationFilter();
filter.Label = "blah";
filter.ReplyTo = "x";
filter.Properties["prop1"] = "abc";
filter.Properties["prop2"] = "xyz";

Created filter will have the following criteria:

sys.ReplyTo = 'x' AND sys.Label = 'blah' AND prop1 = 'abc' AND prop2 = 'xyz'

And you’ve guessed it right. When correlating on multiple properties, logical AND will be used so that all properties have to have the expected values for the filter to be evaluated as truthy.

In case you wondered how user-defined properties are populated, here's an example:

message.Properties["prop1"] = "abc";

Conclusions

Filtering messages can be done in several ways. Evaluate what filter to create and don’t default to SQL filter just because it’s easier to create. If filters can be simple enough to be expressed with correlation, prefer CorrelationFilter over SqlFilter. And remember, no matter what filter is used, filters cannot evaluate message body, but you can always promote from message body to properties/headers.

10 Comments

Hi Sean,

Thanks for the great example, the official documentation goes into great detail about SQL filter syntax but doesn't explain the other two types - what is the purpose of a `FalseFilter`? The documentation mentions that a correlation filter is more efficient than a SQL filter - do you have any concrete examples of the throughput difference between a correlation filter, and the equivalent SQL filter? Also, is there any way to set up a correlation filter equivalent to the following: `NOT EXISTS(prop1) OR prop1 = 'xyz'`?

Thanks,
Andrew

Andrew Williamson - Wednesday, May 30, 2018 3:46:53 PM

@Andrew,

Good question. `TrueFilter` is handy when you want a wireshark implemented to catch all messages. `FalseFilter` could be applied to disable wireshark. At least what I've seen as useful for those two.

While I don't have concrete numbers for throughput differences, I can attest what a difference it makes on performance. One of my customers had his premium namespaces with a significantly degraded performance due to the fact that they've used SQL filters that are very expensive.

Regarding `NOT EXISTS(prop1) OR prop1 = 'xyz'` - there's no way. Correlation filters are always AND-ed and can only check for equality. That's why they are so performant. If you need filtering with more envolved logic, SQL filters is the way to go.

Hope that helps,
Sean

sfeldman - Wednesday, May 30, 2018 5:02:53 PM

From what I understand a correlation filter will be used when creating a subscription. Correct? So I am really confused as to why https://docs.microsoft.com/en-us/azure/azure-subscription-service-limits#service-bus-limits states that Number of correlation filters per topic can be 100,000 when the limit for subscriptions is only 2000 per namespace. Am I interpreting this wrong and if so, how can I use so many correlation filters and not max out the subscription limit?

Sean - Tuesday, June 12, 2018 5:39:04 PM

@Sean,
A single subscription could have multiple filters. Each topic can have 2K subscriptions. Those subscriptions can have up to 2K SQL filters or 100K correlation filters. The difference is due to how those filters are evaluated. It takes much less compute with correlation filters than with SQL filters, hence the difference in maximum allowed number of those. Hope that makes it clear.

sfeldman - Tuesday, June 12, 2018 10:38:07 PM

Hi - what format should a correlation filter be when using the Service Bus Explorer tool (v4.0.109) ? I'm adding a filter as sys.Label = '1234' but when messages are sent with this label they do no appear in the subscription. I've also tried:

* label='1234'
* label=1234

and nothing appears to allow the message through onto the subscription. I've not set the "action" as I'm not clear what this is for in SBE.

Can you help?

Graham - Wednesday, June 20, 2018 8:20:34 AM

@Graham,

SBE does not create Correlation filter. It only operates with SQL filters.
You will need to create a filter for your subscription using a different tool or custom code.

sfeldman - Friday, June 22, 2018 12:09:41 AM

When I create a Correlation Filter with an Int assigned to a Property, the filter is expressed as Property = 'VAL' - its wrapping in quotes and all my messages go to dead letter due to conversion errors - How do I force it to evaluate as Property = VAL

Shevek - Wednesday, September 26, 2018 7:05:41 AM

@Shevek, correlation properties are always strings.

Sean Feldman - Thursday, September 27, 2018 11:43:11 PM

HI Sean!

Love your articles around service bus.

My question is not around correlation filter but around setting up MaxDeliveryCount on webjob subscription client. I am using ServiceBusTrigger to consume the subscription message and am using NamespaceManager to create the filtered subscription description but failing to set the MaxDelievery count. It always defaults to 10. Also need to configure the minBackoff, maxBackoff.

If you can redirect to me to any post that would be great.

Tanvir

Tanvir Hussain - Wednesday, November 7, 2018 10:24:22 AM

Thank you @Tanvir.

ServiceBusTrigger is an Azure Service Bus binding for Functions which uses a queue/subscription you need to create upfront.
When using NamespaceManager to create an entity, you can provide entity type descriptor such as QueueDescription or SubscriptionDescription. Both have MaxDeliveryCount property. When not specified, it defaults to 10. To have a different value, you would need to assign that property.

sfeldman - Thursday, November 8, 2018 10:06:20 PM

Comments have been disabled for this content.