AtomPub, JSON, Azure and Large Datasets, Part 2

[Cross-posted from here]

Last Friday I posted some initial results from some simplistic testing I had done comparing pulling data from Azure via ATOM (the ADO.NET data services client) and JSON. I was surprised at the significant difference in payload and time to completion. A little later, Steve Marx questioned my methodology based on the fact that Azure storage doesn’t support JSON. Steve wasn’t being contrary, but rather pushing for clarification to the methodology of my testing as well as a desire to keep people from attempting to exploit the JSON interface of Azure storage when none exists. This post is a follow up to that one and attempts to clarify things a bit and highlight some expanded findings.

 

The platform I’m working against is an Azure account with a storage account hosting the data (Azure Tables), and a web role providing multiple interaction points to the data, as well as making the interaction point anonymous. Essentially, this web role serves as a “proxy” to the data and reformats it as necessary. After Steve’s question last week, I got to wondering particularly about the overhead (if any) the web role/proxy was introducing and if, esp. in the case of the ATOM data, it was drastically affecting the results. I also got to wondering if the delays I was experiencing in data transmission were, in some part, caused by the fact of having to issue 9 serial requests in order to retrieve the entire 8100 rows that satisfied my query.

 

To address these issues, I made the following adjustments:

  1. Tweaked my test harness for ATOM to optionally hit the storage platform directly (bypassing the proxy data service).
  2. Tweaked the data service to allow an extra query string parameter to indicate that the proxy service should make as many calls to the data service as necessary to gather the complete result set and then return the results as a single batch to the caller. This allowed me to eliminate the 1000 row limit as well as to issue only a single HTTP request from the client.
  3. I increased the test runs from 10 to 20 – still not scientifically accurate by any means, but a bit longer to provide a little better sense of the average lengths for each request batch.

The results I received as follows and not altogether different than one might expect:

image

image

As you can see from the charts above, the JSON FULL option was the fastest with an average time to completion of 14.4 seconds. When compared to the regular JSON approach, you can infer that the overhead introduced from multiple calls is roughly 4 seconds (18.55 average time to completion).

 

In the ATOM category, I find it interesting that the difference between the ATOM Direct (directly to the storage service) was only marginally faster (0.2 of a second on average) than the ATOM FULL approach. This would indicate that the network calls between the web role and the storage role are almost a non-factor (hinting at rather good network speeds). Remember, in the case of ATOM Full, the web role is doing the exact same thing as the test client is doing in Atom Direct, but additionally bundling the XML response into a single blob (rather than 9) and then sending it back to the client.

 

The following chart shows the average payload per request between the test harness and Azure. Atom Full is different then Atom and Atom Direct in that the former is all 8,100 rows whereas the later two represent a single batch of 1000. It is interesting to note that the JSON representation of all 8,100 records is only marginally larger than the ATOM representation of 1,000 records (1,326,298 bytes compared to 1,118,892 bytes).

image

At the end of the day, none of this is too surprising. JSON is less verbose in markup than ATOM and would logically be smaller on the wire and therefore complete sooner (although I wouldn’t have imagined it was a factor of 9 difference). What is interesting, is that the transfer of data b/t the data layer and the web role is almost trivially fast (remember, that 9 MB of XML moved between the layers and was then reformatted as JSON and shoved down back to the client in 14 seconds). It further makes you wonder what the performance improvement would/could be if Azure storage exposed a native JSON interface…

Published Thursday, August 20, 2009 2:27 PM by rgillen
Filed under: , ,

Comments

# re: AtomPub, JSON, Azure and Large Datasets, Part 2

Saturday, August 22, 2009 5:36 AM by Scott Prugh

Thanks for this info.  These results seem to make sense.  We don't have large-scale testing experience with Azure but rather our own large scale web service platform.  Our services support multiple payload exposure transports: http, http+gzip, netTcp, netTcp+gzip and we are playing with protobuf and protobuf+gzip.  The difference in transfer rates between http and http+gzip is about 8-1 and between http and netTcp+gzip is 13-1.  We see the benefit not only on the wire but also on client and server throughput.  I believe the later benefit is due to the fact that the network stack has much less work to do to put the data on the wire.  I am hoping that Azure storage adds native support for gzip and also for binary protocols as the performance difference is significant. Ideally, MSFT would add support for something like Protobuf which is very efficient and portable.  Please email me if you would like to discuss more(scott prugh at csgsystems dotcom)

# re: AtomPub, JSON, Azure and Large Datasets, Part 2

{What we should do|Selection of positive actions|Perform the following} to learn more about watch before you're left out.

# re: AtomPub, JSON, Azure and Large Datasets, Part 2

Saturday, November 23, 2013 4:43 PM by http://www.bamajamfarms.com/メンズ-japan-38.html

{The Thirteen|The Ten} MostWild bag Hacks... And Ways To Use them!

# re: AtomPub, JSON, Azure and Large Datasets, Part 2

The ideal technique for the men for you to find out more about now.

# re: AtomPub, JSON, Azure and Large Datasets, Part 2

{The thing you ought to do|List of beneficial options|Perform the following} to learn more about watch well before you are abandoned.

# re: AtomPub, JSON, Azure and Large Datasets, Part 2

Monday, November 25, 2013 8:18 AM by http://www.yrides.com/寝袋シュラフ-japan-17.html

{The 13|The Ten} MostBizarre bag Secrets-and-cheats... And The Way To Utilise them!

# re: AtomPub, JSON, Azure and Large Datasets, Part 2

Report-- watch {Will certainly Have|Can Have|May Have|May Play} A Major role In Any Website administration

# re: AtomPub, JSON, Azure and Large Datasets, Part 2

Rumor: bag {Will certainly Have|Can Have|May Have|Will Play} A Vital role In Almost Any Site administration

# re: AtomPub, JSON, Azure and Large Datasets, Part 2

Monday, November 25, 2013 7:49 PM by http://www.aliexpress.com/store/1037654

japan can help all of us by adding numerous special features and options. Its a unvaluable thing for every enthusiast of japan.

# re: AtomPub, JSON, Azure and Large Datasets, Part 2

Tuesday, November 26, 2013 6:39 AM by http://casio.vox6.com/casio-jp-2841.html

Tired of so many japan news flashes? I'm at this site to suit your needs!!

# re: AtomPub, JSON, Azure and Large Datasets, Part 2

Drop Protesting and complaining And Begin your own personal men Campaign Instead

# re: AtomPub, JSON, Azure and Large Datasets, Part 2

Friday, November 29, 2013 3:40 AM by http://www.macjanta.com/

The Grotesque {Truth|Fact|Actuality|Facts|Unavoidable truth|Honest truth} Relating To Your Beautiful japan Imagination

# re: AtomPub, JSON, Azure and Large Datasets, Part 2

Quit Protesting And Commence your personal men Distribution campaign Instead

# re: AtomPub, JSON, Azure and Large Datasets, Part 2

Friday, November 29, 2013 9:53 AM by http://www.caribbeanfyah.com

{What you ought to do|Checklist of advantageous things to do|Perform the following} to learn more about watch before you're abandoned.

# re: AtomPub, JSON, Azure and Large Datasets, Part 2

The basic fundamentals behind watch that you'll benefit from getting started today.

# re: AtomPub, JSON, Azure and Large Datasets, Part 2

Done with all the men gossip? Our company is here for you

# re: AtomPub, JSON, Azure and Large Datasets, Part 2

Monday, December 2, 2013 7:09 AM by http://www.uamaps.com/スワニーswany-japan-15.html

Why not a soul is referfing to watch {and|and due to this fact|and therefore|and as a result} the actions one {should create|ought to engage in} today.

Leave a Comment

(required) 
(required) 
(optional)
(required)