October 2008 - Posts

MS BI Conference: Monday Keynote
06 October 08 08:31 PM | MikeD

Here are my notes on the Monday morning keynote:

  • About 3000 attendees at the conference, over 60 countries represented.
  • There is BI in Halo3: whenever you look at competitor stats or weapon effectiveness, this is implemented using BI tech

Madison - MS has acquired DATAllegro, a company that was accomplishing low TCO MPP (massively parallel processing) scale out of BI. Using standard enterprise servers, you can process queries on very large data warehouse databases very quickly. They demonstrated a hardware setup of a MPP cluster: one control node, 24 compute nodes, and at least as many storage nodes (ie. shared disks). They loaded 1 trillion (yes trillion) rows in the fact table, and a bunch of dimension tables, such that the data warehouse contained over 150 TeraBytes of data. Then they sliced the fact table up onto the 24 SQL instances on the compute nodes (each compute node then had 1/24 of the trillion rows) and replicated the dimension tables to all compute nodes. Using SQL 2008 (and its new star join optimization) they then issued a query on the fact table and the related dimension tables to the cluster, where the control node passed the query along to the compute nodes, they each processed it, and returned the results back to the client.

On one screen they had Reporting Services (the client app) and on another, a graphic display of the CPU and disk stats for the control node and all 24 compute nodes, each node having 8 CPUs. When the Report was being displayed, the query got processed, and you could see the CPU usage go up on many of the nodes, then disk usage on each of the nodes, then the activity would subside and the reporting view would then display the results. It was all done in under 10 seconds. It was truly impressive. Now, that was with essentially read only data, so you could probably "roll your own" MPP system, given the time and hardware. It's not a huge technical problem to scale out read-only data. If they could show the same demonstration except with a SSIS package *loading* a trillion rows into the cluster, that would have been astounding - it's a much different and more difficult problem. Still, I was impressed.

Gemini - this is "BI Self Service" - the first evidence of this is an Excel addin that the always-entertaining Donald Farmer demonstrated. He used the addin to connect to a data warehouse and in a spreadsheet showed 20 million rows. We didn't *see* all 20 million rows, but he did sort it in under a second, and then filtered it (to UK sales only, about 1.5 million rows) in under a second. That performance and capacity was on what he said was a <$1000 computer with 8 GB RAM, similar to what he purchased for home a few weeks ago.

Aside from the jaw-dropping performance, he used the addin to dynamically link the data from analysis services with another spreadsheet of user-supplied data (I think it was "industry standard salary" or something). The add-in was able to build a star-schema in the background automatically and then make it available in the views they wanted in Excel  ( a graph or something? I can't remember). So it was showing the fact that sometimes the data warehouse doesn't have all the data needed for users to make decisions, so they got the data themselves, rather than wait for IT to get it in the DW. Ok, cool. So then he published that view into Sharepoint using Excel services, and the user-supplied data went along with it. So centrally publishing that view means it can be utilized by others in the enterprise, rather than sharing via email or a file share or something.

From the IT perspective, he showed a management view (dashboard) in SharePoint showing usage stats of "Sandboxes" (the thing they are currently calling these publications) and they could see how popular this particular sandbox was, and then take steps to formalize it into the enterprise. The tantalizing link on that web page was "Convert to Performance Point" - the idea was that you could take the sandbox view and convert it into a PPS web part. That looked cool too.

So Gemini looks very interesting.

Timeframes: the next major release of SQL will be 24-36 months from release of SQL 2008, but in the meantime, there are a number of releases coming: Madison and Gemini will be coming in the first half of 2010, and CTP's will be available sometime early next year. There are some incremental releases of Analysis Services, Integration Services and Reporting Services coming - the next gen of Reporting Services in particular will become available in a Feature Pack "real soon now".

 

Filed under: ,
MS BI Conference 2008 - First Impressions
06 October 08 04:24 PM | MikeD

It has been over a year since I last blogged, but I want to restart with some posts about the BI Conference I am attending this week.

Chris and I flew to Vancouver yesterday and drove down to Seattle in a Camry Hybrid. Sitting in the lineup at the US border for an hour drained the batteries on the Camry so it had to restart the engine to recharge a couple of times, for about 10 minutes each time. Seemed odd to discharge so much battery just sitting in a lineup and moving 10 feet every five minutes. Anyway...the trip display shows that our fuel efficiency was under 8 liters/100km on the trip down. That also seems a little poor compared to my Golf TDI that gets 4.5 liters/100 km regularly.

We registered last night and wandered the Company Store for a bit - saw uber-geek stuff there and we thought of getting something for Cam, our uber-geek on the team at Imaginet. The conference package was predictable: a nice back-pack, a water bottle, a pen, a 2 GB USB stick, a SQL Server magazine, not as many sales brochures as last year, and a conference guidebook.

Last year's guide book was a small coil bound notebook with a section of blank pages at the end for taking notes. This year's edition has the same content - a description of all the sessions and keynotes and speakers, as well as sponsor ads, but it is missing the note-taking section. I really liked that section last year, so today I found myself scribbling notes on loose paper, and running out. I specifically left my (paper) notebook at home because I liked the smaller conference book instead, but now I am going back to the Company Store to buy a small notebook for the rest of the sessions.

The conference is trying to be more environmentally friendly - in the backpack was a water bottle and they encouraged you to refill that at the water stations rather than having bottled water. That's cool. For me, I would have preferred a coffee mug, since I had three cups of coffee over the day (in paper cups, and no plastic lids). In a strange twist, the breakfast and lunch dishes were on paper plates and not the real dishes like last year - one step forward, two steps back I guess. I can't figure it out.

In the main hall before the keynote address, there was an live band playing 80's hits. They were pretty good, but it seemed odd to have a bouncy energetic group on stage at 8:30 on a Monday morning, everyone was filing in and sitting down, morning coffee still just starting to kick in. The bass player was one or two steps beyond bouncy-happy. It reminded me of someone on a Japanese game show.

 

More Posts