July 2007 - Posts - Jon Galloway

July 2007 - Posts

The real reason SELECT * queries are bad: index coverage

Are SELECT * queries bad? Sure, everyone know that. But, why?

It's returning too much data, right?

That's the common answer, but I don't think it's the right one. If you're working with a reasonably normalized database, the actual network traffic difference is pretty small.

Let's take a look at a sample. The following two queries select 326 rows from the TransactionHistoryArchive table in the AdventureWorks database (which has a total of 89K rows). The first uses a SELECT * query, the second selects a specific column:

 

SELECT * FROM Production.TransactionHistoryArchive WHERE ReferenceOrderID < 100 SELECT ReferenceOrderLineID FROM Production.TransactionHistoryArchive WHERE ReferenceOrderID < 100

 

In this case, the difference in network traffic is only 15K, roughly a 10% difference (180K vs. 165K). It's worth fixing, but not a huge difference.

SELECT * makes the Table / Index Scan Monster come

Often, the bigger problem with SELECT * is the effect it will have on the execution plan. While SQL Server primarily uses indexes to look up your data, if the index contains all the columns you’re requesting it doesn’t even need to look in the table. That concept is known as index coverage. In the above example, the first query results in a Clustered Index Scan, whereas the second query uses a much more efficient Index Seek. In this case, the Index seek is one hundred times more efficient than the Clustered Index Scan.

SelectStarQueryPlan

Unless you've indexed every single column in a table (which is almost never a good idea), a SELECT * query can't take advantage of index coverage, and you're likely to get (extremely inefficient) scan operations.

If you just query the rows you'll actually be using, it's more likely they'll be covered by indexes. And I think that's the biggest performance advantage of ignoring SELECT * queries.

The Stability Aspect

SELECT * queries are also bad from an application maintenance point of view as well, since it introduces another outside variable to your code. If a column is added to a table, the results returned to your application will change in structure. Well programmed applications should be referring to columns by name and shouldn't be affected, but well programmed applications should also minimize the ways in which they are vulnerable to external changes.

Shameless Plug: I go into this (and a lot other important performance tips) in more detail in a soon-to-be-released book for SitePoint.

Posted by Jon Galloway | 9 comment(s)
Filed under: ,

[T-SQL] Getting distinct, current entries from tables with timestamp / datetime fields

It's relatively easy to store data with a time dimension, but querying it is another matter. If you select from a temporal tables (one which includes historical information indicated by a timestamp or datetime column) based on your ID, you'll get a lot of duplicate records; only one or a few of those records will be applicable to a given time or timespan.

For example, the following query (from the AdventureWorks sample database) returns 746 records:

 

SELECT ActualCost FROM Production.TransactionHistory WHERE ProductID = 784

 

A table with History in the name is expected to work that way, but I've run into plenty of tables which allow for duplicates of every column but the primary key, differentiated by a datetime column. I'm starting to use these more now, partly due to project requirements, and partly because SubSonic has built-in support for CreatedOn and ModifiedOn columns.

In the past, I've used stacked views or nested subqueries to handle this madness. It's a pain in the neck, and often leads to frustrations with the GROUP BY clause since queries with an aggregate term can't include a column in a resultset that's not in the GROUP BY clause (but adding those columns to the GROUP BY clause isn't the right solution, as it alters the rows you're returning).

The general idea is to write a query which gets the latest update date for a unique ID combination, then join against it. It works, but it's a pain in the neck, and it's error prone. Fortunately, SQL Server 2005's RANK / PARTITION features make this a lot easier.

Let's talk specifics - a query against the AdventureWorks Production.TransactionHistory table which returns the latest record Product Name and Number by Transaction Date. There are 113K rows in that table, but if we only want the latest entry for each Product we're down to 490 rows.

Unlike the traditional GROUP BY based clause, the RANK / POSITION based queries are a bit more intuitive once you've got the hang of them. The important thing is to use PARTITION in the same way you'd think of a GROUP BY - the column(s) you want unique should show up in the PARTITION clause.

One limitation on RANK queries is that you can't use the RANK value directly in a where clause. That's easy to work around by placing the RANK value in a subquery and the WHERE clause filtering on the RANK in the outer query.

UPDATE: Based on comments, my original example was a little too simple. I've added a few joins and return columns; the point is that we can continue to add other columns to the result set without having to worry about how the grouping is handled. Yes, you can do this with subqueries, but in many cases those will continue to grow more complicated as you add tables and columns; this syntax doesn't.

SELECT Product.Name, Product.ProductNumber, TransactionDate, TransactionID, ProductCategory.Name, ProductSubcategory.Name, ProductSubcategory.ModifiedDate FROM ( SELECT DISTINCT Production.TransactionHistory.ProductID, Production.TransactionHistory.TransactionDate, Production.TransactionHistory.TransactionID, RANK() OVER ( PARTITION BY Production.TransactionHistory.ProductID ORDER BY Production.TransactionHistory.TransactionDate DESC) AS DateOrder FROM Production.TransactionHistory ) CurrentTransactionHistory INNER JOIN Production.Product ON Production.Product.ProductID = CurrentTransactionHistory.ProductID INNER JOIN Production.ProductSubcategory ON Production.Product.ProductSubcategoryID = Production.ProductSubcategory.ProductSubcategoryID INNER JOIN Production.ProductCategory ON Production.ProductSubcategory.ProductCategoryID = Production.ProductCategory.ProductCategoryID WHERE DateOrder = 1 ORDER BY ProductSubcategory.ModifiedDate
Posted by Jon Galloway | with no comments
Filed under: ,

Presentation tricks - Command Windows

Time management is a big part of a technical presentation. You want your demos to go fast enough to keep your audience's attention, but you don't want to gloss over details and lose them. Here are a few tricks I've used when showing something command line operations at a DOS prompt.

1. Avoid command windows in presentations if possible

It's hard to effectively communicate by showing command-line operations. Console windows are hard to read, and it's unlikely that your audience will remember the command line switches or parameters you used. Worse, navigating the Windows folder hierarchy and making a few typing errors is a good way to lose hard won attention. Consider if it's really necessary in your specific case.

Can you explain what you'll be doing, then just execute a batch file which contains the command line operations? I know I'd rather see a PowerPoint slide with a few bullet points explaining important commands or switches than watch someone type tiny words I can't read.

2. Make sure your console window is readable

The best method (as I've said before, mostly plagiarizing Scott Hanselman) is to set up a new Windows account with big fonts. In addition to setting up Visual Studio, take a minute to customize your console window settings, making sure to select "Save properties for future windows with same title." While you're at it, why not check the command prompt settings on your working account as well?

3. Have your folder paths preset

There's no value to your audience in watching you CD to the right directory, even if you know about auto-complete directory names. Be prepared and have shortcuts set up.

There are two ways to set that up. One solution is to set up a shortcut to %COMSPEC% (the DOS variable that always points to your command executable), then set the "Start in" directory to your target directory.

Note that this console window will be displayed with the shortcut's title, so you'll probably want to set your font settings for this shortcut as well. In my testing, console window shortcuts still got the default settings, but it doesn't hurt to set them for the shortcut and be sure.

Console-Shortcut

The other way to launch a command prompt in a directory is to create a simple batch file which calls %COMSPEC% with the /K switch, which tells the prompt window to stay open after the batch file has executed. Here's what I used to launch the prompt in my SubCommander bin directory (calling CLS to clear the screen when complete):

%COMSPEC% /k "cd c:\projects\jon\subsonic\subcommander\bin\debug\" CLS

Which to use? It really doesn't matter, although the batch file approach gives you the option of performing other setup tasks, like setting environment variables.

Once you've got any console window shortcuts set up, drop them on your desktop along with shortcuts to anything else you'll want during your demo - shortcuts to solution files, slides, important directories, etc. After the demo, move these shortcuts to another folder to keep the desktop clean.

Posted by Jon Galloway | 1 comment(s)
Filed under:

SubSonic - Code Camp slides posted

Slides and sample code from my talk on SubSonic from the SoCal Code Camp a week ago are available here. The slides were pretty light since my talk was geared towards building some very simple sample pages on the fly.

The one shortcut I took was to copy web.config and DLL's from the SubSonic Central site included with the SubSonic download - I think that was a good approach, since I didn't have to bore everyone with manually typing configuration files, but I didn't have to use the demo hammerspace technique of copying a bunch of code out of some text document I'd set up beforehand. Sure, it's a cheat, but it's a cheat you can use when you get home from the talk, too.

Here's the text from the slides:


SubSonic

Using SubSonic to build ASP.NET Applications that are good, fast, and cheap
Jon Galloway
SoCal Code Camp July 2007


Overview

* What is SubSonic

* Simple – WebSite project with build provider

* More advanced – Using SubCommander to generate code

* Simple again – Using a the SubSonic Starter Site


What is SubSonic

* Lightweight data access layer (DAL) built on ActiveRecord pattern

– SQL Server 2000
– SQL Server 2005
– MySQL
– Oracle
– EntLib

* Controls built on top of SubSonic DAL

* “Sugar” – Utilities


Overview

* History

* SubSonic 1.0 – Zero Code DAL, scaffold

* SubSonic 1.0+ – SubCommander, more controls

* Subsonic 2.0 (current is 2.0.2) – Starter site with CMS and Membership, AutoScaffold, more controls


Our Pal, Northwind

(screenshot of Northwind tables I’ll be working with)


Build Provider DAL

* Overview of the concept

* Demo

(Remember to cheat by copying config and DLL’s in from SubSonic Central)
(Remember to copy connectionstring.config)


SubCommander

* Overview

* Demo

– Command Line, Toolbar
– Look at generated code
(Yes, there’s dynamic SQL in there. Here’s why it’s okay…)
– Modify database

(again, remember to copy connectionstring.config)

Sample Database

(screenshot of sample database for demo)


Starter Site

* First show in SubSonic solution

* Look at CMS

* Build some pages

– Query
– Controls
– LoadFromPost (time permitting)

Starter Site Demo Cheat Sheet

* Add connection string

<add name="CodeCamp" connectionString="Data Source=.\SQL2005;Integrated Security=True;Database=CodeCampSite"/>

* Add provider line

<add name="CodeCamp" type="SubSonic.SqlDataProvider, SubSonic"
connectionStringName="CodeCamp"
generatedNamespace="CodeCamp"
generateLazyLoad="true"
generateRelatedTablesAsProperties="true"
/>

Starter Site Cheat Sheet

* Set up AutoScaffold page

* Clear content from CMS

protected void ClearCMS_Click(object sender, EventArgs e) { CMS.PageCollection pc = new CMS.PageCollection(); pc.Load(); foreach (CMS.Page p in pc) CMS.Page.Destroy(p.PageID); }

References

http://blog.wekeroad.com

http://subsonicproject.com

http://weblogs.asp.net/jgalloway

Posted by Jon Galloway | with no comments
Filed under:

Checkbox Grids in ASP.NET

A simple checkbox grid is often the best user interface for mapping multiple selections in multiple categories:

Food Cabernet Zinfandel Pinot
Salmon
Steak
Chicken
Chocolate

You'd think this would be easy to do with a GridView and some CheckBoxField columns, but the GridView control's underlying assumption is that you'll only be editing one row at a time. Because of that, all the checkboxes are disabled by default; only the row which has been set to "Edit" mode can be checked and unchecked.

Aside - CheckBoxField only binds to bit fields; it doesn't bind to string values which are convertible to booleans. If you're binding to a field which isn't a bit, you'll need to convert to a bit. If you have a non-bit numeric field, you can likely use (CONVERT(bit,0) as Selected).

So we can't a CheckBoxField, what do we do?

I think the best solution is to ditch the CheckBoxField control for a simple CheckBox control, either in a GridView TemplateField or a Repeater with a Checkbox control in the ItemTemplate. The CheckBox has an OnCheckedChanged event which fires for each bound control which changes status (from checked to unchecked, or from unchecked to checked).

My rule of thumb with GridView vs. Repeater: use the GridView if the GridView will do everything I need to do without tweaking it.  Once I start having to have to start hacking the GridView, I switch to the Repeater. Usually once a grid needs something a little out of the ordinary, there's a good chance I'll need to make further changes to it, and customized GridViews tend to get complicated quickly. So, here's the sample code I'd use for the above grid:

<table>
<asp:Repeater ID="FoodPairings" runat="server" 
		OnItemDataBound="FoodPairings_ItemDataBound" >
  <HeaderTemplate>
    <thead>
      <tr>
        <th>Food</th>
        <th>Cabernet</th>
        <th>Zinfandel</th>
        <th>Pinot</th>
      </tr>
    </thead>
  </HeaderTemplate>
  <ItemTemplate>
    <tr>
      <td>
        <asp:Label ID="Food" runat="server" Text='<%# Eval("Food") %>' />
      </td>
      <td>
        <asp:CheckBox ID="Cabernet" OnCheckedChanged="OnCheckChangedEvent" 
			runat="server" Checked='<%# Eval("Cabernet") %>' />
      </td>
      <td>
        <asp:CheckBox ID="Zinfandel" OnCheckedChanged="OnCheckChangedEvent" 
			runat="server" Checked='<%# Eval("Zinfandel") %>' />
      </td>
      <td>
        <asp:CheckBox ID="Pinot" OnCheckedChanged="OnCheckChangedEvent" 
			runat="server" Checked='<%# Eval("Pinot") %>' />
      </td>
    </tr>
  </ItemTemplate>
</asp:Repeater>
</table>

But how do I determine the ID for the selected checkbox?

The CheckBox OnCheckedChanged event doesn't work too well when it's databound, because the OnCheckedChanged event EventArgs doesn't have an ID which corresponds to the row and column. Handling the save requires you to know both the row and column; it's easy to determine the column by the name of CheckBox control, but there's no simple way to determine the row. The OnCheckedChanged event EventArgs Sender argument passes in the Checkbox control, but the Checkbox control doesn't give you any properties which let you stash an ID field.

If we were only working with a single Checkbox per row it wouldn't be a problem, but you've got a grid of checkboxes, there's no built-in way to get the id of the selected checkbox. CheckBox doesn't have anywhere to stash a value. I thought about some goofy hack in which I hid the ID in the Text or Tooltip, but both of those values are displayed by default.

I think a better solution, though, is to add an attribute to the checkbox control. You can add an attribute to any control using the ControlAttributes.Add(name,value) syntax, and that's a handy way to associate these values.

public void OnCheckChangedEvent(object sender, EventArgs e) { CheckBox c = (CheckBox)sender; string wineID = ((Control)c).ID; int FoodID = int.Parse(c.Attributes["FoodID"]); if (c.Checked) { //Add pairing } else { //Remove pairing } } protected void FoodPairings_ItemDataBound( object sender, RepeaterItemEventArgs e) { if (e.Item.ItemType == ListItemType.Item || e.Item.ItemType == ListItemType.AlternatingItem) { DbDataRecord row = e.Item.DataItem as DbDataRecord; CheckBox checkbox; checkbox = e.Item.FindControl("Cabernet") as CheckBox; checkbox.Attributes.Add("FoodID", row["FoodID"].ToString()); checkbox = e.Item.FindControl("Zinfandel") as CheckBox; checkbox.Attributes.Add("FoodID", row["FoodID"].ToString()); checkbox = e.Item.FindControl("Pinot") as CheckBox; checkbox.Attributes.Add("FoodID", row["FoodID"].ToString()); } }

Is this the best solution?

I'm not sure. There are a few different ways to go about this. My approach worked for me, but I'm not sure it's the best. Here are pro's and con's to the approach I used:

Pro:

  • Reasonably simple
  • Fits with the general control approach you'd expect
  • Handling the OnCheckedChanged event keeps your save logic really simple, because you don't need to implement any special logic to check if the checkbox state has changed.

Cons:
The generated HTML is a little wacky. The checkbox is wrapped in a <span>:

<span FoodID="2"><input id="..." type="checkbox" name="..." checked="checked" /></span>

Here are some other ideas I considered:

  1. Use an AJAX save method which is called for each checkbox click.
  2. Hide the row id in a hidden label and get to it via the CheckBox's NamingContainer.

What do you think?

Posted by Jon Galloway | 15 comment(s)
Filed under:

Silverlight 1.1 (Alpha) cross domain webservice access makes mashups tricky

Any web mashups, by definition, require cross-domain calls. Those cross-domain calls may happen on the client (in the browser) or on the server. Regardless of the client technology (AJAX, Flash, Silverlight, etc.), cross domain calls on the client are always more complex that server-side cross-domain calls, and for good reason. It's tricky in AJAX, and it's downright difficult in Silverlight. You'll know that Silverlight development has become more widespread when you hear a lot more complaints about this problem.

I previously wrote about using a static port to eliminate this problem when you're calling back to your own server but Silverlight is detecting a cross-domain call. That's caused by the ASP.NET Development Server running the different projects on different ports (e.g. website on localhost:1234 and webservice on localhost:5678), and you can work around it by just putting the website and webservice in one project with a static port. However, there are plenty of times when you'll want to make a cross-domain call, so we've got to get this figured out.

While helping Rob Conery work through some problems connecting to Amazon Web Services this past week, I wrote up some notes on the issue.

Silverlight intentionally blocks cross domain access

Silverlight intentionally forbids cross domain access, so unless webservices.amazon.com served up the page with the Silverlight control, it’s not going to allow access. That’s a pain from a developer point of view, but from a security point of view it makes complete sense. Otherwise a the Silverlight control could be used to turn a user’s browser into a “zombie” which could roam the entire Internet and local intranet without the end user’s knowledge, doing all kinds of bad things (posting spam on forums, DDOS attacks, modify the local router via web access, etc.).

It's important to understand that this is an even worse case than a simple cross-site scripting (XSS) attack, which steals information from one site to pass to another. In addition to information security issues, a cross-domain vulnerability is like an unpatched virtual machine running on a patched host, since malicious net access code would be more like an out of control virus with full Internet access.

A simple proxy through XmlHttp isn't likely to work

XmlHttp in both Firefox and IE7 (and XmlHttpRequest in IE6) are pretty locked down by default, so it’s hard to come up with a cross browser solution that will work on a default browser configuration. For example, Firefox only allows it if the script is signed, and IE7 applies security zone policies. It's hard to find a simple cross browser solution here.

Server-side proxies are a simple solution

You can use a server side proxy, of course, which gets around all these restrictions because Silverlight is calling back to the originating domain.

The JSONP workaround

You can use JSONP to do a client side proxied call which goes through JavaScript but doesn’t use the XmlHttp stack. The basic concept is that the browser security model allows you to reference a JavaScript source from any location, so JSONP dynamically adds a script element with a src set to the cross domain URL you want to access. The information returned by that URL isn’t JavaScript, but it doesn’t matter; the JSONP code which created the script element just reads the contents and destroys the script element when it’s done. That’s easier for web services which support JSONP natively – del.icio.us, Yahoo – but it can be done against AWS via some libraries.

What I think Microsoft should do

The Silverlight team should do one or more of the following:

  1. Document this more, preferably in the Silverlight documentation on MSDN
  2. Add a client configurable whitelist of domains to which cross-domain calls are allowed
  3. Have a list of "certified" safe domains to which cross-domain calls are allowed (Microsoft, Google, Amazon, Flickr, del.icio.us, etc.).
  4. Offer a proxy service, perhaps reimbursed by the target domains

UPDATE: I heard there will likely be a solution of some kind to this problem in a forthcoming Silverlight release.

References:

A pretty good article on JSONP with a sample which hits AWS: http://www.devx.com/webdev/Article/30860/1954

A sample which hits AWS via JSONP (bonus points for using XSLT to convert AWS XML responds to JSON, which isn’t as useful for you since in Silverlight XML’s easier to read than JSON): http://www.kokogiak.com/gedankengang/2006/05/consuming-amazons-web-api-directly.html

My previous blog entries on the topic - http://weblogs.asp.net/jgalloway/archive/2007/06/14/calling-an-asmx-webservice-from-silverlight-use-a-static-port.aspx, http://weblogs.asp.net/jgalloway/archive/2007/05/08/silverlight-cross-domain-access-blocked-use-a-server-side-proxy-or-xmlhttprequest.aspx

Posted by Jon Galloway | 3 comment(s)
Filed under: , ,

Some keyboard input tricks for Silverlight 1.1 (Alpha)

Here are a few tricks I learned while doing my "hello world" maze game in Silverlight 1.1.

Silverlight doesn't fire the KeyDown event for cursor (arrow) keys

However, just about all keys will fire a KeyUp event. In my case, I was able to just handle the KeyUp event, but remember that holding a key down will trigger multiple KeyDown events and only one KeyUp event.

Notice how a lot of the games you're seeing for Silverlight use W/A/S/D instead of arrow keys? I suspect that's the reason.

The difference between Key and PlatformKey

The KeyDown and KeyUp event parameters include KeyboardEventArgs, which has properties for both PlatformKeyCode and Key. PlatformKeyCode is a platform specific key code, so by definition it can (and will) change between Windows and Mac (and potentially between versions of these operating systems). It's preferable to use Key unless you need to be you need to detect keys which aren't available on all platforms, such as the Windows start key.

Key is a platform agnostic key code and is better to use when possible. Of course, there isn't a Key value for the Windows key, so if you're using keys which will only be defined on a specific operating system you're stuck with the PlatformKeyCode value. If you do need access to PlatformKey values, see Dave Relyea's comment on this thread for info on how he determined platform key values.

The missing Key Enumeration

KeyboardEventArgs.Key value is supposed to be an Enum per the MSDN documentation, but at least in this release it's just an int. I grabbed the keys and values from the documentation and created a simple enum which should hopefully be forward compatible when that enum is (hopefully) added to agclr.dll.

enum Keys { KEYNONE = 0, BACKSPACE = 1, TAB = 2, ENTER = 3, SHIFT = 4, CTRL = 5, ALT = 6, CAPSLOCK = 7, ESCAPE = 8, SPACE = 9, PAGEUP = 10, PAGEDOWN = 11, END = 12, HOME = 13, LEFT = 14, UP = 15, RIGHT = 16, DOWN = 17, INSERT = 18, DELETE = 19, DIGIT0 = 20, DIGIT1 = 21, DIGIT2 = 22, DIGIT3 = 23, DIGIT4 = 24, DIGIT5 = 25, DIGIT6 = 26, DIGIT7 = 27, DIGIT8 = 28, DIGIT9 = 29, A = 30, B = 31, C = 32, D = 33, E = 34, F = 35, G = 36, H = 37, I = 38, J = 39, K = 40, L = 41, M = 42, N = 43, O = 44, P = 45, Q = 46, R = 47, S = 48, T = 49, U = 50, V = 51, W = 52, X = 53, Y = 54, Z = 55, F1 = 56, F2 = 57, F3 = 58, F4 = 59, F5 = 60, F6 = 61, F7 = 62, F8 = 63, F9 = 64, F10 = 65, F11 = 66, F12 = 67, NUMPAD0 = 68, NUMPAD1 = 69, NUMPAD2 = 70, NUMPAD3 = 71, NUMPAD4 = 72, NUMPAD5 = 73, NUMPAD6 = 74, NUMPAD7 = 75, NUMPAD8 = 76, NUMPAD9 = 77, MULTIPLY = 78, ADD = 79, SUBTRACT = 80, DECIMAL = 81, DIVIDE = 82, KEYUNKNOWN = 255 }

The result: a generic Keyboard Handler Event

First, I add the event handler in the Page_Load event:

public void Page_Loaded(object o, EventArgs e) { InitializeComponent(); this.KeyUp += new System.Windows.Input.KeyboardEventHandler(keyHandler); //Other stuff... }

Next, my keyHandler method has a simple switch block which allows for W/A/S/D, numeric keypad, or cursor keys. I also handle the F key, which toggles to full screen mode:

protected void keyHandler(object sender, KeyboardEventArgs args) { Keys key = (Keys)args.Key; switch (key) { case Keys.A: case Keys.NUMPAD4: case Keys.LEFT: handleLeft(); break; case Keys.W: case Keys.NUMPAD8: case Keys.UP: handleUp(); break; case Keys.D: case Keys.NUMPAD6: case Keys.RIGHT: handleRight(); break; case Keys.S: case Keys.NUMPAD2: case Keys.DOWN: handleDown(); break; case Keys.F: BrowserHost.IsFullScreen = !BrowserHost.IsFullScreen; break; default: return; } }
Posted by Jon Galloway | 7 comment(s)
Filed under: , ,
More Posts