Conquering Deep Zoom (Part 1), making tiles, and other Silverlight 2 thoughts

Wow, I can't believe three weeks passed since my last post. This is clearly the problem with a day job that doesn't align with your interests!

Since that last post, Silverlight 2 has been released into the world, and it is good. I've been spending quality time with Deep Zoom, which is a strange name since it's a bit of a concept and not really a technology per se. MultiScaleImage is really the control we're talking about. The Deep Zoom Composer app does a nice job of letting you put together these sweet collections of images at crazy resolutions and various zoom levels, and it's all very cool. Once you get passed the Hard Rock demo, you start to think a little harder about the technology and how you might use it.

I'm still not entirely convinced that a bazillion thumbnails are the best way to navigate image collections. What I do think is useful is taking those 20+ megapixel images we can take with quasi-affordable cameras and make them available at full resolution. Reducing them to 2.5% of their original resolution seems to me like throwing away many stories captured in the moment. So my goal was to conquer Deep Zoom and purpose it for single-image viewing in an image library.

The requirements break down like this: Create a code library that handles database storage of the tiled images (because hundreds of photos times hundreds of tiles makes using the file system a nightmare), cuts up images you feed it and offers an HttpHandler to serve up the images to the MultiScaleImage control. The other side of it is the Silverlight app itself, which is relatively straight forward and uses the Deep Zoom Composer templates, plus a class to correctly handle the calls to score image tiles.

There's a lot to cover, so I won't try to do it all here. I want to start with a class that I wrote to take an image and make tiles. It uses WPF classes to accomplish this instead of System.Drawing. I don't have a good reason for this other than I wanted to see what was available.

public class TileMaker
{
    public TileMaker(BitmapImage bitmapImage, int tileSize, int quality)
    {
        _original = bitmapImage;
        _tileSize = tileSize;
        _quality = quality;
        Tiles = new List<TileContainer>();
        EnableTileContainerCollection = false;
        MaxZoom = CalculateMaxZoom();
    }

    private readonly BitmapImage _original;
    private readonly int _tileSize;
    private readonly int _quality;

    public List<TileContainer> Tiles { get; private set; }
    public int MaxZoom { get; private set; }
    public bool EnableTileContainerCollection { get; set; }
    public event TileCreatedEventHandler TileCreated;

    public void CreateTiles()
    {
        for (int z = MaxZoom; z >= 0; z--)
        {
            BitmapImage image = _original;
            if (z != MaxZoom)
            {
                int? newWidth = null;
                int? newHeight = null;
                int divFactor = MaxZoom - z;
                int divisor = (int)Math.Pow(2, divFactor);
                if (_original.PixelHeight > _original.PixelWidth)
                    newHeight = (int)Math.Ceiling((double)_original.PixelHeight / divisor);
                else
                    newWidth = (int)Math.Ceiling((double)_original.PixelWidth / divisor);
                image = _original.ToBytes().Resize(newWidth, newHeight);
            }
            int tilesHigh = (int)Math.Ceiling((double)image.PixelHeight / _tileSize);
            int tilesWide = (int)Math.Ceiling((double)image.PixelWidth / _tileSize);
            for (int x = 0; x < tilesWide; x++)
            {
                for (int y = 0; y < tilesHigh; y++)
                {
                    var tile = new TileContainer
                                   {
                                       TilePositionX = x,
                                    TilePositionY = y,
                                    Level = z
                                   };
                    int xoffset = x * _tileSize;
                    int yoffset = y * _tileSize;
                    int xsize = _tileSize;
                    int ysize = _tileSize;
                    if (xoffset + _tileSize > image.PixelWidth)
                        xsize = image.PixelWidth - xoffset;
                    if (xsize < 1)
                        xsize = 1;
                    if (yoffset + _tileSize > image.PixelHeight)
                        ysize = image.PixelHeight - yoffset;
                    if (ysize < 1)
                        ysize = 1;
                    tile.ImageData = image.Crop(xoffset, yoffset, xsize, ysize).ToJpegBytes(_quality);
                    if (EnableTileContainerCollection)
                        Tiles.Add(tile);
                    OnTileCreated(new TileCreatedEventArgs(tile));
                }
            }
        }
    }

    private int CalculateMaxZoom()
    {
        return (int)Math.Ceiling(Math.Max(Math.Log(_original.PixelWidth, 2), Math.Log(_original.PixelHeight, 2)));
    }

    protected virtual void OnTileCreated(TileCreatedEventArgs e)
    {
        if (TileCreated != null)
            TileCreated(this, e);
    }
}

public delegate void TileCreatedEventHandler(object sender, TileCreatedEventArgs e);

public class TileCreatedEventArgs : EventArgs
{
    public TileCreatedEventArgs(TileContainer tileContainer)
    {
        TileContainer = tileContainer;
    }

    public TileContainer TileContainer { get; private set; }
}

This class doesn't save the tiles anywhere, it only generates them. The most important math in this case is that it first calculates the maximum zoom level of the original image, meaning the greater of the number of tiles on either axis required to compose the entire image at its native resolution. It uses the Log function to find this. For example, 2 to the power of 10 is 1024, enough to cover a 1000 pixel wide image, so the maximum zoom level is 10. If the image is 1025 wide, then the next up is 2 ^ 11, for 2048, though those last tiles will only be one pixel wide. Does that make sense?

Each zoom level down, you'll divide the image's width and height by half, until you get to level 0, which will always be a 1x1 single tile. The loops, which could probably use some refactoring, cut up the resized image. There are two extension methods in here as well, to crop and make bytes for a JPEG:

public static BitmapSource Crop(this BitmapSource source, int x, int y, int width, int height)
{
    CroppedBitmap crop = new CroppedBitmap(source, new System.Windows.Int32Rect(x, y, width, height));
    return crop;
}

public static byte[] ToJpegBytes(this BitmapSource image, int qualityLevel)
{
    if (image == null)
        throw new Exception("Image parameter is null.");
    var encoder = new JpegBitmapEncoder();
    MemoryStream stream = new MemoryStream();
    encoder.Frames.Add(BitmapFrame.Create(image));
    encoder.QualityLevel = qualityLevel;
    encoder.Save(stream);
    int length = (int)stream.Length;
    byte[] imageData = new byte[length];
    stream.Position = 0;
    stream.Read(imageData, 0, length);
    return imageData;
}

Because I chose to make this event driven and not tied to a specific storage medium, I need to tie in some code to save the images.

TileMaker maker = new TileMaker(bitmapImage, 256, 60);
maker.TileCreated += maker_TileCreated;
maker.CreateTiles();

This simple code creates the TileMaker, assigns a handler to its TileCreated event, then calls the CreateTile() method to make it actually happen. Here are two examples of what the handler might look like.

private void maker_TileCreated(object sender, TileCreatedEventArgs e)
{
    DataContext context = new DataContext();
    var tile = e.TileContainer;
    Tile tileData = new Tile { ImageID = _imageID, Level = tile.Level, TilePositionX = tile.TilePositionX, TilePositionY = tile.TilePositionY, ImageData = new System.Data.Linq.Binary(tile.ImageData) };
    context.Tiles.InsertOnSubmit(tileData);
    context.SubmitChanges();
}

private void maker_TileCreated(object sender, TileCreatedEventArgs e)
{
    var tile = e.TileContainer;
    FileStream stream = new FileStream(String.Format(@"C:\Documents and Settings\Jeff\Desktop\test\{0}_{1}_{2}.jpg", tile.Level, tile.TilePositionX, tile.TilePositionY), FileMode.OpenOrCreate);
    stream.Write(tile.ImageData, 0, tile.ImageData.Length);
    stream.SetLength(tile.ImageData.Length);
    stream.Close();
}

The first assumes I have a LINQ to SQL definition, with a table called Tiles, and I'm feeding in an image ID which is part of some other code. The second one actually writes out files. I've run this code out of a Web page, which isn't ideal because it's somewhat CPU intensive. Really big, high resolution panoramas take a good 30 seconds.

Next time I'll get into how you feed the tiles from the database, to an HttpHandler and have the MultiScaleImage call the handler for tiles.

On a side note, I have to say that overall I'm ridiculously impressed with Silverlight 2. Sure, I wish more stuff would've made it into the distribution, but it's just stunning to me how much managed code you can now run right in the browser.

I'm not happy with the whole Blend thing. Apparently, according to a friend who was blocked from it, Blend is not available to the typical MSDN subscription levels. I'm not sure of what the criteria is. Blend is a pretty neat tool, but I'm absolutely annoyed that the designer surface in Visual Studio is useless. I'd love to know why this is, when you can use it in WPF.

Debugging simply works without any intervention, which I'm very happy about. It even debugs just fine while in Firefox. Getting scriptable methods to work right was a pain (because the MSDN docs are too vauge about the right way to do it), but I think I've got that licked.

I'm thinking about publishing a mini-Deep Zoom kit to facilitate some of these roll-your-own scenarios. Is there any interest on that?

EDIT: Part II is here.

4 Comments

Comments have been disabled for this content.