Getting absolute coordinates from a DOM element

For some reason, there is no standard API to get the pixel coordinates of a DOM element relative to the upper-left corner of the document. APIs only exist to get coordinates relative to the offset parent. Problem is, it's very important to get those coordinates for applications such as drag and drop, or whenever you need to compare coordinates of elements that may be in completely different parts of the document.

In Microsoft Ajax, we implemented such a function (Sys.UI.DomElement.getLocation) but it proved to be one of the most difficult problems we had to solve. Not so surprisingly, every single browser has its own coordinate quirks that make it almost impossible to get the right results with just capability detection. This is one of the very rare cases where we reluctantly had to use browser sniffing and implement a completely different version of the function for each browser.

We also had to implement a pretty complex test suite to verify our algorithms for thousands of combinations of block or inline elements, offsets, scroll positions, frame containment, borders, etc., and run those on each browser that we support. The test suite in itself is quite interesting: it renders an element with the constraints to test, gets its coordinates from the API, creates a top-level semi-transparent element that is absolutely positioned and check that both overlap exactly to pixel precision. To do so, it takes a screen shot and analyses the image to find the rectangles and check their color.

The simplest runtime implementation is the IE one, thanks to a little-known API that does almost exactly what we want: getBoundingClientRect. Almost exactly as we quickly discovered it has a weird 2-pixel offset except on IE6 if the HTML element has a border (which is a "feature" that was removed in IE7 and which we chose not to support). It also doesn't include the document's scroll position. Finally, if the element is in a frame, the frame border should be subtracted. This actually was the cause for a bug that we unfortunately discovered after we shipped 1.0 but which is now fixed in ASP.NET 3.5 as the parent frame may not be in the same domain, in which case attempting to get its frame border will result in an access denied error. So you need to do this in a try-catch and just accept the bad offset in that fringe scenario.

In all other browsers, you currently need to recurse through the offset parents of the element and sum the offset coordinates.

In Safari 2, though, the body's offset gets counted twice for absolutely-positioned elements that are direct children of body. Something you don't just guess, you need the thousands of test cases I mentioned earlier to discover something like that...

In both Safari and Firefox, you must subtract to the coordinates the scroll positions of all parent nodes (and not the offset parents like before). Well, except if the element is absolutely positioned (this last restriction doesn't apply to Opera). Or if the parent is the body or html element. Confused yet? Wait, there's more.

In Firefox, non-absolutely positioned elements that are direct children of body get the body offset counted twice.

In both IE and Firefox (but you don't care for IE as it has getBoundingClientRect), the border of a table gets counted in both the border of the table and in the td's offset.

Finally, on Opera, there are scroll values on elements that are not scrolled so you need to explicitly check for the overflow mode before you subtract scroll positions. Opera also includes the scrolling into the offsets, except for positioned contents.

The worst part in all this is that we don't even know for sure that we nailed it, and we know that future browsers will require adjustments.

Speaking of which... Firefox 3 will implement getBoundingClientRect. I haven't tried their implementation yet and checked whether it has the IE quirks, but it should be a lot simpler than what we have to work around today to do the same thing and we'll definitely have to rely on less undocumented quirks. By the way, if you were thinking of using the undocumented getBoxObjectFor, forget it, it was designed for XUL elements and will probably get removed from non-XUL elements in future versions.

There is a bug open against WebKit to get that in Safari but it's currently without an owner. Here's to hoping this gets into the next version. Vote for it.

Opera apparently also implements that in 9.5, referencing a W3C draft which curiously doesn't contain any references to getBoundingClientRect.

17 Comments

  • Why don't you just implement the Mootools framework - it's much more lightweight, fast, and simpler to implement. No point reinventing the wheel again!

  • Marcus: I get a lot of comments from people who are not using ASP.NET Ajax asking me why we didn't just integrate . I don't deny that it's an interesting question, but there actually are pretty good answers. First, users of ASP.NET have been asking us to do it. Second, Microsoft Ajax, while not requiring ASP.NET, has unparallelled integration with it, which our users appreciate immensely. Third, building our own enables us to control the design and steer it to promote interoperability with other products and answer specific requirements from our partners and customers. Finally, it's very difficult legally for us to integrate external products (although we've done it in the past and may do it again in the future), especially if their license is anything like GPL.
    In the end, our decision seems to have been the right one as one year after shipping, our library is already used by 73% of ASP.NET developers (i.e. our customers), whether they develop Ajax applications or not.

  • Sebastian: thanks for the comment. I completely agree with the sharing and stuff. This kind of hacking is no fun at all and I don't wish anyone to have to go through it. Hence the experience sharing and the blog post. So yes, take what you want from this post and use it if you find anything you didn't already know and feel free to share any information I may have missed and I will add it to the post and credit you for it.
    As for the code sharing, it may be a little trickier. On my side, there is no chance that the legal department here would even let me glance at something with a license that ends with GPL. The other way around might work, as we're under MS-PL:
    http://msdn.microsoft.com/en-us/asp.net/bb944808.aspx
    One thing that may be interesting to talk about would be to share our test suites. That may help achieve consistency, correctness, and would help other developers to debug their code.
    Drop me a mail at bleroy at youknowwho if you want to chat.

  • I applaud the commitment M$ has made to cross-browser compatibility and this article is a great example on the lengths you guys are going to to make our lives (the developers) easier. It is a shame that other parts of M$ are still mired in the 'One platform' mindset.

    Where's the tip jar?

    On the note of multiple toolkits performing the same tasks...I agree with everyone: It would be nice if everyone was working on the one codebase, but I also understand that M$ would also need to have (some of) it's codebase dictated by commercial concerns (eg. from ASP.NET).
    Alas, these two philosophies are on other sides of a river.

    Let's hope that one day M$ will allow 3rd parties to develop their AJAX code.
    (WTL went down a similar avenue, hopefully for the world's benefit).

  • Thomas: the way the test suite works is that we have an ASP.NET page that runs locally (as in on the same machine as the server), displays an element with the wanted constraints, call getLocation on it, positions another element absolutely with the coordinates we got from getLocation, then the page's script sends a request to the server, which launches the application that takes a screenshot and analyses the image to detect differences in coordinates and log them.
    This way, the test suite runs on any browser, you just go to that page and it starts.

  • I'm with you up to the screenshot. But are you at liberty to disclose how you do the analysis of the image?

  • The other interesting thing about the suite is that it generates positioning combinations automatically. It takes around I'd say... 30 minutes if I remember correctly, to run though enough combinations in a single browser.

    Some issues it has are combinations that produce strange renderings where the positioned element isn't visible or only partially visible (although we do the best we can to minimize those by making everything big enough, it happens anyway), or isn't redrawn by the browser without user interaction.

    I could imagine coming up with a single page that had enough of the real key scenarios (eigenpositions?) that could be the GetLocation equivalent of the acid test... :) If I recall the main pain areas across all browsers were combinations of relative and absolute positioned elements, and further when those are inside scrolled elements. So much fun...

  • Dave (infinitiesloop): I like the idea of an acid test for getLocation. By the way, our simplified version of the test that runs on every checkin does include the most pathological cases and comes close to that definition. It lacks the generality of the full test but it has the advantage that because the expected coordinates for each browser are hard coded, it doesn't need the screenshot analysis so it's much easier to run.

  • Thomas: the element we got getLocation from has a border of a specific color, and what we position absolutely to check the coordinates are two squares of a different color on the top left and bottom right. The analysis program scans the image for those rectangles and squares of specific colors. Does that answer your question?

  • Not quite. I was after the 'analysis' part you mention. How is this performed? Are you using a tool like ImageMagic, or something self-written? How do you detect the rectangles and squares and the deviation from the expected values?

  • Something self-written. We wrote a .NET console app that grabs the screenshot and finds rectangles of a given color.

  • Thanks for the blog entry. I really liked it to read about all that problems ... I always just did a loop through the dom to find out the coordinates, and "corrected" that by hardcoded values (for the browser differences) ...
    But as interesting as your post was, could you explain a little bit more how that console app does the screenshots? As I saw in the comments, I'm not the only one who's interested in that ;-) ...

  • Fine, seeing all the feedback I'm getting on that, I'll find a way to publish that publicly. I'll try to see if we can come up with a simplified version that would look like ACID for getLocation. Stay tuned.
    (technically, grabbing the screenshot involves some .NET/pInvoke wizardry, but nothing too intense, basically to get a BitMap object from any window handle and then you're in business)

  • @Atanas: yes, we're currently doing a test pass with the IE8 beta and we know of a few problems. getLocation is one API where we have literally thousands of test cases and where new browsers are likely to break the current code because of how tied it is to browser bug workarounds.

  • jQuery does a good job at it with the offset() method. I really like the way jQuery is chainable and how properties are implemented.

    Seriously study jQuery! I've found it to be the best, easiest, and least intrusive.

  • I am sorry i am not familiar with asp. But I want to get absolute coordinate from C# System.Windows.Forms.HtmlElement. How can I use (Sys.UI.DomElement.getLocation) in my code because I can't find namespace "Sys".

  • @Tom: Sys is a JavaScript namespace introduced by ASP.NET Ajax. You seem to be two technologies away from where you need to look. I'd ask on a Windows Forms forum, for example http://windowsclient.net/Forums/

Comments have been disabled for this content.