Part I: What is the Document Object Model

What is a Document?

Let us define a document as a logical structure, detailing information of a certain kind. Each type of document also has its physical structure.

For example, take the following:

<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
	"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en">
	<head>
		<title>My Title</title>
	</head>
	<body>
		<p>Some Text</p>
	</body>
</html>
This document is physically an XML document, since it can be seen that it is built of a declaration and elements. We could also delve deeper into the physical structure of this document and decide that this document's physical structure was a set of UTF-8 characters, 8 bit bytes or even bits.
In this example, however, we have decided to take XML as our physical structure and build the logical structure on top of it.

Should we examine this document again, we could understand that this document also represents a simple XHTML document, since it conforms to the XHTML Strict standard. From this we can infer that this document's logical structure is the logical structure of an XHTML document. Now we can refer to these elements not as simple, anonymous xml elements, but as XHTML elements.

What is a Document Object Model?

The web can define an Object Model in many ways, but let us discuss the reasons a Document Object Model is needed in the first place:

  1. Every type of document has different rules for its structure. For instance, the rules 'bold' and 'not bold' could not be both applied to a single letter of a Word document. A C# class can not have two data types for a field definition, etc.
    When using such a structure for a file from our code, we have to be able to apply all of those rules intuitively or even implicitly.
  2. Due to their physical structure, documents are often human unreadable without the use of software to translate them into a human readable structure. For instance, the physical structure of an Executable file in DOS might contain the byte pair 0xCD and 0x21. This is not understandable, unless you're specifically trained to read this structure. When read by the machine, these bytes would be translated into the 0x86 Assembler call for Interrupt 0x21 (int 21h).
    We have to be able to translate this encoding from the document to something that could be used in code, when an Object Oriented structure is usually preferable.
  3. Upgrades to the physical structure of a document type are common. Let's take a Microsoft Word 6.0 document for example, and compare it to a Microsoft Word 10.0 document. Looking into the physical structure of both document types, we can see how it has changed, even though the logical structure remained backwards compatible. We must be able to access the logical structure of the file without being required to understand its physical structure in order to maintain backwards compatibility in our 3rd party software.
  4. It is common knowledge that almost all large computer software projects are written and maintained by several persons. Often, the persons who maintain the software are not those who originally wrote it. Thus, we must write code that is readable, understandable and maintainable.
    One of the problems that might come up is if we code, for instance, a method in charge of writing several bytes to a file that is in use and no explanation would appear. These lines might work, but it's not easily understandable why and for what purpose.

These points show us that we must create a viable, convenient and reusable model for working with documents of a specific type. This model must also be as backwards compatible and consistent.

The Document Object Model (henceforth referred to as DOM) comes into play to solve these problems by introducing a set of classes that represent all of the options the document's logical structure has to offer, thus creating an abstraction of the file's structure.
The DOM can load an existing document into an object graph that would represent it. This object graph could be created from scratch as well. The DOM also offers saving an object graph back to the file format, once we are done using it via our code.
Many DOMs present us with other capabilities, such as creating different document types or versions from a single graph of objects. This way, for instance, should the Microsoft Word Object Model have supported persistence to Portable Document Format (PDF) documents, a Word document could have been loaded into an object graph and saved as a PDF document using a few simple lines of code.
The company or organization that controls the standard also issues a specification and at times an implementation of the DOM for their document, making it more reliable. When an implementation is also distributed, the DOM becomes a shared piece of code that is maintained by a single company – the one that issued it, thus eliminating duplication and the need to maintain different implementations.

However, every good thing has its bad side. The use of DOM could at times be cumbersome and inconvenient. This happens due to poor design of the structure and/or the fact that it contains many features that are unnecessary for novice developers.
Efficiency must also be considered, since we are using a complete object graph in memory to manipulate a file, not to mention loading graphs from documents and persistence. These are at times processor-heavy and memory-heavy actions.

Examples of common Document Object Models are the Xml Document Object Model specification (with its System.Xml implementation in the Framework), The Microsoft Office Document Object Models, etc.

The following parts deal with the Code Document Object Model, better known as the Code DOM. Its function – to represent code files for languages implementing the CLI.

1 Comment

Comments have been disabled for this content.