The Document Object Model (DOM) is the data representation of the objects that comprise the structure and content of a document on the web. In this guide, we’ll briefly introduce the DOM. We’ll look at how the DOM represents an HTML or XML document in memory and how you use APIs to create web content and applications.

The Document Object Model, or the “DOM”, is an interface to web pages. It is essentially an API to the page, allowing programs to read and manipulate the page’s content, structure, and styles. Let’s break this down.

The name “Document Object Model” was chosen because it is an “object model” is used in the traditional object oriented design sense: documents are modeled using objects, and the model encompasses not only the structure of a document, but also the behavior of a document and the objects of which it is composed. In other words, the nodes in the above diagram do not represent a data structure, they represent objects, which have functions and identity. As an object model, the Document Object Model identifies:

  • the interfaces and objects used to represent and manipulate a document
  • the semantics of these interfaces and objects – including both behavior and attributes
  • the relationships and collaborations among these interfaces and objects

The structure of SGML documents has traditionally been represented by an abstract data model, not by an object model. In an abstract data model, the model is centered around the data. In object oriented programming languages, the data itself is encapsulated in objects which hide the data, protecting it from direct external manipulation. The functions associated with these objects determine how the objects may be manipulated, and they are part of the object model.

The Document Object Model currently consists of two parts, DOM Core and DOM HTML. The DOM Core represents the functionality used for XML documents, and also serves as the basis for DOM HTML. All DOM implementations must support the interfaces listed as “fundamental” in the Core specification; in addition, XML implementations must support the interfaces listed as “extended” in the Core specification. The Level 1 DOM HTML specification defines additional functionality needed for HTML documents.

Where the Document Object Model came from

The Document Object Model originated as a specification to allow JavaScript scripts and Java programs to be portable among web browsers. Dynamic HTML was the immediate ancestor of the Document Object Model, and it was originally thought of largely in terms of browsers. However, when the Document Object Model Working Group was formed, it was also joined by vendors in other domains, including HTML or XML editors and document repositories. Several of these vendors had worked with SGML before XML was developed; as a result, the Document Object Model has been influenced by SGML Groves and the HyTime standard. Some of these vendors had also developed their own object models for documents in order to provide programming APIs for SGML/XML editors or document repositories, and these object models have also influenced the Document Object Model.

Web page built

How a browser goes from a source HTML document to displaying a styled and interactive page in the viewport is called the “Critical Rendering Path”. Although this process can be broken down into several steps, as I cover in my article on Understanding the Critical Rendering Path, these steps can be roughly grouped into two stages. The first stage involves the browser parsing the document to determine what will ultimately be rendered on the page, and the second stage involves the browser performing the render.

The result of the first stage is what is called a “render tree”. The render tree is a representation of the HTML elements that will be rendered on the page and their related styles. In order to build this tree, the browser needs two things:

  1. The CSSOM, a representation of the styles associated with elements
  2. The DOM, a representation of the elements

DOM Interfaces and DOM Implementations

The DOM specifies interfaces which may be used to manage XML or HTML documents. It is important to realize that these interfaces are an abstraction – much like “abstract base classes” in C++, they are a means of specifying a way to access and manipulate an application’s internal representation of a document. In particular, interfaces do not imply a particular concrete implementation. Each DOM application is free to maintain documents in any convenient representation, as long as the interfaces shown in this specification are supported. Some DOM implementations will be existing programs that use the DOM interfaces to access software written long before the DOM specification existed. Therefore, the DOM is designed to avoid implementation dependencies; in particular,

  1. Attributes defined in the IDL do not imply concrete objects which must have specific data members – in the language bindings, they are translated to a pair of get()/set() functions, not to a data member. (Read-only functions have only a get() function in the language bindings).
  2. DOM applications may provide additional interfaces and objects not found in this specification and still be considered DOM compliant.
  3. Because we specify interfaces and not the actual objects that are to be created, the DOM can not know what constructors to call for an implementation. In general, DOM users call the createXXX() methods on the Document class to create document structures, and DOM implementations create their own internal representations of these structures in their implementations of the createXXX() functions.

Limitations of Level One

The DOM Level 1 specification is intentionally limited to those methods needed to represent and manipulate document structure and content. Future Levels of the DOM specification will provide:

  1. A structure model for the internal subset and the external subset.
  2. Validation against a schema.
  3. Control for rendering documents via stylesheets.
  4. Access control.
  5. Thread-safety.

Recap

The DOM is an interface to an HTML document. It is used by browsers as a first step towards determining what to render in the viewport, and by Javascript programs to modify the content, structure, or styling of the page.

Although similar to other forms of the source HTML document, the DOM is different in a number of ways:

  • It is always valid HTML
  • It is a living model that can be modifed by Javascript
  • It doesn’t include pseudo-elements (e.g. ::after)
  • It does include hidden elements (e.g. with display: none)

LEAVE A REPLY

Please enter your comment!
Please enter your name here