Skip to main content

XML DTD Building Blocks

All XML documents are composed of elements, attributes, entities, PCDATA, and CDATA. These are the fundamental building blocks that make up XML documents.


Elements are the main building blocks of both XML and HTML documents.


<?xml version="1.0"?>  
<title>A Great Book</title>
<author>Tom Nolan</author>
<publisher>Tutorial Reference</publisher>

where book, title, author, publisher are elements.


Learn more about Elements in the DTD Elements Section


Attributes provide extra information about elements.

Attributes are name-value pairs always placed inside the opening tag of an element.


<img src="myimage.png" />


  • the name of the element is img;
  • the name of the attribute is src;
  • the value of the attribute is myimage.png

Learn more about Attributes in the DTD Attributes Section


Some characters have a special meaning in XML, like the less than sign (<) that defines the start of an XML tag.

An entity is composed of three parts:

  1. An ampersand (&)
  2. An entity name
  3. A semicolon (;)
<!ENTITY entity-name "entity-value">
Entity ReferencesCharacter

The following entities are predefined in XML:


Entities are expanded when a document is parsed by an XML parser.


PCDATA means parsed character data.

PCDATA is text that will be parsed by the XML parser.

Tags inside the PCDATA will be treated as markup and entities will be expanded.


The parsed character data should not contain any &, <, or > characters; these should be represented by the entities &amp; &lt; and &gt;, respectively.


XML parser examines the data and ensures that it doesn't contain entity. If an entity is found, it will be expanded.

<!DOCTYPE book [  
<!ELEMENT book (title,author,publisher)>
<!ELEMENT author (#PCDATA)>
<!ELEMENT publisher(#PCDATA)>


CDATA means character data.

Tags inside the CDATA text are not treated as markup and entities will not be expanded.