`
`
`
`
`
`Merrill Communications LLC
`d/b/a Merrill Corporation
`Exhibit 1006 pt. 6
`
`
`
`
`
`I DTD schemas
`
`I Aliases
`
`I Combining multiple schemas
`
`I Datatypes
`
`@1998 THE XML HANDBOOKTM
`
`I
`
`'
`
`
`
`I XML markup
`I Document type definitions
`I Linking and addressing
`I Style sheets
`I XML-Data
`I Web Interface Definition Language (WIDL)
`
`©199 8 TH E XML HANDBOOKTM
`
`
`
`The
`Technology
`ofXML
`
`©1998 TH E X ML HANOBOOK1M
`
`
`
`Creating a
`document type
`definition
`
`I Document type declaration
`
`...______ __
`
`I Element type declarations
`
`I Attribute list declarations
`
`©199 8 TI-lE XML HANDBOOKTM
`
`
`
`(hapter __ _
`
`J2.
`
`rearing your own document type definition is like creating
`ur own markup language. If you have ever chafed at the
`itations of a language with a fixed set of element types,
`uch as HTML, TEl or LaTeX, then you will embrace the opportu(cid:173)
`nity to create your own language.1
`We should note again that it is possible to keep a document type defini(cid:173)
`tion completely in your head rather than writing the declarations for a
`DTD. Sometimes DTD designers do that while they are testing out ideas.
`Usually, though, you actually commit your ideas to declarations so that a
`validating processor can help you to keep your documents consistent.
`Note also that, for the present, we are maintaining the distinction, dis(cid:173)
`cussed in 4.4.3, "Document type, DTD, and markup declarations", on
`page 61, between a document type, the XML markup rules for it (DTD),
`and the markup declarations that declare the DTD. Those DTD declara(cid:173)
`tions are connected to the big kahuna of markup declarations - the docu(cid:173)
`ment type declaration.
`
`1. With its own set of limitations!
`
`11!1 !998 T HE XML H A NDBO OKTh1
`
`449
`
`
`
`4 50
`
`CHAPTER 32 I CREATING A DOCUMENT TYPE DEFINITION
`
`i t. I 1 Document type declaration
`
`A document type declaration for a particular document might say "This
`document is a concert poster." The document type definition for the docu(cid:173)
`ment would say "A concert poster must have the following features." As an
`analogy: in the world of art, you can declare yourself a practitioner of a par(cid:173)
`ticular m ovement, or you can define the movement by writing its manifesto.
`The XML spec uses the abbreviation DTD to refer to document type
`definitions because we speak of them much more often than document type
`declarations. The DTD defines the allowed element types, arrributes and
`entities and can express some constraints on their combination.
`A document that conforms to its DTD is said to be valid. Just as an
`English sentence can be ungrammatical, a document can fail to conform to
`its DTD and thus be invalid. That does not necessarily mean, however, that
`it ceases to be an XML document. The word valid does not have its usual
`meaning here. An artist can fail to uphold the principles of an artistic move(cid:173)
`ment without ceasing to be an artist, and an XML document can violate its
`DTD and yet remain a well-formed XML document.
`As the document type declaration is optional, a well-formed XML docu(cid:173)
`ment can choose not to declare conformance to any DTD at all. It cannot
`then be a valid document, because it cannot be checked for conformance to
`a DTD. It is not invalid, because it does not violate the constraints of a
`DTD.
`XML has no good word for these merely well-formed documents. Some
`people call them "well-formed", but that is insufficiently precise. If the doc(cid:173)
`ument were not well-formed, it would not be XML (by definition). Saying
`that a document is well-formed does not tell us anything about its conform(cid:173)
`ance to a DTD at all.
`For this reason, we prefer the terms used by the ISO for full-SGML: type(cid:173)
`valid, meaning "valid with respect to a document type", and non-type-valid,
`the converse.
`Example 32-1 is an XML document containing a document type decla(cid:173)
`ration and document type definition for mailing labels, followed by an
`instance of the document type: a single label.
`The document type declaration starts on the first line and ends with
`"]>".The DTD declarations are the lines starting with "<!ELEMENT". Those
`are element type declarations. You can also declare attributes, entities and
`notations for a DTD.
`
`©1998 THE XML HANDBOOK™
`
`
`
`3 2. I
`
`f) U C U M I; NT 'I' Y I' F
`
`[) E C I. A R A ' l' I () N
`
`451
`
`](• "2-1. XML ,locumenl: with document type lleclaration
`.
`Exam•_!P=··,;.·'::__,..-.-;-------------~:...._ __________ _
`;!-oocTYPE labe l r
`(name, street, cit:y , sta te, count ry , code)>
`< ! ELE!1F.Nl.' labe l
`( -11-PCDA'l'A) >
`< !ELEMENT name
`<!ELEMENT stree t
`(#PCDATA) >
`< ! ELEHEN'I' ci ty (irPCDATA)>
`< !ELE!1 ENT state (~PCDATA) >
`<!ELEMENT country ( irPCDA'rA) >
`<! ELEMEN'l' code
`( JrPCDATA) >
`
`] ><label>
`<name>Rock N . Robyn</name>
`<street>Jay B i r d Sl reel</ s t r eet>
`<city>Bal t imore</city>
`<state>MD</state>
`<country>USA</country>
`<code>~3214</code>
`</label>
`
`Recall hom 3.4, "Entities: The physical structure", on page 38 that an
`XML document can be broken up into separate objects for storage, called
`"entities" .1 The document type d.eclaration occurs in the first (or only)
`entity to be parsed, called the "document entity".
`In Example 32-1, all of the DTD declarations that define the label DTD
`reside within the document entity. However, the DTD could have been
`partially or completely def-ined somewhere else. ln that case, the document
`type declaration would contain a reference to another entity containing
`those declarations.
`A document type declaration with only external DTD declarations looks
`like Example 32--2.
`
`Example 32-2. Doeumen't type dcelurntiou with m.:Hernal DTD tledarations
`<?xml ver.sion:..= 11 l . 0 II?>
`<!DOC'l'YPE LABEL SYSTE!1 "http: //www . sgw:l sour cC'.com/dtds/l abe l. dtd " >
`<LABEL>
`
`</LABEL>
`
`They keyword SYSTEM is described more co mpletely in 33.9.1, "System
`identifiers", on page 495. For now, we will just say that it tells the processor
`
`1. Loosely, an entity is like a file.
`
`I~ 1 'Jl) ('; T II):, X M ] . 1-lt\ N ll I~()() K (,\I
`
`
`
`4 52
`
`CHAPTER 32 I CREATING A DOCUMENT TYPE DEFINITION
`
`to fetch some resource containing the external information. In this case, the
`external information is made up of the declarations that define the label
`DTD. They should be exactly the ones we had in the original label docu(cid:173)
`ment. The big difference is that now they can be reused in hundreds, thou(cid:173)
`sands, or even millions of label documents. Our simple DTD could be the
`basis for the largest junk mailing in history!
`All document type declarations start with the string "<!DOCTYPE".
`Next they have the name of an element type that is defined in the DTD.
`The root element in th
`in ranee (described in 31.4, "Elements", on page
`434) must be of th Ly pe declared in he document typed claration. If any
`of the DT D declarations are stored externally, the third parr of the docu(cid:173)
`ment type declaration must be either "SYSTEM" or "PUBLIC". We will
`cover "PUBLIC" later. If it is "SYSTEM", the final part must be a URI
`pointing to the external declarations. A URI is, for all practical purposes, a
`URL. URis are discussed in 34.4, "Uniform Resource Identifier (URI)", on
`page 512.
`
`Spec. Reference 32-1. DOCTYPE declaration
`[28] doctypedecl : : =
`'<!DOCTYPE' S Name
`(S ExternaliD)? S? (' ['
`(rnarkupdecl I PEReference I S)* ']' S?)? '>'
`'SYSTEM' S SysternLiteral
`I 'PUBLIC' S PubidLiteral S SysternLiteral
`
`[75] ExternaliD : :=
`
`[29] rnarkupdecl : : = elernentdecl I AttlistDecl I EntityDecl
`I NotationDecl I PI I Comment
`
`J2.2 1 Internal and external subset
`
`In Example 32-1, the DTD declarations were completely internal. They
`were inside of the document type declaration. In Example 32-2, they were
`completely external. In many cases, there will be a mix of the two. This sec(cid:173)
`tion will review these options and show how most XML document type
`declarations combine an internal part, called the internal subset and an
`external part, called the external subset.
`From now on, as we'll almost always be writing about DTD declarations,
`we'll refer to them as "the DTD". ~Ve'll resort to the finer distinctions only
`when necessary for clarity.
`We will start with another example of a DTD:
`
`©1998 THE XML HANDBO O K™
`
`
`
`32.2 I IN TE RNA L AND EXTERNAL SUBS E T
`
`453
`
`Yl'
`
`... 16 32-3. Garage sale announce1nent DTD.
`"'xa r
`10
`;iELE~ENT GARAGESALE
`(DATE, TIME, PLACE, NOTES)>
`<!ELEMENT DATE
`(#PCDATA)>
`<!ELEMENT TIME
`(#PCDATA)>
`<!ELEMENT PLACE
`(#PCDATA) >
`<!ELEMENT NOTES
`(#PCDATA) >
`
`for
`These markup declarations would male up an ultra-simple DT
`garage sale announcements. 1 As you may have deduced, it declares five ele(cid:173)
`,menr types. We will get to the syntax of the declarations soon. Firsr we will
`look at bow they would be used. These coLtld reside in a separate file called
`garage. dtd (for instauc ) and then every document that want d ro con(cid:173)
`fotm to them would declare its conformance using a document type decla(cid:173)
`ration. This is shown in Example 32-4.
`
`Example 32-4. Conforming garage sale document.
`<!DOCTYPE GARAGESALE SYSTEM "garage.dtd" >
`<GARAGESALE>
`<DATE>February 29, 1998< / DATE>
`<TIME>7:30 AM</TIME>
`<PLACE>249 Cedarbrae</PLACE>
`<NOTES>Lots of high-quality junk for sale!< / NOTES >
`</GARAGESALE>
`
`Instead of a complete URL, we have just referred to the DTD's file name.
`Actually, this is still a URL. It is a relative URL. That means that in a stan(cid:173)
`dard Web server setup, the XML document entity and its DTD entity
`reside in the same directory. You could also refer to a full URL as we did in
`Example 32-2.
`
`Example 32-5. Specifying a full URL
`<!DOCTYPE GARAGESALE SYSTEM
`"http: //www .tradestuff.com/stuff .dtd" >
`
`<GARAGESALE>
`
`</GARAGESALE>
`
`1. A garage sale is where North Americans spend their hard-earned money on
`other people's junk, which they will eventually sell at their own garage sales.
`
`\1)1998 THE XML HANDBOOKTM
`
`
`
`454
`
`CHAPTER 32 I CREATING A DOCUMENT lYPE DEFINITION
`
`The relative URL is more convenient while you are testing because you
`do not need to have a full server installed. You can just put the two entities
`in the same directory on your hard drive. But your DTD and your instance
`can get even more cozy than sharing a directory. You can hoist your DTD
`into the same entity as the instance:
`
`Example 32-6. Bringing a DTD into the instance
`<! DOCTYPE GARAGESALE
`[
`(DATE, PLACE, NOTES)>
`< !ELEMENT GARAGESALE
`< !ELEMENT DATE
`(#PCDATA) >
`< !ELEMENT TIME (#PCDATA) >
`< !ELEMENT PLACE
`(#PCDATA) >
`< !ELEMENT NOTES
`(#PCDATA) >
`l >
`<GARAGESALE>
`
`</ GARAGESALE>
`
`The section between the square brackets is called the internal subset of
`the document type declaration. For testing, this is very convenient! You can
`edit the instance and the DTD without moving between entities. Since
`entities usually correspond to files, this means that instead of moving
`between two files, you need only edit one.
`Although this is convenient, it is not great for reuse. The DTD is not
`available anywhere but in this file. Other documents cannot conform to
`this DTD without copying the declarations into their internal subset.
`
`Often you will combine both approaches. Some of the DTD declara(cid:173)
`tions can go in an external entity where it can be reused, and some of it can
`go in the same entity as the instance. Often graphic entities (see 33.6,
`"Unparsed entities", on page 486) would be declared in the internal subset
`because they are specific to a document. On the other hand, element type
`declarations would usually be in the external subset, the external part of the
`document type declaration:
`
`Example 32-7. Reference to an external subset
`< !DOCTYPE GARAGESALE SYSTEM "garage.dtd" >
`< !ENTITY LOGO SYSTEM "l o go.gif">
`) ><GARAGESALE> ... </ GARAGESALE>
`
`©1998 T H E XML HANDBOOK™
`
`
`
`32.3 I ELEMEN T TYPE DECLARATIONS
`
`455
`
`The declarations in the internal subse are processed before tho e in the
`xrernal subset. This gives do umenc authors rhe opportunity to override1
`:orne kinds of declarations in the shared portion f the D
`Note char the content of both the in ernal subset and the external subset
`rnakes up the DTD. garage. dtd may have a . dtd enension bur that is just
`a convemion we cho e to emphasize that the file contains TD declara(cid:173)
`tions. Ir i not necessarily the fuJI set of them. The full set of DTD declara(cid:173)
`tions is the combination
`f the d
`larati ns in the internal and external
`subsets.
`
`Caution Many people believe that the file containing the
`external subset is "the DTD". Until it is referenced from a
`document type declaration and combined with an internal subset
`(even an empty one) it is just a file that happens to have markup
`declarations in it. It is good practice, however, when an external
`subset is used, to restrict the internal subset to declarations that
`apply only to the individual document, such as entity declarations
`for graphics.
`
`It is often very convenient to point to a particular file and refer to it as
`"the DTD" for a given document type. As long as the concepts are straight
`in your mind, it does seem a trifle simpler than saying "the file that contains
`the markup declarations that I intend to reference as the external subset of
`the docum ent type declaration for all documents of this type" .
`
`12.1 1 Elen1ent type declarations
`
`Elements are the foundation of XML markup. Every element in a valid
`XML document must conform to an clement type declared in the DTD.
`Documents with elements that do not conform could be well-formed, but
`not valid. Here is an example of an clem ent type declaration:
`Element type declarations must start with the string "<!ELEMENT",
`followed by the name (or generic identifier of the element type being
`
`1. Actually, preempt.
`
`
`
`4 56
`
`CHAPTER 32 I CREATING A DOCUMENT TYPE DEFINITION
`
`Example 32-8. Element type declaration.
`<!ELEMENT memo (to, from, body )>
`
`declared. Finally they must have a content specification. The content specifi(cid:173)
`cation above states that elements of this type must contain a to element fol(cid:173)
`lowed by a from element followed in turn by a body element. Here is the
`rule from the XML grammar:
`
`Spec. Reference 32-2. Element type declaration
`<!ELEMENT' S Name S contentspec S?
`'>'
`
`Element type names are XML names. That means there are certain
`restrictions on the characters allowed in them. These are described in
`31.1.4, "Names and name tokens", on page 428. Each element type decla(cid:173)
`ration must use a different name because a particular element type cannot
`be declared more than once.
`
`Caution Unique element type declaration
`Unlike attribute and entities, element types can be declared only
`once.
`
`1 Element type content
`specification
`
`Every element type has certain allowed content. For instance a document
`type definition might allow a chapter to have a title in its content, but
`would probably not allow a footnote to have a chapter in its content
`(though XML itself would not prohibit that!).
`
`There are four kinds of content specification. These are described in
`Table 32-1.
`
`©1998 THE XML HANDBOOKTM
`
`
`
`3 2 . ·i
`
`I E L F M 1·: N T
`
`I' y p I'
`
`l: () N I' E N ']' s p E c I I' I c (\ T I 0 N
`
`4 57
`
`--Tttble 32··1 Content specification types
`
`Content specification type Allowed content
`
`EMPTY content
`
`ANY content
`
`Mixed contmt
`
`Bement content
`
`May not have content. They are typically used
`for their attributes.
`
`May have any content at all.
`
`May have character data or a mix of character
`data and sub-elements specified in mixed con·
`tent specification.
`
`May have only sub-elements specified in ele(cid:173)
`ment content specification
`
`n.t.l Empty content
`
`Sometimes we want an element type that can never have any content. We
`would give it a content specification of EMPTY. For instance an image ele(cid:173)
`ment type like HTMts img would include a graphic from somewhere else.
`It would do this through an attribute and would not need any sub .. elements
`or character data content. A cross-reference element type might not need
`content because the text for the reference might be generated from the tar(cid:173)
`get. A reference to an element type with the title "More about XML" might
`become "See More about XML on page 14".
`You can declare an element type to have empty content by using the
`EMP'rY keyword as the content specification:
`
`Example :32-H. Empty element typ(~
`<! ELEMEN'I' !1Y- EHP'J'Y- ELEMENT EMP 'I'Y>
`
`n.~.2 ANY content
`
`· ccasionally, you want an element type to be able to hold any clement or
`character data. You can do this if you give it a content spec of ANY:
`This is rarely done. Typically we introduce clement type declarations to
`express the structure of our document types. An element type that has an
`
`(\,) 1 q () ~ T II I~ X M l.
`
`l T ,\ N [) 1\ ( ) l) [,-
`
`I ,\I
`
`
`
`4 58
`
`CHAPTER 32 I CREATING A DOCUMENT TYPE DEFINITION
`
`Example 32-10. Element type with ANY content.
`<!ELEMENT LOOSEY-GOOSEY ANY>
`
`ANY concen t specification is completely unstructured. It can co main any
`combination of character data and sub-elements. Still, ANY content element
`types are occasionally useful, e pecia!Jy while a DTD is being developed. If
`you are developing a DTD for existing documems, then you could declare
`each clement type ro have ANY content co get the document to validate.
`Then you could try to figure out more precise content speci.6cations for
`each element type, one at a time.
`
`;n.4.J Mixed content
`
`Element types with mixed content are allowed ro bold either character data
`alone or character data with child elements interspersed. A paragraph is a
`good example of a typical mixed content element. It might have character
`data with some mixed in emphasis and quotation sub-elements. The sim(cid:173)
`plest mixed content specifications allow data only and start with a left
`parenthesis character ("("), followed by the string #PCDATA and a final dose
`parenthesis (") "):
`
`Example 32-11. Data-only mixed content.
`<!ELEMENT emph (#PCDATA)>
`<!ELEMENT foreign-language ( #PCDATA ) >
`
`You may put white space b tween the paren hesis and the string 1~PCDATA
`if you like. The declarati ns above create element types chat cannot contain
`ub-elements. ub-elements rhat are detected will be reported as validity
`errors.
`Tn ocher words, these elements do not really have "mixed" concent in the
`usual ense. Like the word "valid", XML has a particular meaning for the
`word rhat is not very in uitive. Any contem specification tha contains
`#PCDATA is called mixed, whether sub-elements ar allowed or not.
`We can easily extend the DTD to allow a mix of elements and character
`data:
`
`© 199 8 TH E XML HAN DBOO KTM
`
`
`
`32.5
`
`I CONTENT MODELS
`
`459
`
`Example 32-12. Allow a mix of character data and elements
`;!ELEMENT paragraph (#PCDATA i emph)*>
`<!ELEMENT abstract (#PCDATAiemphlquot)*>
`<!ELEMENT title ( #PCDATA I foreign-language I emph )* >
`
`Note the trailing asterisks. They are required in content specifications
`that allow a mix of character data and elements. The reason that they are
`there will be clear when we study content models. Note also that we can put
`white space before and after the vertical bar ("I") characters.
`These declarations create element types that allow a mix of character data
`and sub-elements. The element types listed after the vertical bars ("1"), are
`the allowed sub-elements. The following would be a valid title if we com(cid:173)
`bine the declarations in Example 32-12 with those in Example 32-11
`<title>this is a <foreign-language>tres gros</foreign-language>
`title for an <emph>XML</emph> book</title>
`The title has character data ("This is a''), a foreign-language sub-ele(cid:173)
`ment, some more character data ("tide for an"), an emph sub-element and
`some final character data "book''. We could have reordered the emph and
`foreign-language elements and the character data however we wanted.
`We could also have introduced as many (or as few) emph and foreign(cid:173)
`l anguage elements as we needed.
`
`11.5 1 Content models
`
`The final kind of content specification is a "children" specification. This
`type of specification says that elements of the type can contain only child
`elements in its content. You declare an element type as having element con(cid:173)
`tent by specifying a content model instead of a mixed content specification or
`one of the keywords described above.
`A content model is a pattern that you set up to declare what sub-element
`types are allowed and in what order they are allowed. A simple model for a
`memm might say that it must contain a from followed by a to followed by a
`subject followed by a paragraph. A more complex model for a question(cid:173)
`cmd.-answer might require question and answer elements to alternate.
`A model for a chapter might require a single title element, one or two
`a\J.tnor elements and one or more paragraphs . When a document is vali-
`
`~ 199 8 THE XML HANDBOOK™
`
`
`
`460
`
`CHAPTER 32 I CREATING A DOCUMENT TYPE DEFINITION
`
`dated, the processor would check that the element's content matches the
`model.
`A simple content model could have a single sub-element type:
`<!ELEMENT WARNING
`(PARAGRAPH)>
`This says that a WARNING must have a single PARAGRAPH within it. As with
`mixed content specifications, you may place white space before or after the
`parentheses. We could also say that a WARNING must have a TITLE and then
`a PARAGRAPH within it:
`<!ELEMENT WARNING
`(TITLE, PARAGRAPH)>
`The comma(",") between the "TITLE" and "PARAGRAPH" Gis indi(cid:173)
`cates that the "TITLE" must precede the "PARAGRAPH" in the "WARN(cid:173)
`ING" element. This is called a sequence. Sequences can be as long as you
`like:
`(FROM, TO, SUBJECT, BODY)>
`<!ELEMENT MEMO
`You may put white space before or after the comma (",") between two
`parts of the sequence.
`Sometimes you want to have a choice rather than a sequence. For instance
`a document type might be designed such that a FIGURE could contain either
`a GRAPHIC element (inserting an external graphic) or a CODE element (insert(cid:173)
`ing some computer code).
`<!ELEMENT FIGURE (GRAPHICICODE)>
`The vertical bar character ("I") indicates that the author can choose
`between the elements. You can put white space before or after the vertical
`bar. You may have as many choices as you want:
`<!ELEMENT FIGURE (CODEITABLE
`I FLOW-CHARTI SCREEN-SHOT)>
`You may also combine choices and sequences using parenthesis. When
`you wrap parenthesis around a choice or sequence, it becomes a content par(cid:173)
`ticle. Individual Gis are also content particles. You can use any content par(cid:173)
`ticle where ever you would use a GI in a content model:
`<!ELEMENT FIGURE (CAPTION,
`(CODEITABLEIFLOW-CHARTISCREEN-SHOT)
`<!ELEMENT CREATED
`((AUTHOR I CO-AUTHORS)' DATE)>
`The content model for FIGURE is thus made up of a sequence of two con(cid:173)
`tent particles. The first content particle is a single element type name. The
`second is a choice of several element type names. You can break down the
`content model for CREATED in the same way.
`You can make some fairly complex models this way. But when you write
`a DTD for a book, you do not know in advance how many chapters the
`book will have, nor how many paragraphs each chapter will contain. You
`need a way of saying that the part of the content specification that allows
`captions is repeatable- that you can match it many times.
`
`)>
`
`©1998 THE XML HANDBOOKTM
`
`
`
`--
`
`3 2 . 5
`
`I c 0 N T E N T M 0 D E L s
`
`461
`
`Sometimes you will also want to make an element optional. For instance,
`some figures may not have captions. You may want to say that part of the
`specification for figures is optional.
`)(ML allows you to specify that a content particle is optional or repeat(cid:173)
`able using an occurrence indicatm: There are three occurrence indicators:
`
`Table 32-2 Occurrence Indicators
`
`Indicator Content particle is ...
`
`Optional (0 or l. time).
`
`*
`
`+
`
`Optional and repeatable (O or more times)
`
`Required and repeatable (I or more times)
`
`Occurrence indicators directly follow a GI, sequence or choice. The
`occurrence indicator cannot be preceded by white space.
`For instance we can make captions optional on figures:
`<!ELEMENT FIGURE
`(CAPTION?,
`(CODEJTABLEJFLOW-CHARTJSCREEN-SHOT) )>
`We can allow footnotes to have multiple paragraphs:
`<!ELEMENT FOOTNOTE (P+)>
`Because we used the "+" indicator, footnotes must have at least one para(cid:173)
`graph. We could also have expressed this in another way:
`<!ELEMENT FOOTNOTE (P, P*)>
`This would require a leading paragraph and then 0 or more paragraphs
`following. That would achieve the same effect as requiring 1 or more para(cid:173)
`graphs. The "+" operator is just a little more convenient than repeating the
`preceding content particle.
`We can combine occurrence indicators with sequences or choices:
`<!ELEMENT QUESTION-AND-ANSWER (INTRODUCTION,
`(QUESTION, ANSWER)+,
`COPYRIGHT?)>
`It is also possible to make all of the element types in a content model
`optional:
`(CAPTION?)>
`<! ELE!1ENT IMAGE
`. This allows the IMAGE element to be empty sometimes and not other
`tlmes. The question mark indicates that CAPTION is optional. Most likely
`IMAGE elements would link to an external graphic through an
`these
`
`©l!Y9H 'fill'. XMI. lli\NllllCJCJt;""
`
`
`
`462
`
`CHAPTER 32 I CREATING A DOCUMENT TYPE DEFINITION
`
`attribute. The author would only provide content if he wanted to provide a
`caption.
`In the document instance, empty IMAGE elements look identical to how
`they would look if IMAGE had been declared to be always empty. There is no
`way to tell from the document instance whether they were declared as
`empty or are merely empty in a particular case.
`
`J2.6 1 Attributes
`
`Attributes allow an author to attach extra information to the elements in a
`document. For instance a code element for computer code might have a
`lang attribute declaring the language that the code is in. On the other
`hand, you could also use a lang sub-element for the same purpose. It is the
`DTD designer's responsibility to choose a way and embody that in the
`DTD. Attributes have srrengtbs and weaknesses that differentiate them
`from sub-elements so you can usually make the decision without too much
`difficulty.
`The largest difference between elements and attributes is that attributes
`cannot contain elements and there is no such thing as a "sub-attribute".
`Attributes are always either text strings with no explicit structure (at least as
`far as XML is concerned) or simple lists of strings. That means that a
`chapter should not be an attribute of a book element, because there would
`be no place to put the tides and paragraphs of the chapter. You will typically
`use attributes for small, simple, unstructured "extra" information.
`Another imp rtanr difference berween elements and attributes is tbar
`each of an element's attributes may be speci£ed only one , and they may be
`specified in any order. This is often convenient becau e memorizing the
`order of things can be difficul . Elements, on the otber hand, must occur in
`the order specified and mayo cur as many rimes as the DTD allows. Thus
`you mu t us elements for thing rhat must be repeated, or must follow a
`certain pattern or order char you want the XML parser to enforce.
`T hes
`technical oncerns ar often enough to make the decision for you.
`But if everything el e is equal, there are some usab ility considerations chat
`can help. One ru le of thumb that some people use (with neither perfect
`success nor constant abject failure) is that elements usually repre ent data
`that is the natural content that should appear in v ry print-ouc or other
`rend ition, Most formatting syst m print out elements by default and do
`
`© 1 998 T HE XML HA ND BO O KTM
`
`
`
`3 2 . 6
`
`I AT T R I B u T E s
`
`463
`
`not print out attributes unless you specifically ask for them. Amibures rep(cid:173)
`resent data that is of secondary importance and is often i11formacion about
`the information ( "metainformation ').
`Also, attribute names usually represent properties of objects, but ele(cid:173)
`ment-type names usually represent parts of objects. So given a person ele(cid:173)
`ment, sub-elements might represent parts of the body and attributes might
`represent properties like weight, height, and accumulated karma points.
`We would advise you not to spend too much of your life trying to figure
`out exactly what qualifies as a part and what qualifies as a property. Experi(cid:173)
`ence shows that the question "what is a property?" ranks with "what is the
`good life?" and ""':hat is ~rt?". The te~hnical concerns are usually a good
`indicator of the ph1losoph1cal category m any event.
`
`n.t.l Attribute-list declarations
`
`Attributes are declared for specific element types. You declare attributes for
`a particular element type using an attribute-list declaration. You will often
`see an attribute-list declaration right beside an element type declaration:
`<!ELEMENT PERSON
`(#PCDATA)>
`<!ATTLIST PERSON EMAIL CDATA #REQUIRED>
`Attribute declarations start with the string "<!ATTLIST". Immediately
`after the white space comes an element type's generic identifier. Mter that
`comes the attribute's name, its type and its default. In the example above,
`the attribute is named EMAIL and is valid on PERSON elements. Its value
`must be character data and it is required- there is no default and the author
`must supply a value for the attribute on every PERSON element.
`
`Spec. Reference 32-3. Attribute-list declarations
`[52] AttlistDecl .. -
`' < !ATTLIST' S Name AttDef* S?
`[53] AttDef : := S Name S AttType S DefaultDecl
`
`'>'
`
`You can declare many attributes in a single attribute-list declaration. 1
`You can also have multiple attribute-list declarations for a single element
`type:
`
`1. That's why it is called a list!
`
`<i:li998 THE XML HANDBOOKTM
`
`
`
`464
`
`CHAPTER 32 I CREATING A DOCUMENT TYPE DEFINITION
`
`Example 32-13. Declaring multiple attributes
`<!ATTLIST PERSON EMAIL CDATA #REQUIRED
`PHONE CDATA #REQUIRED
`FAX CDATA #REQUIRED>
`
`Example 32-14. Multiple declarations for one element type
`<!ATTLIST PERSON HONORIFIC CDATA #REQUIRED>
`<!ATTLIST PERSON POSITION CDATA #REQUIRED
`ORGANIZATION CDATA #REQUIRED>
`
`This is equivalent to putting the declarations altogether into a single
`attribute-list declaration.
`It is even possible to have multiple declarations for the same attribute of
`the same element type. When this occurs, the first declaration of the
`attribute is binding and the rest are ignored. This is analogous to the situa(cid:173)
`tion with entity declarations.
`Note that two different element types can have attributes with the same
`name without there being a conflict. Despite the fact that these attributes
`have the same name, they are in fact different attributes. For instance a
`SHIRT element could have an attribute SIZE that exhibits values SMALL,
`MEDIUM and LARGE and a PANTS element in the same DTD could have an
`attribute also named SIZE that is a measurement in inches:
`<!-- These are -->
`<!ATTLIST SHIRT SIZE (SMALLIMEDIUMjLARGE) #REQUIRED>
`
`<!-- two different attributes -->
`<!ATTLIST PANTS SIZE NUMBER #REQUIRED>
`It is not good practice to allow attributes with the same name to have dif(cid:173)
`ferent semantics or allowed values in the same document. That can be quite
`confusing for authors.
`
`J'l.6.'l Attribute defaults
`
`Attributes can have default values. If the author does not specifY an attribute
`value then the processor supplies the default value if it exists. A DTD
`designer can also choose not to supply a default.
`SpecifYing a default is simple. You merely include the default after the
`type or list of allowed values in the attribute list declaration:
`
`©1998 THE XML HANDBOOKTM
`
`
`
`3 2. 6
`
`I ATTRIBuTEs
`
`465
`
`<!ATTLIST SHIRT SIZE (SMALLIMEDIUMILARGE) MEDIUM>
`<!ATTLIST SHOES SIZE NUMBER "13">
`Any value that meets the consuaints of the attribute list declaration is
`legal as a default value. You could not, however, use "abc" as a default value
`for an attrjbure with declared rype nurriber any more than you could do so
`in a scan-(ag in the documerlt jnstance.
`Sometimes you want to allow the user to omit a value for a particular
`attribute without forcing a particular default. For instance you could have
`an element SHIRT which has a SIZE attribute with a declared type of
`NUMBER. But some shirts are "one size fits all". They do not have a size. You
`want the author to be able