XML Schema
Eric van der Vlist. 2002. Sebastopol, CA: O'Reilly. [ISBN 0-596-00252-1. 400 pages. $39.95 USD (softcover).]
The main purpose of XML Schema is to detail the structure of XML instance documents and the datatype of each element and attribute in an XML document. In
What is an XML Schema? It's an XML vocabulary for expressing data business rules or constraints. Why do you need XML Schema? An XML Schema is required to ensure that the corresponding XML instance document is valid and well formed for proper data interchange. Why do you need a book like XML Schema? This is a very nicely organized book that provides a good foundation for understanding the process of building your own XML Schema, conforming to W3C XML Schema standards, revising your own XML Schema, and integrating it into your applications. However, writing any XML Schema is a very detail-oriented process, and you must read almost the entire book to really comprehend the concepts and understand how the author refined his schema in this book, so that you can use similar practice in making your own schema.
[ILLUSTRATION OMITTED]
A typical XML Schema consists of individual elements, a sequence of elements, attributes, processing instructions, comments, and texts. W3C XML Schema is a "Schema of schema" and provides the guidelines to define "simple type" elements and "complex type" elements. A complex type structure is formed by arranging the sequence of predefined elements. The book demonstrates multiple examples of simple and complex type elements to help the reader understand the concepts. In the process, van der Vlist also warns you to identify elements and attributes by their datatype using the W3C XML Schema, rather than by a set of rules or patterns using nonstandard schema.
Chapter 2 presents the first complete schema, introducing the basic features of the XML language. This schema is based on simple elements and attributes and is used throughout the book. Chapter 3 gives more depth to the basic schema, introducing a completely different style called "Russian doll design." Chapter 4 explores how some of the predefined simple datatypes can be bound to the contents of XML documents.
Chapter 5 explains two methods to create your own new simple types in XML Schema. In the first method, simple types are based on different derivation mechanisms, which derive from common primitive types such as integers and strings. In the second method, simple types are derived from basic types while adding restrictions to them.
In Chapter 6, the author demonstrates how to create datatypes that can be applied to attributes or simple type elements. In Chapter 7, the discussion focuses mainly on creating complex datatypes and complex content models, which are descriptions of the XML markup structure. Details about global and local definition for simple and complex datatypes are explained. The author explains how the complex datatypes are created. He provides concise examples of derivation of simple, complex, empty, and mixed content datatypes. These are created by extension or by restriction from the basic datatypes.
Up to Chapter 7, van der Vlist has been explaining the most commonly used techniques in creating various complex structures in your XML Schema. In Chapter 8, he demonstrates how to reuse these techniques from one schema to another. Multiple schema can be included and redefined to create schema libraries. This is a very informative chapter for all XML Schema designers and developers.
In Chapter 9, the author presents two features that allow you to leverage already created elements by using references to themselves. One feature directly emulates structures that use the ID, IDREF, and attribute types from the XML document type definitions. The other feature provides more flexibility in describing self-structure by using regular expressions. These regular expressions are described in the terminology of XPath, which is another standard specification provided by W3C.
In Chapter 10, van der Vlist tries to explain how the W3C chose to implement namespace support in W3C XML Schema. Despite a very good explanation on this subject, it still remains a confusing matter, even in Chapter 11 because the concept is inherently difficult to explain, even though the author has done a good job.
Chapter 12 is dedicated to creating more structures using object-oriented features. The examples show how to create substitution groups in the XML world, which are similar to creating subclasses in the object-oriented world. The abstract elements and datatypes in XML Schema are created just as abstract classes are created in the C + + language. Creating final datatypes in XML Schema is like creating final classes in Java programming.
Chapter 13 provides a solid discussion of creating extensible schema, and Chapter 14 deals with documenting schema that are readable directly as well as by documentation extraction tools. Documenting an XML Schema using annotation tag and other methods is a very important but mostly neglected aspect of XML Schema.
Chapter 15 provides a very useful and concise reference to all of the elements that the W3C XML Schema uses to define components of XML Schema. Each element is described with syntax, description, restrictions, attributes, and an example.
Appendix A presents a very brief description of various XML Schema Languages and one example of each schema language.
Overall, this book is very useful to XML Schema designers and developers. To take maximum advantage of the book, you should walk through the example the author is building in this book and also simultaneously work on building an XML Schema library by reusing and redefining your own XML Schema with the techniques described in this book. Ultimately, you should be able to use all components of the XML Schema library, or at least some of them, in your own applications.
By the way, if your schema has evolved using other techniques not covered by van der Vlist's techniques, you are almost on your own. This book won't provide much in the way of guidelines for validating the complexities built into your own schema. However, rest assured that many commercial packages will check the validation and the wellformedness of your XML documents based on your schema as you progress in designing and refining your own complex schema, provided it is based on the W3C XML Schema Recommendation. Therefore, you don't have to spend time trying to understand how van der Vlist's schema gets refined. However, this book provides a great path and excellent guidelines for learning and refining your schema conforming to W3C standards and recommendations.
VIVEK VAISHAMPAYAN is an experienced information technology analyst in designing, developing, and testing computer systems. He has more than 15 years' experience in the industry and has taught courses in HTML. XML. and JavaScript. He has reviewed books on quality assurance and software development.