diff --git a/doc/faq-parse.xml b/doc/faq-parse.xml index 3fab629fa6e09ca6b7939a538425973224f92206..049f0bfff03fd35d65b3d6951326947d0fe1ab40 100644 --- a/doc/faq-parse.xml +++ b/doc/faq-parse.xml @@ -9,7 +9,13 @@ <a> - <p>See <jump href="schema.html">the Schema page. </jump></p> + <p>The &XercesCName; &XercesCVersion; contains an implementation + of a subset of the W3C XML Schema Language as specified + in the 2 May 2001 Recommendation for <jump + href="http://www.w3.org/TR/xmlschema-1/">Structures</jump> + and <jump href="http://www.w3.org/TR/xmlschema-2/"> + Datatypes</jump>. See <jump href="schema.html">the Schema + page</jump> for details.</p> </a> </faq> @@ -20,8 +26,14 @@ <a> - <p> See <jump href="schema.html#supportedFeature"> - supported schema features in &XercesCName; &XercesCVersion;</jump></p> + <p>The &XercesCName; &XercesCVersion; contains an implementation + of a subset of the W3C XML Schema Language as specified + in the 2 May 2001 Recommendation for <jump + href="http://www.w3.org/TR/xmlschema-1/">Structures</jump> + and <jump href="http://www.w3.org/TR/xmlschema-2/"> + Datatypes</jump>. You should not consider this implementation + complete or correct. Please refer to <jump href="schema.html#limitation"> + the Schema Limitations </jump>for further details.</p> </a> </faq> diff --git a/doc/migration.xml b/doc/migration.xml index 412ad65adba46d3731939a42056f946f32f9f064..27ca9ed2d4457aef4c1823cbfd91061dc91fa016 100644 --- a/doc/migration.xml +++ b/doc/migration.xml @@ -38,8 +38,8 @@ <s3 title="Compliance"> <p>Except for a couple of the very obscure (mostly related to the 'standalone' mode), this version should be quite compliant - to <jump href="http://www.w3.org/TR/REC-xml">XML 1.0</jump>. It also - tracks the latest changes to DOM, SAX and Namespace Specification. + to <jump href="http://www.w3.org/TR/REC-xml">XML 1.0</jump>. It also + tracks the latest changes to DOM, SAX and Namespace Specification. We have more than a thousand tests, some collected from various public sources and some IBM generated, which are used to do regression testing. The C++ parser is now passing all but a @@ -68,60 +68,53 @@ <s2 title="Changes required to migrate to &XercesCName; &XercesCVersion;"> <p>There are some architectural changes between the &XercesCName; - &XercesPreCVersion; and the &XercesCName; &XercesCVersion; releases - of the parser, and as a result, some code has undergone restructuring - as shown below. </p> + &XercesPreCVersion; and the &XercesCName; &XercesCVersion; releases + of the parser, and as a result, some code has undergone restructuring + as shown below. </p> <anchor name="Reorganization"/> <s3 title="Validator directory Reorganization"> <ul> <li>common content model files such as DFAContentModel ... - are moved to a new directory called src/validators/common</li> + are moved to a new directory called src/validators/common</li> <li>DTD related files are moved to a new directory called src/validators/DTD</li> <li>new directory src/validators/Datatype is created to store all datatype validators</li> <li>new directory src/validators/schema is created to store Schema related files</li> - </ul> + </ul> </s3> <anchor name="DTDValidator"/> <s3 title="DTDValidator"> - <p> DTDValidator was design to scan, validate and store the DTD in &XercesCName; &XercesPreCVersion; - or earlier. In &XercesCName; &XercesCVersion;, this process is broken down into three components: + <p> DTDValidator was design to scan, validate and store the DTD in &XercesCName; &XercesPreCVersion; + or earlier. In &XercesCName; &XercesCVersion;, this process is broken down into three components: </p> <ul> <li>new class DTDScanner - to scan the DTD</li> <li>new class DTDGrammar - to store the DTD Grammar</li> <li>DTDValidator - to validate the DTD only</li> - </ul> + </ul> </s3> </s2> <anchor name="NewFeatures"/> <s2 title="New features in &XercesCName; &XercesCVersion;"> - <p> Schema subset support is provided in this release. See - <jump href="schema.html#supportedFeature"> - supported schema features in &XercesCName; &XercesCVersion;.</jump>. An - experiemental IDOM is also available as well. - </p> - + <p>Schema subset support and an experimental IDOM are available + in this release. + </p> <anchor name="Schema"/> <s3 title="Schema Subset Support"> <ul> - <li>Schema Subset support is added</li> - <ul> - <li>New function "setDoSchema" is added to DOM/SAX parser.</li> - <li>New feature "http://apache.org/xml/features/validation/schema" is recognized by SAX2XMLReader.</li> - <li>New classes such as SchemaValidator, TraverseSchema ... are added.</li> - <li>The Scanner is enhanced to process schema.</li> - </ul> + <li>New function "setDoSchema" is added to DOM/SAX parser.</li> + <li>New feature "http://apache.org/xml/features/validation/schema" is recognized by SAX2XMLReader.</li> + <li>New classes such as SchemaValidator, TraverseSchema ... are added.</li> + <li>The Scanner is enhanced to process schema.</li> <li>New sample data files personal-schema.xml and personal.xsd.</li> <li>New command line option "-s" for samples.</li> - </ul> - <p> - See <jump href="schema.html#usage"> - Schema Usage</jump> - </p> + </ul> + <p> + See <jump href="schema.html">the Schema page</jump> for details. + </p> </s3> <anchor name="IDOM"/> @@ -138,8 +131,8 @@ <s2 title="Migration Archive"> - <p> For migration information from XML4C 2.x to &XercesCName; &XercesPreCVersion;, - please refer to <jump href="migrate_archive.html">Migration Archive. </jump></p> + <p>For migration information from XML4C 2.x to &XercesCName; &XercesPreCVersion;, + please refer to <jump href="migrate_archive.html">Migration Archive. </jump></p> </s2> diff --git a/doc/program.xml b/doc/program.xml index 133aacb818da4bed0ce4bcb3b300fe442cd8aab2..7d489c17fc03c12c48a141f0b0742e7d67b1d05e 100644 --- a/doc/program.xml +++ b/doc/program.xml @@ -38,7 +38,12 @@ <ul> <li><link anchor="Motivation">Motivation behind new design</link></li> <li><link anchor="IDOMClassNames">Class Names</link></li> - <li><link anchor="IDOMObjMemMgmt">Objects and Memory Management</link></li> + <li><link anchor="IDOMObjMgmt">Objects Management</link></li> + <li><link anchor="IDOMMemMgmt">Memory Management</link></li> + <ul> + <li><link anchor="IDOMMemImplicit">Implicit Object Deletion</link></li> + <li><link anchor="IDOMMemExplicit">Explicit Object Deletion</link></li> + </ul> <li><link anchor="DOMStringXMCh">DOMString vs. XMLCh</link></li> </ul> </ul> @@ -722,14 +727,11 @@ DOM_Text someText; </source> </s3> - <anchor name="IDOMObjMemMgmt"/> - <s3 title="Objects and Memory Management"> - <p>The C++ IDOM implementation no longer uses reference counting for - automatic memory management. The storage for a DOM document is - associated with the document node object. Applications would use - normal C++ pointers to directly access the implementation objects - for Nodes in IDOM C++, while they would use object references in - DOM C++. + <anchor name="IDOMObjMgmt"/> + <s3 title="Objects Management"> + <p>Applications would use normal C++ pointers to directly access the + implementation objects for Nodes in IDOM C++, while they would use + object references in DOM C++. </p> <p>Consider the following code snippets</p> @@ -752,9 +754,15 @@ aNode = someDocument.createElement("ElementName"); docRootNode = someDocument.getDocumentElement(); docRootNode.appendChild(aNode); </source> + </s3> - <p>The IDOM C++ uses an independent storage allocator per document. + <anchor name="IDOMMemMgmt"/> + <s3 title="Memory Management"> + <p>The C++ IDOM implementation no longer uses reference counting for + automatic memory management. The C++ IDOM uses an independent storage + allocator per document. The storage for a DOM document is + associated with the document node object. The advantage here is that allocation would require no synchronization in most cases (based on the the same threading model that we have now - one thread active per document, but any number of @@ -776,35 +784,192 @@ docRootNode.appendChild(aNode); memory is automatically taken care of by the reference counting. </p> - <p>In C++ IDOM, there is an implict and explict object deletion. When parsing - a document using an IDOMParser, the storage allocated will be automatically - deleted when the parser instance is deleted (implicit). If a user is - manually building a DOM tree in memory using the document factory methods, - then the user needs to explicilty delete the document object to free all - allocated memory. + <p>In C++ IDOM, there is an implict and explict object deletion. + </p> + </s3> + + <anchor name="IDOMMemImplicit"/> + <s3 title="Implicit Object Deletion"> + <p>When parsing a document using an IDOMParser, all memory allocated + for a DOM tree is associated to the DOM document. And this storage + will be automatically deleted when the parser instance is deleted (implicit). + </p> + <p>If you do multiple parse using the same IDOMParser instance, then + multiple DOM documents will be generated and saved in a vector pool. + All these documents (and thus all the allocated memory) won't be deleted + until the parser instance is destroyed. If you want to release the memory + back to the system but don't want to destroy the IDOMParser instance at this moment, + then you can call the method IDOMParser::resetDocumentPool to reset the document + vector pool, provided that you do not need access to these documents anymore. + </p> + + <p>Consider the following code snippets: </p> + + <source> + // C++ IDOM - implicit deletion + IDOMParser* parser = new IDOMParser(); + parser->parse(xmlFile) + IDOM_Document *doc = parser->getDocument(); + + unsigned int i = 1000; + while (i > 0) { + parser->parse(xmlFile) + IDOM_Document* myDoc = parser->getDocument(); + i--; + } + + // all allocated memory associated with these 1001 DOM documents + // will be deleted implicitly when the parser instance is destroyed + delete parser; + </source> + + <source> + // C++ IDOM - implicit deletion + // optionally release the memory + IDOMParser* parser = new IDOMParser(); + unsigned int i = 1000; + while (i > 0) { + parser->parse(xmlFile) + IDOM_Document *doc = parser->getDocument(); + i--; + } + + // instead of waiting until the parser instance is destroyed, + // user can optionally choose to release the memory back to the system + // if does not need access to these 1000 parsed documents anymore. + parser->resetDocumentPool(); + + // now the parser has some fresh memory to work on for the following + // big loop + i = 1000; + while (i > 0) { + parser->parse(xmlFile) + IDOM_Document *doc = parser->getDocument(); + i--; + } + delete parser; + + </source> + </s3> + + <anchor name="IDOMMemExplicit"/> + <s3 title="Explicit Object Deletion"> + <p>If user is manually building a DOM tree in memory using the document factory methods, + then the user needs to explicilty delete the document object to free all the allocated memory. + It normally falls under the following 3 scenarios: </p> + <ul> + <li>If a user is manually creating a DOM document using the document implementation + factory methods, IDOM_DOMImplementation::getImplementation()->createDocument, + then the user needs to explicilty delete the document object to free all + allocated memory. </li> + <li>If a user is creating a DocumentType object using the document implementation factory + method, IDOM_DOMImplementation::getImplementation()->createDocumentType, then + the user also needs to explicilty delete the document type object to free the + allocated memory.</li> + <li>Special case: If a user is creating a DocumentType using the document + implementation factory method, and clone the node WITHOUT assigning a document + owner to that documentType object, then the cloned node also needs to be explicitly + deleted.</li> + </ul> <p>Consider the following code snippets: </p> <source> // C++ IDOM - explicit deletion +// use the document implementation factory method to create a document type and a document +IDOM_DocumentType* myDocType; +IDOM_Document* myDocument; +IDOM_Node* root; +IDOM_Node* aNode; + +myDocType = IDOM_DOMImplementation::getImplementation()->createDocumentType(name, 0, 0); +myDocument = IDOM_DOMImplementation::getImplementation()->createDocument(0, name, myDocType); +root = myDocument->getDocumentElement(); +aNode = myDocument->createElement(anElementname); + +root->appendChild(aNode); + +// need to delete both myDocType and myDocument which are created through DOM Implementation +delete myDocType; +delete myDocument; + </source> + + <source> +// C++ IDOM - explicit deletion +// use the document implementation factory method to create a document +IDOM_DocumentType* myDocType; IDOM_Document* myDocument; +IDOM_Node* root; IDOM_Node* aNode; + myDocument = IDOM_DOMImplementation::getImplementation()->createDocument(); -aNode = myDocument->createElement("ElementName"); -myDocument->appendChild(aNode); +myDocType = myDocument->createDocumentType(name); +root = myDocument->createElement(name); +aNode = myDocument->createElement(anElementname); + +myDocument->appendChild(myDocType); +myDocument->appendChild(root); +root->appendChild(aNode); + +// the myDocType is created through myDocument, not through Document Implementation +// thus no need to delete myDocType delete myDocument; </source> <source> -// C++ DOM - implicit deletion -IDOM_Document myDocument; -DOM_Node aNode; -myDocument = DOM_DOMImplementation::getImplementation().createDocument(); -aNode = myDocument.createElement("ElementName"); -myDocument.appendChild(aNode); +// C++ IDOM - explicit deletion +// manually build a DOM document +// clone the document type object which does not have an owner yet +IDOM_DocumentType* myDocType1; +IDOM_DocumentType* myDocType; +IDOM_Document* myDocument; +IDOM_Node* root; +IDOM_Node* aNode; + +myDocType = IDOM_DOMImplementation::getImplementation()->createDocumentType(name, 0, 0); +myDocType1 = (IDOM_DocumentType*) myDocType->cloneNode(false); +myDocument = IDOM_DOMImplementation::getImplementation()->createDocument(0, name, myDocType); + +root = myDocument->getDocumentElement(); +aNode = myDocument->createElement(anElementname); + +root->appendChild(aNode); + +// myDocType does not have an owner yet when myDocType1 was cloned. +// thus need to explicitly delete myDocType1 +delete myDocType1; +delete myDocType; +delete myDocument; </source> + <source> +// C++ IDOM - explicit deletion +// manually build a DOM document +// clone the document type object that has an owner already +// thus no need to delete the cloned object +IDOM_DocumentType* myDocType1; +IDOM_DocumentType* myDocType; +IDOM_Document* myDocument; +IDOM_Node* root; +IDOM_Node* aNode; + +myDocType = IDOM_DOMImplementation::getImplementation()->createDocumentType(name, 0, 0); +myDocument = IDOM_DOMImplementation::getImplementation()->createDocument(0, name, myDocType); +myDocType1 = (IDOM_DocumentType*) myDocType->cloneNode(false); + +root = myDocument->getDocumentElement(); +aNode = myDocument->createElement(anElementname); + +root->appendChild(aNode); + +// myDocType already has myDocument as the owner when myDocType1 was cloned +// thus NO need to explicitly delete myDocType1 +delete myDocType; +delete myDocument; + </source> + </s3> + <p>Key points to remember when using the C++ IDOM classes:</p> <ul> @@ -820,7 +985,6 @@ myDocument.appendChild(aNode); the IDOM parser when parsing an instance document.</li> </ul> - </s3> <anchor name="DOMStringXMCh"/> <s3 title="DOMString vs. XMLCh"> diff --git a/doc/releases.xml b/doc/releases.xml index c2fef82536764d68604359b5a3734fd026ad1efd..f31fdeb3a1c59c81f67823332fa4b0c1f459f5c3 100644 --- a/doc/releases.xml +++ b/doc/releases.xml @@ -20,8 +20,9 @@ all, <br/> any, <br/> anyAttribute, <br/> - redefine, <br/> annotation, <br/> + notation, <br/> + redefine, <br/> circular import. <br/> Add AnySimpleTypeDatatypeValidator. <br/> Add XercesGroupInfo. <br/> @@ -45,7 +46,7 @@ XMLFloat, <br/> XMLInteger, <br/> XMLNumber, <br/> - XMLUri. <br/> + XMLUri. </td> </tr> @@ -53,13 +54,33 @@ <td>2001-10-19</td> <td>Tinny Ng</td> <td> Schema:<br/> - Support Unique Particle Attribution Constraint Checking, <br/> - xsi:type, <br/> + Support xsi:type, <br/> + Unique Particle Attribution Constraint Checking, <br/> anyAttribute in Scanner and Validator. <br/> Add XercesElementWildCard, <br/> AllContentModel, <br/> - XMLInternalErrorHandler. <br/> - Enable those derived dataype like nonPositiveinteger, negativeInteger ... etc. + XMLInternalErrorHandler. + </td> + </tr> + + <tr> + <td>2001-10-18</td> + <td>Tinny Ng</td> + <td>[Bug 4015] IDDOMImplementation::createDocumentType hopelessly broken. + </td> + </tr> + + <tr> + <td>2001-10-16</td> + <td>Khaled Noaman</td> + <td>[Bug 3750] GeneralAttributeCheck threading bug. + </td> + </tr> + + <tr> + <td>2001-10-15</td> + <td>Khaled Noaman</td> + <td>[Bug 4177] setupRange uses non-portable code. </td> </tr> @@ -102,7 +123,7 @@ <tr> <td>2001-10-05</td> <td>PeiYong Zhang</td> - <td>[Bug 3831]: -1 returned from getIndex() needs to be checked. + <td>[Bug 3831] -1 returned from getIndex() needs to be checked. </td> </tr> @@ -289,7 +310,7 @@ </tr> <tr> - <td>2001-07-27</td> + <td>2001/07/27</td> <td>Tinny Ng</td> <td>Fix bug in 'transcode' functions reported by Evgeniy Gabrilovich. </td> diff --git a/doc/schema.xml b/doc/schema.xml index 368783c3048895075372829e7f5000ceaf025c85..e4ae65056b565a37ac1df3571e31e5361c823ecc 100644 --- a/doc/schema.xml +++ b/doc/schema.xml @@ -2,13 +2,13 @@ <!DOCTYPE s1 SYSTEM "./dtd/document.dtd"> <s1 title="Schema"> <s2 title="Disclaimer"> - <p>Schema is not fully supported in &XercesCName; yet. But an + <p>Schema is not fully supported in &XercesCName; yet. But an experimental implementation of a subset of the W3C XML Schema language is now available for review in &XercesCName; &XercesCVersion;. You should not consider this implementation complete or - correct. The limitations of this implementation are - detailed below. Please read this document before using - &XercesCName; &XercesCVersion;. + correct. The limitations of this implementation are + detailed below. Please read this document before using + &XercesCName; &XercesCVersion;. </p> </s2> <s2 title="Introduction"> @@ -29,99 +29,72 @@ Xerces-C mailing list &XercesCEmailAddress; </jump> . </p> </s2> + <anchor name="limitation"/> <s2 title="Limitations"> <p>The XML Schema implementation in the &XercesCName; &XercesCVersion; is a subset of the features defined in the 2 May 2001 XML Schema Recommendation. </p> </s2> - <anchor name="supportedFeature"/> - <s2 title='Features/Datatypes Supported'> + <s2 title='Features/Datatypes Not Supported'> <ul> - <li>Partial Simple type support </li> - <ul> - <li> Yes: atomic simple type </li> - <li> No: union and list </li> - </ul> - <li>Partial Complex type suppport </li> - <ul> - <li> Yes: choice, sequence </li> - <li> No: group, all </li> - </ul> - <li>Element and Attribute Declaration </li> - <ul> - <li> No: any/anyAttribute </li> - </ul> - <li>SubsitutionGroup</li> - <li>Subset of Built-in Datatypes</li> + <li>Identity Constraints</li> + <li>Particle Derivation Constraint Checking </li> + <li>Built-in Datatypes Not Supported</li> <ul> <li>Primitive Datatypes</li> <ul> - <li>string</li> - <li>boolean</li> - <li>decimal</li> - <li>hexbinary</li> - <li>base64binary</li> - </ul> - <li>Derived Datatypes</li> - <ul> - <li>integer</li> + <li>duration</li> + <li>dateTime</li> + <li>time</li> + <li>date</li> + <li>gYearMonth</li> + <li>gYear</li> + <li>gMonthDay</li> + <li>gDay</li> + <li>gMonth</li> </ul> </ul> - <li>xsi Markup</li> - <ul> - <li>Yes: xsi:nil</li> - <li>Yes: xsi:schemaLocation and xsi:noNamespaceSchemaLocation</li> - <li>No: xsi:type</li> - </ul> </ul> - <p> Additional Experimental Features (not tested and subject to change, use as is)</p> - <ul> - <li>Complex type derivation support (simpleContent and complexContent).</li> - <li>Element and attribute re-use using "ref".</li> - <li>Include support</li> - <li>Import Support</li> - <li>Element declaration <any></li> - <li>Subset of Built-in Datatypes</li> - <ul> - <li>Derived Datatypes</li> - <ul> - <li>normalizedString</li> - <li>token</li> - <li>language</li> - <li>Name</li> - <li>NCName</li> - <li>NMTOKEN</li> - <li>NMTOKENS</li> - <li>ID</li> - <li>IDREF</li> - <li>IDREFS</li> - <li>ENTITY</li> - <li>ENTITIES</li> - <li>nonNegativeInteger</li> - </ul> - </ul> - </ul> - - - <p>Other features in the Schema recommendation such as "redefine", - "identity constraints" and others which are not mentioned above, are not supported - yet. Also, particle and model group constraint checking is not yet fully implemented. But development is - continuing and we target to implement all the features of the current XML Schema - Recommendation before end of this year. Please note that the date is tentative and - subject to change. + <p>Development is ongoing and we target to implement all the features of the + current XML Schema Recommendation before end of this year. Please note that + the date is tentative and subject to change. </p> </s2> <s2 title="Other Limitations"> - <p>The schema must be specified by the xsi:schemaLocation or - xsi:noNamespaceSchemaLocation attribute on the root - element of the document. The xsi prefix must be bound to the - Schema document instance namespace, as specified by the - Recommendation. See the sample provided in the Usage section. - </p> + <ul> + <li>No interface is provided for exposing the post-schema + validation infoset , beyond + that provided by DOM or SAX;</li> + <li> The parser permits situations in which there is + circular or multiple importing. However, the parser only permits forward + references--that is, references directed from the + direction of the schema cited in the instance + document to other schemas. For instance, if schema A + imports both schema B and schema C, then + any reference in schema B to an information item from + schema C will produce an error. Circular or multiple + <include>s have similar limitations.</li> + <li>Due to the way in which the parser constructs content + models for elements with complex content, specifying large + values for the <code>minOccurs</code> or <code>maxOccurs</code> + attributes may cause a stack overflow or very poor performance + in the parser. Large values for <code>minOccurs</code> should be + avoided, and <code>unbounded</code> should be used instead of + a large value for <code>maxOccurs</code>.</li> + <li>The parsers contained in this package are able to read and + validate XML documents with the grammar specified in either + DTD or XML Schema format, but not both.</li> + <li>The schema is specified by the xsi:schemaLocation or + xsi:noNamespaceSchemaLocation attribute on the root + element of the document. The xsi prefix must be bound to the + Schema document instance namespace, as specified by the + Recommendation. See the sample provided in the + Usage section.</li> + </ul> </s2> <anchor name="usage"/>