From 9bba601614ca0b3c07d171b0d238894850e17d0c Mon Sep 17 00:00:00 2001 From: Khaled Noaman <knoaman@apache.org> Date: Fri, 8 Jun 2001 14:25:25 +0000 Subject: [PATCH] Documentation update. git-svn-id: https://svn.apache.org/repos/asf/xerces/c/trunk@172750 13f79535-47bb-0310-9956-ffa450edef68 --- doc/idomcount.xml | 62 +++++++++++++ doc/idomprint.xml | 118 +++++++++++++++++++++++ doc/program.xml | 231 ++++++++++++++++++++++++++++++++++++++++++++++ doc/samples.xml | 9 ++ doc/sax2count.xml | 65 +++++++++++++ doc/sax2print.xml | 119 ++++++++++++++++++++++++ 6 files changed, 604 insertions(+) create mode 100644 doc/idomcount.xml create mode 100644 doc/idomprint.xml create mode 100644 doc/sax2count.xml create mode 100644 doc/sax2print.xml diff --git a/doc/idomcount.xml b/doc/idomcount.xml new file mode 100644 index 000000000..84e8f7043 --- /dev/null +++ b/doc/idomcount.xml @@ -0,0 +1,62 @@ +<?xml version="1.0" standalone="no"?> +<!DOCTYPE s1 SYSTEM "./dtd/document.dtd"> + +<s1 title="&XercesCName; Sample 13: IDOMCount"> + + <s2 title="IDOMCount"> + <p>IDOMCount uses the provided IDOM API to parse an XML file, + constructs the DOM tree and walks through the tree counting + the elements (using just one API call).</p> + + <s3 title="Building on Windows"> + <p>Load the &XercesCInstallDir;-win32\samples\Projects\Win32\VC6\samples.dsw + Microsoft Visual C++ workspace inside your MSVC IDE. Then + build the project marked IDOMCount.</p> + </s3> + + <s3 title="Building on UNIX"> +<source>cd &XercesCInstallDir;-linux/samples +./runConfigure -p<platform> -c<C_compiler> -x<C++_compiler> +cd IDOMCount +gmake</source> + <p>This will create the object files in the current directory + and the executable named IDOMCount in ' &XercesCInstallDir;-linux/bin' + directory.</p> + + <p>To delete all the generated object files and executables, type</p> +<source>gmake clean</source> + </s3> + + <s3 title="Running IDOMCount"> + + <p>The IDOMCount sample parses an XML file and prints out a count of the number of + elements in the file. To run IDOMCount, enter the following </p> +<source>IDOMCount <XML file></source> + <p>The following parameters may be set from the command line </p> +<source>Usage: + IDOMCount [-v -n] {XML file} + +This program invokes the XML4C IDOM parser, builds +the DOM tree, and then prints the number of elements +found in the input XML file. + +Options: + -v=xxx Validation scheme [always | never | auto*] + -n Enable namespace processing. Defaults to off. + + * = Default if not provided explicitly</source> + <p><em>-v=always</em> will force validation<br/> + <em>-v=never</em> will not use any validation<br/> + <em>-v=auto</em> will validate if a DOCTYPE declaration is present in the XML document</p> + <p>Here is a sample output from IDOMCount</p> +<source>cd &XercesCInstallDir;-linux/samples/data +IDOMCount -v=always personal.xml +personal.xml: 20 ms (37 elems)</source> + + <p>The output of both versions should be same.</p> + + <note>The time reported by the system may be different, depending on your + processor type.</note> + </s3> + </s2> +</s1> diff --git a/doc/idomprint.xml b/doc/idomprint.xml new file mode 100644 index 000000000..67828d4ed --- /dev/null +++ b/doc/idomprint.xml @@ -0,0 +1,118 @@ +<?xml version="1.0" standalone="no"?> +<!DOCTYPE s1 SYSTEM "./dtd/document.dtd"> + +<s1 title="&XercesCName; Sample 14: IDOMPrint"> + + <s2 title="IDOMPrint"> + <p>IDOMPrint parses an XML file, constructs the DOM tree, and walks + through the tree printing each element. It thus dumps the XML back + (output same as SAXPrint).</p> + + <s3 title="Building on Windows"> + <p>Load the &XercesCInstallDir;-win32\samples\Projects\Win32\VC6\samples.dsw + Microsoft Visual C++ workspace inside your MSVC IDE. Then + build the project marked IDOMPrint. + </p> + </s3> + <s3 title="Building on UNIX"> +<source>cd &XercesCInstallDir;-linux/samples +./runConfigure -p<platform> -c<C_compiler> -x<C++_compiler> +cd IDOMPrint +gmake</source> + <p> + This will create the object files in the current directory and the executable named + IDOMPrint in '&XercesCInstallDir;-linux/bin' directory.</p> + + <p>To delete all the generated object files and executables, type</p> +<source>gmake clean</source> + </s3> + + <s3 title="Running IDOMPrint"> + + <p>The IDOMPrint sample parses an XML file, using either a validating + or non-validating IDOM parser configuration, builds a DOM tree, + and then walks the tree and outputs the contents of the nodes + in a 'canonical' format. To run IDOMPrint, enter the following:</p> +<source>IDOMPrint <XML file></source> + <p>The following parameters may be set from the command line </p> +<source>Usage: IDOMPrint [options] file + +This program invokes the &XercesCName; IDOM parser and builds the DOM +tree. It then traverses the DOM tree and prints the contents +of the tree. Options are NOT case sensitive. + +Options: + -e Expand entity references. Default is no expansion. + -u=xxx Handle unrepresentable chars [fail | rep | ref*] + -v=xxx Validation scheme [always | never | auto*] + -n Enable namespace processing. Default is off. + -x=XXX Use a particular encoding for output. Default is + the same encoding as the input XML file. UTF-8 if + input XML file has not XML declaration. + -? Show this help (must be the only parameter) + + * = Default if not provided explicitly + +The parser has intrinsic support for the following encodings: + UTF-8, USASCII, ISO8859-1, UTF-16[BL]E, UCS-4[BL]E, + WINDOWS-1252, IBM1140, IBM037</source> + <p><em>-u=fail</em> will fail when unrepresentable characters are encountered<br/> + <em>-u=rep</em> will replace with the substitution character for that codepage<br/> + <em>-u=ref</em> will report the character as a reference</p> + <p><em>-v=always</em> will force validation<br/> + <em>-v=never</em> will not use any validation<br/> + <em>-v=auto</em> will validate if a DOCTYPE declaration is present in the XML document</p> + <p>Here is a sample output from IDOMPrint</p> +<source>cd &XercesCInstallDir;-linux/samples/data +IDOMPrint -v personal.xml + +<?xml version="1.0" encoding="iso-8859-1"?> + +<!DOCTYPE personnel SYSTEM "personal.dtd"> +<!-- @version: --> +<personnel> + +<person id="Big.Boss"> + <name><family>Boss</family> <given>Big</given></name> + <email>chief@foo.com</email> + <link subordinates="one.worker two.worker three.worker + four.worker five.worker"></link> +</person> + +<person id="one.worker"> + <name><family>Worker</family> <given>One</given></name> + <email>one@foo.com</email> + <link manager="Big.Boss"></link> +</person> + +<person id="two.worker"> + <name><family>Worker</family> <given>Two</given></name> + <email>two@foo.com</email> + <link manager="Big.Boss"></link> +</person> + +<person id="three.worker"> + <name><family>Worker</family> <given>Three</given></name> + <email>three@foo.com</email> + <link manager="Big.Boss"></link> +</person> + +<person id="four.worker"> + <name><family>Worker</family> <given>Four</given></name> + <email>four@foo.com</email> + <link manager="Big.Boss"></link> +</person> + +<person id="five.worker"> + <name><family>Worker</family> <given>Five</given></name> + <email>five@foo.com</email> + <link manager="Big.Boss"></link> +</person> + +</personnel></source> + <p>Note that IDOMPrint does not reproduce the original XML file. IDOMPrint and + SAXPrint produce different results because of the way the two APIs store data + and capture events.</p> + </s3> + </s2> +</s1> diff --git a/doc/program.xml b/doc/program.xml index 6688311df..6b2d06de0 100644 --- a/doc/program.xml +++ b/doc/program.xml @@ -31,6 +31,17 @@ <li><link anchor="Downcasting">Downcasting</link></li> <li><link anchor="Subclassing">Subclassing</link></li> </ul> + <li><link anchor="IDOMProgGuide">Experimental IDOM Programming Guide</link></li> + <ul> + <li><link anchor="ConstructIDOMParser">Constructing a parser</link></li> + <li><link anchor="DOMandIDOM">Comparision of C++ DOM and IDOM</link></li> + <ul> + <li><link anchor="Motivation">Motivation behind new design</link></li> + <li><link anchor="IDOMClassNames">Class Names</link></li> + <li><link anchor="IDOMObjMemMgmt">Objects and Memory Management</link></li> + <li><link anchor="DOMStringXMCh">DOMString vs. XMLCh</link></li> + </ul> + </ul> </ul> @@ -601,4 +612,224 @@ else </s2> + <anchor name="IDOMProgGuide"/> + <s2 title="Experimental IDOM Programming Guide"> + <p>The experimental IDOM API is a new design of the C++ DOM API. + Please note that this experimental IDOM API is only a prototype + and is subject to change.</p> + + <anchor name="ConstructIDOMParser"/> + <s3 title="Constructing a parser"> + <p>In order to use &XercesCName; to parse XML files using IDOM, you + will need to create an instance of the IDOMParser class. The example + below shows the code you need in order to create an instance of the + IDOMParser.</p> + + <source> +int main (int argc, char* args[]) { + + try { + XMLPlatformUtils::Initialize(); + } + catch (const XMLException& toCatch) { + cout << "Error during initialization! :\n" + << toCatch.getMessage() << "\n"; + return 1; + } + + char* xmlFile = "x1.xml"; + IDOMParser* parser = new IDOMParser(); + parser->setValidationScheme(IDOMParser::Val_Always); // optional. + parser->setDoNamespaces(true); // optional + + ErrorHandler* errHandler = (ErrorHandler*) new HandlerBase(); + parser->setErrorHandler(errHandler); + + try { + parser->parse(xmlFile); + } + catch (const XMLException& toCatch) { + cout << "\nFile not found: '" << xmlFile << "'\n" + << "Exception message is: \n" + << toCatch.getMessage() << "\n" ; + return -1; + } + + return 0; +} + </source> + </s3> + + <anchor name="DOMandIDOM"/> + <s3 title="Comparision of C++ DOM and IDOM"> + <p> + This section outlines the differences between the C++ DOM and IDOM APIs. + </p> + </s3> + + <anchor name="Motivation"/> + <s3 title="Motivation behind new design"> + <p> + The performance of the C++ DOM has not been as good as it + might be, especially for use in server style applications. + The DOM's reference counted automatic memory management has + been the biggest time consumer. The situation becomes worse + when running multi-threaded applications. + </p> + <p> + The experimental C++ IDOM is a new alternative to the C++ DOM, and aims at + meeting the following requirements: + </p> + <ul> + <li>Reduced memory footprint.</li> + <li>Fast.</li> + <li>Good scalability on multiprocessor systems.</li> + <li>More C++ like and less Java like.</li> + </ul> + </s3> + + <anchor name="IDOMClassNames"/> + <s3 title="Class Names"> + <p> + The IDOM class names are prefixed with "IDOM_". The intent is + to prevent conflicts between IDOM class names and DOM class names + that may already be in use by an application or other + libraries that a DOM based application must link with. + </p> + + + <source> +IDOM_Document* myDocument; // IDOM +IDOM_Node* aNode; +IDOM_Text* someText; + </source> + + <source> +DOM_Document myDocument; // DOM +DOM_Node aNode; +DOM_Text someText; + </source> + </s3> + + <anchor name="IDOMObjMemMgmt"/> + <s3 title="Objects and Memory Management"> + <p>The C++ IDOM implementation no longer uses reference counting for + automatic memory management. The storage for a DOM document is + associated with the document node object. Applications would use + normal C++ pointers to directly access the implementation objects + for Nodes in IDOM C++, while they would use object references in + DOM C++. + </p> + + <p>Consider the following code snippets</p> + + + <source> +// IDOM C++ +IDOM_Node* aNode; +IDOM_Node* docRootNode; +aNode = someDocument->createElement("ElementName"); +docRootNode = someDocument->getDocumentElement(); +docRootNode->appendChild(aNode); + </source> + + <source> +// DOM C++ +DOM_Node aNode; +DOM_Node docRootNode; +aNode = someDocument.createElement("ElementName"); +docRootNode = someDocument.getDocumentElement(); +docRootNode.appendChild(aNode); + </source> + + + <p>The IDOM C++ uses an independent storage allocator per document. + The advantage here is that allocation would require no synchronization + in most cases (based on the the same threading model that we + have now - one thread active per document, but any number of + documents running in parallel with separate threads). + </p> + + <p>The allocator does not support a delete operation at all - all + allocated memory would persist for the life of the document, and + then the larger blocks would be returned to the system without separately + deleting all of the individual nodes and strings within the document. + </p> + + <p>The C++ DOM and IDOM are similar in the use of factory methods in the + document class for all object creation. They differ in the object deletion + mechanism. + </p> + + <p>In C++ DOM, there is no explicit object deletion. The deallocation of + memory is automatically taken care of by the reference counting. + </p> + + <p>In C++ IDOM, there is an implict and explict object deletion. When parsing + a document using an IDOMParser, the storage allocated will be automatically + deleted when the parser instance is deleted (implicit). If a user is + manually building a DOM tree in memory using the document factory methods, + then the user needs to explicilty delete the document object to free all + allocated memory. + </p> + + <p>Consider the following code snippets: </p> + + <source> +// C++ IDOM - explicit deletion +IDOM_Document* myDocument; +IDOM_Node* aNode; +myDocument = IDOM_DOMImplementation::getImplementation()->createDocument(); +aNode = myDocument->createElement("ElementName"); +myDocument->appendChild(aNode); +delete myDocument; + </source> + + <source> +// C++ DOM - implicit deletion +IDOM_Document myDocument; +DOM_Node aNode; +myDocument = DOM_DOMImplementation::getImplementation().createDocument(); +aNode = myDocument.createElement("ElementName"); +myDocument.appendChild(aNode); + </source> + + <p>Key points to remember when using the C++ IDOM classes:</p> + + <ul> + <li>The DOM objects are accessed via C++ pointers.</li> + + <li>The DOM objects - nodes, attributes, CData + sections, etc., are created with the factory methods + (create...) in the document class.</li> + + <li>If you are manually building a DOM tree in memory, you + need to explicitly delete the document object. + Memory management will be automatically taken care of by + the IDOM parser when parsing an instance document.</li> + + </ul> + </s3> + + <anchor name="DOMStringXMCh"/> + <s3 title="DOMString vs. XMLCh"> + <p>The IDOM C++ no longer uses DOMString to pass string data to + and from the DOM API. Instead, the IDOM C++ uses plain, null-terminated + (XMLCh *) utf-16 strings. The (XMLCh*) utf-16 type string is much + simpler with lower overhead. All the string data would remain in + memory until the document object is deleted.</p> + + <source> +//C++ IDOM +const XMLCh* nodeValue = aNode->getNodeValue(); + </source> + + <source> +//C++ DOM +DOMString nodeValue = aNode.getNodeValue(); + </source> + </s3> + + </s2> + </s1> diff --git a/doc/samples.xml b/doc/samples.xml index e6db26210..cb4b77e50 100644 --- a/doc/samples.xml +++ b/doc/samples.xml @@ -134,6 +134,15 @@ <br/>EnumVal shows how to enumerate the markup decls in a DTD Validator.</li> <li><link idref="createdoc">CreateDOMDocument</link> <br/>CreateDOMDocument creates a DOM tree in memory from scratch.</li> + <li><link idref="sax2count">SAX2Count</link> + <br/>SAX2Count counts the elements, attributes, spaces and + characters in an XML file.</li> + <li><link idref="sax2print">SAX2Print</link> + <br/>SAX2Print parses an XML file and prints it out.</li> + <li><link idref="idomcount">IDOMCount</link> + <br/>IDOMCount counts the elements in a XML file.</li> + <li><link idref="idomprint">IDOMPrint</link> + <br/>IDOMPrint parses an XML file and prints it out.</li> </ul> </s3> </s2> diff --git a/doc/sax2count.xml b/doc/sax2count.xml new file mode 100644 index 000000000..bad6f031c --- /dev/null +++ b/doc/sax2count.xml @@ -0,0 +1,65 @@ +<?xml version="1.0" standalone="no"?> +<!DOCTYPE s1 SYSTEM "./dtd/document.dtd"> + +<s1 title="&XercesCName; Sample 11: SAX2Count"> + + <s2 title="SAX2Count"> + <p>SAX2Count is the simplest application that counts the elements and characters of + a given XML file using the (event based) SAX2 API.</p> + + <s3 title="Building on Windows"> + <p>Load the &XercesCInstallDir;-win32\samples\Projects\Win32\VC6\samples.dsw + Microsoft Visual C++ workspace inside your MSVC IDE. Then + build the project marked SAX2Count.</p> + </s3> + + <s3 title="Building on UNIX"> +<source>cd &XercesCInstallDir;-linux/samples +./runConfigure -p<platform> -c<C_compiler> -x<C++_compiler> +cd SAX2Count +gmake</source> + <p>This will create the object files in the current directory + and the executable named + SAX2Count in '&XercesCInstallDir;-linux/bin' directory.</p> + + <p>To delete all the generated object files and executables, type</p> +<source>gmake clean</source> + </s3> + + <s3 title="Running SAX2Count"> + + <p>The SAX2Count sample parses an XML file and prints out a count of the number of + elements in the file. To run SAX2Count, enter the following </p> + <source>SAX2Count <XML File></source> + <p>The following parameters may be set from the command line </p> +<source>Usage: + SAX2Count [options] <XML file> + +Options: + -v=xxx Validation scheme [always | never | auto*] + -n Enable namespace processing. Defaults to off. + -s Disable schema processing. Defaults to on. + +This program prints the number of elements, attributes, +white spaces and other non-white space characters in the input file. + + * = Default if not provided explicitly</source> + <p><em>-v=always</em> will force validation<br/> + <em>-v=never</em> will not use any validation<br/> + <em>-v=auto</em> will validate if a DOCTYPE declaration is present in the XML document</p> + <p>Here is a sample output from SAX2Count</p> +<source>cd &XercesCInstallDir;-linux/samples/data +SAX2Count -v=always personal.xml +personal.xml: 60 ms (37 elems, 12 attrs, 134 spaces, 134 chars)</source> + <p>Running SAX2Count with the validating parser gives a different result because + ignorable white-space is counted separately from regular characters.</p> +<source>SAX2Count -v=never personal.xml +personal.xml: 10 ms (37 elems, 12 attrs, 0 spaces, 268 chars)</source> + <p>Note that the sum of spaces and chracters in both versions is the same.</p> + + <note>The time reported by the program may be different depending on your + machine processor.</note> + </s3> + + </s2> +</s1> diff --git a/doc/sax2print.xml b/doc/sax2print.xml new file mode 100644 index 000000000..20e9f2b69 --- /dev/null +++ b/doc/sax2print.xml @@ -0,0 +1,119 @@ +<?xml version="1.0" standalone="no"?> +<!DOCTYPE s1 SYSTEM "./dtd/document.dtd"> + +<s1 title="&XercesCName; Sample 12: SAX2Print"> + + <s2 title="SAX2Print"> + <p>SAX2Print uses the SAX2 APIs to parse an XML file and print + it back. Do note that the output of this sample is not + exactly the same as the input (in terms of whitespaces, first + line), but the output has the same information content as the + input.</p> + + <s3 title="Building on Windows"> + <p>Load the + &XercesCInstallDir;-win32\samples\Projects\Win32\VC6\samples.dsw + Microsoft Visual C++ workspace inside your MSVC IDE. Then + build the project marked SAX2Print. + </p> + </s3> + + <s3 title="Building on UNIX"> + +<source>cd &XercesCInstallDir;-linux/samples +./runConfigure -p<platform> -c<C_compiler> -x<C++_compiler> +cd SAX2Print +gmake</source> + + <p>This will create the object files in the current directory + and the executable named SAX2Print in + '&XercesCInstallDir;-linux/bin' directory.</p> + + <p>To delete all the generated object files and executables, type</p> +<source>gmake clean</source> + </s3> + + <s3 title="Running SAX2Print"> + + <p>The SAX2Print sample parses an XML file and prints out the + contents again in XML (some loss occurs). To run SAX2Print, + enter the following </p> + +<source>SAX2Print <XML file></source> + <p>The following parameters may be set from the command line </p> +<source>Usage: SAX2Print [options] file +This program prints the data returned by the various SAX2 +handlers for the specified input file. Options are NOT case +sensitive. + +Options: + -u=xxx Handle unrepresentable chars [fail | rep | ref*] + -v=xxx Validation scheme [always | never | auto*] + -e Expand Namespace Alias with URI's. + -x=XXX Use a particular encoding for output (LATIN1*). + -? Show this help + + * = Default if not provided explicitly + +The parser has intrinsic support for the following encodings: + UTF-8, USASCII, ISO8859-1, UTF-16[BL]E, UCS-4[BL]E, + WINDOWS-1252, IBM1140, IBM037</source> + + <p><em>-u=fail</em> will fail when unrepresentable characters are encountered<br/> + <em>-u=rep</em> will replace with the substitution character for that codepage<br/> + <em>-u=ref</em> will report the character as a reference</p> + <p><em>-v=always</em> will force validation<br/> + <em>-v=never</em> will not use any validation<br/> + <em>-v=auto</em> will validate if a DOCTYPE declaration is present in the XML document</p> + <p>Here is a sample output from SAX2Print</p> +<source>cd &XercesCInstallDir;-linux/samples/data +SAX2Print -v=always personal.xml + +<?xml version="1.0" encoding="LATIN1"?> +<personnel> + + <person id="Big.Boss"> + <name><family>Boss</family> <given>Big</given></name> + <email>chief@foo.com</email> + <link subordinates="one.worker two.worker three.worker + four.worker five.worker"></link> + </person> + + <person id="one.worker"> + <name><family>Worker</family> <given>One</given></name> + <email>one@foo.com</email> + <link manager="Big.Boss"></link> + </person> + + <person id="two.worker"> + <name><family>Worker</family> <given>Two</given></name> + <email>two@foo.com</email> + <link manager="Big.Boss"></link> + </person> + + <person id="three.worker"> + <name><family>Worker</family> <given>Three</given></name> + <email>three@foo.com</email> + <link manager="Big.Boss"></link> + </person> + + <person id="four.worker"> + <name><family>Worker</family> <given>Four</given></name> + <email>four@foo.com</email> + <link manager="Big.Boss"></link> + </person> + + <person id="five.worker"> + <name><family>Worker</family> <given>Five</given></name> + <email>five@foo.com</email> + <link manager="Big.Boss"></link> + </person> + +</personnel></source> + <note>SAX2Print does not reproduce the original XML file. + SAX2Print and DOMPrint produce different results because of + the way the two APIs store data and capture events.</note> + </s3> + + </s2> +</s1> -- GitLab