diff --git a/doc/apidocs.xml b/doc/apidocs.xml new file mode 100644 index 0000000000000000000000000000000000000000..97df5030f1940368c546f6eedbbe880b71c37404 --- /dev/null +++ b/doc/apidocs.xml @@ -0,0 +1,28 @@ +<?xml version="1.0" standalone="no"?> +<!DOCTYPE s1 SYSTEM "sbk:/style/dtd/document.dtd"> + +<s1 title="API Documentation"> + <s2 title="API Docs for SAX and DOM"> + + <p>&XercesCName; is packaged with the API documentation for SAX and DOM, the two + most common programming interfaces for XML. The most common + framework classes have also been documented.</p> + + <p>&XercesCName; DOM is an implementation of the + <jump href="http://www.w3.org/TR/1998/REC-DOM-Level-1-19981001/level-one-core.html">Document Object + Model (Core) Level 1</jump> as defined in the W3C Recommendation of 1 October, 1998. + For a complete understanding of how the &XercesCName; APIs work, + we recommend you to read the DOM Level 1 specification.</p> + + <p>&XercesCName; SAX is an implementation of the + <jump href="http://www.megginson.com/SAX/index.html">SAX 1.0</jump> specification. + You are encouraged to read this document for a better + understanding of the SAX API in &XercesCName;.</p> + + <p><em><jump href="../apiDocs/index.html">Click here for the &XercesCName; API documentation.</jump></em></p> + + <note>The API documentation is automatically generated using + <jump href="http://www.zib.de/Visual/software/doc++/index.html">DOC++</jump>.</note> + + </s2> +</s1> diff --git a/doc/build.xml b/doc/build.xml new file mode 100644 index 0000000000000000000000000000000000000000..1fd4f393f29e840b691df4ce7d4f37e0569566af --- /dev/null +++ b/doc/build.xml @@ -0,0 +1,557 @@ +<?xml version="1.0" standalone="no"?> +<!DOCTYPE s1 SYSTEM "sbk:/style/dtd/document.dtd"> + +<s1 title="Building &XercesCName;"> + <s2 title="Building on Windows NT/98"> + &XercesCName; comes with Microsoft Visual C++ projects and workspaces to + help you build &XercesCName;. The following describes the steps you need + to build &XercesCName;. + <s3 title="Building &XercesCName; library"> + <p>To build &XercesCName; from it source (using MSVC), you will + need to open the workspace containing the project. If you are + building your application, you may want to add the &XercesCName; + project inside your applications's workspace.</p> + <p>The workspace containing the &XercesCName; project file and + all other samples is:</p> +<source>&XercesCSrcInstallDir;\Projects\Win32\VC6\xerces-all\xerces-all.dsw</source> + <p>Once you are inside MSVC, you need to build the project marked + <em>XercesLib</em>.</p> + <p>If you want to include the &XercesCName; project separately, + you need to pick up:</p> +<source>&XercesCSrcInstallDir;\Projects\Win32\VC6\xerces-all\XercesLib\XercesLib.dsp</source> + <p>You must make sure that you are linking your application with + the &XercesCWindowsLib;.lib library and also make sure that + the associated DLL is somewhere in your path.</p> + <note>If you are working on the AlphaWorks version which uses ICU, + you must either have the environment variable ICU_DATA set, or + keep the international converter files relative to the + &XercesCProjectName; DLL (as it came with the original binary + drop) for the program to find it. For finding out where you can + get ICU from and build it, look at the last section of this page.</note> + </s3> + <s3 title="Building samples"> + <p>Inside the same workspace (xerces-all.dsw), you'll find several other + projects. These are for the samples. Select all the samples and right click + on the selection. Then choose "Build (selection only)" to build all the + samples in one shot.</p> + </s3> + </s2> + + <s2 title="Building on UNIX platforms"> + <p>&XercesCName; uses + <jump href="http://www.gnu.org">GNU</jump> tools like + <jump href="http://www.gnu.org/software/autoconf/autoconf.html">Autoconf</jump> and + <jump href="http://www.gnu.org/software/make/make.html">GNU Make</jump> to build the system. You must first make sure you + have these tools installed on your system before proceeding. + If you don not have required tools, ask your system administrator + to get them for you. These tools are free under the GNU Public Licence + and may be obtained from the + <jump href="http://www.gnu.org">Free Software Foundation</jump>.</p> + + <p><em>Do not jump into the build directly before reading this.</em></p> + + <p>Spending some time reading the following instructions will save you a + lot of wasted time and support-related e-mail communication. + The &XercesCName; build instructions are a little different from + normal product builds. Specifically, there are some wrapper-scripts + that have been written to make life easier for you. You are free + not to use these scripts and use + <jump href="http://www.gnu.org/software/autoconf/autoconf.html">Autoconf</jump> and + <jump href="http://www.gnu.org/software/make/make.html">GNU Make</jump> + directly, but we want to make sure you know what you are by-passing and + what risks you are taking. So read the following instructions + carefully before attempting to build it yourself.</p> + + <p>Besides having all necessary build tools, you also need to know what + compilers we have tested &XercesCName; on. The following table lists the + relevant platforms and compilers.</p> + + <table> + <tr><td><em>Operating System</em></td><td><em>Compiler</em></td></tr> + <tr><td>Redhat Linux 6.0</td><td>gcc</td></tr> + <tr><td>AIX 4.1.4 and higher</td><td>xlC 3.1</td></tr> + <tr><td>Solaris 2.6</td><td>CC version 4.2</td></tr> + <tr><td>HP-UX B10.2</td><td>aCC and CC</td></tr> + <tr><td>HP-UX B11</td><td>aCC and CC</td></tr> + </table> + + <p>If you are not using any of these compilers, you are taking a calculated risk + by exploring new grounds. Your effort in making &XercesCName; work on this + new compiler is greatly appreciated and any problems you face can be addressed + on the &XercesCName; <jump href="mailto:&XercesCEmailAddress;">mailing list</jump>. + </p> + + <p><em>Differences between the UNIX platforms:</em> The description below is + generic, but as every programmer is aware, there are minor differences + within the various UNIX flavors the world has been bestowed with. + The one difference that you need to watch out in the discussion below, + pertains to the system environment variable for finding libraries. + On <em>Linux and Solaris</em>, the environment variable name is called + <code>LD_LIBRARY_PATH</code>, on <em>AIX</em> it is <code>LIBPATH</code>, + while on <em>HP-UX</em> it is <code>SHLIB_PATH</code>. The following + discussion assumes you are working on Linux, but it is with subtle + understanding that you know how to interpret it for the other UNIX flavors.</p> + + <note>If you wish to build &XercesCName; with + <jump href="http://www10.software.ibm.com/developerworks/opensource/icu/project/">ICU</jump>, + look at the <link anchor="icu">last section</link> of this page. + It tells you where you can find ICU and how you can build &XercesCName; + to include the ICU internationalization library.</note> + + <s3 title="Setting build environment variables"> + <p>Before doing the build, you must first set your environment variables + to pick-up the compiler and also specify where you extracted &XercesCName; + on your machine. + While the first one is probably set for you by the system administrator, just + make sure you can invoke the compiler. You may do so by typing the + compiler invocation command without any parameters (e.g. xlc_r, or g++, or cc) + and check if you get a proper response back.</p> + <p>Next set your &XercesCName; root path as follows:</p> +<source>export XERCESCROOT=<full path to &XercesCSrcInstallDir;></source> + + <p>This should be the full path of the directory where you extracted &XercesCName;.</p> + </s3> + + <s3 title="Building &XercesCName; library"> + <p>As mentioned earlier, you must be ready with the GNU tools like + <jump href="http://www.gnu.org/software/autoconf/autoconf.html">autoconf</jump> and + <jump href="http://www.gnu.org/software/make/make.html">gmake</jump> + before you attempt the build.</p> + + <p>The autoconf tool is required on only one platform and produces + a set of portable scripts (configure) that you can run on all + other platforms without actually having the autoconf tool installed + everywhere. In all probability the autoconf-generated script + (called <code>configure</code>) is already in your <code>src</code> + directory. If not, type:</p> + +<source>cd $XERCESCROOT/src +autoconf</source> + + <p>This generates a shell-script called <code>configure</code>. It is tempting to run + this script directly as is normally the case, but wait a minute. If you are + using the default compilers like + <jump href="http://www.gnu.org/software/gcc/gcc.html">gcc</jump> and + <jump href="http://www.gnu.org/software/gcc/gcc.html">g++</jump> you do not have a problem. But + if you are not on the standard GNU compilers, you need to export a few more + environment variables before you can invoke configure.</p> + + <p>Rather than make you to figure out what strange environment + variables you need to use, we have provided you with a wrapper + script that does the job for you. All you need to tell the script + is what your compiler is, and what options you are going to use + inside your build, and the script does everything for you. Here + is what the script takes as input:</p> + +<source>runConfigure +runConfigure: Helper script to run "configure" for one of the + supported platforms. +Usage: runConfigure "options" + where options may be any of the following: + -p <platform> (accepts 'aix', 'linux', 'solaris', + 'hp-10', 'hp-11', 'irix', 'unixware') + -c <C compiler name> (e.g. gcc, cc, xlc) + -x <C++ compiler name> (e.g. g++, CC, xlC) + -d (specifies that you want to build debug version) + -m <message loader> can be 'inmem', 'icu', 'iconv' + -n <net accessor> can be 'fileonly', 'libwww' + -t <transcoder> can be 'icu' or 'native' + -r <thread option> can be 'pthread' or 'dce' (only used on HP-11) + -l <extra linker options> + -z <extra compiler options> + -h (to get help on the above commands)</source> + + <note>&XercesCName; builds as a standalone library and also as a library + dependent on IBM's International Classes for Unicode (ICU). For simplicity, + the following discussion only targets standalone builds.</note> + + <p>One of the common ways to build &XercesCName; is as follows:</p> + +<source>runConfigure -plinux -cgcc -xg++ -minmem -nfileonly -tnative</source> + + <p>The response will be something like this:</p> +<source>Platform: linux +C Compiler: gcc +C++ Compiler: g++ +Extra compile options: +Extra link options: +Message Loader: inmem +Net Accessor: fileonly +Transcoder: native +Thread option: +Debug is OFF + +creating cache ./config.cache +checking for gcc... gcc +checking whether the C compiler (gcc -O -DXML_USE_NATIVE_TRANSCODER + -DXML_USE_INMEM_MESSAGELOADER ) works... yes +checking whether the C compiler (gcc -O -DXML_USE_NATIVE_TRANSCODER + -DXML_USE_INMEM_MESSAGELOADER ) is a cross-compiler... no +checking whether we are using GNU C... yes +checking whether gcc accepts -g... yes +checking for c++... g++ +checking whether the C++ compiler (g++ -O -DXML_USE_NATIVE_TRANSCODER + -DXML_USE_INMEM_MESSAGELOADER ) works... yes +checking whether the C++ compiler (g++ -O -DXML_USE_NATIVE_TRANSCODER + -DXML_USE_INMEM_MESSAGELOADER ) is a cross-compiler... no +checking whether we are using GNU C++... yes +checking whether g++ accepts -g... yes +checking for a BSD compatible install... /usr/bin/install -c +checking for autoconf... autoconf +checking for floor in -lm... yes +checking how to run the C preprocessor... gcc -E +checking for ANSI C header files... yes +checking for XMLByte... no +checking host system type... i686-pc-linux-gnu +updating cache ./config.cache +creating ./config.status +creating Makefile +creating util/Makefile +creating util/Transcoders/ICU/Makefile +creating util/Transcoders/Iconv/Makefile +creating util/Transcoders/Iconv400/Makefile +creating util/Platforms/Makefile +creating util/Compilers/Makefile +creating util/MsgLoaders/InMemory/Makefile +creating util/MsgLoaders/ICU/Makefile +creating util/MsgLoaders/MsgCatalog/Makefile +creating util/MsgLoaders/MsgFile/Makefile +creating validators/DTD/Makefile +creating framework/Makefile +creating dom/Makefile +creating parsers/Makefile +creating internal/Makefile +creating sax/Makefile +creating ../obj/Makefile +creating conf.h +conf.h is unchanged + +In future, you may also directly type the following commands to +create the Makefiles. + +export TRANSCODER=NATIVE +export MESSAGELOADER=INMEM +export USELIBWWW=0 +export CC=gcc +export CXX=g++ +export CXXFLAGS=-O -DXML_USE_NATIVE_TRANSCODER -DXML_USE_INMEM_MESSAGELOADER +export CFLAGS=-O -DXML_USE_NATIVE_TRANSCODER -DXML_USE_INMEM_MESSAGELOADER +export LIBS= -lpthread +configure + +If the result of the above commands look OK to you, go to the directory +$XERCESCROOT/src and type "gmake" to make the XERCES-C system.</source> + + <p>So now you see what the wrapper script has actually been doing! It has + invoked <code>configure</code> + to create the Makefiles in the individual sub-directories, but in addition + to that, it has set a few environment variables to correctly configure + your compiler and compiler flags too.</p> + + <p>Now that the Makefiles are all created, you are ready to do the actual build.</p> + +<source>gmake</source> + + <p>Is that it? Yes, that's all you need to build &XercesCName;.</p> + </s3> + + <s3 title="Building samples"> + <p>Similarly, you can build the samples by giving the same commands in the + <code>samples</code> directory.</p> + +<source>cd $XERCESCROOT/samples +runConfigure -plinux -cgcc -xg++ +gmake</source> + + <p>The samples get built in the <code>bin</code> directory. Before you run the + samples, you must make sure that your library path is set to pick up + libraries from <code>$XERCESCROOT/lib</code>. If not, type the following to + set your library path properly.</p> + +<source>export LD_LIBRARY_PATH=$XERCESCROOT/lib:$LD_LIBRARY_PATH</source> + <p>You are now set to run the sample applications.</p> + + </s3> + </s2> + + <s2 title="Building &XercesCName; on Windows using Visual Age C++"> + <p>A few unsupported projects are also packaged with &XercesCName;. Due to + origins of &XercesCName; inside IBM labs, we do have projects for IBM's + <jump href="http://www-4.ibm.com/software/ad/vacpp/">Visual Age C++ compiler</jump> on Windows. + The following describes the steps you need to build &XercesCName; using + Visual Age C++.</p> + + <s3 title="Building &XercesCName; library"> + <p><em>Requirements:</em></p> + + <ul> + <li>VisualAge C++ Version 4.0 with Fixpak 1: + <br/>Download the + <jump href="http://www-4.ibm.com/software/ad/vacpp/service/csd.html">Fixpak</jump> + from the IBM VisualAge C++ Corrective Services web page.</li> + </ul> + + <p>To include the ICU library:</p> + + <ul> + <li>ICU Build: + <br/>You should have the + <jump href="http://www10.software.ibm.com/developerworks/opensource/icu/project/icuhtml/index.html">ICU Library</jump> + in the same directory as the &XercesCName; library. For example if + &XercesCName; is at the top level of the d drive, put the ICU + library at the top level of d e.g. d:/xml4c, d:/icu.</li> + </ul> + + <p><em>Instructions:</em></p> + <ol> + <li>Change the directory to d:\xml4c\Projects\Win32</li> + <li>If a d:\xml4c\Project\Win32\VACPP40 directory does not exist, create it.</li> + <li>Copy the IBM VisualAge project file, <code>XML4C2X.icc</code>, + to the VACPP40 directory.</li> + <li>From the VisualAge main menu enter the project file name and path.</li> + <li>When the build finishes the status bar displays this message: Last Compile + completed Successfully with warnings on date.</li> + </ol> + <note>These instructions assume that you install in drive d:\. + Replace d with the appropriate drive letter.</note> + </s3> + </s2> + + + <s2 title="Building on OS/2 using Visual Age C++"> + <p>OS/2 is a favourite IBM PC platforms. The only + option in this platform is to use + <jump href="http://www-4.ibm.com/software/ad/vacpp/">Visual Age C++ compiler</jump>. + Here are the steps you need to build &XercesCName; using + Visual Age C++ on OS/2.</p> + <s3 title="Building &XercesCName; library"> + <p><em>Requirements:</em></p> + + <ul> + <li>VisualAge C++ Version 4.0 with Fixpak 1: + <br/>Download the + <jump href="http://www-4.ibm.com/software/ad/vacpp/service/csd.html">Fixpak</jump> + from the IBM VisualAge C++ Corrective Services web page.</li> + </ul> + + <p>To include the ICU library:</p> + + <ul> + <li>ICU Build: + <br/>You should have the + <jump href="http://www10.software.ibm.com/developerworks/opensource/icu/project/icuhtml/index.html">ICU Library</jump> + in the same directory as the &XercesCName; library. For example if + &XercesCName; is at the top level of the d drive, put the ICU + library at the top level of d e.g. d:/xml4c, d:/icu.</li> + </ul> + + <p><em>Instructions</em></p> + <ol> + <li>Change directory to d:\xml4c\Projects\OS2</li> + <li>If a d:\xml4c\Project\OS2\VACPP40 directory does not exist, create it.</li> + <li>Copy the IBM VisualAge project file, XML4C2X.icc, to the VACPP40 directory.</li> + <li>From the VisualAge main menu enter the project file name and path.</li> + <li>When the build finishes the status bar displays this message: Last Compile + completed Successfully with warnings on date.</li> + </ol> + <note>These instructions assume that you install in drive d:\. Replace d with the + appropriate drive letter.</note> + </s3> + </s2> + + + <s2 title="Building on Macintosh using CodeWarrior"> + + <s3 title="Building &XercesCName; library"> + <p>The directions in this file cover installing and building + &XercesCName; and ICU under the MacOS using CodeWarrior.</p> + <ol> + <li><em>Create a folder:</em> + <br/>for the &XercesCName; and ICU distributions, + the "src drop" folder </li> + + <li><em>Download and uncompress:</em> + <br/>the ICU and &XercesCName; source distribution + <br/>the ICU and &XercesCName; binary distributions, + for the documentation included </li> + + <li><em>Move the new folders:</em> + <br/>move the newly created &XercesCName; and icu124 + folders to the "src drop" folder.</li> + + <li><em>Drag and drop:</em> + <br/>the &XercesCName; folder into the "rename file" application located in + the same folder as this readme. + <br/>This is a MacPerl script that renames files that have + names too long to fit in a HFS/HFS+ filesystem. + It also searches through all of the source code and changes + the #include statements to refer to the new file names.</li> + + <li><em>Move the MacOS folder:</em> + <br/>from the in the Projects folder to "src drop:&XercesCName;:Projects".</li> + + <li><em>Open and build &XercesCName;:</em> + <br/>open the CodeWarrior project file + "src drop:&XercesCName;:Projects:MacOS:&XercesCName;:&XercesCName;" + and build the &XercesCName; library.</li> + + <li><em>Open and build ICU:</em> + <br/>open the CodeWarrior project file + "src drop:&XercesCName;:Projects:MacOS:icu:icu" + and build the ICU library.</li> + + <li><em>Binary distribution:</em> + <br/>If you wish, you can create projects for and build the rest of the tools and test + suites. They are not needed if you just want to use &XercesCName;. I suggest that you + use the binary data files distributed with the binary distribution of ICU instead of + creating your own from the text data files in the ICE source distribution.</li> + </ol> + + <p>There are some things to be aware of when creating your own + projects using &XercesCName;.</p> + <ol> + <li>You will need to link against both the ICU and &XercesCName; libraries.</li> + <li>The options "Always search user paths" and "Interpret DOS and Unix Paths" are + very useful. Some of the code won't compile without them set.</li> + <li>Most of the tools and test code will require slight modification to compile and run + correctly (typecasts, command line parameters, etc), but it is possible to get + them working correctly.</li> + <li>You will most likely have to set up the Access Paths. The access paths in the + &XercesCName; projects should serve as a good example.</li> + </ol> + + + <note>These instructions were originally contributed by + <jump href="mailto:jbellardo@alumni.calpoly.edu">J. Bellardo</jump>. + &XercesCName; has undergone many changes since these instructions + were written. So, these instructions are not upto date. + But it will give you a jump start if you are struggling to get it + to work for the first time. We will be glad to get your changes. + Please respond to <jump href="mailto:&XercesCEmailAddress;"> + &XercesCEmailAddress;</jump> with your comments and corrections.</note> + + </s3> + </s2> + + <s2 title="How to Build ICU"> + <p>As mentioned earlier, &XercesCName; may be built in stand-alone mode using + native encoding support and also using ICU where you get support for 100's + of encodings. ICU stands for International Classes for Unicode and is an + open source distribution from IBM. You can get + <jump href="http://www10.software.ibm.com/developerworks/opensource/icu/project/">ICU libraries</jump> from + <jump href="http://www.ibm.com/developerWorks">IBM's developerWorks site</jump> + or go to the ICU + <jump href="http://www10.software.ibm.com/developerworks/opensource/icu/project/download/index.html">download page</jump> + directly.</p> + <s3 title="Buiding ICU for &XercesCName;"> + <p>You can find generic instructions to build ICU in the ICU documentation. + What we describe below are the minimal steps needed to build ICU for &XercesCName;. + Not all ICU components need to be built to make it work with &XercesCName;.</p> + + <note><em>Important:</em> Please remember that <em>ICU and + &XercesCName; must be built with the same compiler</em>, + preferably with the same version. You cannot for example, + build ICU with a threaded version of the xlC compiler and + build &XercesCName; with a non-threaded one.</note> + </s3> + + <s3 title="Building ICU on Windows"> + <p>To build ICU from its source, invoke the project + <code>\icu\source\allinone\allinone.dsw</code> + and build the sub-project labeled <code>common</code>. + You may also want to build <code>tools/makeconv</code> to make + the converter tool. All others are not required for the &XercesCName; + build to proceed.</p> + + <p>To build &XercesCName; from it source, you will need to + include a project file in your workspace to program your + application. Otherwise, you can use the provided workspace and add + your application to it as a separate project.</p> + + <p>In the first case the project file is: + <code>xml4c2\Projects\Win32\VC6\IXXML4C2\IXXML4C2\IXXML4C2.dsp</code></p> + + <p>In the second case the workspace is: + <code>xml4c2\Projects\Win32\VC6\IXXML4C2\IXXML4C2.dsw</code></p> + + <p>You must make sure that you are linking your application + with the &XercesCWindowsLib;.lib library and also make sure + that the associated DLL is somewhere in your path. Note + that you must either have the environment variable + <code>ICU_DATA</code> set, or keep the international converter + files relative to the &XercesCProjectName; DLL (as it came with + the original binary drop) for the program to find it.</p> + </s3> + + <anchor name="icu"/> + <s3 title="Building ICU on UNIX platforms"> + + <p>To build ICU on all UNIX platforms you at least need the + <code>autoconf</code> tool and GNU's <code>gmake</code> utility.</p> + + <p>First make sure that you have defined the following + environment variables:</p> + +<source>export ICUROOT = <icu_installdir> +export ICU_DATA = <icu_installdir>/data/</source> + + <p>Next, go to the directory, the following commands will create + a shell script called 'configure': </p> + +<source>cd $ICUROOT +cd source +autoconf</source> + + <p>Commands for specific UNIX platforms are different and are + described separately below.</p> + + <p>You will get a more detailed description of the use of + configure in the ICU documentation. The differences lie in the + arguments passed to the configure script, which is a + platform-independent generated shell-script (through + <code>autoconf</code>) and is used to generate platform-specific + <code>Makefiles</code> from generic <code>Makefile.in</code> files.</p> + + <p><em>For AIX:</em></p> + + <p>Type the following:</p> +<source>env CC="xlc_r -L/usr/lpp/xlC/lib" CXX="xlC_r -L/usr/lpp/xlC/lib" + C_FLAGS="-w -O" CXX_FLAGS="-w -O" +configure --prefix=$ICUROOT +cd common +gmake +gmake install +cd ../tools/makeconv +gmake</source> + + <p><em>For Solaris and Linux:</em></p> + +<source>env CC="cc" CXX="CC" C_FLAGS="-w -O" CXX_FLAGS="-w -O" + ./configure --prefix=$ICUROOT</source> + + <p><em>For HP-UX with the aCC compiler:</em></p> + +<source>env CC="cc" CXX="aCC" C_FLAGS="+DAportable -w -O" + CXX_FLAGS="+DAportable -w -O" ./configure --prefix=$ICUROOT</source> + + <p><em>For HP-UX with the CC compiler:</em></p> + +<source>env CC="cc" CXX="CC" C_FLAGS="+DAportable -w -O" + CXX_FLAGS="+eh +DAportable -w -O" ./configure --prefix=$ICUROOT</source> + + </s3> + + </s2> + + <s2 title="Where to look for more help"> + <p>If you read this page, followed the instructions, and + still cannot resolve your problem(s). You can find out if others have + sovled this same problem before you, by checking the + <jump href="http://xml-archive.webweaving.org/xml-archive-xerces/"> + &XercesCProjectName; mailing list archives</jump>.</p> + + <p>If all else fails, you ask for help by joining the + <jump href="mailto:&XercesCEmailAddress;">&XercesCName; mailing list</jump>.</p> + </s2> + +</s1> \ No newline at end of file diff --git a/doc/caveats.xml b/doc/caveats.xml new file mode 100644 index 0000000000000000000000000000000000000000..366cfb9e8fb4253584115358c799cf168c1d75d0 --- /dev/null +++ b/doc/caveats.xml @@ -0,0 +1,16 @@ +<?xml version="1.0" standalone="no"?> +<!DOCTYPE s1 SYSTEM "sbk:/style/dtd/document.dtd"> + +<s1 title="Caveats and Limitations"> + <s2 title="Miscellaneous"> + <ul> + <li>SAXPrint does not output the <?XML ... ?> prologue + line (this means that it cannot process its own + output). This is because the SAX API doesn't provide + a callback handler for the prologue.</li> + <li>Only URL's of the form 'file://' are currently supported. + Others will be supported in future versions (we're adding + code to call libwww for this support).</li> + </ul> + </s2> +</s1> diff --git a/doc/createdoc.xml b/doc/createdoc.xml new file mode 100644 index 0000000000000000000000000000000000000000..1b1e4d4c636b6d34e7a2362ddaa7e956196a5365 --- /dev/null +++ b/doc/createdoc.xml @@ -0,0 +1,41 @@ +<?xml version="1.0" standalone="no"?> +<!DOCTYPE s1 SYSTEM "sbk:/style/dtd/document.dtd"> + +<s1 title="&XercesCName; Sample 10"> + + <s2 title="CreateDOMDocument"> + <p> CreateDOMDocument, illustrates how you can create a DOM tree in + memory from scratch. It then reports the elements in the tree that + was just created.</p> + + <s3 title="Building on Windows"> + <p>Load the &XercesCInstallDir;-win32\samples\Projects\Win32\VC6\samples.dsw + Microsoft Visual C++ workspace inside your MSVC IDE. Then + build the project marked DOMCount.</p> + </s3> + + <s3 title="Building on UNIX"> +<source>cd &XercesCInstallDir;-linux/samples +./runConfigure -p<platform> -c<C_compiler> -x<C++_compiler> +cd CreateDOMDocument +gmake</source> + <p>This will create the object files in the current directory and the executable named + CreateDOMDocument in ' &XercesCInstallDir;-linux/bin' directory.</p> + + <p>To delete all the generated object files and executables, type</p> +<source>gmake clean</source> + </s3> + + <s3 title="Running CreateDOMDocument"> + + <p>The CreateDOMDocument sample illustrates how you can create a DOM tree + in memory from scratch. To run CreateDOMDocument, enter the following</p> +<source>CreateDOMDocument</source> + <p>Here is a sample output from CreateDOMDocument</p> +<source>cd &XercesCInstallDir;-linux/samples/data +CreateDOMDocument +The tree just created contains: 4 elements.</source> + + </s3> + </s2> +</s1> diff --git a/doc/domcount.xml b/doc/domcount.xml new file mode 100644 index 0000000000000000000000000000000000000000..21e8189a126b88888baf3f62ec90af2ff5b8cf7e --- /dev/null +++ b/doc/domcount.xml @@ -0,0 +1,48 @@ +<?xml version="1.0" standalone="no"?> +<!DOCTYPE s1 SYSTEM "sbk:/style/dtd/document.dtd"> + +<s1 title="&XercesCName; Sample 3"> + + <s2 title="DOMCount"> + <p>DOMCount uses the provided DOM API to parse an XML file, + constructs the DOM tree and walks through the tree counting + the elements (using just one API call).</p> + + <s3 title="Building on Windows"> + <p>Load the &XercesCInstallDir;-win32\samples\Projects\Win32\VC6\samples.dsw + Microsoft Visual C++ workspace inside your MSVC IDE. Then + build the project marked DOMCount.</p> + </s3> + + <s3 title="Building on UNIX"> +<source>cd &XercesCInstallDir;-linux/samples +./runConfigure -p<platform> -c<C_compiler> -x<C++_compiler> +cd DOMCount +gmake</source> + <p>This will create the object files in the current directory + and the executable named DOMCount in ' &XercesCInstallDir;-linux/bin' + directory.</p> + + <p>To delete all the generated object files and executables, type</p> +<source>gmake clean</source> + </s3> + + <s3 title="Running DOMCount"> + + <p>The DOMCount sample parses an XML file and prints out a count of the number of + elements in the file. To run DOMCount, enter the following </p> +<source>DOMCount <XML file></source> + <p>To use the validating parser, use </p> +<source>DOMCount -v <XML file></source> + <p>Here is a sample output from DOMCount</p> +<source>cd &XercesCInstallDir;-linux/samples/data +DOMCount -v personal.xml +personal.xml: 20 ms (37 elems)</source> + + <p>The output of both versions should be same.</p> + + <note>The time reported by the system may be different, depending on your + processor type.</note> + </s3> + </s2> +</s1> diff --git a/doc/domprint.xml b/doc/domprint.xml new file mode 100644 index 0000000000000000000000000000000000000000..b8a85898eb9fe2c147fc62d01b92510840d8624b --- /dev/null +++ b/doc/domprint.xml @@ -0,0 +1,92 @@ +<?xml version="1.0" standalone="no"?> +<!DOCTYPE s1 SYSTEM "sbk:/style/dtd/document.dtd"> + +<s1 title="&XercesCName; Sample 4"> + + <s2 title="DOMPrint"> + <p>DOMPrint parses an XML file, constructs the DOM tree, and walks + through the tree printing each element. It thus dumps the XML back + (output same as SAXPrint).</p> + + <s3 title="Building on Windows"> + <p>Load the &XercesCInstallDir;-win32\samples\Projects\Win32\VC6\samples.dsw + Microsoft Visual C++ workspace inside your MSVC IDE. Then + build the project marked DOMPrint. + </p> + </s3> + <s3 title="Building on UNIX"> +<source>cd &XercesCInstallDir;-linux/samples +./runConfigure -p<platform> -c<C_compiler> -x<C++_compiler> +cd DOMPrint +gmake</source> + <p> + This will create the object files in the current directory and the executable named + DOMPrint in '&XercesCInstallDir;-linux/bin' directory.</p> + + <p>To delete all the generated object files and executables, type</p> +<source>gmake clean</source> + </s3> + + <s3 title="Running DOMPrint"> + + <p>The DOMPrint sample parses an XML file, using either a validating + or non-validating DOM parser configuration, builds a DOM tree, + and then walks the tree and outputs the contents of the nodes + in a 'canonical' format. To run DOMPrint, enter the following:</p> +<source>DOMPrint [-v] <XML file></source> + <p>The -v option is used when you wish to use a validating parser. Here is a + sample output for DOMPrint when the validating parser is used: </p> +<source>cd &XercesCInstallDir;-linux/samples/data +DOMPrint -v personal.xml</source> + <p>Here is a sample output from DOMPrint</p> +<source>cd &XercesCInstallDir;-linux/samples/data +DOMPrint -v personal.xml + +<?xml version='1.0' encoding='utf-8?> +<!-- Revision: 63 1.7 samples/data/personal.xml --> +<personnel> + +<person id="Big.Boss"> + <name><family>Boss</family> <given>Big</given></name> + <email>chief@foo.com</email> + <link subordinates="one.worker two.worker three.worker + four.worker five.worker"></link> +</person> + +<person id="one.worker"> + <name><family>Worker</family> <given>One</given></name> + <email>one@foo.com</email> + <link manager="Big.Boss"></link> +</person> + +<person id="two.worker"> + <name><family>Worker</family> <given>Two</given></name> + <email>two@foo.com</email> + <link manager="Big.Boss"></link> +</person> + +<person id="three.worker"> + <name><family>Worker</family> <given>Three</given></name> + <email>three@foo.com</email> + <link manager="Big.Boss"></link> +</person> + +<person id="four.worker"> + <name><family>Worker</family> <given>Four</given></name> + <email>four@foo.com</email> + <link manager="Big.Boss"></link> +</person> + +<person id="five.worker"> + <name><family>Worker</family> <given>Five</given></name> + <email>five@foo.com</email> + <link manager="Big.Boss"></link> +</person> + +</personnel></source> + <p>Note that DOMPrint does not reproduce the original XML file. Also DOMPrint and + SAXPrint produce different results because of the way the two APIs store data + and capture events.</p> + </s3> + </s2> +</s1> diff --git a/doc/entities.ent b/doc/entities.ent new file mode 100644 index 0000000000000000000000000000000000000000..e19c0a1e627ee9a7384ff55836082047bd20c85c --- /dev/null +++ b/doc/entities.ent @@ -0,0 +1,11 @@ +<?xml encoding="US-ASCII"?> + +<!ENTITY XercesCFullName "Xerces C++ Parser"> <!-- fullproductname --> +<!ENTITY XercesCName "Xerces-C"> <!-- productname --> +<!ENTITY XercesCVersion "1.0.1"> <!-- versionnumber --> +<!ENTITY XercesCProjectName "Xerces"> <!-- projectname --> +<!ENTITY XercesCInstallDir "xerces-c-1_0_1"> <!-- installdirname --> +<!ENTITY XercesCSrcInstallDir "xerces-c-src-1_0_1"> <!-- sourcedirectory --> +<!ENTITY XercesCWindowsLib "xerces-c_1_0"> <!-- windowslibname --> +<!ENTITY XercesCUnixLib "libxerces-c1_0"> <!-- unixlibname --> +<!ENTITY XercesCEmailAddress "xerces-dev@xml.apache.org "> <!-- emailaddress --> diff --git a/doc/enumval.xml b/doc/enumval.xml new file mode 100644 index 0000000000000000000000000000000000000000..3161cc09eb4786f39d977a6026ad0a5d8bcfb888 --- /dev/null +++ b/doc/enumval.xml @@ -0,0 +1,69 @@ +<?xml version="1.0" standalone="no"?> +<!DOCTYPE s1 SYSTEM "sbk:/style/dtd/document.dtd"> + +<s1 title="&XercesCName; Sample 9"> + + <s2 title="EnumVal"> + <p>EnumVal shows how to enumerate the markup decls in a DTD Validator.</p> + + <s3 title="Building on Windows"> + <p>Load the &XercesCInstallDir;-win32\samples\Projects\Win32\VC6\samples.dsw + Microsoft Visual C++ workspace inside your MSVC IDE. Then + build the project marked EnumVal.</p> + </s3> + + <s3 title="Building on UNIX"> +<source>cd &XercesCInstallDir;-linux/samples +./runConfigure -p<platform> -c<C_compiler> -x<C++_compiler> +cd EnumVal +gmake</source> + <p>This will create the object files in the current directory and the executable named + EnumVal in ' &XercesCInstallDir;-linux/bin' directory.</p> + + <p>To delete all the generated object files and executables, type</p> +<source>gmake clean</source> + </s3> + + <s3 title="Running EnumVal"> + <p>This program parses a file, then shows how to enumerate the contents of the validator pools. + To run EnumVal, enter the following </p> +<source>EnumVal <XML file></source> + <p>Here is a sample output from EnumVal</p> +<source>cd &XercesCInstallDir;-linux/samples/data +EnumVal personal.xml + +ELEMENTS: +---------------------------- + Name: personnel + Content Model: (person)+ + + Name: person + Content Model: (name,email*,url*,link?) + Attributes: + Name:id, Type: ID + + Name: name + Content Model: (#PCDATA|family|given)* + + Name: email + Content Model: (#PCDATA)* + + Name: url + Content Model: EMPTY + Attributes: + Name:href, Type: CDATA + + Name: link + Content Model: EMPTY + Attributes: + Name:subordinates, Type: IDREF(S) + Name:manager, Type: IDREF(S) + + Name: family + Content Model: (#PCDATA)* + + Name: given + Content Model: (#PCDATA)*</source> + </s3> + </s2> +</s1> diff --git a/doc/faq-distrib.xml b/doc/faq-distrib.xml new file mode 100644 index 0000000000000000000000000000000000000000..2437a14286915bed4f5e82e1820fd5c51d475bdb --- /dev/null +++ b/doc/faq-distrib.xml @@ -0,0 +1,229 @@ +<?xml version="1.0" ?> +<!DOCTYPE faqs SYSTEM "sbk:/style/dtd/faqs.dtd"> + +<faqs title="Distributing &XercesCName;"> + + <faq title="Which DLL's do I need to distribute with my application?"> + <q>Which DLL's do I need to distribute with my application?</q> + <a> + <p>There are currently two configurations in which Xerces-C binaries + are published. One is from the Apache site, while the other is + from the IBM Alphaworks Site.</p> + + <p>The <jump href="http://xml.apache.org/dist">Apache download + site</jump> binary drops only contain support for ASCII, UTF-8, UTF-16 + and UCS4 encodings. The parser intrinsically supports transcoding + input files in these encodings to Unicode (all internal processing in + the parser happens in Unicode). If you are using these Xerces-C + binaries in your application, then you only need to distribute + <em>one</em> file:<br></br> + &XercesCWindowsLib;.dll for Windows NT/95/98, or<br/> + &XercesCUnixLib;.a for AIX, or<br/> + &XercesCUnixLib;.so for Solaris/Linux, or<br/> + &XercesCUnixLib;.sl for HP-UX.</p> + + <p>However, if your application needs to support more international + encodings, other than the one's mentioned above, then you may use the + XML4C binaries published by IBM at their + <jump href="http://www.alphaworks.ibm.com/tech/xml4c">AlphaWorks</jump> + site. XML4C binaries use and include + <jump href="http://www10.software.ibm.com/developerworks/opensource/icu/index.html">IBM + Classes for Unicode</jump> (ICU) (also an open source project but under + <jump href="http://www10.software.ibm.com/developerworks/opensource/license10.html">IBM + Public License</jump>) for transcoding and as a result can parse input + files in over 100 different encodings. If you are using XML4C binaries + in your application, then in <em>addition</em> to the &XercesCName; library + file mentioned above, you also need to ship:</p> + + <ol> + <li><em>ICU shared library file</em>:<br></br> + icuuc.dll for Windows NT/95/98, or<br></br> + libicu-uc.a for AIX, or<br></br> + libicu-uc.so for Solaris/Linux, or<br></br> + libicu-uc.sl for HP-UX.</li> + <li><em>ICU converter files</em>: *.cnv.<br></br> + These are platform specific binary files which contain the tables used + to transcode characters from the respective encoding to Unicode. These + files may be found in the 'lib/icu/data' directory of the XML4C binary + archives.</li> + </ol> + </a> + </faq> + + <faq title="How do I package the sources to create a binary drop?"> + + <q>How do I package the sources to create a binary drop?</q> + + <a> + <p>You have to first compile the sources inside your IDE to + create the required DLLs and EXEs. Then you need to copy + over the binaries to another directory for the binary + drop. A perl script has been provided to give you a jump + start. You need to install perl on your machine for the script to work. + The file may not work if you have changed your + source tree. You have to modify the script to suit + your current state of the source tree. To invoke the + script, go to the \<&XercesCProjectName;>\scripts directory, and type:</p> +<source>perl packageBinaries.pl</source> + + <p>You will get a message that looks like: </p> + +<source>Usage is: packageBinaries <options> +options are: -s <source_directory> + -o <target_directory> + -c <C compiler name> (e.g. gcc or xlc) + -x <C++ compiler name> (e.g. g++ or xlC) + -m <message loader> can be 'inmem', 'icu' or 'iconv' + -n <net accessor> can be 'fileonly' or 'libwww' + -t <transcoder> can be 'icu' or 'native' + -r <thread option> can be 'pthread' or 'dce' (only used on HP-11) + -h to get help on these commands +Example: perl packageBinaries.pl -s$HOME/xerces-c_1_0_0 + -o$HOME/xerces-c_1_0_0 + -cgcc -xg++ -minmem + -nfileonly -tnative</source> + + <p>Make sure that your compiler can be invoked from the command line and + follow the instructions to produce a binary drop.</p> + </a> + </faq> + + <faq title="When will a port to my platform be available?"> + + <q>When will a port to my platform be available?</q> + + <a> + <p>Ports to other platforms are planned, but dates are not + fixed yet. In the meantime, look below to see a + description of the steps you need to follow to port it to + another platform.</p> + + <p>We strongly encourage you to submit the changes that were + required to make it work on another platform. We will + incorporate these changes in the source code base and make + them available in the future releases.</p> + + <p>All such changes may be sent to: < + <jump href="mailto:&XercesCEmailAddress;">&XercesCEmailAddress;</jump>>.</p> + </a> + </faq> + + <faq title="How can I port &XercesCProjectName; to my favourite platform?"> + <q>How can I port &XercesCProjectName; to my favourite platform?</q> + <a> + <p>All platform dependent code in &XercesCProjectName; has been isolated to + a couple of files, which should ease the porting effort. + Here are the basic steps that should be followed to port + &XercesCProjectName;.</p> + + <ol> + <li>The directory 'src/util/Platforms' contains the + platform sensitive files while 'src/util/Compilers' contains all + development environment sensitive files. Each + operating system has a file of its own and each + development environment has another one of its own too. + <br/>As an example, the Win32 platform as a Win32Defs.hpp file + and the Visual C++ environment has a <code>VCPPDefs.hpp</code> file. + These files set up certain define tokens, typedefs, + constants, etc... that will drive the rest of the code to + do the right thing for that platform and development + environment. AIX/CSet have their own <code>AIXDefs.hpp</code> and + <code>CSetDefs.hpp</code> files, and so on. You should create new + versions of these files for your platform and environment + and follow the comments in them to set up your own. + Probably the comments in the Win32 and Visual C++ will be + the best to follow, since that is where the main + development is done.</li> + + <li>Next, edit the file XML4CDefs.hpp , which is where all + of the fundamental stuff comes into the system. You will + see conditional sections in there where the above + per-platform and per-environment headers are brought in. + Add the new ones for your platform under the appropriate + conditionals.</li> + + <li>Now edit 'AutoSense.hpp'. Here we set canonical &XercesCProjectName; + internal #define tokens which indicate the platform and + compiler. These definitions are based on known platform + and compiler defines. + <br/> + AutoSense.hpp is included in XML4CDefs.hpp and the + canonical platform and compiler settings thus defined will + make the particular platform and compiler headers to be + the included at compilation. + <br/> + It might be a little tricky to decipher this file so be + careful. If you are using say another compiler on Win32, + probably it will use similar tokens so that the platform + will get picked up already using what is already there.</li> + + <li>Once this is done, you will then need to implement a + version of the 'platform utilities' for your platform. + Each operating system has a file which implements some + methods of the XMLPlatformUtils class, specific to that + operating system. These are not terribly complex, so it + should not be a lot of work. The Win32 verions is called + Win32PlatformUtils.cpp, the AIX version is + AIXPlatformUtils.cpp and so on. Create one for your + platform, with the correct name, and empty out all of the + implementation so that just the empty shells of the + methods are there (with dummy returns where needed to make + the compiler happy.) Once you've done that, you can start + to get it to build without any real implementation.</li> + + <li>Once you have the system building, then start + implementing your own platform utilties methods. Follow + the comments in the Win32 version as to what they do, the + comments will be improved in subsequent versions, but they + should be fairly obvious now. Once you have these + implementations done, you should be able to start + debugging the system using the demo programs.</li> + </ol> + <p>That is the work required in a nutshell!</p> + </a> + </faq> + + <faq title="What application did you used to create the documentation?"> + <q>What application did you used to create the documentation?</q> + <a> + <p>We have used an internal XML based application to create the + documentation. The documentation files are all written in XML and the + application, internally codenamed StyleBook, makes use of XSL to transform + it into an HTML document that you are seeing right now. + It is currently available on the + <jump href="http://xml.apache.org/">Apache</jump> open source website as + <jump href="http://xml.apache.org/cocoon/index.html">Cocoon</jump>.</p> + + <p>The API documentation is created using + <jump href="http://www.zib.de/Visual/software/doc++/index.html">DOC++</jump>.</p> + </a> + </faq> + + <faq title="Source code for the C++ Builder TreeViewer?"> + <q>Can I get the source code for the C++ Builder TreeViewer application?</q> + <a> + <p>In view of the numerous requests that we have received for + the TreeViewer sample application (written using C++ + Builder), we have decided to make it available as an + independent download from the IBM + <jump href="http://www.alphaworks.ibm.com">AlphaWorks</jump> portal. Please + note, this is provided on a "as-is, no support" basis.</p> + + <p>TreeViewer parses the XML file, using &XercesCProjectName;, + and displays the data as a tree.</p> + + <p>We welcome your additional feedback at: < + <jump href="mailto:&XercesCEmailAddress;">&XercesCEmailAddress;</jump>></p> + </a> + + </faq> + <faq title="Can I use &XercesCProjectName; in my product?"> + <q>Can I use &XercesCProjectName; in my product?</q> + <a> + <p>Yes! Read the license agreement first and contact us at + <<jump href="mailto:&XercesCEmailAddress;">&XercesCEmailAddress;</jump>> + if you need assistance.</p> + </a> +</faq> +</faqs> + diff --git a/doc/faq-migrate.xml b/doc/faq-migrate.xml new file mode 100644 index 0000000000000000000000000000000000000000..71476ec08c2c81da0dbee234d0bb9b07f575c6b8 --- /dev/null +++ b/doc/faq-migrate.xml @@ -0,0 +1,15 @@ +<?xml version="1.0" standalone="no"?> +<!DOCTYPE faqs SYSTEM "sbk:/style/dtd/faqs.dtd"> + +<faqs title="Migrating to Xerces-C"> + <faq title="Migrating from xml4c"> + <q>tba</q> + <a><p>tba</p> + </a> + </faq> + <faq title="Migrating from X"> + <q>tba</q> + <a><p>tba</p> + </a> + </faq> +</faqs> diff --git a/doc/faq-other.xml b/doc/faq-other.xml new file mode 100644 index 0000000000000000000000000000000000000000..8d55dce9a00ef09e6a80256c766007a12e930c43 --- /dev/null +++ b/doc/faq-other.xml @@ -0,0 +1,42 @@ +<?xml version="1.0" standalone="no"?> +<!DOCTYPE faqs SYSTEM "sbk:/style/dtd/faqs.dtd"> + +<faqs title="Other &XercesCName; Questions"> + + <faq title="I can't use C++. Do you have a Java version?"> + <q>I can't use C++. Do you have a Java version?</q> + <a> + <p>Yes. The &XercesCProjectName; family of products also has a Java version.</p> + </a> + </faq> + + <faq title="I found a bug - what do I do?"> + <q>I found a bug - what do I do?</q> + <a> + <p>Send the bug report to <<jump href="mailto:&XercesCEmailAddress;"> + &XercesCEmailAddress;</jump>> with the version number, + the exact OS release number, the compiler version number, and a + copy of the XML document that generates the error. The more + information you can provide, the faster we can get a fix into the + build!</p> + </a> + </faq> + + <faq title="I have a question not covered -- who + do I contact?"> + <q>I have a question not covered here, or in the documentation -- who do I contact?</q> + <a> + <p>First post your question on the + <jump href="http://www.alphaworks.ibm.com/aw.nsf/discussion?ReadForm&/forum/xml4c.nsf/discussion?createdocument">XML4C + discussion group on Alphaworks</jump>, and someone from the + &XercesCProjectName; development team will answer that question. The list is + monitored very closely and the response is usually within 24 hours. + If you need to ask a special question privately, send email to + <<jump href="mailto:&XercesCEmailAddress;">&XercesCEmailAddress;</jump>> + and give us as much information as you can. We will get back to + you as soon as possible.</p> + </a> + </faq> + +</faqs> + diff --git a/doc/faq-parse.xml b/doc/faq-parse.xml new file mode 100644 index 0000000000000000000000000000000000000000..8feedc0cb3ab5e5093a32ef321e6e3a6d601a073 --- /dev/null +++ b/doc/faq-parse.xml @@ -0,0 +1,674 @@ +<?xml version="1.0" standalone="no"?> +<!DOCTYPE faqs SYSTEM "sbk:/style/dtd/faqs.dtd"> + +<faqs title="Parsing with &XercesCName;"> + <faq title="Why does my application crash on AIX when I run it under a + multi-threaded environment?"> + + <q>Why does my application crash on AIX when I run it under a + multi-threaded environment?</q> + + <a> + <p>AIX maintains two kinds of libraries on the system, + thread-safe and non-thread safe. Multi-threaded libraries on + AIX follow a different naming convention, Usually the + multi-threaded library names are followed with "_r". For + example, libc.a is single threaded whereas libc_r.a is + multi-threaded.</p> + + <p>To make your multi-threaded application run on AIX, you + MUST ensure that you do not have a 'system library path' in + your LIBPATH environment variable when you run the + application. The appropriate libraries (threaded or + non-threaded) are automatically picked up at runtime. An + application usually crashes when you build your application + for multi-threaded operation but don't point to the + thread-safe version of the system libraries. For example, + LIBPATH can be simply set as:</p> + +<source>LIBPATH=$HOME/<&XercesCProjectName;>/lib</source> + + <p>Where <&XercesCProjectName;> points to the directory where + &XercesCProjectName; application resides.</p> + + <p>If for any reason, unrelated to &XercesCProjectName;, you need to + keep a 'system library path' in your LIBPATH environment + variable, you must make sure that you have placed the + thread-safe path before you specify the normal system + path. For example, you must place <ref>/lib/threads</ref> before + <ref>/lib</ref> in your LIBPATH variable. That is to say your + LIBPATH may look like this:</p> + +<source>export LIBPATH=$HOME/<&XercesCProjectName;>/lib:/usr/lib/threads:/usr/lib</source> + + <p>Where /usr/lib is where your system libraries are.</p> + </a> + </faq> + + <faq title="What compilers are being used on the supported platforms?"> + + <q>What compilers are being used on the supported platforms?</q> + + <a> + <p>&XercesCProjectName; has been built on the following platforms with these + compilers</p> + + <table> + <tr><td><em>Operating System</em></td><td><em>Compiler</em></td></tr> + <tr><td>Windows NT SP5/98</td><td>MSVC 6.0</td></tr> + <tr><td>Redhat Linux 6.0</td><td>gcc</td></tr> + <tr><td>AIX 4.1.4 and higher</td><td>xlC 3.1</td></tr> + <tr><td>Solaris 2.6</td><td>CC version 4.2</td></tr> + <tr><td>HP-UX B10.2</td><td>aCC and CC</td></tr> + <tr><td>HP-UX B11</td><td>aCC and CC</td></tr> + </table> + </a> + </faq> + + <faq title="I cannot run my sample applications. What is wrong?"> + + <q>I cannot run my sample applications. What is wrong?</q> + <a> + <p>There are two major installation issues which must be dealt + with in order to use &XercesCProjectName; from your applications. The + DLL or shared library must be locatable via the system's + environment. And, the converter files used by &XercesCProjectName; for + its transcoding must be locatable. + </p> + <p>On UNIX platforms you need to ensure that your library search + environment variable includes the directory which has the + &XercesCProjectName; shared library (On AIX, this is LIBPATH, on + Solaris and Linux it is LD_LIBRARY_PATH while on HP-UX it is + SHLIB_PATH). Thus, if you installed your binaries under + <code>$HOME/fastxmlparser</code>, you need to point your + library path to that directory. + </p> + +<source>export LIBPATH=$LIBPATH:$HOME/fastxmlparser/lib # (AIX) + +export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$HOME/fastxmlparser/lib # (Solaris, Linux) + +export SHLIB_PATH=$SHLIB_PATH:$HOME/fastxmlparser/lib # (HP-UX)</source> + + <p>On Win32, you would ensure that the &XercesCProjectName; DLLs are in + the PATH environment.</p> + + <p>For the transcoding files <code>(*.cnv)</code>, the + easiest mechanism (which is used in the binary release) is to + place them relative to the shared library or DLL. The + transcoding converter files should be in the + <code>icu/data</code> directory relative to the shared library + or DLL. This will allow them to be located automatically.</p> + + <p>However, if you redistribute &XercesCProjectName; within some other + product, and cannot maintain this relationship, or if your + build scenario does not allow you to maintain this + relationship during debugging for instance, you can use the + ICU_DATA environment variable to point to these converter + files (make sure the variable ends with a backslash '\' on Windows platforms). + This variable may be set system wide, within a + particular command window, or just within the client + application or higher level libraries, as is deemed + necessary. It must be set before the XML system is initialized + (see below.) + </p> + </a> + </faq> + + <faq title="I just built my own application using the &XercesCProjectName; parser. Why does it + crash?"> + + <q>I just built my own application using the &XercesCProjectName; parser. Why does it + crash?</q> + <a> + <p>In order to work with the &XercesCProjectName; parser, you have to + first initialize the XML subsystem. The most common mistake is + to forget this initialization. Before you make any calls to + &XercesCProjectName; APIs, you must call</p> + +<source>XMLPlatformUtils::Initialize(): +try { + XMLPlatformUtils::Initialize(); +} +catch (const XMLException& toCatch) { + // Do your failure processing here +}</source> + + <p>This initializes the &XercesCProjectName; system and sets its + internal variables. Note that you must the include + <code>util/PlatformUtils.hpp</code> file for this to work.</p> + + <p>The second common problem is the absence of the transcoding + converter files <code>(*.cnv)</code>. This problem has a + simple fix, if you understand how the transcoding converter + files are searched.</p> + + <p>&XercesCProjectName; first looks for the environment variable + ICU_DATA. If it finds this variable in your environment + settings, then it assumes that the transcoding converter files + are kept in that directory. Thus, for example, if you had set + your environment variable to (say):</p> + +<source>set ICU_DATA=d:\my&XercesCProjectName;\icu\data\</source> + + <p>The transcoding converter files (all files having extension + .cnv and convrtrs.txt) will be searched under + <code>d:\my&XercesCProjectName;\icu\data</code></p> + + <p>If you have not set your environment variable, then the + search for the transcoding converters is done relative to the + location of the shared library &XercesCWindowsLib;.dll (or + &XercesCUnixLib;.a on AIX and &XercesCUnixLib;.so on Solaris and + Linux, &XercesCUnixLib;.sl on HP-UX). Thus if your shared library + is in <code>d:\fastxmlparser\lib</code>, then your transcoding + converter files should be in + <code>d:\fastxmlparser\lib\icu\data.</code></p> + + <p>Before you run your application, make sure that you have + covered the two possibilities mentioned above.</p> + + </a> + </faq> + + + + <faq title="Is &XercesCProjectName; thread-safe?"> + + <q>Is &XercesCProjectName; thread-safe?</q> + + <a> + <p>This is not a question that has a simple yes/no answer. Here are + the rules for using &XercesCProjectName; in a multi-threaded environment:</p> + + <p>Within an address space, an instance of the parser may be used + without restriction from a single thread, or an instance of the + parser can be accessed from multiple threads, provided the + application guarantees that only one thread has entered a method + of the parser at any one time.</p> + + <p>When two or more parser instances exist in a process, the + instances can be used concurrently, and without external + synchronization. That is, in an application containing two + parsers and two threads, one pareser can be running within the + first thread concurrently with the second parser running + within the second thread.</p> + + <p>The same rules apply to &XercesCProjectName; DOM documents - + multiple document instances may be concurrently accessed from + different threads, but any given document instance can only be + accessed by one thread at a time.</p> + + <p>DOMStrings allow multiple concurrent readers. All DOMString + const methods are thread safe, and can be concurrently entered + by multiple threads. Non-const DOMString methods, such as + appendData(), are not thread safe and the application must + guarantee that no other methods (including const methods) are + executed concurrently with them.</p> + </a> + </faq> + + <faq title="Why does my multi-threaded application crash on Solaris?"> + <q>Why does my multi-threaded application crash on Solaris?</q> + <a> + <p>The problem appears because the throw call on Solaris 2.6 + is not multi-thread safe. Sun Microsystems provides a patch to + solve this problem. To get the latest patch for solving this + problem, go to <jump href="http://sunsolve.sun.com">SunSolve.sun.com</jump> + and get the appropriate patch for your operating system. + For Intel machines running Solaris, you need to get Patch ID 104678. + For SPARC machines you need to get Patch ID #105591.</p> + </a> + </faq> + + <faq title="How do I find out what version of &XercesCProjectName; I am using?"> + <q>How do I find out what version of &XercesCProjectName; I am using?</q> + <a> + <p>The version string for &XercesCProjectName; happens to be in one of + the source files. Look inside the file + <code>src/util/XML4CDefs.hpp</code> and find out what the + static variable <code>gXML4CFullVersionStr</code> is defined + to be. (It is usually of type 3.0.0 or something + similar). This is the version of XML you are using.</p> + + <p>If you don't have the source code, you have to find the version + information from the shared library name. On Windows NT/95/98 + right click on the DLL name &XercesCWindowsLib;.dll in the bin directory + and look up properties. The version information may be found on + the Version tab.</p> + + <p>On AIX, just look for the library name &XercesCUnixLib;.a (or + &XercesCUnixLib;.so on Solaris/Linux and &XercesCUnixLib;.sl on + HP-UX). The version number is coded in the name of the + library.</p> + </a> + </faq> + + <faq title="How do I uninstall &XercesCProjectName;?"> + <q>How do I uninstall &XercesCProjectName;?</q> + <a> + <p>&XercesCProjectName; only installs itself in a single directory and + does not set any registry entries. Thus, to un-install, you + only need to remove the directory where you installed it, and + all &XercesCProjectName; related files will be removed.</p> + </a> + </faq> + + <faq title="How do I add an additional transcoding file in the existing set?"> + <q>How do I add an additional transcoding file in the existing set?</q> + <a> + <p>Transcoding files shipped with binary drops of &XercesCProjectName; + exist in the <code>bin/icu/data</code> directory on Win32 and + in the <code>lib/icu/data</code> directory under various + unix's. All transcoding files have the extension .cnv and are + platform specific binary files. The ICU drop provides the + utility 'makeconv' to generate these binary files. To add an + additional transcoding file, you need to first define your new + code-set in ASCII format (which has the extension .ucm). The + coding format for an encoding may be obtained from one of the + existing files in icu/data (in the source drop). After you + create the .ucm file for your new language, you need to + convert it to a binary form using makeconv.</p> + + <p>Thus, if your new code-set is defined in file + mynewcodeset.ucm , you would type:</p> + +<source>makeconv mynewcodeset.ucm</source> + + <p>...to create the binary transcoding file mynewcodeset.cnv. Make + sure that this .cnv file is packaged in the same place as the + others, i.e. in a directory <code>icu/data</code> relative to + where your shared library is.</p> + + <p>You can also add aliases for this encoding in the file + 'convrtrs.txt', also present in the same directory as the + converter files.</p> + </a> + </faq> + + <faq title="How are entity reference nodes handled in DOM?"> + <q>How are entity reference nodes handled in DOM?</q> + <a> + <p>If you are using the native DOM classes, the function + <code>setExpandEntityReferences</code> controls how entities appear in the + DOM tree. When setExpandEntityReferences is set to false (the + default), an occurance of an entity reference in the XML + document will be represented by a subtree with an + EntityReference node at the root whose children represent the + entity expansion. Entity expansion will be a DOM tree + representing the structure of the entity expansion, not a text + node containing the entity expansion as text.</p> + + <p>If setExpandEntityReferences is true, an entity reference in the + XML document is represented by only the nodes that represent the + entity expansion. The DOM tree will not contain any + entityReference nodes.</p> + </a> + </faq> + + <faq title="What kinds of URLs are currently supported in &XercesCProjectName;?"> + <q>What kinds of URLs are currently supported in &XercesCProjectName;?</q> + <a> + <p>We now have a spec. compliant, but limited, implementation of + the class URL.</p> + <ul> + <li>The only protocol currently supported is the "file://" + which is used to refer to files locally.</li> + + <li>Only the 'localhost' string is supported in the host + placeholder in the URL syntax.</li> + </ul> + + <p>This should work for command line arguments to samples as well as + any usage in the XML file when referring to an external file.</p> + + <p>Examples of what this implementation will allow you to do are:</p> + +<source>e:\>domcount file:///e:/&XercesCProjectName;/build/win32/vc6/debug/abc.xml + +or + +e:\>domcount file::///&XercesCProjectName;/build/win32/vc6/debug/abc.xml +e:\>domcount file::///d:/abc.xml + +or + +e:\>domcount file:://localhost/d:/abc.xml</source> + + <p>Example of what you cannot do is:</p> + + <p>Refer to files using the 'file://' syntax and giving a + relative path to the file.</p> + + <p>This implies that if you are using the 'file://' syntax to + refer to external files, you have to give the complete path to + files even in the current directory.</p> + + <p>You always have the option of not using the 'file://' syntax + and referring to files by just giving the filename or a + relative path to it as in:</p> + +<source>domcount abc.xml</source> + </a> + </faq> + + <faq title="Can I use &XercesCProjectName; to parse HTML?"> + <q>Can I use &XercesCProjectName; to parse HTML?</q> + <a> + <p>Yes, if it follows the XML spec rules. Most HTML, however, + does not follow the XML rules, and will therefore generate XML + well-formedness errors.</p> + </a> + </faq> + + <faq title="I keep getting an error: "invalid UTF-8 character". What's wrong?"> + <q>I keep getting an error: "invalid UTF-8 character". What's wrong?</q> + <a> + <p>There are many Unicode characters that are not allowed in + your XML document, according to the XML spec. Typical + disallowed characters are control characters, even if you + escape them using the Character Reference form: See the XML + spec, sections 2.2 and 4.1 for details. If the parser is + generating this error, it is very likely that there's a + character in there that you can't see. You can generally use + a UNIX command like "od -hc" to find it.</p> + + <p>Another reason for this error is that your file is in some + non UTF/ASCII encoding but you gave no encoding="" string in + your file to tell the parser what its real encoding is.</p> + </a> + </faq> + + <faq title="What encodings are supported by &XercesCProjectName;?"> + <q>What encodings are supported by &XercesCProjectName;?</q> + <a> + <p>&XercesCProjectName; uses a subset of IBM's International Classes for + Unicode (ICU) for encoding & Unicode support. &XercesCFullName; is + Unicode 3.0 compliant. Please note that <em>these encodings are supported + only if you build &XercesCName; with ICU library</em>, not as a + standalone library using native encoding support.</p> + + <p>Besides ASCII, the following encodings are currrently supported:</p> + + <table> + <tr><td><em>Common Name</em></td><td><em>Use this MIME/IANA name in XML</em></td></tr> + <tr><td>8 bit Unicode</td><td>UTF-8</td></tr> + <tr><td>ISO Latin 1</td><td>ISO-8859-1</td></tr> + <tr><td>ISO Latin 2</td><td>ISO-8859-2</td></tr> + <tr><td>ISO Latin 3</td><td>ISO-8859-3</td></tr> + <tr><td>ISO Latin 4</td><td>ISO-8859-4</td></tr> + <tr><td>ISO Latin Cyrillic</td><td>ISO-8859-5</td></tr> + <tr><td>ISO Latin Arabic</td><td>ISO-8859-6</td></tr> + <tr><td>ISO Latin Greek</td><td>ISO-8859-7</td></tr> + <tr><td>ISO Latin Hebrew</td><td>ISO-8859-8</td></tr> + <tr><td>ISO Latin 5</td><td>ISO-8859-9</td></tr> + <tr><td>EBCDIC: US</td><td>ebcdic-cp-us</td></tr> + <tr><td>EBCDIC: Canada</td><td>ebcdic-cp-ca</td></tr> + <tr><td>EBCDIC: Netherlands</td><td>ebcdic-cp-nl</td></tr> + <tr><td>EBCDIC: Denmark</td><td>ebcdic-cp-dk</td></tr> + <tr><td>EBCDIC: Norway</td><td>ebcdic-cp-no</td></tr> + <tr><td>EBCDIC: Finland</td><td>ebcdic-cp-fi</td></tr> + <tr><td>EBCDIC: Sweden</td><td>ebcdic-cp-se</td></tr> + <tr><td>EBCDIC: Italy</td><td>ebcdic-cp-it</td></tr> + <tr><td>EBCDIC: Spain, Latin America</td><td>ebcdic-cp-es</td></tr> + <tr><td>EBCDIC: Great Britain</td><td>ebcdic-cp-gb</td></tr> + <tr><td>EBCDIC: France</td><td>ebcdic-cp-fr</td></tr> + <tr><td>EBCDIC: Arabic</td><td>ebcdic-cp-ar1</td></tr> + <tr><td>EBCDIC: Hebrew</td><td>ebcdic-cp-he</td></tr> + <tr><td>EBCDIC: Switzerland</td><td>ebcdic-cp-ch</td></tr> + <tr><td>EBCDIC: Roece</td><td>ebcdic-cp-roece</td></tr> + <tr><td>EBCDIC: Yogoslavia</td><td>ebcdic-cp-yu</td></tr> + <tr><td>EBCDIC: Iceland</td><td>ebcdic-cp-is</td></tr> + <tr><td>EBCDIC: Urdu</td><td>ebcdic-cp-ar2</td></tr> + <tr><td>Chinese for PRC, mixed 1/2 byte</td><td>gb2312</td></tr> + <tr><td>Extended UNIX code, packed for Japanese</td><td>euc-jp</td></tr> + <tr><td>Japanese: iso-2022-jp</td><td>iso-2022-jp</td></tr> + <tr><td>Japanese: Shift JIS</td><td>Shift_JIS</td></tr> + <tr><td>Chinese: Big5</td><td>Big5</td></tr> + <tr><td>Extended UNIX code, packed for Korean</td><td>euc-kr</td></tr> + <tr><td>Korean: iso-2022-kr</td><td>iso-2022-kr</td></tr> + <tr><td>Cyrillic</td><td>koi8-r</td></tr> + </table> + + <!-- <ul> + <li>UTF-8</li> + <li>UTF-16 Big Endian, UTF-16 Little Endian</li> + <li>IBM-1208</li> + <li>ISO Latin-1 (ISO-8859-1)</li> + <li>ISO Latin-2 (ISO-8859-2) [Bosnian, Croatian, Czech, + Hungarian, Polish, Romanian, Serbian (in Latin + transcription), Serbocroation, Slovak, Slovenian, Upper + Sorbian and Lower Sorbian]</li> + <li>ISO Latin-3 (ISO-8859-3) [Maltese, Esperanto]</li> + <li>ISO Latin-4 (ISO-8859-4)</li> + <li>ISO Latin Cyrillic (ISO-8859-5)</li> + <li>ISO Latin Arabic (ISO-8859-6) [Arabic]</li> + <li>ISO Latin Greek (ISO-8859-7)</li> + <li>ISO Latin Hebrew (ISO-8859-8) [Hebrew]</li> + <li>ISO Latin-5 (ISO-8859-9) [Turkish]</li> + <li>Extended Unix Code, packed for Japanese (euc-jp, eucjis)</li> + <li>Japanese Shift JIS (shift-jis)</li> + <li>Chinese (big5)</li> + <li>Extended Unix Code, packed for Korean (euc-kr)</li> + <li>Russian Unix, Cyrillic (koi8-r)</li> + <li>Windows Thai (cp874)</li> + <li>Latin 1 Windows (cp1252)</li> + <li>cp858</li> + <li>EBCDIC encodings:</li></ul><ul> + <li>EBCDIC US (ebcdic-cp-us)</li> + <li>EBCDIC Canada (ebcdic-cp-ca)</li> + <li>EBCDIC Netherland (ebcdic-cp-nl)</li> + <li>EBCDIC Denmark (ebcdic-cp-dk)</li> + <li>EBCDIC Norway (ebcdic-cp-no)</li> + <li>EBCDIC Finland (ebcdic-cp-fi)</li> + <li>EBCDIC Sweden (ebcdic-cp-se)</li> + <li>EBCDIC Italy (ebcdic-cp-it)</li> + <li>EBCDIC Spain & Latin America (ebcdic-cp-es)</li> + <li>EBCDIC Great Britain (ebcdic-cp-gb)</li> + <li>EBCDIC France (ebcdic-cp-fr)</li> + <li>EBCDIC Hebrew (ebcdic-cp-he)</li> + <li>EBCDIC Switzerland (ebcdic-cp-ch)</li> + <li>EBCDIC Roece (ebcdic-cp-roece)</li> + <li>EBCDIC Yugoslavia (ebcdic-cp-yu)</li> + <li>EBCDIC Iceland (ebcdic-cp-is)</li> + <li>EBCDIC Urdu (ebcdic-cp-ar2)</li> + <li>Latin 0 EBCDIC</li></ul> + + <p>Additional encodings to be available later:</p> + + <ul> + <li>EBCDIC Arabic (ebcdic-cp-ar1)</li> + <li>Chinese for PRC (mixed 1/2 byte) (gb2312)</li> + <li>Japanese ISO-2022-JP (iso-2022-jp)</li> + <li>Cyrllic (koi8-r)</li> + </ul> + + <p>The ICU uses IBM's UPMAP format as source files for data-based + conversion. All codepages represented in that format are supported + (i.e: SBCS, DBCS, MBCS and EBCDIC_STATEFUL), with the exception of + codepages with a maximum character length strictly greater than + two bytes (e.g. this excludes 1350 and 964).</p> + + <p>The following is a non-exhaustive list of codepages that are + supported by the international library packaged with the product.</p> + + <p> + ibm-1004, + ibm-1006, + ibm-1008, + ibm-1038, + ibm-1041, + ibm-1043, + ibm-1047, + ibm-1051, + ibm-1088, + ibm-1089, + ibm-1098, + ibm-1112, + ibm-1114, + ibm-1115, + ibm-1116, + ibm-1117, + ibm-1118, + ibm-1119, + ibm-1123, + ibm-1140, + ibm-1141, + ibm-1142, + ibm-1143, + ibm-1144, + ibm-1145, + ibm-1146, + ibm-1147, + ibm-1148, + ibm-1149, + ibm-1153, + ibm-1154, + ibm-1155, + ibm-1156, + ibm-1157, + ibm-1158, + ibm-1159, + ibm-1160, + ibm-1164, + ibm-1250, + ibm-1251, + ibm-1252, + ibm-1253, + ibm-1254, + ibm-1255, + ibm-1256, + ibm-1257, + ibm-1258, + ibm-12712, + ibm-1275, + ibm-1276, + ibm-1277, + ibm-1280, + ibm-1281, + ibm-1282, + ibm-1283, + ibm-1361, + ibm-1362, + ibm-1363, + ibm-1364, + ibm-1370, + ibm-1371, + ibm-1383, + ibm-1386, + ibm-1390, + ibm-1399, + ibm-16684, + ibm-16804, + ibm-17248, + ibm-21427, + ibm-273, + ibm-277, + ibm-278, + ibm-280, + ibm-284, + ibm-285, + ibm-290, + ibm-297, + ibm-37, + ibm-420, + ibm-424, + ibm-437, + ibm-4899, + ibm-4909, + ibm-4930, + ibm-4971, + ibm-500, + ibm-5104, + ibm-5123, + ibm-5210, + ibm-5346, + ibm-5347, + ibm-5349, + ibm-5350, + ibm-5351, + ibm-5352, + ibm-5353, + ibm-5354, + ibm-803, + ibm-808, + ibm-813, + ibm-833, + ibm-834, + ibm-835, + ibm-837, + ibm-848, + ibm-8482, + ibm-849, + ibm-850, + ibm-852, + ibm-855, + ibm-856, + ibm-857, + ibm-858, + ibm-859, + ibm-860, + ibm-861, + ibm-862, + ibm-863, + ibm-864, + ibm-865, + ibm-866, + ibm-867, + ibm-868, + ibm-869, + ibm-871, + ibm-872, + ibm-874, + ibm-878, + ibm-891, + ibm-897, + ibm-901, + ibm-902, + ibm-9027, + ibm-903, + ibm-904, + ibm-9044, + ibm-9049, + ibm-9061, + ibm-907, + ibm-909, + ibm-910, + ibm-912, + ibm-913, + ibm-914, + ibm-915, + ibm-916, + ibm-920, + ibm-921, + ibm-922, + ibm-923, + ibm-9238, + ibm-924, + ibm-930, + ibm-933, + ibm-935, + ibm-937, + ibm-939, + ibm-941, + ibm-942, + ibm-943, + ibm-944, + ibm-946, + ibm-947, + ibm-948, + ibm-949, + ibm-950, + ibm-953, + ibm-955, + ibm-961, + ibm-964, + and ibm-970 + + </p> + --> + +</a> +</faq> + +</faqs> + diff --git a/doc/feedback.xml b/doc/feedback.xml new file mode 100644 index 0000000000000000000000000000000000000000..91b975d717d004e7db37ebb0ea9a20954aaf18fb --- /dev/null +++ b/doc/feedback.xml @@ -0,0 +1,46 @@ +<?xml version="1.0" standalone="no"?> +<!DOCTYPE s1 SYSTEM "sbk:/style/dtd/document.dtd"> + +<s1 title="Feedback Procedures"> + <s2 title="Questions or Comments"> + <p>For all questions or comments, write to the <jump href="mailto:&XercesCEmailAddress;">&XercesCProjectName; mailing list.</jump></p> + <p>If you are submitting a bug (and bug reports are definitely appreciated!), + please provide the following information:</p> + <ul> + <li>Version number of &XercesCName; (&XercesCVersion;?) </li> + <li>Which OS platform/version you are using (NT4+SP4? Win98? Redhat Linux 6.0? Solaris2.6? AIX 4.3? HP-UX10? HP-UX11?) </li> + <li>Which compiler/version you are using (MSVC6? gcc? cc? aCC?) </li> + <li>Sample XML file that causes the bug</li> + <li>Sample Schema file (if required to recreate the bug)</li> + <li>Sample DTD file (if required to recreate the bug)</li> + </ul> + </s2> + + <s2 title="Acknowledgements"> + + <p>Ever since this source code base was initially created, many + people have helped to port the code to different platforms and provide + constructive feedback to fix bugs and enhance features.</p> + + <p>Listed below are some names (in alphabetical order) of people + to whom we would like to give special thanks. </p> + <ul> + <li>Anupam Bagchi</li> + <li>John Bellardo</li> + <li>Arundhati Bhowmick</li> + <li>Paul Ferguson</li> + <li>Pierpaolo Fumagalli</li> + <li>Susan Hardenbrook</li> + <li>Andy Heninger</li> + <li>Rahul Jain</li> + <li>Andy Levine</li> + <li>Michael Ottati</li> + <li>Mike Pogue</li> + <li>Dean Roddey</li> + <li>Steven Rosenthal</li> + <li>Gereon Steffens</li> + <li>Tom Watson</li> + <li>Roger Webster</li> + </ul> + </s2> +</s1> diff --git a/doc/install.xml b/doc/install.xml new file mode 100644 index 0000000000000000000000000000000000000000..2cd7ac90e8328860b91432ace5edd4cfe9024218 --- /dev/null +++ b/doc/install.xml @@ -0,0 +1,74 @@ +<?xml version="1.0" standalone="no"?> +<!DOCTYPE s1 SYSTEM "sbk:/style/dtd/document.dtd"> + +<s1 title="Installation"> + + + <s2 title="Window NT/98"> + <p>Install the binary &XercesCName; release by using <code>unzip</code> + on the <ref>file</ref>-win32.zip archive in the Windows environment. You can + use WinZip, or any other UnZip utility.</p> +<source>unzip &XercesCInstallDir;-win32.zip</source> + <p>This creates a '&XercesCInstallDir;-win32' sub-directory + containing the &XercesCName; distribution. </p> + + <p>You need to add the '&XercesCInstallDir;-win32\bin' + directory to your path: </p> + + <p>To do this under Windows NT, go to the start menu, click the + settings menu and select control panel. When the control panel opens, + double click on System and select the 'Environment' tab. + Locate the PATH variable under system variables + and add <full_path_to_&XercesCInstallDir;>\bin to the PATH variable. + To do this under Windows 95/98 add this line to your AUTOEXEC.BAT file:</p> +<source>SET PATH=<full_path_to_&XercesCInstallDir;>\bin;%PATH%</source> + <p>or run the <code>SET PATH</code> command in your shell window.</p> + </s2> + + <s2 title="UNIX"> + <p>Binary installation of this release is to extract the files from the + compressed .tar archive (using 'tar').</p> +<source>cd $HOME +gunzip &XercesCInstallDir;-linux.tar.gz +tar -xvf &XercesCInstallDir;-linux.tar</source> + <p>This will create an '&XercesCInstallDir;-linux' sub-directory + (in the home directory) + which contains the &XercesCName; distribution. You will need to add the + &XercesCInstallDir;-linux/bin directory to your PATH environment variable:</p> + + <p>For Bourne Shell, K Shell or Bash, type: </p> +<source>export PATH="$PATH:$HOME/&XercesCInstallDir;-linux/bin"</source> + <p>For C Shell, type:</p> +<source>setenv PATH "$PATH:$HOME/&XercesCInstallDir;-linux/bin"</source> + + <p>If you wish to make this setting permanent, you need to change + your profile by changing your setup files which can be either .profile or .kshrc.</p> + + <p>In addition, you will also need to set the environment variables XERCESCROOT, + ICUROOT and the library search path. (LIBPATH on AIX, LD_LIBRARY_PATH on + Solaris and Linux, SHLIB_PATH on HP-UX).</p> + + <note>XERCESCROOT and ICUROOT are needed only if you intend to + recompile the samples or build your own applications. The library path is + necessary to link the shared libraries at runtime.</note> + + <p>For Bourne Shell, K Shell or Bash, type:</p> +<source>export XERCESCROOT=<wherever you installed &XercesCName;> +export ICUROOT=<wherever you installed ICU> +export LIBPATH=$XERCESCROOT/lib:$LIBPATH (on AIX) +export LD_LIBRARY_PATH=$XERCESCROOT/lib:$LD_LIBRARY_PATH (on Solaris, Linux) +export SHLIB_PATH=$XERCESCROOT/lib:$SHLIB_PATH (on HP-UX)</source> + + <p>For C Shell, type:</p> +<source>setenv XERCESCROOT "<wherever you installed &XercesCName;>" +setenv ICUROOT "<wherever you installed ICU>" +setenv LIBPATH "$XERCESCROOT/lib:$LIBPATH" (on AIX) +setenv LD_LIBRARY_PATH "$XERCESCROOT/lib:$LD_LIBRARY_PATH" (on Solaris, Linux) +setenv SHLIB_PATH "$XERCESCROOT/lib:$SHLIB_PATH" (on HP-UX)</source> + + <note>If you need to build the samples after installation, + make sure you read and follow the build instructions given in the + <jump href="faqs.html">FAQ</jump>.</note> + + </s2> +</s1> \ No newline at end of file diff --git a/doc/memparse.xml b/doc/memparse.xml new file mode 100644 index 0000000000000000000000000000000000000000..60acd1174d10a745fd440a02a34c812b30891b62 --- /dev/null +++ b/doc/memparse.xml @@ -0,0 +1,67 @@ +<?xml version="1.0" standalone="no"?> +<!DOCTYPE s1 SYSTEM "sbk:/style/dtd/document.dtd"> + +<s1 title="&XercesCName; Sample 5"> + + <s2 title="MemParse"> + <p>MemParse uses the Validating SAX Parser to parse a memory buffer containing + XML statements, and reports the number of elements and attributes found.</p> + + <s3 title="Building on Windows"> + <p>Load the &XercesCInstallDir;-win32\samples\Projects\Win32\VC6\samples.dsw + Microsoft Visual C++ workspace inside your MSVC IDE. Then + build the project marked MemParse. + </p> + </s3> + <s3 title="Building on UNIX"> +<source>cd &XercesCInstallDir;-linux/samples +./runConfigure -p<platform> -c<C_compiler> -x<C++_compiler> +cd MemParse +gmake</source> + <p> + This will create the object files in the current directory and the executable named + MemParse in ' &XercesCInstallDir;-linux/bin' directory.</p> + + <p>To delete all the generated object files and executables, type</p> +<source>gmake clean</source> + </s3> + + <s3 title="Running MemParse"> + + <p>This program uses the SAX Parser to parse a memory buffer + containing XML statements, and reports the number of elements and attributes + found. </p> +<source>MemParse [-v]</source> + <p>The -v option is used to invoke the Validating SAX Parser instead. + + When invoked with a validating parser: </p> +<source>cd &XercesCInstallDir;-linux/samples/data +MemParse -v</source> + <p>The output is the following:</p> +<source>Finished parsing the memory buffer containing the following XML statements: + +<?xml version='1.0' encoding='ascii'?> +<!DOCTYPE company [ +<!ELEMENT company (product,category,developedAt)> +<!ELEMENT product (#PCDATA)> +<!ELEMENT category (#PCDATA)> +<!ATTLIST category idea CDATA #IMPLIED> +<!ELEMENT developedAt (#PCDATA)> +]> + +<company> + <product>&XercesCName;</product> + <category idea='great'>XML Parsing Tools</category> + <developedAt> + IBM Center for Java Technology, Silicon Valley, Cupertino, CA + </developedAt> +</company> + +Parsing took 0 ms (4 elements, 1 attributes, 16 spaces, 95 characters).</source> + + </s3> + </s2> + + + +</s1> diff --git a/doc/migration.xml b/doc/migration.xml new file mode 100644 index 0000000000000000000000000000000000000000..d2b4b04f2fb2a865051a878a5c2f66720a0c7f33 --- /dev/null +++ b/doc/migration.xml @@ -0,0 +1,316 @@ +<?xml version="1.0" standalone="no"?> +<!DOCTYPE s1 SYSTEM "sbk:/style/dtd/document.dtd"> + +<s1 title="Migrating from V2 to V3"> + <p>This document is a discussion of the technical differences between + the 2.x code base and the new code base. </p> + + <s2 title="General Improvements"> + + <p>The new version is improved in many ways. Some general improvements + are significantly better conformance to the XML spec, cleaner + internal architecture, many bug fixes, and better speed.</p> + + <s3 title="Compliance"> + <p>Except for a couple of the very obscure (mostly related to + the 'standalone' mode), this version should be quite compliant. + We have more than a thousand tests, some collected from various + public sources and some IBM generated, which are used to do + regression testing. The C++ parser is now passing all but a + handful of them.</p> + </s3> + <s3 title="Bug Fixes"> + <p>This version has many bug fixes with regard to Version 2. + Some of these were reported by users and some were brought up by + way of the conformance testing.</p> + </s3> + <s3 title="Speed"> + <p>Much work was done to speed up this version. Some of the + new features, such as namespaces, and conformance checks ended + up eating up some of these gains, but overall the new version + is significantly faster than previous versions, even while doing + more.</p> + </s3> + </s2> + + <s2 title="The Samples"> + <p>The sample programs no longer use any of the unsupported util/xxx + classes. They only existed to + allow us to write portable samples. But, since we feel that the wide + character APIs are supported on a lot of platforms these days, it was + decided to go ahead and just write the samples in terms of these. + If your system does not support these APIs, you will not be able + to build and run the samples. On some platforms, these APIs might + perhaps be optional packages or require runtime updates or some such action.</p> + <p>More samples have been added as well. These show off some of the + new functionality introduced in the V3 code base. And the existing + ones have been tighted up a bit as well.</p> + <p>The new samples are:</p> + <ol> + <li>PParse - Demonstrates 'progressive parse', (see below)</li> + <li>StdInParse - Demonstrates use of the standard in input source</li> + <li>EnumVal - Shows how to enumerate the markup decls in a DTD Validator</li> + </ol> + </s2> + <s2 title="Parser Classes"> + <p>In the 2.x code base, there were the following parser classes + (in the src/parsers/ source directory): NonValidatingSAXParser, + ValidatingSAXParser, NonValidatingDOMParser, ValidatingDOMParser. + The non-validating ones were the base classes and the validating + ones just derived from them and turned on the validation. + This was deemed a little bit overblown, considering the tiny + amount of code required to turn on validation and the fact + that it makes people use a pointer to the parser in most cases + (if they needed to support either validating or non-validating versions.)</p> + <p>The new code base just has SAXParer and DOMParser classes. These + are capable of handling both validating and non-validating modes, + according to the state of a flag that you can set on them. For + instance, here is a code snippet that shows this in action.</p> + +<source>void ParseThis(const XMLCh* const fileToParse, + const bool validate) +{ + // + // Create a SAXParser. It can now just be + // created by value on the stack if we want + // to parse something within this scope. + // + SAXParser myParser; + + // Tell it whether to validate or not + myParser.setDoValidation(validate); + + // Parse and catch exceptions... + try + { + myParser.parse(fileToParse); + } + ... +};</source> + + <p>We feel that this is a simpler architecture, and that it makes things + easier for you. In the above example, for instance, the parser will be + cleaned up for you automatically upon exit since you don't have to + allocate it anymore.</p> + + </s2> + + <s2 title="DOM Level 2 support"> + <p>Experimental early support for some parts of the DOM level + 2 specification have been added. These address some of the + shortcomings in our DOM implementation, + such as a simple, standard mechanism for tree traversal.</p> + + </s2> + <s2 title="Progressive Parsing"> + <p>The new parser classes support, in addition to the <ref>parse()</ref> + method, two new parsing methods, <ref>parseFirst()</ref> and <ref>parseNext()</ref>. + These are designed to support 'progressive parsing', so that + you don't have to depend upon throwing an exception to + terminate the parsing operation. Calling parseFirst() will + cause the DTD (or in the future, Schema) to be parsed (both + internal and external subsets) and any pre-content, i.e. everything + up to but not including the root element. Subsequent calls to + parseNext() will cause one more pieces of markup to be parsed, + and spit out from the core scanning code to the parser (and + hence either on to you if using SAX or into the DOM tree if + using DOM.) You can quit the parse any time by just not + calling parseNext() anymore and breaking out of the loop. When + you call parseNext() and the end of the root element is the + next piece of markup, the parser will continue on to the + end of the file and return false, to let you know that the + parse is done. So a typical progressive parse loop will look + like this:</p> +<source>// Create a progressive scan token +XMLPScanToken token; + +if (!parser.parseFirst(xmlFile, token)) +{ + cerr << "scanFirst() failed\n" << endl; + return 1; +} + +// +// We started ok, so lets call scanNext() +// until we find what we want or hit the end. +// +bool gotMore = true; +while (gotMore && !handler.getDone()) + gotMore = parser.parseNext(token);</source> + + <p>In this case, our event handler object (named 'handler' surprisingly + enough) is watching form some criteria and will return a + status from its getDone() method. Since the handler sees + the SAX events coming out of the SAXParser, it can tell + when it finds what it wants. So we loop until we get no + more data or our handler indicates that it saw what it wanted to see.</p> + <note>In the case of the normal parse() call, the end of the parse + is unambiguous and the parser will flush any open entities on exit + (closing files, sockets, memory buffers, etc...) that were open + when the exit completed. In the progressive parse scenario, it cannot + know when you are done unless you parse to the end. So, to insure + that all entites are closed, you should typically destroy the parser. + If you are going to reuse the parser again and again, each reuse will + implicitly flush any previous content, but any opened entities will + remain opened until you do one of these things.</note> + + <p>Also note that you must create a scan token and pass it back + in on each call. This insures that things don't get done out of + sequence. When you call parseFirst() or parse(), any previous scan + tokens are invalidated and will cause an error if used again. This + prevents incorrect mixed use of the two different parsing + schemes or incorrect calls to parseNext().</p> + </s2> + <s2 title="Namespace support"> + <p>The C++ parser now supports namespaces. With current + XML interfaces (SAX/DOM) this doesn't mean very much because + these APIs are incapable of passing on the namespace information. + However, if you are using our internal APIs to write your own + parsers, you can make use of this new information. Since the + internal event APIs must be able to now support both namespace + and non-namespace information, they have more parameters. These + allow namespace information to be passed along.</p> + <p>Most of the samples now have a new command line parameter to + turn on namespace support. You turn on namespaces like this:</p> + +<source>SAXParser myParser; + +// Tell it whether to do namespacse +myParser.setDoNamespaces(true);</source> + </s2> + + <s2 title="Moved Classes to src/framework"> + <p>Some of the classes previously in the src/internal/ + directory have been moved to their more correct location + in the src/framework/ directory. These are classes used by + the outside world and should have been framework classes + to begin with. Also, to avoid name classes in the absense + of C++ namespace support, some of these classes have been + renamed to make them more XML specific and less likely to + clash. More classes might end up being moved to framework + as well.</p> + <p>So you might have to change a few include statements to + find these classes in their new locations. And you might + have to rename some of the names of the classes, if you + used any of the ones whose names were changed.</p> + </s2> + + <s2 title="Loadable Message Text"> + <p>The system now supoprts loadable message text, instead of + having it hard coded into the program. The current drop + still just supports English, but it can now support other + languages. Anyone interested in contributing any translations + should contact us. This would be an extremely useful service.</p> + + <p>In order to support the local message loading services, we + have created a pretty flexible framework for supporting + loadable text. Firstly, there is now an XML file, in the + src/NLS/ directory, which contains all of the error messages. + There is a simple program, in the Tools/NLSXlat/ directory, + which can spit out that text in various formats. It currently + supports a simple 'in memory' format (i.e. an array of strings), + the Win32 resource format, and the message catalog format. + The 'in memory' format is intended for very simple installations + or for use when porting to a new platform (since you can use + it until you can get your own local message loading support done.)</p> + <p>In the src/util/ directory, there is now an XMLMsgLoader class. + This is an abstraction from which any number of message loading + services can be derived. Your platform driver file can create + whichever type of message loader it wants to use on that platform. + We currently have versions for the in memory format, the Win32 + resource format, and the message catalog format. An ICU one is + present but not implemented yet. Some of the platforms can + support multiple message loaders, in which case a #define token + is used to control which one is used. You can set this in your + build projects to control the message loader type used.</p> + <p> + This is also a good place to mention that the Java and C++ + messages are now the <em>same</em>, since both are being taken + from the same XML message file.</p> + </s2> + + <s2 title="Pluggable Validators"> + <p>In a preliminary move to support Schemas, and to make them + first class citizens just like DTDs, the system has been + reworked internally to make validators completely pluggable. + So now the DTD validator code is under the src/validators/DTD/ + directory, with a future Schema validator probably going into + the src/validators. The core scanner architecture now works + completely in terms of the framework/XMLValidator abstract + interface and knows almost nothing about DTDs or Schemas. For + now, if you don't pass in a validator to the parsers, they + will just create a DTDValidator. This means that, theoretically, + you could write your own validator. But we would not encourage + this for a while, until the semantics of the XMLValidator + interface are completely worked out and proven to handle + DTD and Schema cleanly.</p> + </s2> + + + <s2 title="Pluggable Transcoders"> + <p>Another abstract framework added in the src/util/ directory + is to support pluggable transcoding services. The XMLTransService + class is an abtract API that can be derived from, to support + any desired transcoding service. XMLTranscoder is the + abstract API for a particular instance of a transcoder + for a particular encoding. The platform driver file decides + what specific type of transcoder to use, which allows each + platform to use its native transcoding services, or the ICU + service if desired.</p> + <p>Implementations are provided for Win32 native services, ICU + services, and the <ref>iconv</ref> services available on many Unix + platforms. The Win32 version only provides native code page + services, so it can only handle XML code in the intrinsic + encodings (UTF-8, ASCII, UTF-16, UCS-4.) The ICU version provides + all of the encodings that ICU supports. The <ref>iconv</ref> version will + support the encodings supported by the local system. You can use + transcoders we provide or create your own if you feel ours are + insufficient in some way, or if your platform requires an implementation + that we do not provide.</p> + </s2> + + <s2 title="Util directory Reorganization"> + <p>The src/util directory was becoming somewhat of a dumping + ground of platform and compiler stuff. So we reworked that + directory to better spread things out. The new scheme is: + </p> + + <s3 title="util - The platform independent utility stuff"> + <ul> + <li>MsgLoaders - Holds the msg loader implementations</li> + <ol> + <li>ICU</li> + <li>InMemory</li> + <li>MsgCatalog</li> + <li>Win32</li> + </ol> + <li>Compilers - All the compiler specific files</li> + <li>Transcoders - Holds the transcoder implementations</li> + <ol> + <li>Iconv</li> + <li>ICU</li> + <li>Win32</li> + </ol> + <li>Platforms</li> + <ol> + <li>AIX</li> + <li>HP-UX</li> + <li>Linux</li> + <li>Solaris</li> + <li>....</li> + <li>Win32</li> + </ol> + </ul> + </s3> + <p>This organization makes things much easier to understand. + And it makes it easier to find which files you need and + which are optional. Note that only per-platform files have + any hard coded references to specific message loaders or + transcoders. So if you don't include the ICU implementations + of these services, you don't need to link in ICU or use + any ICU headers. The rest of the system works only in + terms of the abstraction APIs.</p> + + </s2> + +</s1> diff --git a/doc/pparse.xml b/doc/pparse.xml new file mode 100644 index 0000000000000000000000000000000000000000..729ca0ef916b0dd6679481b5c5e3c13a9c8b257e --- /dev/null +++ b/doc/pparse.xml @@ -0,0 +1,47 @@ +<?xml version="1.0" standalone="no"?> +<!DOCTYPE s1 SYSTEM "sbk:/style/dtd/document.dtd"> + +<s1 title="&XercesCName; Sample 7"> + + <s2 title="PParse"> + <p>PParse demonstrates progressive parsing.</p> + <p>In this example, the programmer doesn't have to depend upon throwing + an exception to terminate the parsing operation. Calling parseFirst() will + cause the DTD to be parsed (both internal and external subsets) and any + pre-content, i.e. everything up to but not including the root element. + Subsequent calls to parseNext() will cause one more piece of markup to + be parsed, and spit out from the core scanning code to the parser. You + can quit the parse any time by just not calling parseNext() anymore + and breaking out of the loop. When you call parseNext() and the end + of the root element is the next piece of markup, the parser will + continue on to the end of the file and return false, to let you + know that the parse is done.</p> + + <s3 title="Building on Windows"> + <p>Load the &XercesCInstallDir;win32\samples\Projects\Win32\VC6\samples.dsw + Microsoft Visual C++ workspace inside your MSVC IDE. Then + build the project marked PParse.</p> + </s3> + + <s3 title="Building on UNIX"> +<source>cd &XercesCInstallDir;-linux/samples +./runConfigure -p<platform> -c<C_compiler> -x<C++_compiler> +cd PParse +gmake</source> + <p>This will create the object files in the current directory + and the executable named PParse in ' &XercesCInstallDir;-linux/bin' + directory.</p> + + <p>To delete all the generated object files and executables, type</p> +<source>gmake clean</source> + </s3> + + <s3 title="Running PParse"> + <p>The program looks for the first 16 elements of the XML file, + and reports if successful.</p> +<source>PParse [-v] <XML file></source> + <p>The output is the following:</p> +<source>Got the required 16 elements.</source> + </s3> + </s2> +</s1> diff --git a/doc/program.xml b/doc/program.xml new file mode 100644 index 0000000000000000000000000000000000000000..248273613c44abb0c2db2a1b53d944bfd63d7141 --- /dev/null +++ b/doc/program.xml @@ -0,0 +1,388 @@ +<?xml version="1.0" standalone="no"?> +<!DOCTYPE s1 SYSTEM "sbk:/style/dtd/document.dtd"> + +<s1 title="Programming Guide"> + <s2 title="SAX Programming Guide"> + <s3 title="Constructing a parser"> + <p>In order to use &XercesCName; to parse XML files, you will + need to create an instance of the SAXParser class. The example + below shows the code you need in order to create an instance + of SAXParser. The DocumentHandler and ErrorHandler instances + required by the SAX API are provided using the HandlerBase + class supplied with &XercesCName;.</p> + + <source><![CDATA[ +int main (int argc, char* args[]) { + + try { + XMLPlatformUtils::Initialize(); + } + catch (const XMLException& toCatch) { + cout << "Error during initialization! :\n" + << toCatch.getMessage() << "\n"; + return 1; + } + + char* xmlFile = "x1.xml"; + SAXParser* parser = new SAXParser(); + parser->setDoValidation(true); // optional. + + DocumentHandler* docHandler = new HandlerBase(); + ErrorHandler* errHandler = (ErrorHandler*) docHandler; + parser->setDocumentHandler(docHandler); + parser->setErrorHandler(errHandler); + + try { + parser->parse(xmlFile); + } + catch (const XMLException& toCatch) { + cout << "\nFile not found: '" << xmlFile << "'\n" + << "Exception message is: \n" + << toCatch.getMessage() << "\n" ; + return -1; + } +} + ]]></source> + </s3> + <s3 title="Using the SAX API"> + <p>The SAX API for XML parsers was originally developed for + Java. Please be aware that there is no standard SAX API for + C++, and that use of the &XercesCName; SAX API does not + guarantee client code compatibility with other C++ XML + parsers.</p> + + <p>The SAX API presents a callback based API to the parser. An + application that uses SAX provides an instance of a handler + class to the parser. When the parser detects XML constructs, + it calls the methods of the handler class, passing them + information about the construct that was detected. The most + commonly used handler classes are DocumentHandler which is + called when XML constructs are recognized, and ErrorHandler + which is called when an error occurs. The header files for the + various SAX handler classes are in + '<&XercesCInstallDir;>/include/sax'</p> + + <p>As a convenience, &XercesCName; provides the class + HandlerBase, which is a single class which is publicly derived + from all the Handler classes. HandlerBase's default + implementation of the handler callback methods is to do + nothing. A convenient way to get started with &XercesCName; is + to derive your own handler class from HandlerBase and override + just those methods in HandlerBase which you are interested in + customizing. This simple example shows how to create a handler + which will print element names, and print fatal error + messages. The source code for the sample applications show + additional examples of how to write handler classes.</p> + + <p>This is the header file MySAXHandler.hpp:</p> + <source><![CDATA[ +#include <sax/HandlerBase.hpp> + +class MySAXHandler : public HandlerBase { +public: + void startElement(const XMLCh* const, AttributeList&); + void fatalError(const SAXParseException&); +}; + ]]></source> + + <p>This is the implementation file MySAXHandler.cpp:</p> + + <source><![CDATA[ +#include "MySAXHandler.hpp" +#include <iostream.h> + +MySAXHandler::MySAXHandler() +{ +} + +MySAXHandler::startElement(const XMLCh* const name, + AttributeList& attributes) +{ + // transcode() is an user application defined function which + // converts unicode strings to usual 'char *'. Look at + // the sample program SAXCount for an example implementation. + cout << "I saw element: " << transcode(name) << endl; +} + +MySAXHandler::fatalError(const SAXParseException& exception) +{ + cout << "Fatal Error: " << transcode(exception.getMessage()) + << " at line: " << exception.getLineNumber() + << endl; +} + ]]></source> + + <p>The XMLCh and AttributeList types are supplied by + &XercesCName; and are documented in the include + files. Examples of their usage appear in the source code to + the sample applications.</p> + + </s3> + </s2> + + <s2 title="DOM Programming Guide"> + <s3 title="Java and C++ DOM comparisons"> + <p>The C++ DOM API is very similar in design and use, to the + Java DOM API bindings. As a consequence, conversion of + existing Java code that makes use of the DOM to C++ is a + straight forward process. + </p> + <p> + This section outlines the differences between Java and C++ bindings. + </p> + </s3> + + <s3 title="Accessing the API from application code"> + +<source><![CDATA[ +// C++ +#include <dom/DOM.hpp>]]></source> + +<source><![CDATA[ +// Java +import org.w3c.dom.*]]></source> + + <p>The header file <dom/DOM.hpp> includes all the + individual headers for the DOM API classes. </p> + + </s3> + + + <s3 title="Class Names"> + <p>The C++ class names are prefixed with "DOM_". The intent is + to prevent conflicts between DOM class names and other names + that may already be in use by an application or other + libraries that a DOM based application must link with.</p> + + <p>The use of C++ namespaces would also have solved this + conflict problem, but for the fact that many compilers do not + yet support them.</p> + +<source><![CDATA[ +DOM_Document myDocument; // C++ +DOM_Node aNode; +DOM_Text someText;]]></source> + +<source><![CDATA[ +Document myDocument; // Java +Node aNode; +Text someText;]]></source> + + <p>If you wish to use the Java class names in C++, then you need + to typedef them in C++. This is not advisable for the general + case - conflicts really do occur - but can be very useful when + converting a body of existing Java code to C++.</p> + +<source><![CDATA[ +typedef DOM_Document Document; +typedef DOM_Node Node; + +Document myDocument; // Now C++ usage is + // indistinguishable from Java +Node aNode;]]></source> + </s3> + + + <s3 title="Objects and Memory Management"> + <p>The C++ DOM implementation uses automatic memory management, + implemented using reference counting. As a result, the C++ + code for most DOM operations is very similar to the equivalent + Java code, right down to the use of factory methods in the DOM + document class for nearly all object creation, and the lack of + any explicit object deletion.</p> + + <p>Consider the following code snippets </p> + +<source><![CDATA[ +// This is C++ +DOM_Node aNode; +aNode = someDocument.createElement("ElementName"); +DOM_Node docRootNode = someDoc.getDocumentElement(); +docRootNode.AppendChild(aNode);]]></source> + +<source><![CDATA[ +// This is Java +Node aNode; +aNode = someDocument.createElement("ElementName"); +Node docRootNode = someDoc.getDocumentElement(); +docRootNode.AppendChild(aNode);]]></source> + + <p>The Java and the C++ are identical on the surface, except for + the class names, and this similarity remains true for most DOM + code. </p> + + <p>However, Java and C++ handle objects in somewhat different + ways, making it important to understand a little bit of what + is going on beneath the surface.</p> + + <p>In Java, the variable <code>aNode</code> is an object reference , + essentially a pointer. It is initially == null, and references + an object only after the assignment statement in the second + line of the code.</p> + + <p>In C++ the variable <code>aNode</code> is, from the C++ language's + perspective, an actual live object. It is constructed when the + first line of the code executes, and DOM_Node::operator = () + executes at the second line. The C++ class DOM_Node + essentially a form of a smart-pointer; it implements much of + the behavior of a Java Object Reference variable, and + delegates the DOM behaviors to an implementation class that + lives behind the scenes. </p> + + <p>Key points to remember when using the C++ DOM classes:</p> + + <ul> + <li>Create them as local variables, or as member variables of + some other class. Never "new" a DOM object into the heap or + make an ordinary C pointer variable to one, as this will + greatly confuse the automatic memory management. </li> + + <li>The "real" DOM objects - nodes, attributes, CData + sections, whatever, do live on the heap, are created with the + create... methods on class DOM_Document. DOM_Node and the + other DOM classes serve as reference variables to the + underlying heap objects.</li> + + <li>The visible DOM classes may be freely copied (assigned), + passed as parameters to functions, or returned by value from + functions.</li> + + <li>Memory management of the underlying DOM heap objects is + automatic, implemented by means of reference counting. So long + as some part of a document can be reached, directly or + indirectly, via reference variables that are still alive in + the application program, the corresponding document data will + stay alive in the heap. When all possible paths of access have + been closed off (all of the application's DOM objects have + gone out of scope) the heap data itself will be automatically + deleted. </li> + + <li>There are restrictions on the ability to subclass the DOM + classes. </li> + + </ul> + + </s3> + + <s3 title="DOMString"> + <p>Class DOMString provides the mechanism for passing string + data to and from the DOM API. DOMString is not intended to be + a completely general string class, but rather to meet the + specific needs of the DOM API.</p> + + <p>The design derives from two primary sources: from the DOM's + CharacterData interface and from class java.lang.string</p> + + <p>Main features are:</p> + + <ul> + <li>Unicode, with fixed sized 16 bit storage elements.</li> + + <li>Automatic memory management, using reference counting.</li> + + <li>DOMStrings are mutable - characters can be inserted, + deleted or appended.</li> + + </ul> + <p></p> + + <p>When a string is passed into a method of the DOM, when + setting the value of a Node, for example, the string is cloned + so that any subsequent alteration or reuse of the string by + the application will not alter the document contents. + Similarly, when strings from the document are returned to an + application via the DOM API, the string is cloned so that the + document can not be inadvertently altered by subsequent edits + to the string.</p> + + <note>The ICU classes are a more general solution to UNICODE + character handling for C++ applications. ICU is an Open + Source Unicode library, available at the <jump + href="http://www.software.ibm.com/developerworks/opensource/icu/index.html">IBM + DeveloperWorks website</jump>.</note> + + </s3> + + <s3 title="Equality Testing"> + <p>The DOMString equality operators (and all of the rest of the + DOM class conventions) are modeled after the Java + equivalents. The equals() method compares the content of the + string, while the == operator checks whether the string + reference variables (the application program variables) refer + to the same underlying string in memory. This is also true of + DOM_Node, DOM_Element, etc., in that operator == tells whether + the variables in the application are referring to the same + actual node or not. It's all very Java-like </p> + + <ul> + <li>bool operator == () is true if the DOMString variables + refer to the same underlying storage. </li> + + <li>bool equals() is true if the strings contain the same + characters. </li> + + </ul> + <p>Here is an example of how the equality operators work: </p> + <source><![CDATA[ +DOMString a = "Hello"; +DOMString b = a; +DOMString c = a.clone(); +if (b == a) // This is true +if (a == c) // This is false +if (a.equals(c)) // This is true +b = b + " World"; +if (b == a) // Still true, and the string's + // value is "Hello World" +if (a.equals(c)) // false. a is "Hello World"; + // c is still "Hello". + ]]></source> + </s3> + + <s3 title="Downcasting"> + <p>Application code sometimes must cast an object reference from + DOM_Node to one of the classes deriving from DOM_Node, + DOM_Element, for example. The syntax for doing this in C++ is + different from that in Java.</p> + +<source><![CDATA[ +// This is C++ +DOM_Node aNode = someFunctionReturningNode(); +DOM_Element el = (Element &) aNode;]]></source> + +<source><![CDATA[ +// This is Java +Node aNode = someFunctionReturningNode(); +Element el = (Element) aNode;]]></source> + + <p>The C++ cast is not type-safe; the Java cast is checked for + compatible types at runtime. If necessary, a type-check can + be made in C++ using the node type information: </p> + +<source><![CDATA[ +// This is C++ + +DOM_Node aNode = someFunctionReturningNode(); +DOM_Element el; // by default, el will == null. + +if (anode.getNodeType() == DOM_Node::ELEMENT_NODE) + el = (Element &) aNode; +else + // aNode does not refer to an element. + // Do something to recover here.]]></source> + + </s3> + + <s3 title="Subclassing"> + <p>The C++ DOM classes, DOM_Node, DOM_Attr, DOM_Document, etc., + are not designed to be subclassed by an application + program. </p> + + <p>As an alternative, the DOM_Node class provides a User Data + field for use by applications as a hook for extending nodes by + referencing additional data or objects. See the API + description for DOM_Node for details.</p> + </s3> + + </s2> + +</s1> diff --git a/doc/readme.xml b/doc/readme.xml new file mode 100644 index 0000000000000000000000000000000000000000..8d84c3e6df0c8b827f00d75e8cf6e49f70ede65e --- /dev/null +++ b/doc/readme.xml @@ -0,0 +1,89 @@ +<?xml version="1.0" standalone="no"?> +<!DOCTYPE s1 SYSTEM "sbk:/style/dtd/document.dtd"> + +<s1 title="&XercesCFullName;"> + + <s2 title="&XercesCName;++ Version &XercesCVersion;"> + + <p>&XercesCName; is a validating XML parser written in a portable subset of C++. + &XercesCName; makes it easy to give your application the ability to read and write + <jump href="http://www.w3.org/XML/">XML</jump> data. + A shared library is provided for parsing, generating, manipulating, and validating XML + documents. &XercesCName; is faithful to the + <jump href="http://www.w3.org/TR/1998/REC-xml-19980210">XML 1.0</jump> recommendation + and associated standards ( + <jump href="http://www.w3.org/TR/1998/REC-DOM-Level-1-19981001/level-one-core.html">DOM 1.0</jump>, + <jump href="http://www.megginson.com/SAX/index.html">SAX 1.0</jump>, + <jump href="http://www.w3.org/TR/REC-xml-names/">Namespaces</jump>). It also provides + early implementations of + <jump href="http://www.w3.org/TR/1999/CR-DOM-Level-2-19991210/">DOM Level 2 version 1.0</jump> + and soon it will support + <jump href="http://www.w3.org/XML/Group/Schemas.html">XMLSchema</jump>, both the + <jump href="http://www.w3.org/TR/xmlschema-1/structures.html">Structures</jump> and the + <jump href="http://www.w3.org/TR/xmlschema-2/datatypes.html">Datatypes</jump>. + The parser provides high performance, modularity, and scalability. + Source code, samples and API documentation are provided with the parser. For + portability, care has been taken to make minimal use of templates, + no RTTI, no C++ namespaces, limited use of exceptions and minimal + use of #ifdefs.</p> + + <p>&XercesCName; is fully compliant with + <jump href="http://www.unicode.org/unicode/standard/versions/Unicode3.0.html">Unicode 3.0</jump> + specification, making it the first Unicode 3.0 compliant application.</p> + </s2> + + <s2 title="Applications of the &XercesCProjectName; Parser"> + + <p>&XercesCProjectName; has rich generating and validating capabilities. The parser is used for:</p> + + <ul> + <li>Building XML-savvy Web servers</li> + <li>Building next generation of vertical applications that use XML as + their data format</li> + <li>On-the-fly validation for creating XML editors</li> + <li>Ensuring the integrity of e-business data expressed in XML</li> + <li>Building truly internationalized XML applications</li> + </ul> + </s2> + + <s2 title="Features"> + <ul> + <li>Conforms to <jump href="http://www.w3.org/TR/1998/REC-xml-19980210">XML Spec 1.0</jump></li> + <li>Tracking of latest + <jump href="http://www.w3.org/TR/1998/REC-DOM-Level-1-19981001/level-one-core.html">DOM (Level 1.0)</jump>, + <jump href="http://www.megginson.com/SAX/index.html">SAX</jump> and + <jump href="http://www.w3.org/TR/REC-xml-names/">Namespace</jump> specifications.</li> + <li>Experimental + <jump href="http://www.w3.org/TR/1999/CR-DOM-Level-2-19991210/">DOM Level 2.0</jump> implementation</li> + <li>Source code, samples, and documentation is provided.</li> + <li>Programmatic generation and validation of XML</li> + <li>Pluggable catalogs, validators and encodings</li> + <li>High performance</li> + <li>Customizable error handling</li> + </ul> + </s2> + + <s2 title="Platforms with Binaries"> + <ul> + <li>Win32 (MSVC 6.0 compiler)</li> + <li>Linux (RedHat 6.0)</li> + <li>Solaris 2.6</li> + <li>AIX 4.1.5 and higher</li> + <li>HP-UX 10.2 (aCC and CC)</li> + <li>HP-UX 11 (aCC and CC)</li> + </ul> + </s2> + + <s2 title="Platforms Coming Soon"> + <ul> + <li>FreeBSD</li> + <li>SGI IRIX</li> + <li>Unixware</li> + <li>OS/390</li> + <li>OS/2</li> + <li>AS/400</li> + <li>and more!</li> + </ul> + </s2> + +</s1> \ No newline at end of file diff --git a/doc/redirect.xml b/doc/redirect.xml new file mode 100644 index 0000000000000000000000000000000000000000..4a00d5becfb2af430ffb66a748a99ece06ade313 --- /dev/null +++ b/doc/redirect.xml @@ -0,0 +1,65 @@ +<?xml version="1.0" standalone="no"?> +<!DOCTYPE s1 SYSTEM "sbk:/style/dtd/document.dtd"> + +<s1 title="&XercesCName; Sample 6"> + + <s2 title="Redirect"> + <p>Redirect uses the SAX EntityResolver handler to redirect the + input stream for external entities. It installs an entity + resolver, traps the call to the external DTD file and redirects + it to another specific file which contains the actual DTD.</p> + + <s3 title="Building on Windows"> + <p>Load the &XercesCInstallDir;-win32\samples\Projects\Win32\VC6\samples.dsw + Microsoft Visual C++ workspace inside your MSVC IDE. Then + build the project marked Redirect. + </p> + </s3> + <s3 title="Building on UNIX"> +<source>cd &XercesCInstallDir;-linux/samples +./runConfigure -p<platform> -c<C_compiler> -x<C++_compiler> +cd Redirect +gmake</source> + <p> + This will create the object files in the current directory and the executable named + Redirect in '&XercesCInstallDir;-linux/bin' directory.</p> + + <p>To delete all the generated object files and executables, type</p> +<source>gmake clean</source> + </s3> + + <s3 title="Running Redirect"> + + <p>This program illustrates how a XML application can use the SAX EntityResolver + handler to redirect the input stream for external entities. It installs an entity + resolver, traps the call to the external DTD file and redirects it to another specific + file which contains the actual DTD.</p> + + <p>The program then counts and reports the number of elements and attributes in + the given XML file.</p> +<source>Redirect [-v] <XML file></source> + <p>The -v option is used to invoke the Validating SAX Parser instead.</p> + + <p>When invoked as follows:</p> +<source>cd &XercesCInstallDir;-linux/samples/data +Redirect -v personal.xml</source> + <p>The output is the following:</p> +<source>cd &XercesCInstallDir;-linux/samples/data +Redirect -v personal.xml +personal.xml: 30 ms (37 elems, 12 attrs, 134 spaces, 134 chars)</source> + + <p>External files required to run this sample are 'personal.xml', 'personal.dtd' and + 'redirect.dtd', which are all present in the 'samples/data' directory. Make sure + that you run redirect in the samples/data directory.</p> + + <p>The 'resolveEntity' callback in this sample looks for an external entity with + system id as 'personal.dtd'. When it is asked to resolve this particular external + entity, it creates and returns a new InputSource for the file 'redirect.dtd'.</p> + + <p>A real-world XML application can similarly do application specific processing + when encountering external entities. For example, an application might want to + redirect all references to entities outside of its domain to local cached copies.</p> + + </s3> + </s2> +</s1> diff --git a/doc/releases.xml b/doc/releases.xml new file mode 100644 index 0000000000000000000000000000000000000000..bdd7aaed81ad8640ffdd147d9c58bd9cf13ef9ff --- /dev/null +++ b/doc/releases.xml @@ -0,0 +1,46 @@ +<?xml version="1.0" standalone="no"?> +<!DOCTYPE s1 SYSTEM "sbk:/style/dtd/document.dtd"> + +<s1 title="Releases"> + +<s2 title="&XercesCFullName; Version 1.1.0: January 28, 2000"> + <ul> + <li>Simplified platform support (removed need to support local standard output streams or to find the location of the parser DLL/SharedLib.)</li> + <li>Added support for the NetAccessor plug in abstraction, which allows the parser to support HTTP/FTP based data sources</li> + <li>Added EBCDIC-US and ISO-8859-1 as intrinsic encodings</li> + <li>Added more DOM Level II features</li> + <li>Support for ICU 1.4, which makes &XercesCName; Unicode 3.0 compliant when using ICU</li> + <li>New samples and tests (DOM test, programmatic DOM sample, thread test)</li> + <li>Added support for multiply nested entities using relative paths or URLs</li> + <li>Significant internal architecture improvements in the handling of encodings and transcoding services.</li> + </ul> + </s2> + <s2 title="&XercesCFullName; Version 1.0.1: December 15, 1999"> + <ul> + <li>Port to Solaris.</li> + <li>Improved error recovery and clarified error messages.</li> + <li>Added DOMTest program.</li> + </ul> + </s2> + + <s2 title="&XercesCFullName; Version 1.0.0: December 7, 1999"> + <ul> + <li>Released &XercesCName; after incorporating ICU as a value-added plug-in.</li> + <li>Has bug fixes, better conformance, better speed and cleaner internal internal architecture</li> + <li>Three additional samples added: PParse, StdInParse and EnumVal</li> + <li>Experimental DOM Level 2 support</li> + <li>Support for namespaces</li> + <li>Loadable message text enabling future translations to be easily plugged-in</li> + <li>Pluggable validators</li> + <li>Pluggable transcoders</li> + <li>Reorganized the util directory to better manage different platforms and compilers</li> + </ul> + </s2> + + <s2 title="&XercesCFullName; November 5, 1999"> + <ul> + <li>Created initial code base derived from IBM's XML4C Version 2.0</li> + <li>Modified documentation to reflect new name (Xerces-C)</li> + </ul> + </s2> +</s1> diff --git a/doc/samples.xml b/doc/samples.xml new file mode 100644 index 0000000000000000000000000000000000000000..220b76f1f841bc3affd5e34de43b3b19048e6ea9 --- /dev/null +++ b/doc/samples.xml @@ -0,0 +1,81 @@ +<?xml version="1.0" standalone="no"?> +<!DOCTYPE s1 SYSTEM "sbk:/style/dtd/document.dtd"> + +<s1 title="&XercesCName; Samples"> + + <s2 title="Building the Samples"> + <p>&XercesCName; comes packaged with nine sample applications that + demonstrate salient features of the parser using simple + applications written on top of the SAX and DOM APIs provided by + the parser.</p> + + <p>Once you have set up your PATH variable, you can run the + samples by opening a command window (or your shell prompt for + UNIX environments). Sample XML data files are provided in the + samples/data directory.</p> + + <p>The installation process for the samples is same on all UNIX + platforms. + Note that <em>runConfigure</em> is just a helper script and you are free to + use <em>./configure</em> with the correct parameters to make it work + on any platform-compiler combination of your choice. The script needs the following parameters: + </p> +<source>Usage: runConfigure "options" + where options may be any of the following: + -p <platform> (accepts 'aix', 'linux', 'solaris', 'hp-10', 'hp-11') + -c <C compiler name> (e.g. gcc, xlc_r, cc or aCC) + -x <C++ compiler name> (e.g. g++, xlC_r, CC or aCC) + -d (specifies that you want to build debug version) + -h (get help on the above commands)</source> + + <note><em>NOTE:</em>The code samples in this section assume that you are are working on the Linux binary drop. + If you are using some other UNIX flavor, please replace '-linux' with the appropriate + platform name in the code samples.</note> + + </s2> + + <s2 title="Running the Samples"> + + <p>The sample applications are dependent on the &XercesCName; shared library + (and could also depend on the ICU library if you built &XercesCName; with ICU). + Therefore, on Windows platforms you must make sure that your <code>PATH</code> + environment variable is set properly to pick up these shared libraries at + runtime.</p> + + <p>On UNIX platforms you must ensure that <ref>LIBPATH</ref> + environment variable is set properly to pick up the shared libraries at + runtime. (UNIX gurus will understand here that <ref>LIBPATH</ref> actually + translates to <em>LD_LIBRARY_PATH</em> on Solaris and Linux, <em>SHLIB_PATH</em> on HP-UX + and stays as <em>LIBPATH</em> on AIX).</p> + + <p>To set you LIBPATH (on AIX for example), you would type:</p> +<source>export LIBPATH=&XercesCInstallDir;/lib:$LIBPATH</source> + <p> </p> + + <s3 title="&XercesCName; Samples"> + <ul> + <li><link idref="saxcount">SAXCount</link> + <br/>SAXCount counts the elements, attributes, spaces and + characters in an XML file.</li> + <li><link idref="saxprint">SAXPrint</link> + <br/>SAXPrint parses an XML file and prints it out.</li> + <li><link idref="domcount">DOMCount</link> + <br/>DOMCount counts the elements in a XML file.</li> + <li><link idref="domprint">DOMPrint</link> + <br/>DOMPrint parses an XML file and prints it out.</li> + <li><link idref="memparse">MemParse</link> + <br/>MemParse parses XML in a memory buffer, outputing the number of elements and attributes.</li> + <li><link idref="redirect">Redirect</link> + <br/>Redirect redirects the input stream for external entities.</li> + <li><link idref="pparse">PParse</link> + <br/>PParse demonstrates progressive parsing.</li> + <li><link idref="stdinparse">StdInParse</link> + <br/>StdInParse demonstrates streaming XML data from standard input.</li> + <li><link idref="enumval">EnumVal</link> + <br/>EnumVal shows how to enumerate the markup decls in a DTD Validator.</li> + <li><link idref="createdoc">CreateDOMDocument</link> + <br/>CreateDOMDocument creates a DOM tree in memory from scratch.</li> + </ul> + </s3> + </s2> +</s1> diff --git a/doc/saxcount.xml b/doc/saxcount.xml new file mode 100644 index 0000000000000000000000000000000000000000..cea6d43a61ad511a506a9a0d087ae32f1db51caf --- /dev/null +++ b/doc/saxcount.xml @@ -0,0 +1,51 @@ +<?xml version="1.0" standalone="no"?> +<!DOCTYPE s1 SYSTEM "sbk:/style/dtd/document.dtd"> + +<s1 title="&XercesCName; Sample 1"> + + <s2 title="SAXCount"> + <p>SAXCount is the simplest application that counts the elements and characters of + a given XML file using the (event based) SAX API.</p> + + <s3 title="Building on Windows"> + <p>Load the &XercesCInstallDir;-win32\samples\Projects\Win32\VC6\samples.dsw + Microsoft Visual C++ workspace inside your MSVC IDE. Then + build the project marked SAXCount.</p> + </s3> + + <s3 title="Building on UNIX"> +<source>cd &XercesCInstallDir;-linux/samples +./runConfigure -p<platform> -c<C_compiler> -x<C++_compiler> +cd SAXCount +gmake</source> + <p>This will create the object files in the current directory + and the executable named + SAXCount in '&XercesCInstallDir;-linux/bin' directory.</p> + + <p>To delete all the generated object files and executables, type</p> +<source>gmake clean</source> + </s3> + + <s3 title="Running SAXCount"> + + <p>The SAXCount sample parses an XML file and prints out a count of the number of + elements in the file. To run SAXCount, enter the following </p> +<source>SAXCount <XML File></source> + <p>To use the validating parser, use </p> +<source>SAXCount -v <XML file></source> + <p>Here is a sample output from SAXCount</p> +<source>cd &XercesCInstallDir;-linux/samples/data +SAXCount -v personal.xml +personal.xml: 60 ms (37 elems, 12 attrs, 134 spaces, 134 chars)</source> + <p>Running SAXCount with the validating parser gives a different result because + ignorable white-space is counted separately from regular characters.</p> +<source>SAXCount personal.xml +personal.xml: 10 ms (37 elems, 12 attrs, 0 spaces, 268 chars)</source> + <p>Note that the sum of spaces and chracters in both versions is the same.</p> + + <note>The time reported by the program may be different depending on your + machine processor.</note> + </s3> + + </s2> +</s1> \ No newline at end of file diff --git a/doc/saxprint.xml b/doc/saxprint.xml new file mode 100644 index 0000000000000000000000000000000000000000..ddef3d50784122e2c829c59f3c70146fa7a68d6a --- /dev/null +++ b/doc/saxprint.xml @@ -0,0 +1,89 @@ +<?xml version="1.0" standalone="no"?> +<!DOCTYPE s1 SYSTEM "sbk:/style/dtd/document.dtd"> + +<s1 title="&XercesCName; Sample 2"> + + <s2 title="SAXPrint"> + <p>SAXPrint uses the SAX APIs to parse an XML file and print it back. + Notice that the output of this file is not exactly the same as + the input (in terms of whitespaces), but the output has the same + information content as the input.</p> + + <s3 title="Building on Windows"> + <p>Load the &XercesCInstallDir;-win32\samples\Projects\Win32\VC6\samples.dsw + Microsoft Visual C++ workspace inside your MSVC IDE. Then + build the project marked SAXPrint. + </p> + </s3> + + <s3 title="Building on UNIX"> +<source>cd &XercesCInstallDir;-linux/samples +./runConfigure -p<platform> -c<C_compiler> -x<C++_compiler> +cd SAXPrint +gmake</source> + <p>This will create the object files in the current directory and + the executable named SAXPrint in '&XercesCInstallDir;-linux/bin' + directory.</p> + + <p>To delete all the generated object files and executables, type</p> +<source>gmake clean</source> + </s3> + + <s3 title="Running SAXPrint"> + + <p>The SAXPrint sample parses an XML file and prints out a count of the number of + elements in the file. To run SAXPrint, enter the following </p> +<source>SAXPrint <XML file></source> + <p>To use the validating parser, use </p> +<source>SAXPrint -v <XML file></source> + <p>Here is a sample output from SAXPrint</p> +<source>cd &XercesCInstallDir;-linux/samples/data +SAXPrint -v personal.xml + +<personnel> + + <person id="Big.Boss"> + <name><family>Boss</family> <given>Big</given></name> + <email>chief@foo.com</email> + <link subordinates="one.worker two.worker three.worker + four.worker five.worker"></link> + </person> + + <person id="one.worker"> + <name><family>Worker</family> <given>One</given></name> + <email>one@foo.com</email> + <link manager="Big.Boss"></link> + </person> + + <person id="two.worker"> + <name><family>Worker</family> <given>Two</given></name> + <email>two@foo.com</email> + <link manager="Big.Boss"></link> + </person> + + <person id="three.worker"> + <name><family>Worker</family> <given>Three</given></name> + <email>three@foo.com</email> + <link manager="Big.Boss"></link> + </person> + + <person id="four.worker"> + <name><family>Worker</family> <given>Four</given></name> + <email>four@foo.com</email> + <link manager="Big.Boss"></link> + </person> + + <person id="five.worker"> + <name><family>Worker</family> <given>Five</given></name> + <email>five@foo.com</email> + <link manager="Big.Boss"></link> + </person> + +</personnel></source> + <note>SAXPrint does not reproduce the original XML file. + Also SAXPrint and DOMPrint produce different results because of + the way the two APIs store data and capture events.</note> + </s3> + + </s2> +</s1> diff --git a/doc/stdinparse.xml b/doc/stdinparse.xml new file mode 100644 index 0000000000000000000000000000000000000000..660ce85cc83b9f09e17f79a484ba0822fe46eee2 --- /dev/null +++ b/doc/stdinparse.xml @@ -0,0 +1,44 @@ +<?xml version="1.0" standalone="no"?> +<!DOCTYPE s1 SYSTEM "sbk:/style/dtd/document.dtd"> + +<s1 title="&XercesCName; Sample 8"> + + <s2 title="StdInParse"> + <p>StdInParse demonstrates streaming XML data from standard input.</p> + + <s3 title="Building on Windows"> + <p>Load the &XercesCInstallDir;-win32\samples\Projects\Win32\VC6\samples.dsw + Microsoft Visual C++ workspace inside your MSVC IDE. Then + build the project marked StdInParse.</p> + </s3> + + <s3 title="Building on UNIX"> +<source>cd &XercesCInstallDir;-linux/samples +./runConfigure -p<platform> -c<C_compiler> -x<C++_compiler> +cd StdInParse +gmake</source> + <p>This will create the object files in the current directory + and the executable named + StdInParse in ' &XercesCInstallDir;-linux/bin' directory.</p> + + <p>To delete all the generated object files and executables, type</p> + <source>gmake clean</source> + </s3> + + <s3 title="Running StdInParse"> + <p>The StdInParse sample parses an XML file and prints out a + count of the number of + elements in the file. To run StdInParse, enter the following: </p> +<source>StdInParse < <XML file></source> + <p>Here is a sample output from StdInParse</p> +<source>cd &XercesCInstallDir;-linux/samples/data +StdInParse < personal.xml +personal.xml: 60 ms (37 elems, 12 attrs, 0 spaces, 268 chars)</source> + + <note>The time reported by the program may be different depending on your + machine processor.</note> + </s3> + </s2> + + +</s1> diff --git a/doc/xerces-c_book.xml b/doc/xerces-c_book.xml new file mode 100644 index 0000000000000000000000000000000000000000..a50258e07947a6b4ffe9e511d93939bacbf85072 --- /dev/null +++ b/doc/xerces-c_book.xml @@ -0,0 +1,39 @@ +<?xml version="1.0"?> +<!DOCTYPE book SYSTEM "sbk:/style/dtd/book.dtd"> + +<book title="Xerces-C documentation" copyright="2000 The Apache Software Foundation"> + <external href="../index.html" label="Home"/> + + <separator/> + <document id="index" label="Readme" source="readme.xml" /> + <document id="install" label="Installation" source="install.xml" /> + <document id="build" label="Build" source="build.xml" /> + + <separator/> + <document id="api" label="API Docs" source="apidocs.xml" /> + <document id="samples" label="Samples" source="samples.xml" /> + <hidden id="saxcount" source="saxcount.xml"/> + <hidden id="saxprint" source="saxprint.xml"/> + <hidden id="domcount" source="domcount.xml"/> + <hidden id="domprint" source="domprint.xml"/> + <hidden id="memparse" source="memparse.xml"/> + <hidden id="redirect" source="redirect.xml"/> + <hidden id="pparse" source="pparse.xml"/> + <hidden id="stdinparse" source="stdinparse.xml"/> + <hidden id="enumval" source="enumval.xml"/> + <hidden id="createdoc" source="createdoc.xml"/> + + <document id="program" label="Programming" source="program.xml" /> + + <group id="faqs" label="FAQs"> + <entry id="faq-distrib" source="faq-distrib.xml"/> + <entry id="faq-parse" source="faq-parse.xml" /> + <entry id="faq-migrate" source="faq-migrate.xml" /> + <entry id="faq-other" source="faq-other.xml" /> + </group> + + <separator/> + <document id="releases" label="Releases" source="releases.xml" /> + <document id="caveats" label="Caveats" source="caveats.xml" /> + <document id="feedback" label="Feedback" source="feedback.xml" /> +</book>