source: NonGTP/Xerces/xerces-c_2_8_0/include/xercesc/dom/DOMWriter.hpp @ 2674

Revision 2674, 23.3 KB checked in by mattausch, 16 years ago (diff)
Line 
1#ifndef DOMWriter_HEADER_GUARD_
2#define DOMWriter_HEADER_GUARD_
3
4/*
5 * Licensed to the Apache Software Foundation (ASF) under one or more
6 * contributor license agreements.  See the NOTICE file distributed with
7 * this work for additional information regarding copyright ownership.
8 * The ASF licenses this file to You under the Apache License, Version 2.0
9 * (the "License"); you may not use this file except in compliance with
10 * the License.  You may obtain a copy of the License at
11 *
12 *      http://www.apache.org/licenses/LICENSE-2.0
13 *
14 * Unless required by applicable law or agreed to in writing, software
15 * distributed under the License is distributed on an "AS IS" BASIS,
16 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
17 * See the License for the specific language governing permissions and
18 * limitations under the License.
19 */
20
21/*
22 * $Id: DOMWriter.hpp 568078 2007-08-21 11:43:25Z amassari $
23 */
24
25/**
26 *
27 * DOMWriter provides an API for serializing (writing) a DOM document out in
28 * an XML document. The XML data is written to an output stream, the type of
29 * which depends on the specific language bindings in use. During
30 * serialization of XML data, namespace fixup is done when possible.
31 * <p> <code>DOMWriter</code> accepts any node type for serialization. For
32 * nodes of type <code>Document</code> or <code>Entity</code>, well formed
33 * XML will be created if possible. The serialized output for these node
34 * types is either as a Document or an External Entity, respectively, and is
35 * acceptable input for an XML parser. For all other types of nodes the
36 * serialized form is not specified, but should be something useful to a
37 * human for debugging or diagnostic purposes. Note: rigorously designing an
38 * external (source) form for stand-alone node types that don't already have
39 * one defined in  seems a bit much to take on here.
40 * <p>Within a Document or Entity being serialized, Nodes are processed as
41 * follows Documents are written including an XML declaration and a DTD
42 * subset, if one exists in the DOM. Writing a document node serializes the
43 * entire document.  Entity nodes, when written directly by
44 * <code>writeNode</code> defined in the <code>DOMWriter</code> interface,
45 * output the entity expansion but no namespace fixup is done. The resulting
46 * output will be valid as an external entity.  Entity References nodes are
47 * serializes as an entity reference of the form
48 * <code>"&amp;entityName;"</code>) in the output. Child nodes (the
49 * expansion) of the entity reference are ignored.  CDATA sections
50 * containing content characters that can not be represented in the
51 * specified output encoding are handled according to the
52 * "split-cdata-sections" feature.If the feature is <code>true</code>, CDATA
53 * sections are split, and the unrepresentable characters are serialized as
54 * numeric character references in ordinary content. The exact position and
55 * number of splits is not specified. If the feature is <code>false</code>,
56 * unrepresentable characters in a CDATA section are reported as errors. The
57 * error is not recoverable - there is no mechanism for supplying
58 * alternative characters and continuing with the serialization. All other
59 * node types (DOMElement, DOMText, etc.) are serialized to their corresponding
60 * XML source form.
61 * <p> Within the character data of a document (outside of markup), any
62 * characters that cannot be represented directly are replaced with
63 * character references. Occurrences of '&lt;' and '&amp;' are replaced by
64 * the predefined entities &amp;lt; and &amp;amp. The other predefined
65 * entities (&amp;gt, &amp;apos, etc.) are not used; these characters can be
66 * included directly. Any character that can not be represented directly in
67 * the output character encoding is serialized as a numeric character
68 * reference.
69 * <p> Attributes not containing quotes are serialized in quotes. Attributes
70 * containing quotes but no apostrophes are serialized in apostrophes
71 * (single quotes). Attributes containing both forms of quotes are
72 * serialized in quotes, with quotes within the value represented by the
73 * predefined entity &amp;quot;. Any character that can not be represented
74 * directly in the output character encoding is serialized as a numeric
75 * character reference.
76 * <p> Within markup, but outside of attributes, any occurrence of a character
77 * that cannot be represented in the output character encoding is reported
78 * as an error. An example would be serializing the element
79 * &lt;LaCaï¿œada/&gt; with the encoding="us-ascii".
80 * <p> When requested by setting the <code>normalize-characters</code> feature
81 * on <code>DOMWriter</code>, all data to be serialized, both markup and
82 * character data, is W3C Text normalized according to the rules defined in
83 * . The W3C Text normalization process affects only the data as it is being
84 * written; it does not alter the DOM's view of the document after
85 * serialization has completed.
86 * <p>Namespaces are fixed up during serialization, the serialization process
87 * will verify that namespace declarations, namespace prefixes and the
88 * namespace URIs associated with Elements and Attributes are consistent. If
89 * inconsistencies are found, the serialized form of the document will be
90 * altered to remove them. The algorithm used for doing the namespace fixup
91 * while seralizing a document is a combination of the algorithms used for
92 * lookupNamespaceURI and lookupNamespacePrefix . previous paragraph to be
93 * defined closer here.
94 * <p>Any changes made affect only the namespace prefixes and declarations
95 * appearing in the serialized data. The DOM's view of the document is not
96 * altered by the serialization operation, and does not reflect any changes
97 * made to namespace declarations or prefixes in the serialized output.
98 * <p> While serializing a document the serializer will write out
99 * non-specified values (such as attributes whose <code>specified</code> is
100 * <code>false</code>) if the <code>output-default-values</code> feature is
101 * set to <code>true</code>. If the <code>output-default-values</code> flag
102 * is set to <code>false</code> and the <code>use-abstract-schema</code>
103 * feature is set to <code>true</code> the abstract schema will be used to
104 * determine if a value is specified or not, if
105 * <code>use-abstract-schema</code> is not set the <code>specified</code>
106 * flag on attribute nodes is used to determine if attribute values should
107 * be written out.
108 * <p> Ref to Core spec (1.1.9, XML namespaces, 5th paragraph) entity ref
109 * description about warning about unbound entity refs. Entity refs are
110 * always serialized as &amp;foo;, also mention this in the load part of
111 * this spec.
112 * <p> When serializing a document the DOMWriter checks to see if the document
113 * element in the document is a DOM Level 1 element or a DOM Level 2 (or
114 * higher) element (this check is done by looking at the localName of the
115 * root element). If the root element is a DOM Level 1 element then the
116 * DOMWriter will issue an error if a DOM Level 2 (or higher) element is
117 * found while serializing. Likewise if the document element is a DOM Level
118 * 2 (or higher) element and the DOMWriter sees a DOM Level 1 element an
119 * error is issued. Mixing DOM Level 1 elements with DOM Level 2 (or higher)
120 * is not supported.
121 * <p> <code>DOMWriter</code>s have a number of named features that can be
122 * queried or set. The name of <code>DOMWriter</code> features must be valid
123 * XML names. Implementation specific features (extensions) should choose an
124 * implementation dependent prefix to avoid name collisions.
125 * <p>Here is a list of properties that must be recognized by all
126 * implementations.
127 * <dl>
128 * <dt><code>"normalize-characters"</code></dt>
129 * <dd>
130 * <dl>
131 * <dt><code>true</code></dt>
132 * <dd>[
133 * optional] (default) Perform the W3C Text Normalization of the characters
134 * in document as they are written out. Only the characters being written
135 * are (potentially) altered. The DOM document itself is unchanged. </dd>
136 * <dt>
137 * <code>false</code></dt>
138 * <dd>[required] do not perform character normalization. </dd>
139 * </dl></dd>
140 * <dt>
141 * <code>"split-cdata-sections"</code></dt>
142 * <dd>
143 * <dl>
144 * <dt><code>true</code></dt>
145 * <dd>[required] (default)
146 * Split CDATA sections containing the CDATA section termination marker
147 * ']]&gt;' or characters that can not be represented in the output
148 * encoding, and output the characters using numeric character references.
149 * If a CDATA section is split a warning is issued. </dd>
150 * <dt><code>false</code></dt>
151 * <dd>[
152 * required] Signal an error if a <code>CDATASection</code> contains an
153 * unrepresentable character. </dd>
154 * </dl></dd>
155 * <dt><code>"validation"</code></dt>
156 * <dd>
157 * <dl>
158 * <dt><code>true</code></dt>
159 * <dd>[
160 * optional] Use the abstract schema to validate the document as it is being
161 * serialized. If validation errors are found the error handler is notified
162 * about the error. Setting this state will also set the feature
163 * <code>use-abstract-schema</code> to <code>true</code>. </dd>
164 * <dt><code>false</code></dt>
165 * <dd>[
166 * required] (default) Don't validate the document as it is being
167 * serialized. </dd>
168 * </dl></dd>
169 * <dt><code>"expand-entity-references"</code></dt>
170 * <dd>
171 * <dl>
172 * <dt><code>true</code></dt>
173 * <dd>[
174 * optional] Expand <code>EntityReference</code> nodes when serializing. </dd>
175 * <dt>
176 * <code>false</code></dt>
177 * <dd>[required] (default) Serialize all
178 * <code>EntityReference</code> nodes as XML entity references. </dd>
179 * </dl></dd>
180 * <dt>
181 * <code>"whitespace-in-element-content"</code></dt>
182 * <dd>
183 * <dl>
184 * <dt><code>true</code></dt>
185 * <dd>[required] (
186 * default) Output all white spaces in the document. </dd>
187 * <dt><code>false</code></dt>
188 * <dd>[
189 * optional] Only output white space that is not within element content. The
190 * implementation is expected to use the
191 * <code>isWhitespaceInElementContent</code> flag on <code>Text</code> nodes
192 * to determine if a text node should be written out or not. </dd>
193 * </dl></dd>
194 * <dt>
195 * <code>"discard-default-content"</code></dt>
196 * <dd>
197 * <dl>
198 * <dt><code>true</code></dt>
199 * <dd>[required] (default
200 * ) Use whatever information available to the implementation (i.e. XML
201 * schema, DTD, the <code>specified</code> flag on <code>Attr</code> nodes,
202 * and so on) to decide what attributes and content should be serialized or
203 * not. Note that the <code>specified</code> flag on <code>Attr</code> nodes
204 * in itself is not always reliable, it is only reliable when it is set to
205 * <code>false</code> since the only case where it can be set to
206 * <code>false</code> is if the attribute was created by a Level 1
207 * implementation. </dd>
208 * <dt><code>false</code></dt>
209 * <dd>[required] Output all attributes and
210 * all content. </dd>
211 * </dl></dd>
212 * <dt><code>"format-canonical"</code></dt>
213 * <dd>
214 * <dl>
215 * <dt><code>true</code></dt>
216 * <dd>[optional]
217 * This formatting writes the document according to the rules specified in .
218 * Setting this feature to true will set the feature "format-pretty-print"
219 * to false. </dd>
220 * <dt><code>false</code></dt>
221 * <dd>[required] (default) Don't canonicalize the
222 * output. </dd>
223 * </dl></dd>
224 * <dt><code>"format-pretty-print"</code></dt>
225 * <dd>
226 * <dl>
227 * <dt><code>true</code></dt>
228 * <dd>[optional]
229 * Formatting the output by adding whitespace to produce a pretty-printed,
230 * indented, human-readable form. The exact form of the transformations is
231 * not specified by this specification. Setting this feature to true will
232 * set the feature "format-canonical" to false. </dd>
233 * <dt><code>false</code></dt>
234 * <dd>[required]
235 * (default) Don't pretty-print the result. </dd>
236 * </dl></dd>
237 * </dl>
238 * <p>See also the <a href='http://www.w3.org/TR/2002/WD-DOM-Level-3-ASLS-20020409'>Document Object Model (DOM) Level 3 Abstract Schemas and Load
239 * and Save Specification</a>.
240 *
241 * @since DOM Level 3
242 */
243
244
245#include <xercesc/dom/DOMNode.hpp>
246#include <xercesc/dom/DOMWriterFilter.hpp>
247#include <xercesc/dom/DOMErrorHandler.hpp>
248#include <xercesc/framework/XMLFormatter.hpp>
249
250XERCES_CPP_NAMESPACE_BEGIN
251
252class CDOM_EXPORT DOMWriter {
253protected :
254    // -----------------------------------------------------------------------
255    //  Hidden constructors
256    // -----------------------------------------------------------------------
257    /** @name Hidden constructors */
258    //@{   
259    DOMWriter() {};
260    //@}
261private:       
262    // -----------------------------------------------------------------------
263    // Unimplemented constructors and operators
264    // -----------------------------------------------------------------------
265    /** @name Unimplemented constructors and operators */
266    //@{
267    DOMWriter(const DOMWriter &);
268    DOMWriter & operator = (const DOMWriter &);
269    //@}
270
271
272public:
273    // -----------------------------------------------------------------------
274    //  All constructors are hidden, just the destructor is available
275    // -----------------------------------------------------------------------
276    /** @name Destructor */
277    //@{
278    /**
279     * Destructor
280     *
281     */
282    virtual ~DOMWriter() {};
283    //@}
284
285    // -----------------------------------------------------------------------
286    //  Virtual DOMWriter interface
287    // -----------------------------------------------------------------------
288    /** @name Functions introduced in DOM Level 3 */
289    //@{
290    // -----------------------------------------------------------------------
291    //  Feature methods
292    // -----------------------------------------------------------------------
293    /**
294     * Query whether setting a feature to a specific value is supported.
295     * <br>The feature name has the same form as a DOM hasFeature string.
296     *
297     *  <p><b>"Experimental - subject to change"</b></p>
298     *
299     * @param featName The feature name, which is a DOM has-feature style string.
300     * @param state The requested state of the feature (<code>true</code> or
301     *   <code>false</code>).
302     * @return <code>true</code> if the feature could be successfully set to
303     *   the specified value, or <code>false</code> if the feature is not
304     *   recognized or the requested value is not supported. The value of
305     *   the feature itself is not changed.
306     * @since DOM Level 3
307     */
308    virtual bool           canSetFeature(const XMLCh* const featName
309                                       , bool               state) const = 0;
310    /**
311     * Set the state of a feature.
312     * <br>The feature name has the same form as a DOM hasFeature string.
313     * <br>It is possible for a <code>DOMWriter</code> to recognize a feature
314     * name but to be unable to set its value.
315     *
316     *  <p><b>"Experimental - subject to change"</b></p>
317     *
318     * @param featName The feature name.
319     * @param state The requested state of the feature (<code>true</code> or
320     *   <code>false</code>).
321     * @exception DOMException
322     *   Raise a NOT_SUPPORTED_ERR exception when the <code>DOMWriter</code>
323     *   recognizes the feature name but cannot set the requested value.
324     *   <br>Raise a NOT_FOUND_ERR When the <code>DOMWriter</code> does not
325     *   recognize the feature name.
326     * @see   getFeature
327     * @since DOM Level 3
328     */
329    virtual void            setFeature(const XMLCh* const featName
330                                     , bool               state) = 0;
331
332    /**
333     * Look up the value of a feature.
334     * <br>The feature name has the same form as a DOM hasFeature string
335     * @param featName The feature name, which is a string with DOM has-feature
336     *   syntax.
337     * @return The current state of the feature (<code>true</code> or
338     *   <code>false</code>).
339     * @exception DOMException
340     *   Raise a NOT_FOUND_ERR When the <code>DOMWriter</code> does not
341     *   recognize the feature name.
342     *
343     *  <p><b>"Experimental - subject to change"</b></p>
344     *
345     * @see   setFeature
346     * @since DOM Level 3
347     */
348    virtual bool               getFeature(const XMLCh* const featName) const = 0;
349
350    // -----------------------------------------------------------------------
351    //  Setter methods
352    // -----------------------------------------------------------------------
353    /**
354     * The character encoding in which the output will be written.
355     * <br> The encoding to use when writing is determined as follows: If the
356     * encoding attribute has been set, that value will be used.If the
357     * encoding attribute is <code>null</code> or empty, but the item to be
358     * written includes an encoding declaration, that value will be used.If
359     * neither of the above provides an encoding name, a default encoding of
360     * "UTF-8" will be used.
361     * <br>The default value is <code>null</code>.
362     *
363     *  <p><b>"Experimental - subject to change"</b></p>
364     *
365     * @param encoding    The character encoding in which the output will be written.
366     * @see   getEncoding
367     * @since DOM Level 3
368     */
369    virtual void           setEncoding(const XMLCh* const encoding) = 0;
370
371    /**
372     * The end-of-line sequence of characters to be used in the XML being
373     * written out. The only permitted values are these:
374     * <dl>
375     * <dt><code>null</code></dt>
376     * <dd>
377     * Use a default end-of-line sequence. DOM implementations should choose
378     * the default to match the usual convention for text files in the
379     * environment being used. Implementations must choose a default
380     * sequence that matches one of those allowed by  2.11 "End-of-Line
381     * Handling". </dd>
382     * <dt>CR</dt>
383     * <dd>The carriage-return character (#xD).</dd>
384     * <dt>CR-LF</dt>
385     * <dd> The
386     * carriage-return and line-feed characters (#xD #xA). </dd>
387     * <dt>LF</dt>
388     * <dd> The line-feed
389     * character (#xA). </dd>
390     * </dl>
391     * <br>The default value for this attribute is <code>null</code>.
392     *
393     *  <p><b>"Experimental - subject to change"</b></p>
394     *
395     * @param newLine      The end-of-line sequence of characters to be used.
396     * @see   getNewLine
397     * @since DOM Level 3
398     */
399    virtual void          setNewLine(const XMLCh* const newLine) = 0;
400
401    /**
402     * The error handler that will receive error notifications during
403     * serialization. The node where the error occured is passed to this
404     * error handler, any modification to nodes from within an error
405     * callback should be avoided since this will result in undefined,
406     * implementation dependent behavior.
407     *
408     *  <p><b>"Experimental - subject to change"</b></p>
409     *
410     * @param errorHandler The error handler to be used.
411     * @see   getErrorHandler
412     * @since DOM Level 3
413     */
414    virtual void         setErrorHandler(DOMErrorHandler *errorHandler) = 0;
415
416    /**
417     * When the application provides a filter, the serializer will call out
418     * to the filter before serializing each Node. Attribute nodes are never
419     * passed to the filter. The filter implementation can choose to remove
420     * the node from the stream or to terminate the serialization early.
421     *
422     *  <p><b>"Experimental - subject to change"</b></p>
423     *
424     * @param filter       The writer filter to be used.
425     * @see   getFilter
426     * @since DOM Level 3
427     */
428    virtual void         setFilter(DOMWriterFilter *filter) = 0;
429
430    // -----------------------------------------------------------------------
431    //  Getter methods
432    // -----------------------------------------------------------------------
433    /**
434     * Return the character encoding in which the output will be written.
435     *
436     *  <p><b>"Experimental - subject to change"</b></p>
437     *
438     * @return The character encoding used.
439     * @see   setEncoding
440     * @since DOM Level 3
441     */
442     virtual const XMLCh*       getEncoding() const = 0;
443
444    /**
445     * Return the end-of-line sequence of characters to be used in the XML being
446     * written out.
447     *
448     *  <p><b>"Experimental - subject to change"</b></p>
449     *
450     * @return             The end-of-line sequence of characters to be used.
451     * @see   setNewLine
452     * @since DOM Level 3
453     */
454     virtual const XMLCh*       getNewLine() const = 0;
455
456    /**
457     * Return the error handler that will receive error notifications during
458     * serialization.
459     *
460     *  <p><b>"Experimental - subject to change"</b></p>
461     *
462     * @return             The error handler to be used.
463     * @see   setErrorHandler
464     * @since DOM Level 3
465     */
466     virtual DOMErrorHandler*   getErrorHandler() const = 0;
467
468    /**
469     * Return the WriterFilter used.
470     *
471     *  <p><b>"Experimental - subject to change"</b></p>
472     *
473     * @return             The writer filter used.
474     * @see   setFilter
475     * @since DOM Level 3
476     */
477     virtual DOMWriterFilter*   getFilter() const = 0;
478
479    // -----------------------------------------------------------------------
480    //  Write methods
481    // -----------------------------------------------------------------------
482    /**
483     * Write out the specified node as described above in the description of
484     * <code>DOMWriter</code>. Writing a Document or Entity node produces a
485     * serialized form that is well formed XML. Writing other node types
486     * produces a fragment of text in a form that is not fully defined by
487     * this document, but that should be useful to a human for debugging or
488     * diagnostic purposes.
489     *
490     *  <p><b>"Experimental - subject to change"</b></p>
491     *
492     * @param destination The destination for the data to be written.
493     * @param nodeToWrite The <code>Document</code> or <code>Entity</code> node to
494     *   be written. For other node types, something sensible should be
495     *   written, but the exact serialized form is not specified.
496     * @return  Returns <code>true</code> if <code>node</code> was
497     *   successfully serialized and <code>false</code> in case a failure
498     *   occured and the failure wasn't canceled by the error handler.
499     * @since DOM Level 3
500     */
501    virtual bool       writeNode(XMLFormatTarget* const destination
502                               , const DOMNode         &nodeToWrite) = 0;
503
504    /**
505     * Serialize the specified node as described above in the description of
506     * <code>DOMWriter</code>. The result of serializing the node is
507     * returned as a string. Writing a Document or Entity node produces a
508     * serialized form that is well formed XML. Writing other node types
509     * produces a fragment of text in a form that is not fully defined by
510     * this document, but that should be useful to a human for debugging or
511     * diagnostic purposes.
512     *
513     *  <p><b>"Experimental - subject to change"</b></p>
514     *
515     * @param nodeToWrite  The node to be written.
516     * @return  Returns the serialized data, or <code>null</code> in case a
517     *   failure occured and the failure wasn't canceled by the error
518     *   handler.   The returned string is always in UTF-16.
519     *   The encoding information available in DOMWriter is ignored in writeToString().
520     * @since DOM Level 3
521     */
522    virtual XMLCh*     writeToString(const DOMNode &nodeToWrite) = 0;
523
524    //@}
525
526    // -----------------------------------------------------------------------
527    //  Non-standard Extension
528    // -----------------------------------------------------------------------
529    /** @name Non-standard Extension */
530    //@{
531    /**
532     * Called to indicate that this Writer is no longer in use
533     * and that the implementation may relinquish any resources associated with it.
534     *
535     * Access to a released object will lead to unexpected result.
536     */
537    virtual void              release() = 0;
538    //@}
539
540
541};
542
543XERCES_CPP_NAMESPACE_END
544
545#endif
Note: See TracBrowser for help on using the repository browser.