Context Navigation

← Previous Revision
Latest Revision
Next Revision →
Blame
Revision Log

DOMWriter.hpp @ 2674

Revision 2674, 23.3 KB checked in by mattausch, 16 years ago (diff)

Line
1	#ifndef DOMWriter_HEADER_GUARD_
2	#define DOMWriter_HEADER_GUARD_
3
4	/*
5	* Licensed to the Apache Software Foundation (ASF) under one or more
6	* contributor license agreements. See the NOTICE file distributed with
7	* this work for additional information regarding copyright ownership.
8	* The ASF licenses this file to You under the Apache License, Version 2.0
9	* (the "License"); you may not use this file except in compliance with
10	* the License. You may obtain a copy of the License at
11	*
12	* http://www.apache.org/licenses/LICENSE-2.0
13	*
14	* Unless required by applicable law or agreed to in writing, software
15	* distributed under the License is distributed on an "AS IS" BASIS,
16	* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
17	* See the License for the specific language governing permissions and
18	* limitations under the License.
19	*/
20
21	/*
22	* $Id: DOMWriter.hpp 568078 2007-08-21 11:43:25Z amassari $
23	*/
24
25	/**
26	*
27	* DOMWriter provides an API for serializing (writing) a DOM document out in
28	* an XML document. The XML data is written to an output stream, the type of
29	* which depends on the specific language bindings in use. During
30	* serialization of XML data, namespace fixup is done when possible.
31	* <p> <code>DOMWriter</code> accepts any node type for serialization. For
32	* nodes of type <code>Document</code> or <code>Entity</code>, well formed
33	* XML will be created if possible. The serialized output for these node
34	* types is either as a Document or an External Entity, respectively, and is
35	* acceptable input for an XML parser. For all other types of nodes the
36	* serialized form is not specified, but should be something useful to a
37	* human for debugging or diagnostic purposes. Note: rigorously designing an
38	* external (source) form for stand-alone node types that don't already have
39	* one defined in seems a bit much to take on here.
40	* <p>Within a Document or Entity being serialized, Nodes are processed as
41	* follows Documents are written including an XML declaration and a DTD
42	* subset, if one exists in the DOM. Writing a document node serializes the
43	* entire document. Entity nodes, when written directly by
44	* <code>writeNode</code> defined in the <code>DOMWriter</code> interface,
45	* output the entity expansion but no namespace fixup is done. The resulting
46	* output will be valid as an external entity. Entity References nodes are
47	* serializes as an entity reference of the form
48	* <code>"&entityName;"</code>) in the output. Child nodes (the
49	* expansion) of the entity reference are ignored. CDATA sections
50	* containing content characters that can not be represented in the
51	* specified output encoding are handled according to the
52	* "split-cdata-sections" feature.If the feature is <code>true</code>, CDATA
53	* sections are split, and the unrepresentable characters are serialized as
54	* numeric character references in ordinary content. The exact position and
55	* number of splits is not specified. If the feature is <code>false</code>,
56	* unrepresentable characters in a CDATA section are reported as errors. The
57	* error is not recoverable - there is no mechanism for supplying
58	* alternative characters and continuing with the serialization. All other
59	* node types (DOMElement, DOMText, etc.) are serialized to their corresponding
60	* XML source form.
61	* <p> Within the character data of a document (outside of markup), any
62	* characters that cannot be represented directly are replaced with
63	* character references. Occurrences of '<' and '&' are replaced by
64	* the predefined entities &lt; and &amp. The other predefined
65	* entities (&gt, &apos, etc.) are not used; these characters can be
66	* included directly. Any character that can not be represented directly in
67	* the output character encoding is serialized as a numeric character
68	* reference.
69	* <p> Attributes not containing quotes are serialized in quotes. Attributes
70	* containing quotes but no apostrophes are serialized in apostrophes
71	* (single quotes). Attributes containing both forms of quotes are
72	* serialized in quotes, with quotes within the value represented by the
73	* predefined entity &quot;. Any character that can not be represented
74	* directly in the output character encoding is serialized as a numeric
75	* character reference.
76	* <p> Within markup, but outside of attributes, any occurrence of a character
77	* that cannot be represented in the output character encoding is reported
78	* as an error. An example would be serializing the element
79	* <LaCaï¿œada/> with the encoding="us-ascii".
80	* <p> When requested by setting the <code>normalize-characters</code> feature
81	* on <code>DOMWriter</code>, all data to be serialized, both markup and
82	* character data, is W3C Text normalized according to the rules defined in
83	* . The W3C Text normalization process affects only the data as it is being
84	* written; it does not alter the DOM's view of the document after
85	* serialization has completed.
86	* <p>Namespaces are fixed up during serialization, the serialization process
87	* will verify that namespace declarations, namespace prefixes and the
88	* namespace URIs associated with Elements and Attributes are consistent. If
89	* inconsistencies are found, the serialized form of the document will be
90	* altered to remove them. The algorithm used for doing the namespace fixup
91	* while seralizing a document is a combination of the algorithms used for
92	* lookupNamespaceURI and lookupNamespacePrefix . previous paragraph to be
93	* defined closer here.
94	* <p>Any changes made affect only the namespace prefixes and declarations
95	* appearing in the serialized data. The DOM's view of the document is not
96	* altered by the serialization operation, and does not reflect any changes
97	* made to namespace declarations or prefixes in the serialized output.
98	* <p> While serializing a document the serializer will write out
99	* non-specified values (such as attributes whose <code>specified</code> is
100	* <code>false</code>) if the <code>output-default-values</code> feature is
101	* set to <code>true</code>. If the <code>output-default-values</code> flag
102	* is set to <code>false</code> and the <code>use-abstract-schema</code>
103	* feature is set to <code>true</code> the abstract schema will be used to
104	* determine if a value is specified or not, if
105	* <code>use-abstract-schema</code> is not set the <code>specified</code>
106	* flag on attribute nodes is used to determine if attribute values should
107	* be written out.
108	* <p> Ref to Core spec (1.1.9, XML namespaces, 5th paragraph) entity ref
109	* description about warning about unbound entity refs. Entity refs are
110	* always serialized as &foo;, also mention this in the load part of
111	* this spec.
112	* <p> When serializing a document the DOMWriter checks to see if the document
113	* element in the document is a DOM Level 1 element or a DOM Level 2 (or
114	* higher) element (this check is done by looking at the localName of the
115	* root element). If the root element is a DOM Level 1 element then the
116	* DOMWriter will issue an error if a DOM Level 2 (or higher) element is
117	* found while serializing. Likewise if the document element is a DOM Level
118	* 2 (or higher) element and the DOMWriter sees a DOM Level 1 element an
119	* error is issued. Mixing DOM Level 1 elements with DOM Level 2 (or higher)
120	* is not supported.
121	* <p> <code>DOMWriter</code>s have a number of named features that can be
122	* queried or set. The name of <code>DOMWriter</code> features must be valid
123	* XML names. Implementation specific features (extensions) should choose an
124	* implementation dependent prefix to avoid name collisions.
125	* <p>Here is a list of properties that must be recognized by all
126	* implementations.
127	* <dl>
128	* <dt><code>"normalize-characters"</code></dt>
129	* <dd>
130	* <dl>
131	* <dt><code>true</code></dt>
132	* <dd>[
133	* optional] (default) Perform the W3C Text Normalization of the characters
134	* in document as they are written out. Only the characters being written
135	* are (potentially) altered. The DOM document itself is unchanged. </dd>
136	* <dt>
137	* <code>false</code></dt>
138	* <dd>[required] do not perform character normalization. </dd>
139	* </dl></dd>
140	* <dt>
141	* <code>"split-cdata-sections"</code></dt>
142	* <dd>
143	* <dl>
144	* <dt><code>true</code></dt>
145	* <dd>[required] (default)
146	* Split CDATA sections containing the CDATA section termination marker
147	* ']]>' or characters that can not be represented in the output
148	* encoding, and output the characters using numeric character references.
149	* If a CDATA section is split a warning is issued. </dd>
150	* <dt><code>false</code></dt>
151	* <dd>[
152	* required] Signal an error if a <code>CDATASection</code> contains an
153	* unrepresentable character. </dd>
154	* </dl></dd>
155	* <dt><code>"validation"</code></dt>
156	* <dd>
157	* <dl>
158	* <dt><code>true</code></dt>
159	* <dd>[
160	* optional] Use the abstract schema to validate the document as it is being
161	* serialized. If validation errors are found the error handler is notified
162	* about the error. Setting this state will also set the feature
163	* <code>use-abstract-schema</code> to <code>true</code>. </dd>
164	* <dt><code>false</code></dt>
165	* <dd>[
166	* required] (default) Don't validate the document as it is being
167	* serialized. </dd>
168	* </dl></dd>
169	* <dt><code>"expand-entity-references"</code></dt>
170	* <dd>
171	* <dl>
172	* <dt><code>true</code></dt>
173	* <dd>[
174	* optional] Expand <code>EntityReference</code> nodes when serializing. </dd>
175	* <dt>
176	* <code>false</code></dt>
177	* <dd>[required] (default) Serialize all
178	* <code>EntityReference</code> nodes as XML entity references. </dd>
179	* </dl></dd>
180	* <dt>
181	* <code>"whitespace-in-element-content"</code></dt>
182	* <dd>
183	* <dl>
184	* <dt><code>true</code></dt>
185	* <dd>[required] (
186	* default) Output all white spaces in the document. </dd>
187	* <dt><code>false</code></dt>
188	* <dd>[
189	* optional] Only output white space that is not within element content. The
190	* implementation is expected to use the
191	* <code>isWhitespaceInElementContent</code> flag on <code>Text</code> nodes
192	* to determine if a text node should be written out or not. </dd>
193	* </dl></dd>
194	* <dt>
195	* <code>"discard-default-content"</code></dt>
196	* <dd>
197	* <dl>
198	* <dt><code>true</code></dt>
199	* <dd>[required] (default
200	* ) Use whatever information available to the implementation (i.e. XML
201	* schema, DTD, the <code>specified</code> flag on <code>Attr</code> nodes,
202	* and so on) to decide what attributes and content should be serialized or
203	* not. Note that the <code>specified</code> flag on <code>Attr</code> nodes
204	* in itself is not always reliable, it is only reliable when it is set to
205	* <code>false</code> since the only case where it can be set to
206	* <code>false</code> is if the attribute was created by a Level 1
207	* implementation. </dd>
208	* <dt><code>false</code></dt>
209	* <dd>[required] Output all attributes and
210	* all content. </dd>
211	* </dl></dd>
212	* <dt><code>"format-canonical"</code></dt>
213	* <dd>
214	* <dl>
215	* <dt><code>true</code></dt>
216	* <dd>[optional]
217	* This formatting writes the document according to the rules specified in .
218	* Setting this feature to true will set the feature "format-pretty-print"
219	* to false. </dd>
220	* <dt><code>false</code></dt>
221	* <dd>[required] (default) Don't canonicalize the
222	* output. </dd>
223	* </dl></dd>
224	* <dt><code>"format-pretty-print"</code></dt>
225	* <dd>
226	* <dl>
227	* <dt><code>true</code></dt>
228	* <dd>[optional]
229	* Formatting the output by adding whitespace to produce a pretty-printed,
230	* indented, human-readable form. The exact form of the transformations is
231	* not specified by this specification. Setting this feature to true will
232	* set the feature "format-canonical" to false. </dd>
233	* <dt><code>false</code></dt>
234	* <dd>[required]
235	* (default) Don't pretty-print the result. </dd>
236	* </dl></dd>
237	* </dl>
238	* <p>See also the <a href='http://www.w3.org/TR/2002/WD-DOM-Level-3-ASLS-20020409'>Document Object Model (DOM) Level 3 Abstract Schemas and Load
239	* and Save Specification</a>.
240	*
241	* @since DOM Level 3
242	*/
243
244
245	#include <xercesc/dom/DOMNode.hpp>
246	#include <xercesc/dom/DOMWriterFilter.hpp>
247	#include <xercesc/dom/DOMErrorHandler.hpp>
248	#include <xercesc/framework/XMLFormatter.hpp>
249
250	XERCES_CPP_NAMESPACE_BEGIN
251
252	class CDOM_EXPORT DOMWriter {
253	protected :
254	// -----------------------------------------------------------------------
255	// Hidden constructors
256	// -----------------------------------------------------------------------
257	/** @name Hidden constructors */
258	//@{
259	DOMWriter() {};
260	//@}
261	private:
262	// -----------------------------------------------------------------------
263	// Unimplemented constructors and operators
264	// -----------------------------------------------------------------------
265	/** @name Unimplemented constructors and operators */
266	//@{
267	DOMWriter(const DOMWriter &);
268	DOMWriter & operator = (const DOMWriter &);
269	//@}
270
271
272	public:
273	// -----------------------------------------------------------------------
274	// All constructors are hidden, just the destructor is available
275	// -----------------------------------------------------------------------
276	/** @name Destructor */
277	//@{
278	/**
279	* Destructor
280	*
281	*/
282	virtual ~DOMWriter() {};
283	//@}
284
285	// -----------------------------------------------------------------------
286	// Virtual DOMWriter interface
287	// -----------------------------------------------------------------------
288	/** @name Functions introduced in DOM Level 3 */
289	//@{
290	// -----------------------------------------------------------------------
291	// Feature methods
292	// -----------------------------------------------------------------------
293	/**
294	* Query whether setting a feature to a specific value is supported.
295	* <br>The feature name has the same form as a DOM hasFeature string.
296	*
297	* <p><b>"Experimental - subject to change"</b></p>
298	*
299	* @param featName The feature name, which is a DOM has-feature style string.
300	* @param state The requested state of the feature (<code>true</code> or
301	* <code>false</code>).
302	* @return <code>true</code> if the feature could be successfully set to
303	* the specified value, or <code>false</code> if the feature is not
304	* recognized or the requested value is not supported. The value of
305	* the feature itself is not changed.
306	* @since DOM Level 3
307	*/
308	virtual bool canSetFeature(const XMLCh* const featName
309	, bool state) const = 0;
310	/**
311	* Set the state of a feature.
312	* <br>The feature name has the same form as a DOM hasFeature string.
313	* <br>It is possible for a <code>DOMWriter</code> to recognize a feature
314	* name but to be unable to set its value.
315	*
316	* <p><b>"Experimental - subject to change"</b></p>
317	*
318	* @param featName The feature name.
319	* @param state The requested state of the feature (<code>true</code> or
320	* <code>false</code>).
321	* @exception DOMException
322	* Raise a NOT_SUPPORTED_ERR exception when the <code>DOMWriter</code>
323	* recognizes the feature name but cannot set the requested value.
324	* <br>Raise a NOT_FOUND_ERR When the <code>DOMWriter</code> does not
325	* recognize the feature name.
326	* @see getFeature
327	* @since DOM Level 3
328	*/
329	virtual void setFeature(const XMLCh* const featName
330	, bool state) = 0;
331
332	/**
333	* Look up the value of a feature.
334	* <br>The feature name has the same form as a DOM hasFeature string
335	* @param featName The feature name, which is a string with DOM has-feature
336	* syntax.
337	* @return The current state of the feature (<code>true</code> or
338	* <code>false</code>).
339	* @exception DOMException
340	* Raise a NOT_FOUND_ERR When the <code>DOMWriter</code> does not
341	* recognize the feature name.
342	*
343	* <p><b>"Experimental - subject to change"</b></p>
344	*
345	* @see setFeature
346	* @since DOM Level 3
347	*/
348	virtual bool getFeature(const XMLCh* const featName) const = 0;
349
350	// -----------------------------------------------------------------------
351	// Setter methods
352	// -----------------------------------------------------------------------
353	/**
354	* The character encoding in which the output will be written.
355	* <br> The encoding to use when writing is determined as follows: If the
356	* encoding attribute has been set, that value will be used.If the
357	* encoding attribute is <code>null</code> or empty, but the item to be
358	* written includes an encoding declaration, that value will be used.If
359	* neither of the above provides an encoding name, a default encoding of
360	* "UTF-8" will be used.
361	* <br>The default value is <code>null</code>.
362	*
363	* <p><b>"Experimental - subject to change"</b></p>
364	*
365	* @param encoding The character encoding in which the output will be written.
366	* @see getEncoding
367	* @since DOM Level 3
368	*/
369	virtual void setEncoding(const XMLCh* const encoding) = 0;
370
371	/**
372	* The end-of-line sequence of characters to be used in the XML being
373	* written out. The only permitted values are these:
374	* <dl>
375	* <dt><code>null</code></dt>
376	* <dd>
377	* Use a default end-of-line sequence. DOM implementations should choose
378	* the default to match the usual convention for text files in the
379	* environment being used. Implementations must choose a default
380	* sequence that matches one of those allowed by 2.11 "End-of-Line
381	* Handling". </dd>
382	* <dt>CR</dt>
383	* <dd>The carriage-return character (#xD).</dd>
384	* <dt>CR-LF</dt>
385	* <dd> The
386	* carriage-return and line-feed characters (#xD #xA). </dd>
387	* <dt>LF</dt>
388	* <dd> The line-feed
389	* character (#xA). </dd>
390	* </dl>
391	* <br>The default value for this attribute is <code>null</code>.
392	*
393	* <p><b>"Experimental - subject to change"</b></p>
394	*
395	* @param newLine The end-of-line sequence of characters to be used.
396	* @see getNewLine
397	* @since DOM Level 3
398	*/
399	virtual void setNewLine(const XMLCh* const newLine) = 0;
400
401	/**
402	* The error handler that will receive error notifications during
403	* serialization. The node where the error occured is passed to this
404	* error handler, any modification to nodes from within an error
405	* callback should be avoided since this will result in undefined,
406	* implementation dependent behavior.
407	*
408	* <p><b>"Experimental - subject to change"</b></p>
409	*
410	* @param errorHandler The error handler to be used.
411	* @see getErrorHandler
412	* @since DOM Level 3
413	*/
414	virtual void setErrorHandler(DOMErrorHandler *errorHandler) = 0;
415
416	/**
417	* When the application provides a filter, the serializer will call out
418	* to the filter before serializing each Node. Attribute nodes are never
419	* passed to the filter. The filter implementation can choose to remove
420	* the node from the stream or to terminate the serialization early.
421	*
422	* <p><b>"Experimental - subject to change"</b></p>
423	*
424	* @param filter The writer filter to be used.
425	* @see getFilter
426	* @since DOM Level 3
427	*/
428	virtual void setFilter(DOMWriterFilter *filter) = 0;
429
430	// -----------------------------------------------------------------------
431	// Getter methods
432	// -----------------------------------------------------------------------
433	/**
434	* Return the character encoding in which the output will be written.
435	*
436	* <p><b>"Experimental - subject to change"</b></p>
437	*
438	* @return The character encoding used.
439	* @see setEncoding
440	* @since DOM Level 3
441	*/
442	virtual const XMLCh* getEncoding() const = 0;
443
444	/**
445	* Return the end-of-line sequence of characters to be used in the XML being
446	* written out.
447	*
448	* <p><b>"Experimental - subject to change"</b></p>
449	*
450	* @return The end-of-line sequence of characters to be used.
451	* @see setNewLine
452	* @since DOM Level 3
453	*/
454	virtual const XMLCh* getNewLine() const = 0;
455
456	/**
457	* Return the error handler that will receive error notifications during
458	* serialization.
459	*
460	* <p><b>"Experimental - subject to change"</b></p>
461	*
462	* @return The error handler to be used.
463	* @see setErrorHandler
464	* @since DOM Level 3
465	*/
466	virtual DOMErrorHandler* getErrorHandler() const = 0;
467
468	/**
469	* Return the WriterFilter used.
470	*
471	* <p><b>"Experimental - subject to change"</b></p>
472	*
473	* @return The writer filter used.
474	* @see setFilter
475	* @since DOM Level 3
476	*/
477	virtual DOMWriterFilter* getFilter() const = 0;
478
479	// -----------------------------------------------------------------------
480	// Write methods
481	// -----------------------------------------------------------------------
482	/**
483	* Write out the specified node as described above in the description of
484	* <code>DOMWriter</code>. Writing a Document or Entity node produces a
485	* serialized form that is well formed XML. Writing other node types
486	* produces a fragment of text in a form that is not fully defined by
487	* this document, but that should be useful to a human for debugging or
488	* diagnostic purposes.
489	*
490	* <p><b>"Experimental - subject to change"</b></p>
491	*
492	* @param destination The destination for the data to be written.
493	* @param nodeToWrite The <code>Document</code> or <code>Entity</code> node to
494	* be written. For other node types, something sensible should be
495	* written, but the exact serialized form is not specified.
496	* @return Returns <code>true</code> if <code>node</code> was
497	* successfully serialized and <code>false</code> in case a failure
498	* occured and the failure wasn't canceled by the error handler.
499	* @since DOM Level 3
500	*/
501	virtual bool writeNode(XMLFormatTarget* const destination
502	, const DOMNode &nodeToWrite) = 0;
503
504	/**
505	* Serialize the specified node as described above in the description of
506	* <code>DOMWriter</code>. The result of serializing the node is
507	* returned as a string. Writing a Document or Entity node produces a
508	* serialized form that is well formed XML. Writing other node types
509	* produces a fragment of text in a form that is not fully defined by
510	* this document, but that should be useful to a human for debugging or
511	* diagnostic purposes.
512	*
513	* <p><b>"Experimental - subject to change"</b></p>
514	*
515	* @param nodeToWrite The node to be written.
516	* @return Returns the serialized data, or <code>null</code> in case a
517	* failure occured and the failure wasn't canceled by the error
518	* handler. The returned string is always in UTF-16.
519	* The encoding information available in DOMWriter is ignored in writeToString().
520	* @since DOM Level 3
521	*/
522	virtual XMLCh* writeToString(const DOMNode &nodeToWrite) = 0;
523
524	//@}
525
526	// -----------------------------------------------------------------------
527	// Non-standard Extension
528	// -----------------------------------------------------------------------
529	/** @name Non-standard Extension */
530	//@{
531	/**
532	* Called to indicate that this Writer is no longer in use
533	* and that the implementation may relinquish any resources associated with it.
534	*
535	* Access to a released object will lead to unexpected result.
536	*/
537	virtual void release() = 0;
538	//@}
539
540
541	};
542
543	XERCES_CPP_NAMESPACE_END
544
545	#endif

Note: See TracBrowser for help on using the repository browser.

Context Navigation

source: NonGTP/Xerces/xerces-c_2_8_0/include/xercesc/dom/DOMWriter.hpp @ 2674

Download in other formats: