Context Navigation

source: trunk/VUT/doc/SciReport/online.tex @ 243

Revision 243, 49.6 KB checked in by bittner, 19 years ago (diff)
SciReport? template

Line
1	\chapter{Online Visibility Culling}
2	%*****************************************************************************
3
4	\section{Introduction}
5
6	Visibility culling is one of the major acceleration techniques for
7	the real-time rendering of complex scenes. The ultimate goal of
8	visibility culling techniques is to prevent invisible objects from
9	being sent to the rendering pipeline. A standard visibility-culling
10	technique is {\em view-frustum culling}, which eliminates objects
11	outside of the current view frustum. View-frustum culling is a fast
12	and simple technique, but it does not eliminate objects in the view
13	frustum that are occluded by other objects. This can lead to significant
14	{\em overdraw}, i.e., the same image area gets covered more than
15	once. The overdraw causes a waste of computational effort both in the
16	pixel and the vertex processing stages of modern graphic hardware.
17	%that is, we render each pixel in the image several times. For many complex
18	%scenes with high occlusion, the effort wasted on overdraw cannot simply be
19	%eliminated by the increasing the raw computational power of the hardware.
20	%This problem becomes even more apparent on recent graphics hardware with
21	%programmable vertex and pixel shaders: all complex per-vertex or per-pixel
22	%computations on object get completely wasted if the object is occluded.
23	The elimination of occluded objects is addressed by {\em occlusion
24	culling}. In an optimized rendering pipeline, occlusion culling complements
25	other rendering acceleration techniques such as levels of detail or
26	impostors.
27
28	Occlusion culling can either be applied offline or online. When
29	applied offline as a preprocess, we compute a potentially visible set
30	(PVS) for cells of a fixed subdivision of the scene. At runtime, we
31	can quickly identify a PVS for the given viewpoint. However, this
32	approach suffers from four major problems: (1) the PVS is valid only
33	for the original static scene configuration, (2) for a given
34	viewpoint, the corresponding cell-based PVS can be overly
35	conservative, (3) computing all PVSs is computationally expensive, and (4)
36	an accurate PVS computation is difficult to implement for general
37	scenes. Online occlusion culling can solve these problems at the cost
38	of applying extra computations at each frame. To make these additional
39	computations efficient, most online occlusion culling methods rely on a
40	number of assumptions about the scene structure and its occlusion
41	characteristics (e.g. presence of large occluders, occluder
42	connectivity, occlusion by few closest depth layers).
43
44	Recent graphics hardware natively supports an \emph{occlusion query}
45	to detect the visibility of an object against the current contents of the
46	z-buffer. Although the query itself is processed quickly using the
47	raw power of the graphics processing unit (GPU), its result is not
48	available immediately due to the delay between issuing the query and
49	its actual processing in the graphics pipeline. As a result, a naive
50	application of occlusion queries can even decrease the overall
51	application performance due the associated CPU stalls and GPU
52	starvation. In this paper, we present an algorithm that aims to
53	overcome these problems by reducing the number of issued queries and
54	eliminating the CPU stalls and GPU starvation. To schedule the
55	queries, the algorithm makes use of both the spatial and the temporal
56	coherence of visibility. A major strength of our technique is its
57	simplicity and versatility: the method can be easily integrated in
58	existing real-time rendering packages on architectures supporting the
59	underlying occlusion query.
60
61	%Using spatial and assuming temporal coherence
62
63	\section{Related Work}
64
65	With the demand for rendering scenes of ever increasing size, there
66	have been a number of visibility culling methods developed in the
67	last decade. A comprehensive survey of visibility culling methods
68	was presented by Cohen-Or et al.~\cite{Cohen:2002:survey}. Another
69	recent survey of Bittner and Wonka~\cite{bittner03:jep} discusses
70	visibility culling in a broader context of other visibility problems.
71
72	According to the domain of visibility computation, we distinguish
73	between {\em from-point} and {\em from-region} visibility algorithms.
74	From-region algorithms compute a PVS and are applied offline in a
75	preprocessing phase~\cite{Airey90,Teller91a,Leyvand:2003:RSF}.
76	From-point algorithms are applied online for each particular
77	viewpoint~\cite{Greene93a,Hudson97,EVL-1997-163,bittner98b,Wonka:1999:OSF,Klosowski:2001:ECV}.
78	In our further discussion we focus on online occlusion culling methods
79	that exploit graphics hardware.
80
81	A conceptually important online occlusion culling method is the
82	hierarchical z-buffer introduced by Greene et al.~\cite{Greene93a}. It
83	organizes the z-buffer as a pyramid, where the standard z-buffer is
84	the finest level. At all other levels, each z-value is the farthest in
85	the window corresponding to the adjacent finer level. The hierarchical
86	z-buffer allows to quickly determine if the geometry in question is
87	occluded. To a certain extent this idea is used in the current
88	generation of graphics hardware by applying early z-tests of
89	fragments in the graphics pipeline (e.g., Hyper-Z technology of ATI or
90	Z-cull of NVIDIA). However, the geometry still needs to be sent to the
91	GPU, transformed, and coarsely rasterized even if it is later
92	determined invisible.
93
94	Zhang~\cite{EVL-1997-163} proposed hierarchical occlusion maps, which
95	do not rely on the hardware support for the z-pyramid, but instead
96	make use of hardware texturing. The hierarchical occlusion map is
97	computed on the GPU by rasterizing and down sampling a given set of
98	occluders. The occlusion map is used for overlap tests whereas the
99	depths are compared using a coarse depth estimation buffer. Wonka and
100	Schmalstieg~\cite{Wonka:1999:OSF} use occluder shadows to compute
101	from-point visibility in \m25d scenes with the help of the GPU. This
102	method has been further extended to online computation of from-region
103	visibility executed on a server~\cite{wonka:01:eg}.
104
105	%% In
106	%% parallel to rendering, the visibility server calculates visibility for
107	%% the neighborhood of the given viewpoint and sends them back to the
108	%% display host.
109
110	%% A similar method designed by Aila and
111	%% Miettinen~\cite{aila:00:msc}. The incremental occlusion maps are
112	%% created on the CPU, which takes the CPU computational time but it does
113	%% not suffer the problem of slow read-back from the GPU.
114
115	%Klosowski:2001:ECV
116
117	Bartz et al.~\cite{Bartz98} proposed an OpenGL extension for occlusion
118	queries along with a discussion concerning a potential realization in
119	hardware. A first hardware implementation of occlusion queries came
120	with the VISUALIZE fx graphics hardware~\cite{Scott:1998:OVF}. The
121	corresponding OpenGL extension is called \hpot{}. A more recent OpenGL
122	extension, \nvoq{}, was introduced by NVIDIA with the GeForce 3 graphics
123	card and it is now also available as an official ARB extension.
124
125	%Both
126	%extensions have been used in several several occlusion methods
127	%algorithms~\cite{Meissner01,Hillesland02,Staneker:2004:OCO}.
128	%\cite{Meissner01}
129
130	Hillesland et al.~\cite{Hillesland02} have proposed an algorithm
131	which employs the \nvoq. They subdivide the scene using a uniform
132	grid. Then the cubes are traversed in slabs roughly perpendicular to
133	the viewport. The queries are issued for all cubes of a slab at once,
134	after the visible geometry of this slab has been rendered. The method
135	can also use nested grids: a cell of the grid contains another grid
136	that is traversed if the cell is proven visible. This method however
137	does not exploit temporal and spatial coherence of visibility and it
138	is restricted to regular subdivision data structures. Our new method
139	addresses both these problems and provides natural extensions to
140	balance the accuracy of visibility classification and the associated
141	computational costs.
142
143	Recently, Staneker et al.~\cite{Staneker:2004:OCO} developed a method
144	integrating occlusion culling into the OpenSG scene graph
145	framework. Their technique uses occupancy maps maintained in software
146	to avoid queries on visible scene graph nodes, and temporal coherence
147	to reduce the number of occlusion queries. The drawback of the method
148	is that it performs the queries in a serial fashion and thus it
149	suffers from the CPU stalls and GPU starvation.
150
151	%% The methods proposed in this paper can be used to make use of temporal
152	%% and spatial coherence in the scope of existing visibility algorithms,
153	%% that utilise a spatial hierarchy. Examples of these are algorithms
154	%% based on hierarchical occlusion maps~\cite{Zhang97}, coverage
155	%% masks~\cite{Greene:1996:HPT}, shadow frusta~\cite{Hudson97}, and
156	%% occlusion trees~\cite{bittner98b_long}.
157
158	On a theoretical level, our paper is related to methods aiming to
159	exploit the temporal coherence of visibility. Greene et
160	al.~\cite{Greene93a} used the set of visible objects from one frame to
161	initialize the z-pyramid in the next frame in order to reduce the
162	overdraw of the hierarchical z-buffer. The algorithm of Coorg and
163	Teller~\cite{Coorg96b} restricts the hierarchical traversal to nodes
164	associated with visual events that were crossed between successive
165	viewpoint positions. Another method of Coorg and Teller~\cite{Coorg97}
166	exploits temporal coherence by caching occlusion relationships.
167	Chrysanthou and Slater have proposed a probabilistic scheme for
168	view-frustum culling~\cite{Chrysanthou:97}.
169
170	The above mentioned methods for exploiting temporal coherence are
171	tightly interwoven with the particular culling algorithm. On the
172	contrary, Bittner et al.~\cite{bittner01:jvca} presented a general
173	acceleration technique for exploiting spatial and temporal coherence
174	in hierarchical visibility algorithms. The central idea, which is also
175	vital for this paper, is to avoid repeated visibility tests of
176	interior nodes of the hierarchy. The problem of direct adoption of
177	this method is that it is designed for the use with instantaneous CPU
178	based occlusion queries, whereas hardware occlusion queries exhibit
179	significant latency. The method presented herein efficiently overcomes
180	the problem of latency while keeping the benefits of a generality and
181	simplicity of the original hierarchical technique. As a result we
182	obtain a simple and efficient occlusion culling algorithm utilizing
183	hardware occlusion queries.
184
185
186	%In this paper we show how to overcome this problem while
187	%keeping the benefits of exploiting.
188
189
190	The rest of the paper is organized as follows:
191	Section~\ref{sec:hwqueries} discusses hardware supported occlusion
192	queries and a basic application of these queries using a kD-tree.
193	Section~\ref{sec:hoq} presents our new algorithm and
194	Section~\ref{sec:optimizations} describes several additional
195	optimizations. Section~\ref{sec:results} presents results obtained by
196	experimental evaluation of the method and discusses its
197	behavior. Finally Section~\ref{sec:conclusion} concludes the paper.
198
199
200
201
202	\section{Hardware Occlusion Queries}
203	\label{sec:hwqueries}
204
205
206	Hardware occlusion queries follow a simple pattern: To test visibility
207	of an occludee, we send its bounding volume to the GPU. The volume is
208	rasterized and its fragments are compared to the current contents of
209	the z-buffer. The GPU then returns the number of visible
210	fragments. If there is no visible fragment, the occludee is invisible
211	and it need not be rendered.
212
213	\subsection{Advantages of hardware occlusion queries}
214
215	There are several advantages of hardware occlusion queries:
216
217	\begin{itemize}
218
219	\item {\em Generality of occluders.} We can use the original scene geometry
220	as occluders, since the queries use the current contents of the z-buffer.
221
222
223	\item {\em Occluder fusion.} The occluders are merged in the z-buffer,
224	so the queries automatically account for occluder fusion. Additionally
225	this fusion comes for free since we use the intermediate result of the
226	rendering itself.
227
228	\item {\em Generality of occludees.} We can use complex
229	occludees. Anything that can be rasterized quickly is suitable.
230
231	\item {\em Exploiting the GPU power.} The queries take full advantage of
232	the high fill rates and internal parallelism provided by modern GPUs.
233
234	\item {\em Simple use.} Hardware occlusion queries can be easily
235	integrated into a rendering algorithm. They provide a powerful tool to
236	minimize the implementation effort, especially when compared to
237	CPU-based occlusion culling.
238
239
240	\end{itemize}
241
242
243
244	\subsection{Problems of hardware occlusion queries}
245
246	Currently there are two main hardware supported variants of occlusion
247	queries: the HP test (\hpot) and the more recent NV query (\nvoq, now
248	also available as \arboq). The most important difference between the
249	HP test and the NV query is that multiple NV queries can be
250	issued before asking for their results, while only one HP test is
251	allowed at a time, which severely limits its possible algorithmic
252	usage. Additionally the NV query returns the number of visible pixels
253	whereas the HP test returns only a binary visibility
254	classification.
255
256	%% Issuing multiple independent NV queries is crucial for
257	%% the design of algorithms that strive to exploit power of the GPU.
258
259	The main problem of both the HP test and the NV query is the latency
260	between issuing the query and the availability of the result. The
261	latency occurs due to the delayed processing of the query in a long
262	graphics pipeline, the cost of processing the query itself, and the
263	cost of transferring the result back to the CPU. The latency causes
264	two major problems: CPU stalls and GPU starvation. After issuing the
265	query, the CPU waits for its result and does not feed the GPU with new
266	data. When the result finally becomes available, the GPU pipeline can
267	already be empty. Thus the GPU needs to wait for the CPU to process
268	the result of the query and to feed the GPU with new data.
269
270	A major challenge when using hardware occlusion queries is to avoid
271	the CPU stalls by filling the latency time with other tasks, such as
272	rendering visible scene objects or issuing other, independent
273	occlusion queries (see Figure~\ref{fig:latency})
274
275
276	\begin{figure}[htb]
277	\centerline{
278	\includegraphics[width=0.5\textwidth,draft=\DRAFTFIGS]{figs/latency2}
279	}
280
281	\caption{(top) Illustration of CPU stalls and GPU starvation.
282	Qn, Rn, and Cn denote querying, rendering, and culling of object
283	n, respectively. Note that object 5 is found invisible by Q5 and
284	thus not rendered. (bottom) More efficient query scheduling. The
285	scheduling assumes that objects 4 and 6 will be visible in the
286	current frame and renders them without waiting for the result of
287	the corresponding queries. }
288	\label{fig:latency}
289	\end{figure}
290
291
292
293
294
295	\subsection{Hierarchical stop-and-wait method}
296	\label{sec:basic}
297
298	Many rendering algorithms rely on hierarchical structures in order to
299	deal with complex scenes. In the context of occlusion culling, such a
300	data structure allows to efficiently cull large scene blocks, and thus
301	to exploit spatial coherence of visibility and provide a key to
302	achieving output sensitivity.
303
304	This section outlines a naive application of occlusion queries in the
305	scope of a hierarchical algorithm. We refer to this approach as the
306	{\em hierarchical stop-and-wait} method. Our discussion is based on
307	kD-trees, which proved to be efficient for point location, ray
308	tracing, and visibility
309	culling~\cite{MacDonald90,Hudson97,Coorg97,bittner01:jvca}. The
310	concept applies to general hierarchical data structures as well, though.
311
312	The hierarchical stop-and-wait method proceeds as follows: Once a
313	kD-tree node passes view-frustum culling, it is tested for
314	occlusion by issuing the occlusion query and waiting for its
315	result. If the node is found visible, we continue by recursively
316	testing its children in a front-to-back order. If the node is a
317	leaf, we render its associated objects.
318
319	The problem with this approach is that we can continue the tree
320	traversal only when the result of the last occlusion query becomes
321	available. If the result is not available, we have to stall the CPU,
322	which causes significant performance penalties. As we document in
323	Section~\ref{sec:results}, these penalties together with the overhead
324	of the queries themselves can even decrease the overall application
325	performance compared to pure view-frustum culling. Our new method aims
326	to eliminate this problem by issuing multiple occlusion queries for
327	independent scene parts and exploiting temporal coherence of
328	visibility classifications.
329
330
331	\section{Coherent Hierarchical Culling}
332
333	\label{sec:hoq}
334
335	In this section we first present an overview of our new
336	algorithm. Then we discuss its steps in more detail.
337
338	%and present a
339	%generalization of the method to other spatial data structures.
340
341	\subsection{Algorithm Overview}
342
343	\label{sec:overview}
344
345
346	Our method is based on exploiting temporal coherence of visibility
347	classification. In particular, it is centered on the following
348	three ideas:
349
350	\begin{itemize}
351	\item We initiate occlusion queries on nodes of the hierarchy where
352	the traversal terminated in the last frame. Thus we avoid queries
353	on all previously visible interior nodes~\cite{bittner01:jvca}.
354	\item We assume that a previously visible leaf node remains visible
355	and render the associated geometry without waiting for the result
356	of the corresponding occlusion query.
357	\item Issued occlusion queries are stored in a query queue until they are known to be carried out by the GPU. This allows
358	interleaving the queries with the rendering of visible geometry.
359	\end{itemize}
360
361
362	The algorithm performs a traversal of the hierarchy that is
363	terminated either at leaf nodes or nodes that are classified as
364	invisible. Let us call such nodes the {\em termination nodes}, and
365	interior nodes that have been classified visible the {\em opened
366	nodes}. We denote sets of termination and opened nodes in the $i$-th
367	frame $\mathcal{T}_i$ and $\mathcal{ O}_i$, respectively. In the
368	$i$-th frame, we traverse the kD-tree in a front-to-back order, skip
369	all nodes of $\mathcal{ O}_{i-1}$ and apply occlusion queries first on
370	the termination nodes $\mathcal{ T}_{i-1}$. When reaching a
371	termination node, the algorithm proceeds as follows:
372
373	\begin{itemize}
374	\item For a previously visible node (this must be a leaf), we issue
375	the occlusion query and store it in the query queue. Then we
376	immediately render the associated geometry without waiting for the
377	result of the query.
378	\item For a previously invisible node, we issue the query and store
379	it in the query queue.
380	\end{itemize}
381
382	\begin{figure}[htb]
383	{\footnotesize
384	\input{code/pseudocode2}
385	}
386	\caption{Pseudo-code of coherent hierarchical culling.}
387	\label{fig:pseudocode}
388	\end{figure}
389
390	When the query queue is not empty, we check if the result of the
391	oldest query in the queue is already available. If the query result is
392	not available, we continue by recursively processing other nodes of
393	the kD-tree as described above. If the query result is available, we
394	fetch the result and remove the node from the query queue. If the node
395	is visible, we process its children recursively. Otherwise, the whole
396	subtree of the node is invisible and thus it is culled.
397
398
399	In order to propagate changes in visibility upwards in the hierarchy,
400	the visibility classification is \emph{pulled up} according to the
401	following rule: An interior node is invisible only if all its children
402	have been classified invisible. Otherwise, it remains visible and thus
403	opened. The pseudo-code of the complete algorithm is given in
404	Figure~\ref{fig:pseudocode}. An example of the behavior of the method
405	on a small kD-tree for two subsequent frames is depicted
406	Figure~\ref{fig:cut}.
407
408	\begin{figure*}[htb]
409	\centerline{
410	\includegraphics[width=0.73\textwidth,draft=\DRAFTFIGS]{figs/cut} }
411	\caption{(left) Visibility classification of a node of the kD-tree and
412	the termination nodes. (right) Visibility classification after the
413	application of the occlusion test and the new set of termination nodes.
414	Nodes on which occlusion queries were applied are depicted with a
415	solid outline. Note the pull-up and pull-down due to visibility changes.}
416	\label{fig:cut}
417	\end{figure*}
418
419
420
421	The sets of opened nodes and termination nodes need not be maintained
422	explicitly. Instead, these sets can be easily identified by
423	associating with each node an information about its visibility and an
424	id of the last frame when it was visited. The node is an opened node
425	if it is an interior visible node that was visited in the last frame
426	(line 23 in the pseudocode). Note that in the actual implementation of
427	the pull up we can set all visited nodes to invisible by default and
428	then pull up any changes from invisible to visible (lines 25 and line
429	12 in Figure~\ref{fig:pseudocode}). This modification eliminates
430	checking children for invisibility during the pull up.
431
432
433
434	%% \begin{figure*}[htb]
435	%% \centerline{\includegraphics[width=0.8\textwidth,draft=\DRAFTFIGS]{figs/basic}}
436	%% \caption{Illustration of the hierarchy traversal. Initially the algorithm
437	%% starts at the root of the hierarchy (left). In the second frame the
438	%% opened nodes $\mathcal{ O}_0$ are skipped and the occlusion queries
439	%% are first applied on the termination nodes $\mathcal{T}_0$. Visibility changes are propagated upwarsds in the hierarchy and a
440	%% new set of termination nodes $\mathcal{ T}_1$ is established.}
441	%% \label{fig:basic}
442	%% \end{figure*}
443
444
445	\subsection{Reduction of the number of queries}
446
447	%Skipping the opened nodes and issuing occlusion queries for the
448	%termination nodes assists in exploiting two main characteristics of
449	%scene visibility:
450
451	%Identifying the set of termination nodes assists in finding a subdivision
452
453	Our method reduces the number of visibility queries in two ways:
454	Firstly, as other hierarchical culling methods we consider only a
455	subtree of the whole hierarchy (opened nodes + termination
456	nodes). Secondly, by avoiding queries on opened nodes we eliminate
457	part of the overhead of identification of this subtree. These
458	reductions reflect the following coherence properties of scene
459	visibility:
460
461	\begin{itemize}
462
463
464	\item {\em Spatial coherence.} The invisible termination nodes
465	approximate the occluded part of the scene with the smallest number of
466	nodes with respect to the given hierarchy, i.e., each invisible
467	termination node has a visible parent. This induces an adaptive
468	spatial subdivision that reflects spatial coherence of visibility,
469	more precisely the coherence of occluded regions. The adaptive nature
470	of the subdivision allows to minimize the number of subsequent
471	occlusion queries by applying the queries on the largest spatial
472	regions that are expected to remain occluded.
473
474	%, and visible termination
475	%nodes correspond to the smallest unoccluded regions (visible
476	%leafs)
477
478	\item {\em Temporal coherence.} If visibility remains constant the set
479	of termination nodes needs no adaptation. If an occluded node becomes
480	visible we recursively process its children (pull-down). If a visible
481	node becomes occluded we propagate the change higher in the hierarchy
482	(pull-up). A pull-down reflects a spatial growing of visible
483	regions. Similarly, a pull-up reflects a spatial growing of occluded
484	regions.
485
486
487	%The first case corresponds to pull-down of the cut, the latter to pull-up.
488
489	\end{itemize}
490
491	By avoiding queries on the opened nodes, we can save $1/k$ of the queries
492	for a hierarchy with branching factor $k$ (assuming visibility remains
493	constant). Thus for the kD-tree, up to half of the queries can be
494	saved. The actual savings in the total query time are even larger: the
495	higher we are at the hierarchy, the larger boxes we would have to
496	check for occlusion. Consequently, the higher is the fill rate that
497	would have been required to rasterize the boxes. In particular,
498	assuming that the sum of the screen space projected area for nodes at
499	each level of the kD-tree is equal and the opened nodes form a
500	complete binary subtree of depth $d$, the fill rate is reduced $(d+2)$
501	times.
502
503
504
505	\subsection{Reduction of CPU stalls and GPU starvation}
506
507	\label{sec:latency}
508
509	The reduction of CPU stalls and GPU starvation is achieved by
510	interleaving occlusion queries with the rendering of visible geometry. The
511	immediate rendering of previously visible termination nodes and the
512	subsequent issuing of occlusion queries eliminates the requirement of
513	waiting for the query result during the processing of the initial
514	depth layers containing previously visible nodes. In an optimal case,
515	new query results become available in between and thus we completely
516	eliminate CPU stalls. In a static scenario, we achieve exactly the
517	same visibility classification as the hierarchical stop-and-wait
518	method.
519
520	If the visibility is changing, the situation can be different: if the
521	results of the queries arrive too late, it is possible that we
522	initiated an occlusion query on a previously occluded node $A$ that is
523	in fact occluded by another previously occluded node $B$ that became
524	visible. If $B$ is still in the query queue, we do not capture a
525	possible occlusion of $A$ by $B$ since the geometry associated with
526	$B$ has not yet been rendered. In Section~\ref{sec:results} we show
527	that the increase of the number of rendered objects compared to the
528	stop-and-wait method is usually very small.
529
530	%% It is possible that we have already traversed all previously
531	%% visible nodes and we must stall the CPU by waiting for the result
532	%% of the oldest occlusion query. A technique completely eliminating
533	%% CPU stalls at the cost of an increase of the number of rendered
534	%% objects will be discussed in Section~\ref{sec:full}.
535
536	\subsection{Front-to-back scene traversal}
537
538	For kD-trees the front-to-back scene traversal can be easily
539	implemented using a depth first
540	traversal~\cite{bittner01:jvca}. However, at a modest increase in
541	computational cost we can also use a more general breadth-first
542	traversal based on a priority queue. The priority of the node then
543	corresponds to an inverse of the minimal distance of the viewpoint and
544	the bounding box associated with the given node of the
545	kD-tree~\cite{Klosowski:2001:ECV,Staneker:2004:OCO}.
546
547	In the context of our culling algorithm, there are two main advantages
548	of the breadth-first front-to-back traversal :
549
550	\begin{itemize}
551	\item {\em Better query scheduling.} By spreading the traversal of the
552	scene in a breadth-first manner, we process the scene in depth
553	layers. Within each layer, the node processing order is practically
554	independent, which minimizes the problem of occlusion query
555	dependence. The breadth-first traversal interleaves
556	occlusion-independent nodes, which can provide a more accurate
557	visibility classification if visibility changes quickly. In
558	particular, it reduces the problem of false classifications due to
559	missed occlusion by nodes waiting in the query queue (discussed in
560	Section~\ref{sec:latency}).
561
562	\item {\em Using other spatial data structures.} By using a
563	breadth-first traversal, we are no longer restricted to the
564	kD-tree. Instead we can use an arbitrary spatial data structure
565	such as a bounding volume hierarchy, octree, grid, hierarchical grid,
566	etc. Once we compute a distance from a node to the viewpoint, the
567	node processing order is established by the priority queue.
568	\end{itemize}
569
570	When using the priority queue, our culling algorithm can also be
571	applied directly to the scene graph hierarchy, thus avoiding the
572	construction of any auxiliary data structure for spatial
573	partitioning. This is especially important for dynamic scenes, in
574	which maintenance of a spatial classification of moving objects can be
575	costly.
576
577	\subsection{Checking the query result}
578
579	The presented algorithm repeatedly checks if the result of the
580	occlusion query is available before fetching any node from the
581	traversal stack (line 6 in Figure~\ref{fig:pseudocode}). Our
582	practical experiments have proven that the cost of this check is
583	negligible and thus it can used frequently without any performance
584	penalty. If the cost of this check were significantly higher, we could
585	delay asking for the query result by a time established by empirical
586	measurements for the particular hardware. This delay should also
587	reflect the size of the queried node to match the expected
588	availability of the query result as precise as possible.
589
590
591	\section{Further Optimizations}
592
593	\label{sec:optimizations}
594
595	This section discusses a couple of optimizations of our method that
596	can further improve the overall rendering performance. In contrast to
597	the basic algorithm from the previous section, these optimizations
598	rely on some user specified parameters that should be tuned for a
599	particular scene and hardware configuration.
600
601
602	\subsection{Conservative visibility testing}
603
604	The first optimization addresses the reduction of the number of
605	visibility tests at the cost of a possible increase in the number of
606	rendered objects. This optimization is based on the idea of skipping
607	some occlusion tests of visible nodes. We assume that whenever a node
608	becomes visible, it remains visible for a number of frames. Within the
609	given number of frames we avoid issuing occlusion queries and simply
610	assume the node remains visible~\cite{bittner01:jvca}.
611
612	This technique can significantly reduce the number of visibility tests
613	applied on visible nodes of the hierarchy. Especially in the case of
614	sparsely occluded scenes, there is a large number of visible nodes
615	being tested, which does not provide any benefit since most of them
616	remain visible. On the other hand, we do not immediately capture all
617	changes from visibility to invisibility, and thus we may render
618	objects that have already become invisible from the moment when the
619	last occlusion test was issued.
620
621	In the simplest case, the number of frames a node is assumed visible
622	can be a predefined constant. In a more complicated scenario this
623	number should be influenced by the history of the success of occlusion
624	queries and/or the current speed of camera movement.
625
626
627	\subsection{Approximate visibility}
628
629	The algorithm as presented computes a conservative visibility
630	classification with respect to the resolution of the z-buffer. We
631	can easily modify the algorithm to cull nodes more aggressively in
632	cases when a small part of the node is visible. We compare the
633	number of visible pixels returned by the occlusion query with a
634	user specified constant and cull the node if this number drops
635	below this constant.
636
637
638	\subsection{Complete elimination of CPU stalls}
639
640	\label{sec:full}
641
642	The basic algorithm eliminates CPU stalls unless the traversal stack
643	is empty. If there is no node to traverse in the traversal stack and
644	the result of the oldest query in the query queue is still not
645	available, it stalls the CPU by waiting for the query result. To
646	completely eliminate the CPU stalls, we can speculatively render some
647	nodes with undecided visibility. In particular, we select a node from
648	the query queue and render the geometry associated with the node (or
649	the whole subtree if it is an interior node). The node is marked as
650	rendered but the associated occlusion query is kept in the queue to
651	fetch its result later. If we are unlucky and the node remains
652	invisible, the effort of rendering the node's geometry is wasted. On
653	the other hand, if the node has become visible, we have used the time
654	slot before the next query arrives in an optimal manner.
655
656	To avoid the problem of spending more time on rendering invisible
657	nodes than would be spent by waiting for the result of the query, we
658	select a node with the lowest estimated rendering cost and compare
659	this cost with a user specified constant. If the cost is larger than
660	the constant we conclude that it is too risky to render the node and
661	wait till the result of the query becomes available.
662
663	\section{Results}
664	\label{sec:results}
665
666	We have incorporated our method into an OpenGL-based scene graph
667	library and tested it on three scenes of different types. All tests
668	were conducted on a PC with a 3.2GHz P4, 1GB of memory, and a
669	GeForce FX5950 graphics card.
670
671	\subsection{Test scenes}
672
673	The three test scenes comprise a synthetic arrangement of 5000
674	randomly positioned teapots (11.6M polygons); an urban environment (1M
675	polygons); and the UNC power plant model (13M polygons). The test
676	scenes are depicted in Figure~\ref{fig:scenes}. All scenes were
677	partitioned using a kD-tree constructed according to the surface-area
678	heuristics~\cite{MacDonald90}.
679
680
681	Although the teapot scene would intuitively offer good occlusion, it
682	is a complicated case to handle for occlusion culling. Firstly, the
683	teapots consist of small triangles and so only the effect of fused
684	occlusion due to a large number of visible triangles can bring a
685	culling benefit. Secondly, there are many thin holes through which it
686	is possible to see quite far into the arrangement of teapots. Thirdly,
687	the arrangement is long and thin and so we can see almost
688	half of the teapots along the longer side of the arrangement.
689
690	%Although the teapot scene would intuitively seem to offer good
691	%occlusion, it is one of the more complicated cases for occlusion
692	%culling. Due to the high number of very dense objects, a very fine
693	%kD-tree subdivision was necessary, which in turn leads to higher costs
694	%for testing many nodes in the hierarchy. Additionally, misclassifying
695	%a single teapot can lead to a significant increase in frame time due
696	%to the high polygon density (2,304 polygons per teapot). There are also
697	%many locations where it is possible to see very far into the block of
698	%teapots. This can be seen in the kD-tree node classification which is
699	%shown in the accompanying video.
700
701	The complete power plant model is quite challenging even to load into memory,
702	but on the other hand it offers good
703	occlusion. This scene is an interesting candidate for testing not only
704	due to its size, but also due to significant changes in visibility and depth complexity in
705	its different parts.
706
707	%as can also be seen in the
708	%walkthrough. A relatively coarse subdivision of the model triangles
709	%into about 4,600 nodes was sufficient to capture most occlusion
710	%interdependencies.
711
712	The city scene is a classical target for occlusion culling
713	algorithms. Due to the urban structure consisting of buildings and
714	streets, most of the model is occluded when viewed from the
715	streets. Note that the scene does not contain any detailed geometry
716	inside the buildings. See Figure~\ref{fig:city_vis} for a
717	visualization of the visibility classification of the kD-tree nodes
718	for the city scene.
719
720	%that
721	%could further
722
723
724
725	%% However, we still included a section in the walkthrough where the
726	%% viewpoint is elevated above the building roofs in order to show that
727	%% the algorithm continues to work without modification, albeit at some
728	%% expense. It will be an interesting topic of research to automatically
729	%% recognize such situations and decrease the frequency of visibility
730	%% queries or turn the occlusion culling algorithm off completely.
731
732
733
734	\begin{figure}
735	\centering \includegraphics[width=0.32\textwidth,draft=\DRAFTIMAGES]{images/city_vis}
736	\caption{Visibility classification of the kD-tree nodes in the city scene.
737	The orange nodes were found visible, all the other depicted nodes are invisible.
738	Note the increasing size of the occluded nodes with increasing distance from the visible set.}
739	\label{fig:city_vis}
740	\end{figure}
741
742
743	\subsection{Basic tests}
744
745
746	We have measured the frame times for rendering with only view-frustum culling
747	(VFC), the hierarchical stop-and-wait method (S\&W), and our new
748	coherent hierarchical culling method (CHC). Additionally, we have
749	evaluated the time for an ``ideal'' algorithm. The ideal algorithm
750	renders the visible objects found by the S\&W algorithm without
751	performing any visibility tests. This is an optimal solution with
752	respect to the given hierarchy, i.e., no occlusion culling algorithm
753	operating on the same hierarchy can be faster. For the basic tests we
754	did not apply any of the optimizations discussed in
755	Section~\ref{sec:optimizations}, which require user specified
756	parameters.
757
758	%% In order to obtain repeatable and comparable timings, the process
759	%% priority of our application was set to high, timings were obtained
760	%% using the CPU cycle counter instruction, and the OpenGL {\tt Finish()}
761	%% instruction was executed before each invocation of the timer at the
762	%% start and the end of a frame. Furthermore, we did not include clear
763	%% and buffer swapping times in our results, since they depend strongly
764	%% on the windowing mode used.
765
766	For each test scene, we have constructed a walkthrough which is shown
767	in full in the accompanying
768	video. Figures~\ref{fig:teapot_walkthrough},~\ref{fig:city_walkthrough},
769	and~\ref{fig:plant_walkthrough} depict the frame times measured for
770	the walkthroughs. Note that Figure~\ref{fig:plant_walkthrough} uses a
771	logarithmic scale to capture the high variations in frame times during
772	the power plant walkthrough.
773	\begin{figure}
774	\centering
775	\includegraphics[width=0.5\textwidth,draft=\DRAFTFIGS]{images/teapotgraph1}
776	\caption{Frame times for the teapot scene.}\label{fig:teapot_walkthrough}
777	\end{figure}
778	\begin{figure}
779	\centering
780	\includegraphics[width=0.5\textwidth,draft=\DRAFTFIGS]{images/citygraph1}
781	\caption{Frame times for the city walkthrough. Note
782	the spike around frame 1600, where the viewpoint was elevated above the roofs,
783	practically eliminating any occlusion.}\label{fig:city_walkthrough}
784	\end{figure}
785	\begin{figure}
786	\centering
787	\includegraphics[width=0.5\textwidth,draft=\DRAFTFIGS]{images/plantgraph1log}
788	\caption{Frame times for the power plant walkthrough.
789	The plot shows the weakness of the S\&W method: when there is not much occlusion it becomes slower than VFC (near frame 2200).
790	The CHC can keep up even in these situations and in the same time it can exploit occlusion when it appears
791	(e.g. near frame 3700).}
792	\label{fig:plant_walkthrough}
793	\end{figure}
794	To better demonstrate the behavior of our algorithm, all walkthroughs
795	contain sections with both restricted and unrestricted visibility. For
796	the teapots, we viewed the arrangement of teapots along the longer
797	side of the arrangement (frames 25--90). In the city we elevated the
798	viewpoint above the roofs and gained sight over most of the city
799	(frames 1200--1800). The power plant walkthrough contains several
800	viewpoints from which a large part of the model is visible (spikes in
801	Figure~\ref{fig:plant_walkthrough} where all algorithms are slow),
802	viewpoints along the border of the model directed outwards with low depth complexity (holes in
803	Figure~\ref{fig:plant_walkthrough} where all algorithms are fast), and
804	viewpoints inside the power plant with high depth complexity where occlusion culling produces a
805	significant speedup over VFC (e.g. frame 3800).
806
807	As we can see for a number frames in the walkthroughs, the CHC method
808	can produce a speedup of more than one order of magnitude compared to
809	VFC. The maximum speedup for the teapots, the city, and the power
810	plant walkthroughs is 8, 20, and 70, respectively. We can also observe
811	that CHC maintains a significant gain over S\&W and in many cases it
812	almost matches the performance of the ideal algorithm. In complicated
813	scenarios the S\&W method caused a significant slowdown compared to
814	VFC (e.g. frames 1200--1800 of
815	Figure~\ref{fig:city_walkthrough}). Even in these cases, the CHC
816	method maintained a good speedup over VFC except for a small number of
817	frames.
818
819	\begin{table*}
820	\centering \footnotesize
821	\begin{tabular}{\|c\|c\|c\|c\|c\|c\|c\|c\|}
822	\hline\hline
823	scene & method & \#queries & wait time [ms] & rendered triangles & frame time [ms] & speedup \\\hline\hline
824
825	Teapots & VFC & --- & --- & 11,139,928 & 310.42 & 1.0 \\\hhline{~\|-\|-\|-\|-\|-\|-\|-\|}
826	11,520,000 triangles & S\&W & 4704 & 83.19 & 2,617,801 & 154.95 & 2.3 \\\hhline{~\|-\|-\|-\|-\|-\|-\|-\|}
827	21,639 kD-Tree nodes & {\bf CHC} &{\bf 2827} & {\bf 1.31} & {\bf 2,852,514 } & {\bf 81,18 } & {\bf 4.6 } \\\hhline{~\|-\|-\|-\|-\|-\|-\|-\|}
828	& Ideal & --- & --- & 2,617,801 & 72.19 & 5.2 \\
829	\hline\hline
830	City & VFC & --- & --- & 156,521 & 19.79 & 1.0 \\\hhline{~\|-\|-\|-\|-\|-\|-\|-\|}
831	1,036,146 triangles & S\&W & 663 & 9.49 & 30,594 & 19.9 & 1.5 \\\hhline{~\|-\|-\|-\|-\|-\|-\|-\|}
832	33,195 kD-Tree nodes &{\bf CHC} & {\bf 345} & {\bf 0.18} & {\bf 31,034} & {\bf 8.47} & {\bf 4.0} \\\hhline{~\|-\|-\|-\|-\|-\|-\|-\|}
833	& Ideal & --- & --- & 30,594 & 4.55 & 6.6 \\
834	\hline\hline
835	Power Plant & VFC & --- & --- & 1,556,300 & 138.76 & 1.0 \\\hhline{~\|-\|-\|-\|-\|-\|-\|-\|}
836	12,748,510 triangles & S\&W & 485 & 16.16 & 392,962 & 52.29 & 3.2 \\\hhline{~\|-\|-\|-\|-\|-\|-\|-\|}
837	18,719 kD-Tree nodes &{\bf CHC} & {\bf 263} & {\bf 0.70} & {\bf 397,920} & {\bf 38.73} & {\bf 4.7} \\\hhline{~\|-\|-\|-\|-\|-\|-\|-\|}
838	& Ideal & --- & --- & 392,962 & 36.34 & 5.8 \\\hline\hline
839	\end{tabular}
840	\caption{Statistics for the three test scenes. VFC is rendering with only view-frustum culling, S\&W is the
841	hierarchical stop and wait method, CHC is our new method, and Ideal
842	is a perfect method with respect to the given hierarchy. All values are averages over
843	all frames (including the speedup).}
844	\label{tab:averages}
845	\end{table*}
846
847	%VFC S&W CHC Ideal
848	%(01:50:59) salzrat: 1 3,243823523 4,744645293 5,765981148
849	%(01:51:04) salzrat: for the powerplant
850	% for the powerplant
851	%(01:53:59) salzrat: 1 1,519898136 4,01897468 6,599413211
852	%(01:54:01) salzrat: for the city
853	%(01:55:15) salzrat: and
854	%(01:55:15) salzrat: 1 2,321137905 4,646185621 5,199086993
855
856	%salzrat: teapot: 1,977162565 1,128631431
857	%(02:01:52) salzrat: city: 2,607396227 1,674188137
858	%(02:02:59) salzrat: powerplant: 1,635036084 1,244446469
859
860
861	Next, we summarized the scene statistics and the average values per
862	frame in Table~\ref{tab:averages}. The table shows the number of
863	issued occlusion queries, the wait time representing the CPU stalls,
864	the number of rendered triangles, the total frame time, and the
865	speedup over VFC.
866
867
868	We can see that the CHC method practically eliminates the CPU stalls
869	(wait time) compared to the S\&W method. This is paid for by a slight
870	increase in the number of rendered triangles. For the three
871	walkthroughs, the CHC method produces average speedups of 4.6, 4.0, and
872	4.7 over view frustum culling and average speedups of 2.0, 2.6, and
873	1.6 over the S\&W method. CHC is only 1.1, 1.7, and 1.2 times
874	slower than the ideal occlusion culling algorithm. Concerning the
875	accuracy, the increase of the average number of rendered triangles for
876	CHC method compared to S\&W was 9\%, 1.4\%, and 1.3\%. This increase
877	was always recovered by the reduction of CPU stalls for the tested
878	walkthroughs.
879
880	%% While this number may be very low, the algorithm also incurs a
881	%% non-negligible overhead which is caused by the graphics card
882	%% reconfiguration required for the queries and rasterization of the
883	%% bounding volume geometries.
884
885	% triangles
886	% 9%
887	% 1.4%
888	% 1.3%
889
890
891	% speedup over SW
892	% 1.9
893	% 2.32
894	% 1.33
895
896	% compare to ideal
897	% 1.13
898	% 1.9
899	% 1.05
900
901
902
903	\subsection{Optimizations}
904
905	First of all we have observed that the technique of complete
906	elimination of CPU stalls discussed in Section~\ref{sec:full} has a
907	very limited scope. In fact for all our tests the stalls were almost
908	completely eliminated by the basic algorithm already (see wait time in
909	Table~\ref{tab:averages}). We did not find constants that could
910	produce additional speedup using this technique.
911
912	The measurements for the other optimizations discussed in
913	Section~\ref{sec:optimizations} are summarized in
914	Table~\ref{tab:assumed}. We have measured the average number of issued queries
915	and the average frame time in dependence on the number of frames a node is
916	assumed visible and the pixel threshold of approximate visibility. We
917	have observed that the effectiveness of the optimizations depends strongly on the scene. If
918	the hierarchy is deep and the geometry associated with a leaf node is
919	not too complex, the conservative visibility testing produces a
920	significant speedup (city and power plant). For the teapot scene the
921	penalty for false rendering of actually occluded objects became larger
922	than savings achieved by the reduction of the number of queries. On the other
923	hand since the teapot scene contains complex visible geometry the
924	approximate visibility optimization produced a significant
925	speedup. This is however paid for by introducing errors in the image
926	proportional to the pixel threshold used.
927
928	%Good parameters for these
929	%optimizations have to be estimated on a scene-by-scene basis.
930
931
932
933
934	%% , where we compare the reference frame time and
935	%% number of queries already shown in the previous table to the
936	%% approximate visibility optimization using a higher pixel threshold,
937	%% and to the conservative visibility testing optimization assuming a
938	%% node remains visible for 2 frames, both for the teapot and the city
939	%% scene. These tests clearly show that a high visibility threshold
940	%% improves frame times, albeit at the expense of image quality. The
941	%% benefit of the conservative visibility optimization, is however
942	%% limited. While the number of queries can be reduced significantly, the
943	%% number of misclassified objects rises and increases the frame time
944	%% again.
945
946
947
948	%% \begin{table}
949	%% \centering \footnotesize
950	%% \begin{tabular}{\|l\|c\|c\|}
951	%% \hline\hline
952	%% & Teapots & City \\\hline\hline
953	%% frame time CHC & 130.42 & 13.91\\\hline
954	%% \#queries & 2267 & 94.04\\\hline\hline
955	%% \multicolumn{3}{\|l\|}{assume 2 frames visible}\\\hline
956	%% frame time & 123.84 & 13.27 \\\hline
957	%% \#queries & 1342 & 42 \\\hline\hline
958	%% \multicolumn{3}{\|l\|}{pixel threshold 50 pixels and assume 2 frames visible}\\\hline
959	%% frame time & 73.22 & 11.47 \\\hline
960	%% \#queries & 1132 & 50 \\\hline\hline
961	%% \end{tabular}
962	%% \caption{Influence of optimizations on two test scenes. All values are averages
963	%% over all frames, frame times are in [ms].}\label{tab:assumed}
964	%% \end{table}
965
966
967	\begin{table}
968	\centering \footnotesize
969	\begin{tabular}{\|l\|c\|c\|c\|c\|}
970	\hline\hline
971	scene & t$_{av}$ & n$_{vp}$ & \#queries & frame time [ms] \\ \hline\hline
972	\multirow{3}{*}{Teapots} & 0 & 0 & 2827 & 81.18 \\ \hhline{~----}
973	& 2 & 0 & 1769 & 86.31 \\ \hhline{~----}
974	& 2 & 25 & 1468 & 55.90 \\ \hline\hline
975	\multirow{3}{*}{City} & 0 & 0 & 345 & 8.47 \\ \hhline{~----}
976	& 2 & 0 & 192 & 6.70 \\ \hhline{~----}
977	& 2 & 25 & 181 & 6.11 \\ \hline\hline
978	\multirow{3}{*}{Power Plant} & 0 & 0 & 263 & 38.73 \\ \hhline{~----}
979	& 2 & 0 & 126 & 31.17 \\ \hhline{~----}
980	& 2 & 25 & 120 & 36.62 \\ \hline\hline
981	\end{tabular}
982
983	\caption{Influence of optimizations on the CHC method.
984	$t_{av}$ is the number of assumed visibility frames for conservative
985	visibility testing, n$_{vp}$ is the pixel threshold for
986	approximate visibility.}\label{tab:assumed}
987	\end{table}
988
989
990	\subsection{Comparison to PVS-based rendering}
991
992	We also compared the CHC method against precalculated visibility. In
993	particular, we used the PVS computed by an offline visibility
994	algorithm~\cite{wonka00}. While the walkthrough using the PVS was
995	1.26ms faster per frame on average, our method does not require costly
996	precomputation and can be used at any general 3D position in the
997	model, not only in a predefined view space.
998
999
1000	\section{Conclusion}
1001	\label{sec:conclusion}
1002
1003	We have presented a method for the optimized scheduling of hardware
1004	accelerated occlusion queries. The method schedules occlusion queries
1005	in order to minimize the number of the queries and their latency.
1006	This is achieved by exploiting spatial and temporal coherence of
1007	visibility. Our results show that the CPU stalls and GPU starvation
1008	are almost completely eliminated at the cost of a slight increase in
1009	the number of rendered objects.
1010
1011	%Additionally it schedules the queries in an order
1012	%minimizing the overhead due to the latency of availability of query
1013	%results.
1014
1015	Our technique can be used with practically arbitrary scene
1016	partitioning data structures such as kD-trees, bounding volume
1017	hierarchies, or hierarchical grids. The implementation of the method
1018	is straightforward as it uses a simple OpenGL interface to the
1019	hardware occlusion queries. In particular, the method requires no
1020	complicated geometrical operations or data structures. The algorithm
1021	is suitable for application on scenes of arbitrary structure and it
1022	requires no preprocessing or scene dependent tuning.
1023
1024	We have experimentally verified that the method is well suited to the
1025	\nvoq{} supported on current consumer grade graphics hardware. We
1026	have obtained an average speedup of 4.0--4.7 compared to pure view-frustum
1027	culling and 1.6--2.6 compared to the hierarchical
1028	stop-and-wait application of occlusion queries.
1029
1030	The major potential in improving the method is a better estimation
1031	of changes in the visibility classification of hierarchy nodes. If
1032	nodes tend to be mostly visible, we could automatically decrease the
1033	frequency of occlusion tests and thus better adapt the method to the
1034	actual occlusion in the scene. Another possibility for improvement is
1035	better tuning for a particular graphics hardware by means of more
1036	accurate rendering cost estimation. Skipping occlusion tests for
1037	simpler geometry can be faster than issuing comparably expensive
1038	occlusion queries.
1039
1040
1041
1042	\begin{figure*}[htb]
1043	\centerline{
1044	\hfill
1045	\includegraphics[height=0.2\textwidth,draft=\DRAFTFIGS]{images/teapots}
1046	\hfill
1047	\includegraphics[height=0.2\textwidth,draft=\DRAFTFIGS]{images/city}
1048	\hfill
1049	\includegraphics[height=0.2\textwidth,draft=\DRAFTFIGS]{images/pplant}
1050	\hfill
1051	}
1052	\caption{The test scenes: the teapots, the city, and the power plant.}
1053	\label{fig:scenes}
1054	\end{figure*}
1055
1056	%% \begin{figure}[htb]
1057	%% \centerline{
1058	%% \includegraphics[width=0.2\textwidth,draft=\DRAFTFIGS]{images/city}
1059	%% }
1060	%% \caption{The the city scene.}
1061	%% \label{fig:images1}
1062	%% \end{figure}
1063
1064	%% \begin{figure}[htb]
1065	%% \centerline{
1066	%% \includegraphics[width=0.25\textwidth,draft=\DRAFTFIGS]{images/teapots}
1067	%% }
1068	%% \caption{The the teapot scene.}
1069	%% \label{fig:images2}
1070	%% \end{figure}
1071
1072	%% \begin{figure}[htb]
1073	%% \centerline{
1074	%% \includegraphics[width=0.2\textwidth,draft=\DRAFTFIGS]{images/pplant}
1075	%% }
1076	%% \caption{The the power plant scene.}
1077	%% \label{fig:images3}
1078	%% \end{figure}
1079
1080

Note: See TracBrowser for help on using the repository browser.

Download in other formats: