-
Notifications
You must be signed in to change notification settings - Fork 16
/
README
212 lines (151 loc) · 8.25 KB
/
README
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
Efficient Sparse Voxel Octrees 1.4
----------------------------------
This package contains a proof-of-concept implementation of the voxel
rendering system presented in "Efficient Sparse Voxel Octrees" at I3D 2010.
It utilizes the massively parallel computing power available in GPUs to
implement real-time ray tracing of voxel data represented using a compact
octree data structure.
Samuli Laine, Tero Karras
Copyright 2009-2011 NVIDIA Corporation
The source code is licensed under New BSD License (see LICENSE), and
hosted by Google Code:
http://code.google.com/p/efficient-sparse-voxel-octrees/
Original I3D paper:
http://www.nvidia.com/docs/IO/88889/laine2010i3d_paper.pdf
Technical report explaining most aspects of the system in detail:
http://www.nvidia.com/docs/IO/88972/nvr-2010-001.pdf
System requirements
-------------------
- Microsoft Windows XP, Vista, or 7. 64-bit version is recommended to avoid
running out of virtual address space when operating on large scenes.
- At least 2 gigabytes of system memory.
- NVIDIA CUDA-compatible GPU with compute capability 1.1 and at least 256
megabytes of DRAM. Quadro FX 5800 is recommended.
- Microsoft Visual Studio 2010. Required even if you do not plan to build
the source code, as the CUDA Toolkit depends on it.
Instructions
------------
1. Install Visual Studio 2010. The Express edition can be downloaded from:
http://www.microsoft.com/visualstudio/en-us/products/2010-editions/visual-cpp-express
2. Install the latest NVIDIA GPU drivers and CUDA Toolkit.
http://developer.nvidia.com/object/cuda_archive.html
3. Run octree.exe to start the application in interactive mode. The first run
will execute a number of initialization tasks, including compilation of 24
variants of the CUDA code.
4. If you get an error during the initialization, the most probable
explanation is that the initialization is unable to launch nvcc.exe
contained in the CUDA Toolkit. In this case, you should:
- Set CUDA_BIN_PATH to point to the CUDA Toolkit "bin" directory, e.g.
"set CUDA_BIN_PATH=C:\Program Files (x86)\NVIDIA GPU Computing Toolkit\CUDA\v4.2\bin".
- Set CUDA_INC_PATH to point to the CUDA Toolkit "include" directory, e.g.
"set CUDA_INC_PATH=C:\Program Files (x86)\NVIDIA GPU Computing Toolkit\CUDA\v4.2\include".
- Run vcvars32.bat to setup Visual Studio paths, e.g.
"C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\bin\vcvars32.bat".
5. Run "octree --help" to see a list of command-line options.
6. Optional: If you are running on Windows Vista or 7, the next step may
exceed the default GPU driver recovery timeout of 2 seconds. To avoid
difficulties, you should increase the timeout to at least 10 seconds.
- Launch regedit.exe.
- Go to HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\GraphicsDrivers
- If you do not see "TdrDelay", create a new DWORD value for it.
- Set the value of "TdrDelay" to 10.
- Restart your computer.
7. Optional: The package contains a pre-built octree for the default scene
with a relatively low resolution. As high-resolution octrees consume a
considerable amount of disk space, they must be built separately. The
process takes quite long to complete, so you may want to let it run
overnight.
The necessary mesh files are distributed in a separate package.
If you have a GPU with 2 gigabytes of DRAM or more, run
build_and_benchmark_2gb.cmd. The script will create the following files:
- octrees\conference_15.oct 1.06 GB
- octrees\conference_15_ao.oct 1.30 GB
- octrees\default_13.oct 888 MB
- octrees\default_13_ao.oct 1.01 GB
- octrees\hairball_11.oct 1.41 GB
- octrees\hairball_11_ao.oct 1.60 GB
If you have 1 gigabyte of DRAM, you can use build_and_benchmark_1gb.cmd:
- octrees\conference_14.oct 426 MB
- octrees\conference_14_ao.oct 516 MB
- octrees\default_12.oct 333 MB
- octrees\default_12_ao.oct 377 MB
- octrees\hairball_10.oct 474 MB
- octrees\hairball_10_ao.oct 523 MB
If you have less than 1 gigabyte, you will have to tweak the build
parameters manually.
The files with "_ao" postfix contain ambient occlusion information
generated by an additional pre-processing step.
8. Optional: Build the application yourself.
- Open octree.sln in Visual Studio 2010.
- Right-click the "octree" project and select "Set as StartUp Project".
- Select Release build. Debug build is very slow, especially when
constructing octrees.
- Build and run.
Building octrees
----------------
The application supports two ways of building octrees:
- Online build: Increase resolution progressively while viewing the result.
- Offline build: Create a full octree file from the command line.
Online build works as follows:
- Run the application in interactive mode.
- Click "Show octree management controls".
- Click "New octree from mesh..." to start the build.
- The "Maximum octree levels" slider can be adjusted at any time.
- To modify other build options, adjust the corresponding sliders and then
click "Rebuild octree".
- Online build operates on a temporary octree file. To finish the build and
save the result, click "Save octree..."
For details on how to use the offline build, run "octree --help".
The builder runs entirely on the CPU without CUDA-acceleration, and as a
consequence its performance is relatively low. It does, however, utilize
multiple CPU cores by creating a number of threads that operate on different
parts of the octree in parallel. While increasing performance, this also
increases the memory footprint considerably. It may sometimes be necessary
to limit the number of threads through the "--max-threads" command-line
option in order to avoid running out of memory.
The command-line includes a number of useful tools to operate on octree
files:
- Inspect: Print detailed statistics about the contents of an octree file.
- Ambient: Augment an existing octree file with ambient occlusion
information for more realistic look.
- Optimize: Defragment an octree file and remove builder internal data.
Version history
---------------
Version 1.4, May 22, 2012
- Switch to New BSD License (previously Apache License 2.0).
- Upgrade to Visual Studio 2010 (previously 2008).
- Support PNG textures through lodepng.
- Fix a compiler error in the benchmark mode.
- Fix a CUDA compilation issue with Visual Studio Express.
- General bugfixes and improvements to framework.
Version 1.3, Jul 08, 2011
- Fix compatibility issues with CUDA 4.0.
Version 1.2, Dec 17, 2010
- Fix issues with nvcc path autodetection with CUDA 3.2.
Version 1.1, Dec 01, 2010
- Upgrade project files to Visual Studio 2008 (previously 2005).
- Update the codebase to support GF104 and CUDA 3.2.
- Minor stability improvements.
Version 1.0, Feb 17, 2010
- Initial release.
Known issues
------------
- When using CUDA 3.2 or later, the performance of device-side code drops
slightly in 64-bit builds. This is because CUDA 3.2 disallows "mixed-bitness
mode", which we utilize on earlier CUDA versions to get maximum performance.
With CUDA 3.2, we must always compile device code with the same bitness
as host code, which generally results in higher register pressure in 64-bit
builds.
For more information, see:
http://developer.download.nvidia.com/compute/cuda/3_2_prod/toolkit/docs/CUDA_3.2_Readiness_Tech_Brief.pdf
- The support for mesh and image formats is very limited. In particular, only
Wavefront OBJ meshes and truecolor PNG/TGA/TIFF/BMP textures are supported.
If you have trouble importing a mesh, you may want to try enabling
WAVEFRONT_DEBUG in src/framework/io/MeshWavefrontIO.cpp.
- Asynchronous file I/O does not work very well on Windows Vista. This
usually causes random delays and lag when loading/building octrees in the
interactive mode.
- The application may be able to allocate significantly less GPU memory for
the octree than what is actually available, especially on Windows Vista.
To find out how much memory you are getting, look for the following line
in the output: "MemoryManager: Allocated XXX megabytes."