Procedural Data
Contents
Introduction[edit]
Terragen makes extensive use of procedural data. "Procedural" means that the data is generated on an as-needed basis using mathematical formula or algorithms. In simple terms, the data is generated by following a procedure.
The best known example of something being generated procedurally is probably a fractal. The shape of a fractal is generated by following a mathematical formula. Different types of fractals are generated using different formula. If you have used a fractal viewer you'll know you can zoom in on the fractal almost infinitely. This is because the fractal is generated mathematically and each time you zoom a new view of the fractal is calculated. The amount of detail is more or less unlimited.
A key concept of procedural data is "data amplification". The idea is that a little bit of data can be used to generate a lot of data. This is a big part of the reason TG project files are so small in size. Only small amounts of data need to be stored in the file to generate the masses of data required for a visually complex scene.
Another key concept is that the procedures used to generate data are "deterministic". This means that if you use the same starting values you get the same results. This is the reason you can load a saved project that uses prodecural data and the scene looks the same as when you created it.
The "opposite" of procedural data is what you might call static data. Static data is precalculated, generated entirely before being used. Some examples of static data used by TG are polygonal objects (usually imported), heightfields and images used as masks and textures.
Case Study: Procedural vs Static - Fractal Terrain vs Heightfield[edit]
Let's look at the differences between a procedural terrain and a heightfield based terrain. An example of a procedural terrain could be a Power Fractal or an Alpine Fractal terrain. Both of these are based on fractals. You can create one of these terrains using the "Add Terrain" button above the Terrain Node List.
The default TG project is set up with a heightfield ready to be generated. As well as generating heightfields TG can also read them from various sorts of files.
The first big difference between a procedural terrain and a heightfield is the area they can cover. A procedural terrain is theoretically inifinite. You could travel across it forever and never come to the edge. When you create a procedural terrain with TG it covers the entire planet. By contrast the default heightfield is square and 10km on each side. You can certainly create large heightfields, and in fact you can create heightfields that cover entire planets. The Mars MOLA data set is an example of that. However it does require a lot of data, which is something we'll look at later.
The second big difference between a procedural terrain and a heightfield is resolution, or detail. Once again, the amount of detail in a procedural terrain is theoretically infinite. There some practical limits to the detail. However the fact remains that you can zoom in very close to a procedural terrain and still see fine detail. By contrast a heightfield typically has a much coarser fixed resolution. The height measurements are in a fixed grid. A common size for this grid is 10m. That essentially means that there is no further detail in the terrain when you are looking at distances less than 10m apart. The procedural terrain might have detail down to the sub-millimetre level.
Procedural terrains can be stored much more efficiently. The shape of terrain is based on a mathematical formula and a set of numbers which define some of the characteristics of the terrain. When you save a project or clip with a procedural terrain those values take up very little space in a file. In contrast a heightfield will take up much, much more space. A procedural terrain covering a whole planet can take less than a kilobyte. Heightfields covering an entire planet, such as the Mars MOLA dataset, can run into the gigabytes and it's still only accurate to the resolution of the heightfields. In non-scientific terms the heightfields several several jillion times more data.
Another important difference is that heightfields can't have overhangs or caves. Overhangs in particular are often found in natural terrain, and may be highly desirable in unnatural terrains too. These sorts of features are no problem for procedural terrains in TG. Actually creating caves is not the easiest thing in TG, but this is more down to tools rather than limitations in capability.
Of course when you are interested in somewhere real, or even somewhere ficticious but with a particular shape, heightfields are the right thing for the job. It's very difficult to recreate a piece of a real landscape, Mt St Helens for example, using entirely procedural data. It's much easier to use a DEM of the real location. Similarly if you're after a specific landscape shape it maybe be easier to create it as a heightfield using a heightfield editor and/or an image editor.
More Examples of Procedurals in TG[edit]
Here's some other examples of procedural data in TG:
- 
Shaders
 Many shaders are prodecural. One example is the Simple Shape Shader. Although this creates basic shapes which could also be done with image maps using procedural generation means you can scale and rotate the shapes without edges becoming blurry or aliased.
- 
Populators 
 Populators generate the positions of their instances procedurally. When a populator is populating it's actually using an algorithm to generate locations for each instance. Instead of storing the positions of millions of instances in project files, which could take a lot of disk space, the populator just stores a few numbers.
- 
Surface Layers 
 Surface layers use fractals and other sorts of procedural shaders to create surface effects. Image maps can be used effectively but zoom in too far or get too close and things will start to look blurry. From far away an image map may be obviously tiled. Procedural data avoids these issues.
- 
Objects 
 Certain built-in objects, such as the Sphere are procedural objects. These are generated from mathematical formula as opposed to being made up of individual polygons. Some objects like the Grass Clump and Rock are rendered as polygonal objects but are generated using procedural algorithms. The polygons making up the objects aren't stored in files.
- 
Function node networks 
 Function node networks, and to an extent other parts of the node network, are evaluated at render time to generate their results. This is an example of procedural data generation you can create yourself within TG.
- 
Animation curves 
 One use of procedurals that might not be obvious is animation curves. Animation curves are interpolated between key frames. This means values at frames between key frames are calculated mathematically. A entire complex curve can be stored just using a few key frames.
Why Use Static Data?[edit]
If procedural data is so great then why do we use static data? In many situations static data is just what is needed. The discussion of heightfields above is one example of that. Another obvious example is polygonal 3D models. Of course in many cases an image is exactly what you need for texturing or masking. A combination of static and procedural data is often the best way to achieve the results you want.
It's possible to use procedural data to improve static data. A good example of this is adding extra detail to heightfields using fractals. TG does this by default in the Heightfield Shader node. The shader can add extra small scale fractal detail which helps to make up for the limited resolution of the underlying heightfield. Of course the extra detail is "fake" but in most cases this doesn't matter. Here's an example of a heightfield with and without added fractal detail:
Heightfield with (left) and without (right) added fractal detail
You can see the extra detail makes quite a difference. The addition of extra detail is not limited to small scale features. You can easily create large procedural features on heightfields as well.
One of the strengths of procedural data can also be a weakness. Generating procedural data can be computationally intensive. In some cases it may before efficient to use static data rather than procedural data. An example of this might be a mask made up of many procedural elements. If it was slow to compute and resolution wasn't much of an issue it could be better to paint the mask in an image editor and use the static image as the mask. Similarly a procedural terrain could be slow to calculate. There might be an advantage in saving the terrain as a heightfield, although of course you would need to take into consideration the limitations of heightfields, such as the lack of overhangs. In these kinds of situations you are making a trade-off between computation time and data size. An image might work as a mask and be faster to compute but it could take a lot of space on disk and/or in memory while rendering.
Another potential weakness of procedural data is the fact it's based on algorithms. This can be tricky when it comes to import and export. A prime example of this is procedural shaders. Many rendering applications support procedural shaders however it's not very common to have import/export of shaders between applications. This is because each application doesn't know the precise shading algorithms of the other applications. Without knowing that one application is not able to recreate the results of another application.
Another example of this is terrains. TG supports export of terrains but only as static data. You can either export the terrain as a heightfield or use a micro exporter to export rendered terrain as a polygon mesh. Another application wouldn't be able to recreate procedural terrains from TG because it doesn't know the algorithm TG uses to generate the terrain.
Even when algorithms are known in general it can still be tricky. The best example of this is animation curves. Many applications use the same type of curves for animation. However unless the precise algorithm is known it's difficult to exactly reproduce the curve. This might be a problem if you were trying to match a camera move from TG with a camera move in another application. Although the curves might be roughly the same shape minor differences could cause jitter or shake in the results. For this reason we suggest that animation curves are baked on import or export. Baking is a process where a key frame is placed at every frame. This effectively turns the procedural curve into a static one.
A heightmap or heightfield is an array of height values, usually in a grid which describe the height at specific points in a defined area. Heightfields are used to represent real-world and virtual terrain in a specific, easily converted format. Most heightfields can be represented as simple image data in grayscale, with black being minimum height and white being maximum height.
DEM stands for Digital Elevation Model (or Map). A DEM is similar to a heightfield. DEMs are normally generated from real world measurements of a planet's surface, for example Earth or Mars.
A single object or device in the node network which generates or modifies data and may accept input data or create output data or both, depending on its function. Nodes usually have their own settings which control the data they create or how they modify data passing through them. Nodes are connected together in a network to perform work in a network-based user interface. In Terragen 2 nodes are connected together to describe a scene.
A shader is a program or set of instructions used in 3D computer graphics to determine the final surface properties of an object or image. This can include arbitrarily complex descriptions of light absorption and diffusion, texture mapping, reflection and refraction, shadowing, surface displacement and post-processing effects. In Terragen 2 shaders are used to construct and modify almost every element of a scene.


