Shadow Filtering For Pointlights


Dear reader,

I want to write a bit about PCF filtering for omnidirectional lights (see how i just rephrased the title?) because I think this is a desirable thing to implement for a lot of beginners (like me), but I think it has not been written about as properly as I’d like. I would be very happy if you – the reader – would post links of good tutorials in case I missed them.


Shadow Filtering in General

Should you be looking for different shadow filtering algorithms, you have come to the wrong place – the right place would be here:

MJP provides a sample solution with numerous different ways to properly filter your shadows, highly recommended to check them out (you can find the shader code in this file on github).

Shadow Projection

Most of the tutorials/papers about shadow filtering have spot lights or directional lights as the source of shadow maps, therefore they just have to sample a simple 2-dimensional texture.

However, projecting the view of an omnidirectional light source onto a simple texture is not trivial at all. One can do that for example with Parabloid or Dual-Parabloid projection.

A great resource for that is

You can see one side of such a projection to the right (taken from the link)

This sort of projection comes with a number or problems:

  • The texel-per-mesh-area covered ratio is high in the middle and very low at the edges
  • You will experience edge seams that need covering up
  • You waste a lot of memory with the black areas ( 1×1 – 0.5×0.5xPI = 21,46%)
  • Filtering is not trivial

So this approach is often ignored and instead cubemap projection is chosen (both for shadow mapping as well as environment mapping).

There is a great tutorial for that here:!Advanced-Lighting/Shadows/Point-Shadows

The basic idea is to sample the scene 6 times, each with a different orientation of the camera (Cubemaps: wikipedia).

Even better: DirectX and OpenGL combine these 6 textures into an array (TextureCube) and can read hardware-filtered texels with a simple direction vector as an input.

That makes reading out the shadow trivial –> you can simply sample the texture with the vector from the light to the pixel’s position.



Filtering a cubemap

The good part about cubemaps is that they can be bilinearly filtered by default. That means that if you sample a pixel at the very edge of one texture side, it will be compared with a pixel from the texture next to it.

However, for shadow filtering, we have to compare depth values from neighbors, we don’t just want to blend them. We also want to compare more than just neighboring pixels for a softer appearance. (Note: we could do that with Gather() instead of Sample(), but that would only give us 4 texels)

The tutorial ( I’ve linked to deals with that by giving the sampling vector (pixel to light direction) some offsets in all directions, to get some compare values and smooth the results.

I’ll copy the code just to make clear how it works, but I think the illustrations below help, too.

vec3 sampleOffsetDirections[20] = vec3[]
( vec3( 1, 1, 1), vec3( 1, -1, 1), vec3(-1, -1, 1), vec3(-1, 1, 1),
vec3( 1, 1, -1), vec3( 1, -1, -1), vec3(-1, -1, -1), vec3(-1, 1, -1),
vec3( 1, 1, 0), vec3( 1, -1, 0), vec3(-1, -1, 0), vec3(-1, 1, 0),
vec3( 1, 0, 1), vec3(-1, 0, 1), vec3( 1, 0, -1), vec3(-1, 0, -1),
vec3( 0, 1, 1), vec3( 0, -1, 1), vec3( 0, -1, -1), vec3( 0, 1, -1) );

float shadow = 0.0;
float bias = 0.15;
int samples = 20;
float viewDistance = length(viewPos – fragPos);
float diskRadius = 0.05;
for(int i = 0; i < samples; ++i)
float closestDepth = texture(depthMap, fragToLight +          sampleOffsetDirections[i] * diskRadius).r;
closestDepth *= far_plane;
// Undo mapping [0;1]
if(currentDepth – bias > closestDepth)
shadow += 1.0;
shadow /= float(samples);

Here is an old screen shot i took when i used this technique (don’t mind the bad looking colors etc.)


This exhibits a great number of issues and they all come from the simple fact that we use cubemaps, which will become apparent soon.

I try to visualize how this implementation works and how it can be improved.

For this animation I chose a top-down view and 4 offset vectors, all of which are part of the full array.


You can already see a problem here, where the –y vector and the +x vector sample almost the same point.

Another obvious problem arises in this scenario:

In this case we sample the same texel 3 times without any information gain.

I made a close up of the actual shadows and you can see how the problem manifests:sm_old_artifacts2
Note how the offsets between different shades is very inconsistent.

This can be helped by calculating normal and binormals of the sampling vector and using them instead, however it won’t help with what we really want, since we can still skip texels or sample a texel numerous times with bad configuration and offset size.

We really want to read out the neighboring texels instead of just trying to change our vector a bit and hoping we hit a good sampling point. This is also the way to ensure we can use smoothing on our edge-taps (more on that later)

This brings us to a major flaw with the basic TextureCube element in hlsl. We can only sample it with a vector3 input, but we do not know the sampled texel’s position in the array nor do we have easy access to it’s neighbors (for example Texture2D could use SampleOffset(), or simply Load()).
(Noteworthy: we can use the texture array instead, read here )

One could use the dot product to get the sampling vectors to behave correctly and always snap one texel to a certain direction, but edges would have to be treated seperately since the offset does not match to the texel of the next texture accurately. This is probably a much easier solution than what I did, but I didn’t think of it at the time.

However, when I red this thread here
It seemed to me like cubemaps aren’t really what I want.

Not using Cubemaps

It was suggested to fit all 6 texture maps onto one big texture instead and then manually create a conversion function that would return the accurate texture coordinate from any given 3d vector.

So that’s what i did, and it might be a solution for you, too.

You can see the 6 shadow maps on the left in one big texture strip.

An excerpt from my conversion function

//vec3 doesn’t have to be normalized,
//Translates from world space vector to a coordinate inside our 6xsize shadow map
float2 GetSampleCoordinate(float3 vec3)
    float2 coord;
    float slice;
    vec3.z = -vec3.z;

    if (abs(vec3.x) >= abs(vec3.y) && abs(vec3.x) >= abs(vec3.z))
        vec3.y = -vec3.y;
        if (vec3.x > 0) //Positive X
            slice = 0;
            vec3 /= vec3.x;
            coord = vec3.yz;
            vec3.z = -vec3.z;
            slice = 1; //Negative X
            vec3 /= vec3.x;
            coord = vec3.yz;

  // … other directions, Y, X

    // a possible precision problem?
    const float sixth = 1.0f / 6;

    //now we are in [-1,1]x[-1,1] space, so transform to texCoords
    coord = (coord + float2(1, 1)) * 0.5f;

    //now transform to slice position
    coord.y = coord.y * sixth + slice * sixth;
    return coord;

You can find the full code in the github solution, this is just to give you a rough idea of how it works.

Looks like my code is working!

sm_texelsIt’s still blocky, but we could easily apply the offset-vector code and it would show the same results as with cubemaps.

But, since we are dealing with 2d texture coordinates now, we can plugin any code for shadow filtering (for example: PCF).

However, there is still a problem –> what if we are at the very edge of a texture block and want to sample the right neighbor? We need to have another function that checks for offsets. If the textureCoord + the offset are out of the current projection we have to translate that to some other texture coordinate.

This, of course, can sometimes be troublesome, because going to the right on our topview might be going down on our left side view.

Edge Tap Smoothing

Working with texels allows us to have very smooth transitions, since we know “how far into” the texel we are sampling.

For example let’s assume we have a center sample and one sample with
a 1 texel offset.

sm_edgetapIf we are on the very left of the center texel our right sample is also at the very left and should therefore not have a lot of impact on our final result, since we “cover” only a very very small area of the right sample.

If our center sample is at 0.5 our right sample is at 0.5, too and therefore we cover half of the texel’s area, so we weight this sample with 0.5

You can see a simple illustration of this on the left side, but I admit it might be a bit misleading with the 0 and the 1 values.
Plus it’s not simply adding the colors, but you also have to divide by the total weights, so for 0.5 of red it would be (yellow + 0.5xred) / 1.5f
Regardless here is the actual result on shadow maps (3×3 samples for PCF)



Of course, by increasing the amount of samples we can get smoother results, but there really is a limit on how soft our shadows can become with PCF before performance tanks.

For soft shadows, perhaps VSM or ESM maps might be a better choice (i implemented them originally for spot lights here:

With PCF working correctly one could also implement Percentage Closer Soft Shadows (here is a paper from nvidia: Percentage-closer Soft Shadows) and i might explore that in the future, but the implementation is pretty expensive by default.

So yeah, I hope you liked the read. I’m not a pro or anything, so advice and tips are greatly appreciated.

You can find the project here:

Tagged , , , , , , , , , ,

Shaders, Fun & Monogame

I haven’t really written in a while on this blog, but I think it’s a good time to compile some of the stuff i have been working on since the latest blog posts and I’ve decided to release a lot of smaller articles.

A lot of the stuff is available for download or I’ve written more detailed posts about it on the forums, so you might find some things of interest.
I will also put up a list of resources and tutorials which helped me in building the stuff at the end of each segment.

Fun Shaders – Part 2

Here is a short overview of some casual shaders I’ve created in the last week or so.


Radial Blur


A fairly simple, yet impressive effect. Not very complicated, we basically just scale the image from the blur center several times and then average out.
This can look much better if we repeat the effect on top of the blurred image.

We can effectively square the used samples that way. For 10 samples and 3 passes the complete frametime is around 0.5ms on a Radeon 280.

My shader:

float4 PixelShaderFunction(VertexShaderOutput input) : COLOR

float2 vec = MousePosition – input.TexCoord;

float4 color;

//This is done in a static global instead

//float invBlurSamples = 1.0f / BlurSamples;

for (int i = 0; i < BlurSamples; i++)
color += Screen.Sample(texSampler, input.TexCoord + vec * i * invBlurSamples * BlurIntensity);

color *= invBlurSamples;

return float4(color.rgb, 1) ;

Note: I use a point sampler, so I might as well use Load instead of Sample. I have found the difference when using bilinear sampling to be minor, especially compared to the performance impact.

Bokeh Blur


Out-of-focus blur from cameras creates interesting shapes, which depend on the lens and aperture of said lens.
For more information I’d recommend wikipedia

shapeHexagonAnyways, creating this effect is pretty challenging from a rendering perspective since it taps into the old “gather vs scatter” problematic.
The idea of this blur is that each point expands into a certain shape, for example a hexagon (which you can see to the right)

The problem lays in the fact that a pixel shader has a set pixel coordinate that it has to work with. So for each given pixel we can:

  • Read other pixels from another texture

But we cannot

  • Read pixels from the rendertarget we are writing to
  • Move the current pixel around

So the problem is that we can basically only “gather” info, but not expand our pixel.

Therefore a normal blur implementation would look up the neighbor pixels and average them out.

For a bokeh implementation one could read all neighbor pixels in an x*x neighborhood (x being the width/heigh of our bokeh texture), check if our current pixel is part of the bokeh texture, when viewed from the other pixel’s perspective and then average out accordingly.

The trouble then is that we cannot have very large bokeh shapes. If our shape were to be 100×100 pixels that would be 10000 texture reads per pixel. Absolutely impossible to compute in real time.

So the other approach then is to actually expand each of our pixels.


To do that we simply draw a quad for each pixel. This quad has the bokeh shape multiplied by the pixel’s color. Voila. Done.

Obviously this doesn’t sound like it would perform very well, and it’s true, it doesn’t.

However, if we cut our resolution down to 1/4th (1/2 width, 1/2 height) this approach becomes interactive already (depending on the quad size).

In my implementation I prepare a vertex buffer with all the quads combined and draw everything in one drawcall. I read the pixel’s color in the vertex shader and the bokeh shape in the pixel shader.

However, we have to deal with massive overdraw, each pixel is affected potentially thousands of times, depending on how large our quad is.

Not only is that bad because of fillrate, but we also run into some heavy precision problems when trying to blend (additively). Even with FP16 rendertargets precision problems crop up pretty fast when dealing with giant quads.
The issue is visible even with this highly compressed .gif:

However, switching to a 32bit rendertarget impacts the performance in an unacceptable way, so that is no real option.

So the solution I came up with is to resize the buffers after set thresholds.

Eg: after BokehSize = 5.0f I downsample the rendertarget (and the subsequent quad amount) another time, I rescale the quads to fit again and enjoy very good performance along with little enough overdraw to not need 32bit rendertargets.

The transition is noticeable ins some high-frequency areas, and could probably be improved if the base rendertarget is downscaled with priority on standout colors.


Resources used

I didn’t use any templates for this, but I have stumbled upon this implementation by MJP
It uses a stock blur for most parts of the image and only applies the bokeh effect on extracted highlights (similar to a bloom extract)


“Technic” Effect

A simple yet fun shader that sort of looks “techy” and looks a bit like currents flowing through a microchip (as an engineer this feels wrong)


There’s really not much to it, just a simple expand pixel shader. I store the current age and direction of a lit pixel in the alpha value and read all 4 direct neighbors of black pixels to see if they are “next” to be part of the current line.

Additionally I have a random value (either static per pixel or truly random) that determines whether or not the line will split in two (changing directions along the way – from horizontal to vertical and vice versa) or end.

The appeal originally was to use the 8bits in the alpha to store information efficiently, but it turned out i didn’t need that since a pixel is either – black, freshly spawned with new direction information for neighbors or fading into darkness (no neighbor will be affected).

Anyways, stuff like this is always fun.

Spring Particles

Similar to the Bokeh effect i have potentially millions of quads per pixel in a rendertarget drawn onto the screen.


In the image above i used 128 x 80 particles.

On two seperate rendertargets I store the current position (rg) and the current velocity (ba) of each particle. I switch these RTs to read back from each other and simulate simple spring equations, with k being the spring constant and a dampening factor (you can see that in the video at the beginning of the post).
Additionally there are extra attractors/repellants, which i map to the mouse coordinates.
In the .gif below I have one particle per pixel, so a total of 1024000.


The interesting thing about this kind of simulation is that the GPU essentially doesn’t care about a 1280×800 simulation if the math inside is pretty simple.
On the CPU this would be nearly impossible, while I can run this at several hundred frames with one particle per pixel on the GPU. And the main blocker is probably overdraw / blending anyways.

It’s actually not the first time I’ve done this. A very similar approach was used when I created the grass simulation for bounty road. This one is even more complex with complicated wind functions affecting each sample point.

You can see an old video about that here:

Anyways, I hope you liked this short overview Smile


Short video overview:


Motion Blur

In real life and expensive rendering solutions an image is not a real “snapshot” but rather the continious integration of samples (or photons if you will) over a frame’s duration.

In real time rendering it’s not feasible to render multiple frames and average them into the final one, since we don’t get framerates quite so high.

So the usual approach is to render a velocity buffer that basically says how much a pixel has moved per frame. We can do that by passing both the current and the previous model view projection, calculating their position and saving the difference in screen space coordinates. This can be done with MRTs easily.

Then in a post processing pass we can blur each pixel in the direction of the velocity information.

For more information see the tutorial from john chapman, which I linked below.


A few issues remain:

  • For high framerates the difference between frames is not enough to have motion blur visible. We can calculate the factor between actual frametime and target frame time (for example 50 ms for 25Hz. If the actual frametime is 25ms our factor is 2) and increase the motion blur by that factor to give it a consistent strength across variable framerates.
  • The motion blur is limited only to the object. This is not what we actually want, since we want to smear the object with the background, so we need to somehow enlarge the blur field.
    This can be done with various dilation methods, a good reference is the presentation about “Next-Generation-Post-Processing in Call-of-Duty” linked below.

    What I did instead of any pixel operations is to render the velocity buffer in a whole different pass and extrude the balls in the direction of the velocity vector (and away from it on the backside).
    This is sort of “hacky” and it only works well for this sample, in many cases the vertex displacements are not accurate enough for what we want.

Also keep in mind that this motion blur technique is not super cheap. Your velocity buffer probably needs more than 8bit precision (i worked with 16bit). And you need to pass an additional transformation matrix to the vertex buffer.

The second point might not sound like much, but if you are using skinned meshes you need to pass not only your current bone transformations, but also the ones from the last frame. If you are using directx9 you may run out of registers for the vertex buffer.

Resources used


Order Independent Transparency

Just something I’ve implemented / copied based on Morgan McGuire’s work.


It’s really nothing to write home about from my side, I originally intended to research this topic more in depth, but haven’t found time or motivation to do so.

The whole premise of order independent transparency is that you don’t have to worry about the order in which you draw your transparent objects. This is not trivial at all, but Mr. McGuire and his team have found a relatively easy to implement solution.

In my tests I haven’t found it to be super robust and the algorithm seemed to struggle with relatively high alpha values, but that may be my fault.

The upside is the ease of use and implementation compared to all other relevant research, which usually revolves around some sort of per-pixel lists and compute shader computations. This one works with basic pixel shaders and some blend modes.

The one thing I did have to do is to implement different blend modes for MRT in monogame. By default Monogame/XNA uses the same blend mode for all rendertargets.

This behaviour can be easily changed if TargetBlendState inside the BlendState class is set to public and then manipulated. TargetBlendState is an array of blendstates, with each one corresponding to the rt in an MRT setup, so we have to change TargetBlendState[1] for our second RT for example.

Resources used

Ocean Scene


Before working on the model viewer I actually wanted to create a “water slice”. I found the idea pretty interesting both as a visual concept as well as an interesting challenge. Water in general is always an interesting rendering problem and I have not dealt with that one before.

What is it?

  • A FFT transformed height field used as base for the water surface
  • A water cylinder below
  • both have fresnel and light absorption functions applied
  • combined with some stencil shading


Above: First iterations, with some fake green subsurface scattering

How is it done?

Well the whole thing went through several iterations and in the end I stopped working on it, since it didn’t seem like something that would be interesting enough to release without having fishes and boats etc. Plus the rendering obviously needed more work.

The basic idea for the heighmap transformation is derived from J. Tessendorf’s paper about simulating ocean behaviour by statistical analysis of waves and work out the current heightmap values with fast fourier transformations. You can find the link below.

For realistic rendering I combined that with a fresnel map that is generated on startup during runtime for realistic reflection behaviour as well as a water color map based on light absorption, which i generate with some real world values.

I had to combine that with calculating the volume traversal for each pixel, which was interesting to solve, but I did something similar for volumetric lights before.

Basically you would want to know the entry and exit point in the water surface for each pixel. That can be done by either reading the pixel depth (for pixels on the water surface) or calculating intersection points for a line / circle (basic math).

Added to that is that the pixel “vector” (the vector from camera to entry pixel) is skewed because of refraction inside the water.image_thumb[2]

Along with the knowledge of how far we travel through the water we also have to know how much light reaches the water depth in the first place and then integrate these together.

This poses an interesting issue where if we have a water cylinder like this we have to decide whether or not the light outside the cylinder but below the water surface travels as if it’s in air or in water which we just can’t see.
Depending on which one we chose the water gets darker faster or slower depending on depth from above and from the sides.

The depth also depends on the dynamic surface height of the waves. You can see that pretty well in the first gif.

Another interesting challenge is to simulate actual refraction for meshes inside the water.
This is everything but trivial:

The game approach to refraction is usually to use a post-process effect

  • Render the scene “as is” without any refraction to a buffer
  • On the objects that refract (glass, water etc.) we usually have a distortion texture or use normals to get some distortion value
  • read the buffer with the distortion offset and present the data

This looks plausible enough in many cases, but has the major drawback of not being accurate at all.

The problem lays in the fact that depending on how far away the mesh point is from the point of refraction the mesh will appear more skewed to one side for the viewer.

E.g. a fish right in front of the glass of an aquarium more or less has the same position as if there was no water/glass. A fish at the back end however will have a great offset from what his position would be like without refraction. This is not accounted for with the post-process approach.

So I attempted to “solve” the issue for a cylinder in the vertex shader and transform the position to where it should be after the refraction. Some of it I could do analytical (because of the cylinder assumption) and some I had to do numerical.
Either way, it works *somewhat*, but it does have bugs from certain perspectives and I’d have to fix that if i wanted to release the project.

Note: The fish looks undistorted when close to the camera but is stretched when at the far end of the cylinder.

Resources used

Monogame Model & Animation Viewer

A short video as an overview:

You can find the source code on github:

A direct download of the binaries here:

And a discussion link here:

What does it do?

  • You can import a variety of .obj and .fbx files and see if they were exported correctly basically.
  • You can try out different materials and settings in a PBR environment
  • You can see if the animations were correctly exported (.fbx only)
  • HBAO and Parallax Occlusion Mapping along with preconvoluted environment maps should help to give the objects a realistic feel
  • Use heightmaps as bump maps
  • You can use the included animation import libraries for your games to set up 3d skinned animation in your game.

Why did I do it?

  • learn about skinning (on the GPU) and animation setup
  • learn about Image Based Lighting
  • implement Parallax Occlusion Mapping
  • build a solid GUI
  • create / use tools to import new resources to a running program, handled in a different thread

In my previous (unfinished) game proper skinned animations were never really implemented and I have never dealt with skinning before. An artist i work with inquired about the possibilities of animations and I put up this viewer to verify his work exported correctly.
I’ve also learned a bunch of new stuff (see list above) and created a basic, solid Graphical User Interface I could use in future projects.

The GUI is a good fit for a playground/debug user interface, since I can directly link fields/properties/methods of any instances to the interface elements. This might not be optimal for a giant UI in a game, but is just what I need for fast prototyping. Along with being easy to build and use it is not resource intensive and does not produce any garbage, a flaw i have found with other solutions.

I think the source can be useful if you want to learn a bit about animation and/or rendering basics in monogame.
The renderer uses simple forward shading; it has a simple depth pre-pass that’s only used for ambient occlusion.

Even though of all the things created lately this is by far the largest and most polished I think I’ll leave it at that. The techniques themselves are well explained in other tutorials (and the links below).

I hope some of you find this tool useful :)

Resources used

Shadow Tactics – Rendering Breakdown

logoHi guys,

this one I almost missed. True to its name and heritage it hid from me and if I didn’t notice it on someone else’s wishlist I would probably never have heared about Shadow Tactics – Blades of the Shogun.

It’s basically a fresh take on the good old realtime tactics genre, a sequel to the likes of Commandos – Behind Enemy Lines and Robin Hood: The Legend Of Sherwood. So basically “isometric” party spy action.

Regardless whether or not you know these games this one I found to be really intriguing.
Luckily for everyone the developer released a pretty lengthy demo of the game, you’ll find it in the steam store.

The fact that there is a demo is pretty amazing, but what’s even better is that the game itself is right up my valley, too. Go check it out. Or read some reviews, there are plenty (even though I have missed them all when the game released a month ago).

What makes this game even more interesting to me is that it’s made by local Munich developer Mimimi Productions. Should one of them read this: How about a beer? : P

Anyways, as with every game i play i usually pay close attention to its graphics and usually capture some frame breakdowns with Intel GPA or RenderDoc for me to analyze.

This time i decided to share, hope you enjoy.

Frame Breakdown

This is our final image: output_final

How do we get there?

This game is made in Unity, so I assume it uses many of the default way Unity renders things, but there are some unique things as well.
EDIT: As I went on and on I found a number of things I am not certain about, sorry. I decided to still release this blog post since i like the game and want to write something about it. I have never personally worked with Unity and only assume some stuff. The default Unity reference deferred rendering setup is a bit different than the one used here.

I will try to make this as friendly to beginners as possible, but first let’s see a quick overview. I will go over these points in more detail afterwards.

In chronological order:

  • Depth map generation for shadows and enemy view cones
  • Water reflection map (if needed)
  • G-Buffer generation
  • Deferred shadow mapping
  • Lighting
  • Water (if needed)
  • View cones
  • Outlines drawn
  • Screen Space Ambient Occlusion
  • Particle Effects
  • Bloom/Glow (Fullscreen Blur)
  • Combine (+tonemap) into final image
  • UI Elements (Buttons etc.)

However, since some things are done earlier, but only come up later to full effect, I will not talk about everything in chronological order, but instead in a way that is easier to explain and understand.


Let’s start with the G-Buffer generation. This game uses a deferred rendering engine, which means that first we have to store all relevant material informations for each pixel and later we can calculate lighting with this information.

So we basically draw all our objects and store the relevant information, like for example “what color does the material have? How reflective is the material?” to a big texture / several textures.

Unfortunately a texture only has 4 different channels at max (red green blue and alpha / transparency) but we need more info so we save to multiple textures at once.

I’ll break them down:

  • Albedo (rgb) – that’s just the texture of the object without any lighting appliedalbedo
  • Occlusion (a ) – the baked-in darkening of the object based on the fact that some parts are naturally occluded from ambient light (typical usecase: cracks etc. on rocks)occlusion
  • Specular (rgb) – the brighter this value the stronger the reflective properties of the material. If it is not gray it’s usally a metal. For example the ninja’s sword handle is gold. So it will reflect the environment and light with a golden tint. specular
  • Smoothness (a ) – a super smooth object is basically a mirror, a rough object is for example cloth or dry dirt. This determines the way the material reacts to light.smoothness
  • Normals (rgb) which direction does the current pixel face? We need this later for light calculations (how much does it face the light –> how bright is it?)normals
  • Outlines sub (a ) – I falsely presumed these were used for outlines. However, it turns out they are not. It is possible that this is info that is used in the lighting pass, since these meshes are dynamic/moving and do not have a precomputed light map. Maybe these pixels get lit with some (indirect) light approximations, which are not used elsewhere.outlines
  • Lightmaps (rgb ) – precomputed light bounces and indirect shadowing. These textured were rendered with a very sophisticated lighting model to simulate how light bounces around and distributes in shadows. For static lighting (the sun) and geometry (our objects) we don’t have to calculate this for every frame, so it is done beforehand. Notice how the stones on the right are lit from below because the sunlight reflected off of the water
  • Non-terrain (a ) – I dubbed this one non-terrain, since that’s what it is. I am not sure what exactly this is used for later on. It’s basically all the meshes except for the terrain meshes. A good usecase for this would for example be: decals / textures like tracks from a road, that should only affect the ground.nonterrain
  • Depth (R ) – the depth of the scene (distance from the camera). Used for light calculations and other stuff.depth
  • Stencil (R ) – discussed later, used for view cones.stencil

I can’t think but wonder if the engineers could optimize the G-buffer setup quite a bit by not having a rgb specular buffer for example. Anyways, let’s move on.

toon1Comic / Toon look:
The artstyle demands a cartoonish look, so black outlines are applied to all relevant meshes (not all meshes, bushes for example look better without).
So every mesh affected (almost all) is drawn twice. Once with slightly larger outlines and only in black and once on top with the correct texture. If you want to learn more about this technique you should look for inverted normals.

Note: In this game’s implementation every mesh is drawn twice, right after another. For better depth buffer usage and less texture changes it would make sense to first draw all black meshes and all the textured ones afterwards (basically like a depth pre-pass) so all geometry is drawn to the depth buffer first and we can potentially save a lot of pixel draws when we draw the colored meshes afterwards. This should work in most implementations of such a toon rendering.

A thought about terrain, should you be inclined to program your own:
The way it is handled here is that the terrain is split into chunks, so each chunk can be efficiently frustum culled (not drawn if not in the field of view of the camera).

In this image I only enabled some chunks of the terrain to draw (so it’s easy to see how large they are). For each point the terrain checks in the map on the left. The amount of green determines how much of the grass texture our pixel uses. This way we can blend smoothly between different ground textures, for example we can add a little bit of red to it to make it blend with rocks a bit.

Note: The terrain is drawn first in this implementation. Don’t do that unless you have to! Draw it last, since it covers potentially the whole screen, but only parts of it are visible in the end. If we draw the terrain first we have a lot of pixels that are overdrawn later on (check out “non-terrain” above!) This wastes a lot of resources.terrain_chunksterrain_1terrain_2

Deferred Shadows and Lighting

Next up we draw our sun shadows to another screen buffer. In a step before our g-buffer creation the scene was rendered from the perspective of the sun, but only the depth of the scene is stored. It looks like this:

Notice how there is no terrain below our objects, so in this game it doesn’t cast any shadow (but it can receive shadows)

In the next step we check every pixel in our base view and measure the distance to the sun / the shadow caster (we check where the pixel is from the viewpoint of the sun). If we are further away than the value stored in the depth map we must be in shadow, so we color this pixel in black.

deferred shadows

We read this map in the next step and if the pixel is not black we perform a light calculation. We can combine the result with the light map buffer to get the scene properly lit. Yay!


Normal Transformation and Water

In the next step we transform our normals with some matrix math from world space to view space. That means that their base matrix now aligns with the base of our camera view. You can look up more information on that if you want (View Space Normals), but it basically means that if a pixel looks to the right it’s 100% red, if it looks towards the top of the screen it’s 100% green and yellowish in between.
World space normals look in the x,y and z axis and so they need 3 colors (RGB), but in viewspace we only need 2, since the third axis looks towards us or away from us and gives no new information.

normals – > normalsVS

It obviously begs the question why the normals weren’t stored in View Space from the get go, this just adds one more matrix transformation to every pixel. Most deferred engines do just that.

We will need the transformed normals further down the line, for example for screen space ambient occlusion effects. (It should be noted that that can be done with world space normals, too)

In the free’d up blue and alpha channel the depth is stored.

We draw our water on a mesh around our terrain and we can check against the depth to see how deep it is for each pixel and how much light passes through (how much we can see the ground!).

We save the meshes drawn pixels in a stencil buffer and only calculate water for these pixels.


Another thing that happened before the g-buffer creation was the rendering of the water’s reflection map. You can see the meshes that are mirrored so we see them from below. Note that only the right part of this reflection map is seen in the final render. It is not lit and low detail, because with the water waves etc. it’s hard to make out details anyways.

One can probably imagine that most of this geometry didn’t have to be rendered if a mask like the one on top was used. Since the water itself is not spanning all across the map anyways (it actually has a mesh that intersects only a little bit into the terrain) one could very cheaply write the water geometry to the stencil buffer and only reflect there.

That said, the reflection map is most likely not updated every frame anyways (because it is so hard to make out details)

water reflection

Now with some pixel shader magic and pretty textures we get this:


Notice how the water casts white foam near the cliffs. Just like the terrain it reads that from a global map. I think it looks really good!

View Cones

Now we get to the first signature feature of the game and something that is definitely not stock in any engine: The display of view cones of enemies!


That’s basically the number 1 technical thing that has to work right for gameplay to function at all. And it does just that!

The idea is that the view cone is fully blocked by high objects, enemies can peek over objects of medium height, such as bushes or rocks.

view cones3

Sadly the one interesting thing is where the inspection tool i used, Intel GPA, it fails to read some input textures for the shadows, and if i save them all the other tools say the file is corrupted. Which is a shame since this is probably the most interesting feature. RenderDoc, sadly, crashes the game. If someone knows more, please leave a comment or write me a message on twitter.

output2_depthI can tell it’s a depth map from the perspective of the soldier, the very first pass in the render renders some proxy geometry to a giant 1024×4096 (yep, 4 times as high!) texture.
I can’t tell you what it looks like but here is the depth buffer for this thing. I’ve chosen a different scene, because it becomes clear that it’s really the view of the enemy unit.output2
The depth buffer on the right clearly gives the simplified outlines of the stairs to the left, the lamp in the middle and the crates to the right. We don’t see the house behind that though. Maybe the rest of the shadowing is done with compute shaders, I don’t know.

Another image makes it pretty clear a shadow mapping technique is used: Jaggies. Actually I think this is a bug or misplaced unit, because usually the whole area behind a bush/box is safe for crouching. Maybe this soldier’s eyes are a bit too high.


If I had to implement this technique with shadow mapping i probably would have had a 2-channel buffer / 2 textures. One depth for the crouchables and one depth for the total vision blockers. Then choose appropriate texture based on which depth comparison fails. The soldier’s eyes are on crotch level anyways.

Note: Should you look to implement this effect in your game I think shadow volume / stencil shadowing is actually the perfect fit for this kind of thing, I cannot possibly think of any downsides to that and wonder what made Mimimi choose a different path.
(This technique works by extruding the backsides of shadow-casting geometry to the end of the shadow volume (our cone). So we basically make a new mesh where everything inside should be in shadow and draw that to a mask, which contains all shadow casting meshes. The upside to that is that it is pixel-perfect and cheap to compute, it was used in games like DOOM3 for example. It is not used nowadays for normal shadows because it can’t produce good soft shadow results, but we don’t want soft shadows anyways for the view cones)

The actual gameplay “detection” mechanic is probably using raytracing anyways and is most likely independent of the rendering technique.

What I can tell for sure though is this:
When the stencil buffer was created one of the values represents occlusion for the view cones. On the white values nothing will be drawn. And a nice pattern was created for the bushes to make different them from the rest.


The lines on the ground where crouching is needed are supplied by texture



We already briefly talked about the black outlines for the characters, to give them a comic / toon look. But we also need some colored, thicker outlines for gameplay purposes.

Here is what we want:
You can see the blue ninja, the red enemy, the yellow scrolls and the green plants which are we can use for climbing.


The process is similar to the first round: Render our meshes a bit thicker than they really are (by extruding them in the normal direction for example). Curious: These meshes are drawn twice again. Maybe coverage was not optimal with only one.

outlines2However, in this case we want smooth outlines that fade out to the edge. Therefore in the next pass we just dilate and blur the pixels instead of making the geometry thicker. We repeat this step. Finally, when we are happy with the thickness we can substract the first pass (which had no outlines yet).

Voila – you can see the results to the right.

Screen Space Ambient Occlusion

Phew, almost done!

We just need to add a little more neat effects on top. For example with Screen Space Ambient Occlusion. The basic idea of ambient occlusion is that, even though light is coming from almost all directions (sky, sunlight bouncing around etc.) some pixels are less likely to receive this ambient light than others. For example some grass growing between two rocks is probably getting less light than some grass on the top of a mountain. Poor grass.
Anyways, SSAO approximates this effect by checking the surroundings of each pixel and how much it may or may not obstruct light.


The implementation used here is actually doing this in full-resolution for every single pixel. Usually this is done in smaller resolutions and the result is then upscaled, but obviously the final image is cleaner if you do it for every pixel (albeit more expensive in terms of rendering time).


Well, so far we have only dealt with opaque geometry, but in a deferred renderer particles and see-through geometry have to be drawn on top of the lit world. Nothing fancy, here is the result (see the glowing beam into the sky next to the tutorial scroll or the smoke on the bottom of the screen)


Bloom and Combine

In the next pass we draw the outlines we created previously on top of the image. I think that’s obvious enough that I don’t have to provide a picture. It also appears that all other outlines/edges get a blur applied to make the image smoother (anti-aliased). Unfortunately I don’t have information about some textures again, so I have no way of knowing if the previous frame is combined into that to make it a temporal solution. Evidence speaks against that, however, since we have never computed a velocity buffer (how fast is each pixel moving), so it is probably just an edge blur. Pretty effective though.

Then we want to apply a bloom filter to make the bright spots in our image “glow” a little. We do that by extracting the bright spots of the image and blurring them.

I almost wanted to leave out any image but actually this blur implementation goes crazy at one point with some stitching operations on top. Just watch ^^

First of all the image gets split into parts that are affected and then blurred 4 times (2 times horizontally and 2 times vertically). The bright spots are marked in the alpha channel (hard to show), which is blurred as well. For the final image (the black one) i show only the parts that have value in the alpha channel, meaning the things that will appear blurry later.

I wonder why the image is stitched together for the ultimate step, I haven’t seen this before in a blur operation. Interesting. This stitching you see is usually done for texture maps, to make sure that there are no visible seams at UV meshpart edges (you have to “unwrap” a model onto a flat texture, but some parts have to be split)


Finally we have rendered everything 3d.

If you know something about High-Dynamic Range then you probably suspect that i have been lying in a lot of my images. That would be correct, since the lighting is done in HDR. That means the color range is much greater than what we want/can display and we have to bring it down to a good level with “tonemapping”. The original images were very dark and overly bright at some spots with very high contrast, so i had to manually bring them to appropriate levels in photoshop.

So here is the final gif: Original, Tonemapped and with HUD on top.


Final Verdict

Final notes: I hope i didn’t bore you too much.

I actually wanted to go on and on about how things could be optimized (why a deferred shadowing solution that isn’t even screen space blurred and used for sun only? etc.) but I think there are many things I don’t know and one of them is how to ship a finished game. So Mimimi is probably right and I am most likely wrong.

I guess this isn’t the most technically accomplished title anyways, nor does it want to be, but the final look is just right. I have had so many of these in the pipe (Mad Max has pretty interesting rendering!) but this is the first time I actually released a technical overview, if you like it I might make more.

Speaking of which, I planned to make an actual video about Crysis (1) and its myriad of little technical wonders, but writing a script etc. takes ages. This write-up took me a lot more hours than I would have expected.

The technology is one side, but the art is another. And this game has a lot of beautiful art. I didn’t like it at first when I watched a review video, but in person it’s really nice and engaging. The UI and sounds are good, too. I love the main menu and the character portraits.

Now I gotta grab the full version when it comes around, because this seems like a worthy follow up to the old favorites.


Hi guys,

just a short blog post with a super short content. I’ve seen a question on the monogame forums of how to render to multiple render targets at once in monogame (identical in XNA, too).

I am writing this again on my blog so it is preserved for future searchers :)

Here is the solution (160kb). You will have to build the content anew and build the solution itself to make it run. (Open the content pipeline -> rebuild)

The idea is very very simple: I draw red to one texture and blue to the other (but I do that at the same time with MRT rendering).
To showcase our results I render these two textures seperately.

and here is version which just uses spritebatch, in case you need that


I’ve also noticed that I didn’t even post about the bloom solution / integration I’ve made for fellow indie devs.

Here you go. The sample should be pretty easy to understand and integrate into your game if you want. For more info follow the link.

An example application is in my deferred engine, which you can download on github, too.
(Discussion here:

Overview MRT

Basically it is just these two functions

protected override void Initialize()
//Create a mesh to render, in our case a simple rectangle / quad
quadRenderer = new QuadRenderer();

//Create our rendertargets!
renderTarget1 = new RenderTarget2D(GraphicsDevice, width, height);
renderTarget2 = new RenderTarget2D(GraphicsDevice, width, height);

//Create our Rendertargetbinding
renderTargetBinding[0] = renderTarget1;
renderTargetBinding[1] = renderTarget2;



protected override void Draw(GameTime gameTime)
//Set our renderTargetBinding as the target!


//Apply our shader. It should make the first rendertarget red, and the second one blue. You find the shader in content/BasicShader.fx

//Draw our mesh! It's just a fullscreen quad / rectangle (We can use spritebatch, see second link)
 quadRenderer.RenderQuad(GraphicsDevice, -Vector2.One, Vector2.One);

//Set our backbuffer as RenderTarget

//Draw our rts

//Our first rt should be red! It's in the top left corner!
 spriteBatch.Draw(renderTarget1, new Rectangle(0,0,width/2, height/2), Color.White);
//Our second one should be blue in the bottom right corner
 spriteBatch.Draw(renderTarget2, new Rectangle(width / 2, height / 2, width / 2, height / 2), Color.White);


As you can see it’s very basic.

  • Draw a fullscreen quad with MRT, draw red to the first one and blue to the other one
  • Draw the RTs seperately to the backbuffer to see how they look like now


struct VertexShaderInput
{  float3 Position : POSITION0;

struct VertexShaderOutput
{  float4 Position : POSITION0;

VertexShaderOutput VertexShaderFunction(VertexShaderInput input)
{  VertexShaderOutput output;  output.Position = float4(input.Position, 1);  return output;

struct PixelShaderOutput
{  float4 Color0 : COLOR0;  float4 Color1 : COLOR1;

PixelShaderOutput PixelShaderFunction(VertexShaderOutput input)
{  PixelShaderOutput output;    //The first rendertarget is red!  output.Color0 = float4(1, 0, 0, 0);  //The second rendertarget is blue!  output.Color1 = float4(0, 0, 1, 0);  return output;

technique Technique1
{  pass Pass1  {   VertexShader = compile vs_4_0 VertexShaderFunction();   PixelShader = compile ps_4_0 PixelShaderFunction();  }

Texture (Object) Space Lighting

I have been thinking about different implementations of sub-surface scattering, and I recalled that before it was done in screen space it was done in texture space.

I’ve looked it up and found this gem from ATI in 2004:


The idea is to not render to the pixels covered by the model on screen (view projection space) but instead to the texture’s uv space.

Shortly after starting to work on it, I found out that this technique is actually used in some game production right now. 

If you want to hear from the guys that actually work on this thing look at this amazing presentation from Oxide games.

Furthermore here are some more thoughts about it

Think of it this way: We render a new texture for the model, and for each pixel we calculate where this pixel would be in the world and how it reacts to light / shadows etc. in the world. Then we render a normal mesh to our view projection space and apply the texture like we would with any other texture – voila – the final lit mesh is on screen!

I thought this was a very interesting way to do things, in part because it is so fundamentally different than our normal form of 3d rendering.

For the sub-surface scattering approximation we would then blur the final lighting, if you want to learn more about that – follow the link provided above.

So I decided to make a fast and dirty ™ implementation of the general idea in monogame.

Here is the result:

Benefits (naive approach)

  • Potentially the biggest benefit of this method is that the final image is very stable.
  • If we create mip-maps for our lighting texture we can reap the benefits of some 2000s magic in form of texture filtering when drawing the model. That means that effectively shader aliasing is not an issue any more.
  • Also, if we use this naive approach and render the model from all sides, we basically don’t have to render it again if the lighting doesn’t change (pre-rendered essentially), and if it only changes a little we can get away with only shading at lower framerates (not particularly useful without async rendering, which comes with dx12 / Vulkan)
  • We can also reuse this texture for copies of the model, for example if we render reflections. Since the lighting calculations are already done our reflections become very cheap. (Disclaimer: Specular lighting should in theory be updated for reflections for example, but it’s plausible enough in games not to do so – see: screen space reflections are generally accepted by players as plausible)
  • We can also enforce a shading budget that is limited by the resolution of the calculated texture.

Downsides (naive approach)

  • We waste a lot of resources on shading stuff that we can’t even see, like backfaces.
  • The rendering quality is only as good as the resolution of our texture. If we zoom way out we render too much detail, when we zoom in the shading becomes blocky.
  • With increasing amount of meshes, our memory needs increase, too.
  • We can’t easily apply screen space effects.
  • Our models cannot have tiling textures.

The road to viability

The downsides are pretty severe, so let’s address them.

First of all, we want to limit our memory consumption, so instead of each mesh having it’s own texture let’s use a bigger one for all meshes. For each mesh we now have to store where the lighting texture is placed in the big texture. We can create a new address each frame for all meshes in the view frustum.

(So for example we tell our first mesh to cover the first 1024×1024 square in our 8k x 8k texture, so it should know that it can write to / read from [0,0] to [0.125, 0.125])

Then we should obviously scale our texturing resolution per mesh depending on distance to the camera – ideally each pixel should cover one texel. It’s important the UV distribution of the model is uniform!

Then, to not draw unvisible pixels, we have to add one pass up-front where we draw all meshes with their virtual texture address as texture.

(So for example our mesh’s texture in the virtual texture sits at [0,0]-[0.125,0.125] and the pixel we draw has uv-coordinates of [0.5, 0.5] we draw the color [0.0625 , 0.626])

We can then check each pixel’s color, which is also it’s position, and mark the pixel in the virtual texture at this very position, so we know it has to be rendered. This step could be done with a compute shader, so it’s not available in Monogame unfortunately.

Finally some parts of our model may be closer to the camera to others (think of terrain for example) or some may not be rendered at all when we have huge models, so maybe splitting the model into smaller chunks or using other texturing LODs, like in more advanced virtual texturing / “megatextures” would be good.

Benefits (improved approach)

  • fixed memory and shading budget dependent on the master texture size
  • consistent quality
  • less wasteful shading, better performance.
  • We can scale our quality by undersampling and it would still look okay-ish, sort of like blurry textures.

Downsides (improved approach)

  • Becomes pretty complicated, the time from nothing to first results to first robust implementation is many times longer than for the usual approaches
    • which is why I didn’t do it. Also monogame doesn’t support compute shaders and I wanted to stick to it.
  • If virtual textures are used, the amount of implementation woes goes up by another mile
  • since we draw only visible pixels, we can’t reuse as much


Implementation Overview

I went for the naive implementation and got it to work in a very short amount of time.

We have to render in two passes.

The first one is the texture (object) space pass. It works basically just like normal forward rendering with a small change to our vertex shader:

Our Output.Position changes from

Output.Position = mul(input.Position, WorldViewProj);


Output.Position = float4(input.TexCoord.x * 2.0f - 1.0f, -(input.TexCoord.y * 2.0f - 1.0f), 0, 0);

which may seem familiar if you ever worked with screen space effects. It’s basically the mapping of texture coordinates (which are in [0,1]x[0,1] space) to normalized projection space which is ([-1,1]x[-1,1] and the y is flipped)

In the video above you can see this output texture in the bottom left corner.

In the final pass we render our mesh with a traditional vertex shader again

DrawBasicMesh_VS DrawBasicMesh_VertexShader(DrawBasicMesh_VS input)
DrawBasicMesh_VS Output;
Output.Position = mul(input.Position, WorldViewProj);
Output.TexCoord = input.TexCoord;
return Output;

We just need position and texture uv as input, since our pixel shader is even more simple and just reads the texture drawn in the pass before (a one liner:)

return Texture.Sample(TextureSamplerTrilinear, input.TexCoord);

In the video above you can see a new GUI I’ve used, you can find it here

The implementation was super easy and quick. Check it out if you are looking for one to use in monogame!





Screen Space Emissive Materials

Hi guys,

today I want to talk about screen space emissive materials, which is a relatively simple technique I implemented, which allows for some real-time lighting in a deferred engine (along with some problems)

Emissive materials in real life would for example be fluorescents. But they can also be used as lamp/light shapes that are hard to approximate with simple point lights. You can see some examples at the end of this blog entry.


So, I was just coming off of implementing screen space ambient occlusion (SSAO) into my deferred engine, along with trying to make screen space reflections (SSR) work.

I haven’t worked with ray marching in screen space before but their power became apparent immediately.

So I went ahead and implemented something I have been thinking about for quite some time – screen space emissive materials.

The idea is pretty easy – just ray march diffuse and specular contribution for each pixel.

Per Pixel Operations

First question would be – which pixels?


The pixels used are bound by a sphere – similar to normal point lights in a deferred rendering engine (We don’t want to check every pixel on the screen when the model is only covering a small fraction). I simply take the Bounding Sphere of the model (see the smaller circle around the dragon) and multiply it by some factor, depending on the emissive properties of the material.

Then I raymarch a certain amount of times across some random vectors in a hemisphere (based on the normal) of the pixel to get the diffuse contribution. If the ray hits the emissive material I add some diffuse contribution to the pixel.

For the specular contribution I reflect the incidence vector (camera direction) on the normal and raymarch and check if I hit something. I actually use more than one reflection vector – depending on the roughness of the material this is more of a cone actually.









First results

Here is an early result. I think it looks pretty convincing.


Now, there are 3 major problems with the way I described it above:

  • If the emissive material is obstructed there is no lighting happening around it (think of a pillar in front)
  • If the emissive material is outside the screen space there is no lighting happening
  • The results are very noisy (see above)



blog_ssem_emissiveThere is a pretty easy solution for this one – draw the emissive mesh to another rendertarget and check against that one when ray marching.

In my case I went with the approach to save world space coordinates for the meshes (translated by the origin of the mesh, so precision is good). I draw the model on a new rendertarget so the scene depth is not considered and cannot obstruct.

One could go with a depth map here, but I went with this approach this time.

This makes depth comparison pretty trivial, but it may not be the most efficient solution.

Note: For each light source I – clear the emissive depth/world position map and draw the object and then calculate lighting and add to the lighting buffer. This way emissives cannot obstruct each other and I can optimize the lighting steps for each individual mesh.



blog_ssem_diffuseInstead of sampling in a random direction, we can sample in only in the direction of the bounding box/sphere of the model. With the same amount of samples we get much smoother results.

Apart from that – all of the techniques that help SSAO and SSR can be applied here. Bilateral blur would be a prime example here.

Another often used solution that helps here is to actually change the noisy vectors per frame and then use some temporal accumulation to smooth out the results.

Simply more samples per pixel is the most simple solution obviously, but often performance limitations do not allow for that.

Screen Space Limitations


blog_ssem_screen-spaceThe good old “it only applies to screen space” problem:

As soon as the materials aren’t visible any more the whole thing breaks basically, since we can’t ray march against anything any more.

Philippe Rollin on twitter (@prollin) suggested to “render a bigger frame but show only a part of it”

This would be a performance problem if we had to render the whole frame in a bigger resolution, but since we draw the emissive material to another texture we can use a neat trick here:

FOVfactor2We can draw the emissive materials with another view projection – specifically with a larger field of view than the normal camera field of view (for example factor 2).

Then, when calculating the lighting, we reproject our screen coordinates to the new view*projection matrix to sample from there. Barely any cost.

Now, the local resolution goes down a bit, but, for a factor of 2 for example, it is not noticeable at all.

To address this issue one could change the alternate field of view depending on how much “out of view” the meshes are, but I found the results to be good enough with a constant factor 2.

Note: Simply changing the FOV is pretty naive. It would be more beneficial to also change the aspect ratio so the amount of additional coverage to the top/bottom is equal to the sides. A larger FOV gives proportionally more coverage in x-direction than in y direction if the aspect ratio is > 1. This should be adjusted for.

That is all great, but it won’t help when the emissive mesh is behind the camera. There is no way to beat around the bush, you won’t have any effect then.


Can there be done anything about that?

Well you can always draw several emissive rendertargets with different orientations/projections and then check against each one of them (as is suggested for so many screen space effects), but this is honestly not viable in terms of performance.

What I would rather suggest is a fade to a deferred light with similar strength. Not optimal, but people overlook so many rendering errors and discontinuities it might work? I don’t know.


So yeah, that’s all thanks for reading, I hope you all enjoyed it. Bye :)


(click on the image above for a large view. Note that SSAO creates black shadows below the dragon, which obviously doesn’t make any sense with an emissive material)

Performance is relatively bad – right now. As you can see in the image above the emissive effect (which at that proximity covers all pixels) costs ~15 ms for one material only at ~1080p (on a Radeon R9 280)

The SSEM is rendered in full resolution with – 16 samples with 8 ray march steps for diffuse and 4 samples with 4 steps for specular.

There is a lot of room for improvements – as mentioned above. Especially diffuse doesn’t have to be rendered full-resolution, a half or even quarter resolution with bilateral blur and upscale would most likely have little impact on visual fidelity.

A smaller sample count makes the results more noisy – but that can be helped with some blur also, especially for diffuse.


In this picture we have many more emissive materials, again at 26ms total at ~1080p. Because the screen coverage / overdraw is relatively low the performance is not much worse.

Conclusion and further research

I presented a basic solution for rendering emissive materials in real time applications and proposed possible solutions / work-arounds for typical issues with screen space ray marching based algorithms. It does not need precomputation and all meshes can be transformed during runtime.

I am not sure whether or not this can actually be viable in high performance applications, but I am confident the rendering cost can be much improved.

I am sorry for not providing any code or pseudo code snippets, maybe I’ll update the article eventually.

A possible extension of this method would be to have textured materials and read out the color at the sampling position to add to the lighting contribution. This would greatly extend the use of the implementation but bring a number of new problems with it, for example when some color values are occluded.

Making the rendering more physically based would be another goal, currently there are many issues with accuracy and false positives/negatives based on wrong ray marching assumptions in my version.

I hope you enjoyed the read, and if you like you can track the progress of the test environment/engine here and find a download on github here: