Fun Shaders – Part 2

Here is a short overview of some casual shaders I’ve created in the last week or so.


Radial Blur


A fairly simple, yet impressive effect. Not very complicated, we basically just scale the image from the blur center several times and then average out.
This can look much better if we repeat the effect on top of the blurred image.

We can effectively square the used samples that way. For 10 samples and 3 passes the complete frametime is around 0.5ms on a Radeon 280.

My shader:

float4 PixelShaderFunction(VertexShaderOutput input) : COLOR

float2 vec = MousePosition – input.TexCoord;

float4 color;

//This is done in a static global instead

//float invBlurSamples = 1.0f / BlurSamples;

for (int i = 0; i < BlurSamples; i++)
color += Screen.Sample(texSampler, input.TexCoord + vec * i * invBlurSamples * BlurIntensity);

color *= invBlurSamples;

return float4(color.rgb, 1) ;

Note: I use a point sampler, so I might as well use Load instead of Sample. I have found the difference when using bilinear sampling to be minor, especially compared to the performance impact.

Bokeh Blur


Out-of-focus blur from cameras creates interesting shapes, which depend on the lens and aperture of said lens.
For more information I’d recommend wikipedia

shapeHexagonAnyways, creating this effect is pretty challenging from a rendering perspective since it taps into the old “gather vs scatter” problematic.
The idea of this blur is that each point expands into a certain shape, for example a hexagon (which you can see to the right)

The problem lays in the fact that a pixel shader has a set pixel coordinate that it has to work with. So for each given pixel we can:

  • Read other pixels from another texture

But we cannot

  • Read pixels from the rendertarget we are writing to
  • Move the current pixel around

So the problem is that we can basically only “gather” info, but not expand our pixel.

Therefore a normal blur implementation would look up the neighbor pixels and average them out.

For a bokeh implementation one could read all neighbor pixels in an x*x neighborhood (x being the width/heigh of our bokeh texture), check if our current pixel is part of the bokeh texture, when viewed from the other pixel’s perspective and then average out accordingly.

The trouble then is that we cannot have very large bokeh shapes. If our shape were to be 100×100 pixels that would be 10000 texture reads per pixel. Absolutely impossible to compute in real time.

So the other approach then is to actually expand each of our pixels.


To do that we simply draw a quad for each pixel. This quad has the bokeh shape multiplied by the pixel’s color. Voila. Done.

Obviously this doesn’t sound like it would perform very well, and it’s true, it doesn’t.

However, if we cut our resolution down to 1/4th (1/2 width, 1/2 height) this approach becomes interactive already (depending on the quad size).

In my implementation I prepare a vertex buffer with all the quads combined and draw everything in one drawcall. I read the pixel’s color in the vertex shader and the bokeh shape in the pixel shader.

However, we have to deal with massive overdraw, each pixel is affected potentially thousands of times, depending on how large our quad is.

Not only is that bad because of fillrate, but we also run into some heavy precision problems when trying to blend (additively). Even with FP16 rendertargets precision problems crop up pretty fast when dealing with giant quads.
The issue is visible even with this highly compressed .gif:

However, switching to a 32bit rendertarget impacts the performance in an unacceptable way, so that is no real option.

So the solution I came up with is to resize the buffers after set thresholds.

Eg: after BokehSize = 5.0f I downsample the rendertarget (and the subsequent quad amount) another time, I rescale the quads to fit again and enjoy very good performance along with little enough overdraw to not need 32bit rendertargets.

The transition is noticeable ins some high-frequency areas, and could probably be improved if the base rendertarget is downscaled with priority on standout colors.


Resources used

I didn’t use any templates for this, but I have stumbled upon this implementation by MJP
It uses a stock blur for most parts of the image and only applies the bokeh effect on extracted highlights (similar to a bloom extract)


“Technic” Effect

A simple yet fun shader that sort of looks “techy” and looks a bit like currents flowing through a microchip (as an engineer this feels wrong)


There’s really not much to it, just a simple expand pixel shader. I store the current age and direction of a lit pixel in the alpha value and read all 4 direct neighbors of black pixels to see if they are “next” to be part of the current line.

Additionally I have a random value (either static per pixel or truly random) that determines whether or not the line will split in two (changing directions along the way – from horizontal to vertical and vice versa) or end.

The appeal originally was to use the 8bits in the alpha to store information efficiently, but it turned out i didn’t need that since a pixel is either – black, freshly spawned with new direction information for neighbors or fading into darkness (no neighbor will be affected).

Anyways, stuff like this is always fun.

Spring Particles

Similar to the Bokeh effect i have potentially millions of quads per pixel in a rendertarget drawn onto the screen.


In the image above i used 128 x 80 particles.

On two seperate rendertargets I store the current position (rg) and the current velocity (ba) of each particle. I switch these RTs to read back from each other and simulate simple spring equations, with k being the spring constant and a dampening factor (you can see that in the video at the beginning of the post).
Additionally there are extra attractors/repellants, which i map to the mouse coordinates.
In the .gif below I have one particle per pixel, so a total of 1024000.


The interesting thing about this kind of simulation is that the GPU essentially doesn’t care about a 1280×800 simulation if the math inside is pretty simple.
On the CPU this would be nearly impossible, while I can run this at several hundred frames with one particle per pixel on the GPU. And the main blocker is probably overdraw / blending anyways.

It’s actually not the first time I’ve done this. A very similar approach was used when I created the grass simulation for bounty road. This one is even more complex with complicated wind functions affecting each sample point.

You can see an old video about that here:

Anyways, I hope you liked this short overview Smile

One thought on “Fun Shaders – Part 2

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s