Hi guys,
this one I almost missed. True to its name and heritage it hid from me and if I didn’t notice it on someone else’s wishlist I would probably never have heared about Shadow Tactics – Blades of the Shogun.
It’s basically a fresh take on the good old realtime tactics genre, a sequel to the likes of Commandos – Behind Enemy Lines and Robin Hood: The Legend Of Sherwood. So basically “isometric” party spy action.
Regardless whether or not you know these games this one I found to be really intriguing.
Luckily for everyone the developer released a pretty lengthy demo of the game, you’ll find it in the steam store.
The fact that there is a demo is pretty amazing, but what’s even better is that the game itself is right up my valley, too. Go check it out. Or read some reviews, there are plenty (even though I have missed them all when the game released a month ago).
What makes this game even more interesting to me is that it’s made by local Munich developer Mimimi Productions. Should one of them read this: How about a beer? : P
Anyways, as with every game i play i usually pay close attention to its graphics and usually capture some frame breakdowns with Intel GPA or RenderDoc for me to analyze.
This time i decided to share, hope you enjoy.
Frame Breakdown
How do we get there?
This game is made in Unity, so I assume it uses many of the default way Unity renders things, but there are some unique things as well.
EDIT: As I went on and on I found a number of things I am not certain about, sorry. I decided to still release this blog post since i like the game and want to write something about it. I have never personally worked with Unity and only assume some stuff. The default Unity reference deferred rendering setup is a bit different than the one used here.
I will try to make this as friendly to beginners as possible, but first let’s see a quick overview. I will go over these points in more detail afterwards.
In chronological order:
- Depth map generation for shadows and enemy view cones
- Water reflection map (if needed)
- G-Buffer generation
- Deferred shadow mapping
- Lighting
- Water (if needed)
- View cones
- Outlines drawn
- Screen Space Ambient Occlusion
- Particle Effects
- Bloom/Glow (Fullscreen Blur)
- Combine (+tonemap) into final image
- UI Elements (Buttons etc.)
However, since some things are done earlier, but only come up later to full effect, I will not talk about everything in chronological order, but instead in a way that is easier to explain and understand.
G-Buffer
Let’s start with the G-Buffer generation. This game uses a deferred rendering engine, which means that first we have to store all relevant material informations for each pixel and later we can calculate lighting with this information.
So we basically draw all our objects and store the relevant information, like for example “what color does the material have? How reflective is the material?” to a big texture / several textures.
Unfortunately a texture only has 4 different channels at max (red green blue and alpha / transparency) but we need more info so we save to multiple textures at once.
I’ll break them down:
- Albedo (rgb) – that’s just the texture of the object without any lighting applied
- Occlusion (a ) – the baked-in darkening of the object based on the fact that some parts are naturally occluded from ambient light (typical usecase: cracks etc. on rocks)
- Specular (rgb) – the brighter this value the stronger the reflective properties of the material. If it is not gray it’s usally a metal. For example the ninja’s sword handle is gold. So it will reflect the environment and light with a golden tint.
- Smoothness (a ) – a super smooth object is basically a mirror, a rough object is for example cloth or dry dirt. This determines the way the material reacts to light.
- Normals (rgb) which direction does the current pixel face? We need this later for light calculations (how much does it face the light –> how bright is it?)
- Outlines sub (a ) – I falsely presumed these were used for outlines. However, it turns out they are not. It is possible that this is info that is used in the lighting pass, since these meshes are dynamic/moving and do not have a precomputed light map. Maybe these pixels get lit with some (indirect) light approximations, which are not used elsewhere.
- Lightmaps (rgb ) – precomputed light bounces and indirect shadowing. These textured were rendered with a very sophisticated lighting model to simulate how light bounces around and distributes in shadows. For static lighting (the sun) and geometry (our objects) we don’t have to calculate this for every frame, so it is done beforehand. Notice how the stones on the right are lit from below because the sunlight reflected off of the water
- Non-terrain (a ) – I dubbed this one non-terrain, since that’s what it is. I am not sure what exactly this is used for later on. It’s basically all the meshes except for the terrain meshes. A good usecase for this would for example be: decals / textures like tracks from a road, that should only affect the ground.
- Depth (R ) – the depth of the scene (distance from the camera). Used for light calculations and other stuff.
- Stencil (R ) – discussed later, used for view cones.
I can’t think but wonder if the engineers could optimize the G-buffer setup quite a bit by not having a rgb specular buffer for example. Anyways, let’s move on.
Comic / Toon look:
The artstyle demands a cartoonish look, so black outlines are applied to all relevant meshes (not all meshes, bushes for example look better without).
So every mesh affected (almost all) is drawn twice. Once with slightly larger outlines and only in black and once on top with the correct texture. If you want to learn more about this technique you should look for inverted normals.
Note: In this game’s implementation every mesh is drawn twice, right after another. For better depth buffer usage and less texture changes it would make sense to first draw all black meshes and all the textured ones afterwards (basically like a depth pre-pass) so all geometry is drawn to the depth buffer first and we can potentially save a lot of pixel draws when we draw the colored meshes afterwards. This should work in most implementations of such a toon rendering.
A thought about terrain, should you be inclined to program your own:
The way it is handled here is that the terrain is split into chunks, so each chunk can be efficiently frustum culled (not drawn if not in the field of view of the camera).
In this image I only enabled some chunks of the terrain to draw (so it’s easy to see how large they are). For each point the terrain checks in the map on the left. The amount of green determines how much of the grass texture our pixel uses. This way we can blend smoothly between different ground textures, for example we can add a little bit of red to it to make it blend with rocks a bit.
Note: The terrain is drawn first in this implementation. Don’t do that unless you have to! Draw it last, since it covers potentially the whole screen, but only parts of it are visible in the end. If we draw the terrain first we have a lot of pixels that are overdrawn later on (check out “non-terrain” above!) This wastes a lot of resources.
Deferred Shadows and Lighting
Next up we draw our sun shadows to another screen buffer. In a step before our g-buffer creation the scene was rendered from the perspective of the sun, but only the depth of the scene is stored. It looks like this:
Notice how there is no terrain below our objects, so in this game it doesn’t cast any shadow (but it can receive shadows)
In the next step we check every pixel in our base view and measure the distance to the sun / the shadow caster (we check where the pixel is from the viewpoint of the sun). If we are further away than the value stored in the depth map we must be in shadow, so we color this pixel in black.
We read this map in the next step and if the pixel is not black we perform a light calculation. We can combine the result with the light map buffer to get the scene properly lit. Yay!
Normal Transformation and Water
In the next step we transform our normals with some matrix math from world space to view space. That means that their base matrix now aligns with the base of our camera view. You can look up more information on that if you want (View Space Normals), but it basically means that if a pixel looks to the right it’s 100% red, if it looks towards the top of the screen it’s 100% green and yellowish in between.
World space normals look in the x,y and z axis and so they need 3 colors (RGB), but in viewspace we only need 2, since the third axis looks towards us or away from us and gives no new information.
It obviously begs the question why the normals weren’t stored in View Space from the get go, this just adds one more matrix transformation to every pixel. Most deferred engines do just that.
We will need the transformed normals further down the line, for example for screen space ambient occlusion effects. (It should be noted that that can be done with world space normals, too)
In the free’d up blue and alpha channel the depth is stored.
We draw our water on a mesh around our terrain and we can check against the depth to see how deep it is for each pixel and how much light passes through (how much we can see the ground!).
We save the meshes drawn pixels in a stencil buffer and only calculate water for these pixels.
Another thing that happened before the g-buffer creation was the rendering of the water’s reflection map. You can see the meshes that are mirrored so we see them from below. Note that only the right part of this reflection map is seen in the final render. It is not lit and low detail, because with the water waves etc. it’s hard to make out details anyways.
One can probably imagine that most of this geometry didn’t have to be rendered if a mask like the one on top was used. Since the water itself is not spanning all across the map anyways (it actually has a mesh that intersects only a little bit into the terrain) one could very cheaply write the water geometry to the stencil buffer and only reflect there.
That said, the reflection map is most likely not updated every frame anyways (because it is so hard to make out details)
Now with some pixel shader magic and pretty textures we get this:
Notice how the water casts white foam near the cliffs. Just like the terrain it reads that from a global map. I think it looks really good!
View Cones
Now we get to the first signature feature of the game and something that is definitely not stock in any engine: The display of view cones of enemies!
That’s basically the number 1 technical thing that has to work right for gameplay to function at all. And it does just that!
The idea is that the view cone is fully blocked by high objects, enemies can peek over objects of medium height, such as bushes or rocks.
Sadly the one interesting thing is where the inspection tool i used, Intel GPA, it fails to read some input textures for the shadows, and if i save them all the other tools say the file is corrupted. Which is a shame since this is probably the most interesting feature. RenderDoc, sadly, crashes the game. If someone knows more, please leave a comment or write me a message on twitter.
I can tell it’s a depth map from the perspective of the soldier, the very first pass in the render renders some proxy geometry to a giant 1024×4096 (yep, 4 times as high!) texture.
I can’t tell you what it looks like but here is the depth buffer for this thing. I’ve chosen a different scene, because it becomes clear that it’s really the view of the enemy unit.
The depth buffer on the right clearly gives the simplified outlines of the stairs to the left, the lamp in the middle and the crates to the right. We don’t see the house behind that though. Maybe the rest of the shadowing is done with compute shaders, I don’t know.
Another image makes it pretty clear a shadow mapping technique is used: Jaggies. Actually I think this is a bug or misplaced unit, because usually the whole area behind a bush/box is safe for crouching. Maybe this soldier’s eyes are a bit too high.
If I had to implement this technique with shadow mapping i probably would have had a 2-channel buffer / 2 textures. One depth for the crouchables and one depth for the total vision blockers. Then choose appropriate texture based on which depth comparison fails. The soldier’s eyes are on crotch level anyways.
Note: Should you look to implement this effect in your game I think shadow volume / stencil shadowing is actually the perfect fit for this kind of thing, I cannot possibly think of any downsides to that and wonder what made Mimimi choose a different path.
(This technique works by extruding the backsides of shadow-casting geometry to the end of the shadow volume (our cone). So we basically make a new mesh where everything inside should be in shadow and draw that to a mask, which contains all shadow casting meshes. The upside to that is that it is pixel-perfect and cheap to compute, it was used in games like DOOM3 for example. It is not used nowadays for normal shadows because it can’t produce good soft shadow results, but we don’t want soft shadows anyways for the view cones)
The actual gameplay “detection” mechanic is probably using raytracing anyways and is most likely independent of the rendering technique.
What I can tell for sure though is this:
When the stencil buffer was created one of the values represents occlusion for the view cones. On the white values nothing will be drawn. And a nice pattern was created for the bushes to make different them from the rest.
The lines on the ground where crouching is needed are supplied by texture
Outlines
We already briefly talked about the black outlines for the characters, to give them a comic / toon look. But we also need some colored, thicker outlines for gameplay purposes.
Here is what we want:
You can see the blue ninja, the red enemy, the yellow scrolls and the green plants which are we can use for climbing.
The process is similar to the first round: Render our meshes a bit thicker than they really are (by extruding them in the normal direction for example). Curious: These meshes are drawn twice again. Maybe coverage was not optimal with only one.
However, in this case we want smooth outlines that fade out to the edge. Therefore in the next pass we just dilate and blur the pixels instead of making the geometry thicker. We repeat this step. Finally, when we are happy with the thickness we can substract the first pass (which had no outlines yet).
Voila – you can see the results to the right.
Screen Space Ambient Occlusion
Phew, almost done!
We just need to add a little more neat effects on top. For example with Screen Space Ambient Occlusion. The basic idea of ambient occlusion is that, even though light is coming from almost all directions (sky, sunlight bouncing around etc.) some pixels are less likely to receive this ambient light than others. For example some grass growing between two rocks is probably getting less light than some grass on the top of a mountain. Poor grass.
Anyways, SSAO approximates this effect by checking the surroundings of each pixel and how much it may or may not obstruct light.
The implementation used here is actually doing this in full-resolution for every single pixel. Usually this is done in smaller resolutions and the result is then upscaled, but obviously the final image is cleaner if you do it for every pixel (albeit more expensive in terms of rendering time).
Particles
Well, so far we have only dealt with opaque geometry, but in a deferred renderer particles and see-through geometry have to be drawn on top of the lit world. Nothing fancy, here is the result (see the glowing beam into the sky next to the tutorial scroll or the smoke on the bottom of the screen)
Bloom and Combine
In the next pass we draw the outlines we created previously on top of the image. I think that’s obvious enough that I don’t have to provide a picture. It also appears that all other outlines/edges get a blur applied to make the image smoother (anti-aliased). Unfortunately I don’t have information about some textures again, so I have no way of knowing if the previous frame is combined into that to make it a temporal solution. Evidence speaks against that, however, since we have never computed a velocity buffer (how fast is each pixel moving), so it is probably just an edge blur. Pretty effective though.
Then we want to apply a bloom filter to make the bright spots in our image “glow” a little. We do that by extracting the bright spots of the image and blurring them.
I almost wanted to leave out any image but actually this blur implementation goes crazy at one point with some stitching operations on top. Just watch ^^
First of all the image gets split into parts that are affected and then blurred 4 times (2 times horizontally and 2 times vertically). The bright spots are marked in the alpha channel (hard to show), which is blurred as well. For the final image (the black one) i show only the parts that have value in the alpha channel, meaning the things that will appear blurry later.
I wonder why the image is stitched together for the ultimate step, I haven’t seen this before in a blur operation. Interesting. This stitching you see is usually done for texture maps, to make sure that there are no visible seams at UV meshpart edges (you have to “unwrap” a model onto a flat texture, but some parts have to be split)
Finally we have rendered everything 3d.
If you know something about High-Dynamic Range then you probably suspect that i have been lying in a lot of my images. That would be correct, since the lighting is done in HDR. That means the color range is much greater than what we want/can display and we have to bring it down to a good level with “tonemapping”. The original images were very dark and overly bright at some spots with very high contrast, so i had to manually bring them to appropriate levels in photoshop.
So here is the final gif: Original, Tonemapped and with HUD on top.
Final Verdict
Final notes: I hope i didn’t bore you too much.
I actually wanted to go on and on about how things could be optimized (why a deferred shadowing solution that isn’t even screen space blurred and used for sun only? etc.) but I think there are many things I don’t know and one of them is how to ship a finished game. So Mimimi is probably right and I am most likely wrong.
I guess this isn’t the most technically accomplished title anyways, nor does it want to be, but the final look is just right. I have had so many of these in the pipe (Mad Max has pretty interesting rendering!) but this is the first time I actually released a technical overview, if you like it I might make more.
Speaking of which, I planned to make an actual video about Crysis (1) and its myriad of little technical wonders, but writing a script etc. takes ages. This write-up took me a lot more hours than I would have expected.
The technology is one side, but the art is another. And this game has a lot of beautiful art. I didn’t like it at first when I watched a review video, but in person it’s really nice and engaging. The UI and sounds are good, too. I love the main menu and the character portraits.
Now I gotta grab the full version when it comes around, because this seems like a worthy follow up to the old favorites.
Cool breakdown :) let’s have a beer! Contact us via shadowtactics@mimimi-productions.de :)
done :)
This is insanely good! I would LOVE to see more!
Hey, btw, I’ve once written an article about bloom in Battlefield 3. They use quite a neat technique to make sure their bloom extends all the way into space. If you are interested, check it out:
http://www.moddb.com/games/zombie-hunter-inc/news/zombie-hunter-inc-how-we-made-an-hq-bloom-system-like-in-battlefield-3