Depth Of Field demo ------------------- Run: ----- Copy DepthOfField.exe (from the Debug or Release directory) to the solution'S directory and run. Build: ------ Simply run and build the solution (DepthOfField.sln) Controls: --------- Mouse controlls the camera orientation. Focal Distance parameter controlls the focal depth of the simulated camera. The implemented method: ----------------------- Depth of Field with Simulation of Circle of Confusion Multiple rays scattered from a given point on an object will pass through the camera lens, forming the cone of light. If the object is in focus, all rays will converge at a single point on the image plane. However, if a given point on an object in the scene is not near the focal distance, the cone of light rays will intersect the image plane in an area shaped like a conic section. Typically, the conic section is approximated by a circle called the circle of confusion. The circle of confusion diameter b depends on the distance of the plane of focus and lens aperture setting a (also known as f-stop). For a known focus distance and lens parameters, size of the circle of confusion can be calculated as: b = (D * f* (zfocus - z))/(zfocus*(z - f)) where D is a lens diameter D= f / a and f is the focal length of the lens. Any circle of confusion greater than the smallest point a human eye can resolve contributes to the blurriness of the image that we see as a depth of field. The methods presented in this section are all post-processing methods. It means that they consist of two main phases. In the first phase, the scene is rendered into an off-screen buffer once or more. In the second phase, the final image is computed from the off-screen image buffers and some type of depth controlled blurring. Phase 1: First, the whole scene is rendered by outputting depth and blurriness factor, which is used to describe how much each pixel should be blurred, in addition to the resulting scene rendering color. This can be accomplished by rendering the scene to the multiple buffers at one time. DirectX® 9 has a useful feature called Multiple Render Targets (MRT) that allows simultaneous shader output into the multiple renderable buffers. Using this feature gives us the ability to output all of the data channels (scene color, depth and blurriness factor) in our first pass. One of the MRT restrictions on some hardware is the requirement for all render surfaces to have the same bit depth, while allowing use of different surface formats. Guided by this requirement we can pick the D3DFMT_A8R8G8B8 format for the scene color output and the two-channel texture format D3DFMT_G16R16 format for depth and blurriness factor. Both formats are 32-bits per pixel, and provide us with enough space for the necessary information at the desired precision. Depth of Field with Simulation of Circle of Confusion Multiple rays scattered from a given point on an object will pass through the camera lens, forming the cone of light. If the object is in focus, all rays will converge at a single point on the image plane. However, if a given point on an object in the scene is not near the focal distance, the cone of light rays will intersect the image plane in an area shaped like a conic section. Typically, the conic section is approximated by a circle called the circle of confusion. The circle of confusion diameter b depends on the distance of the plane of focus and lens aperture setting a (also known as f-stop). For a known focus distance and lens parameters, size of the circle of confusion can be calculated as: b = (D * f* (zfocus - z))/(zfocus*(z - f)) where D is a lens diameter D= f / a and f is the focal length of the lens. Any circle of confusion greater than the smallest point a human eye can resolve contributes to the blurriness of the image that we see as a depth of field. The methods presented in this section are all post-processing methods. It means that they consist of two main phases. In the first phase, the scene is rendered into an off-screen buffer once or more. In the second phase, the final image is computed from the off-screen image buffers and some type of depth controlled blurring. Phase 1: First, the whole scene is rendered by outputting depth and blurriness factor, which is used to describe how much each pixel should be blurred, in addition to the resulting scene rendering color. This can be accomplished by rendering the scene to the multiple buffers at one time. DirectX® 9 has a useful feature called Multiple Render Targets (MRT) that allows simultaneous shader output into the multiple renderable buffers. Using this feature gives us the ability to output all of the data channels (scene color, depth and blurriness factor) in our first pass. One of the MRT restrictions on some hardware is the requirement for all render surfaces to have the same bit depth, while allowing use of different surface formats. Guided by this requirement we can pick the D3DFMT_A8R8G8B8 format for the scene color output and the two-channel texture format D3DFMT_G16R16 format for depth and blurriness factor. Both formats are 32-bits per pixel, and provide us with enough space for the necessary information at the desired precision. The implementation of the first phase: The pixel shader of the scene rendering pass needs to compute the blurriness factor and output it along with the scene depth and color. To abstract from the different display sizes and resolutions, the blurriness is defined to lie in the 0..1 range. A value of zero means the pixel is perfectly sharp, while a value of one corresponds to the pixel of the maximal circle of confusion size. The reason behind using 0..1 range is twofold. First, the blurriness is not expressed in terms of pixels and can scale with resolution during the post-processing step. Second, the values can be directly used as sample weights when eliminating “bleeding” artifacts. For each pixel of a scene, this shader computes the circle of confusion size based on the formula provided in the preceding discussion of the thin lens model. Later in the process, the size of the circle of confusion is scaled by the factor corresponding to size of the circle in pixels for a given resolution and display size. As a last step, the blurriness value is divided by maximal desired circle of confusion size in pixels (variable maxCoC) and clamped to the 0..1 range. Sometimes it might be necessary to limit the circle of confusion size (through the variable maxCoC) to reasonable values (i.e. 10 pixels) to avoid sampling artifacts caused by an insufficient number of filter taps. struct VS_INPUT { float4 Position : POSITION; float3 Normal : NORMAL; float3 Binormal : BINORMAL; float3 Tangent : TANGENT; float4 TexCoord0 : TEXCOORD0; }; struct VS_OUTPUT { float4 hPosition : POSITION; // point in normalized device space before homogeneous division float2 TexCoord : TEXCOORD0; // texture coordinates float3 tView : TEXCOORD1; // tangent space view vector float3 tLight : TEXCOORD2; // tangent space light vector float Depth : TEXCOORD3; }; struct PS_OUTPUT { float4 Color : COLOR0; float4 Depth : COLOR1; }; VS_OUTPUT BumpVS(VS_INPUT IN) { VS_OUTPUT output; // object-space tangent matrix float3x3 Tan = float3x3(normalize(IN.Tangent), normalize(IN.Binormal), IN.Normal); // position in view-space float3 P = mul(IN.Position, WorldView); // model-space view vector float3 mView = mCameraPos - IN.Position; // model-space light vector float3 mLight = mLightPos - IN.Position; // tangent-space view vector output.tView = mul(Tan, mView); // tangent-space light vector output.tLight = mul(Tan, mLight); // vertex position before homogenious division output.hPosition = mul(IN.Position, WorldViewProj); // tex coordinates passed to pixel shader output.TexCoord = IN.TexCoord0; output.Depth = output.hPosition.z; return output; } PS_OUTPUT BumpPS(VS_OUTPUT IN) { PS_OUTPUT output; // needs normalization because of linear interpolation float3 View = normalize( IN.tView ); // needs normalization because of linear interpolation float3 Light = normalize( IN.tLight ); // get tangent-space normal from normal map float3 Normal = tex2D(BumpMapSampler, IN.TexCoord).rgb; // illumination calculation output.Color = Illumination(Light, Normal, View, IN.TexCoord, Attenuation(IN.tLight)); float blur = saturate(abs(IN.Depth - focalDist) * focalRange); output.Depth = float4(IN.Depth, blur, 0, 0); return output; } Phase 2: During the post-processing phase, the results of the previous rendering are processed and the color image is blurred based on the blurriness factor computed in the first phase. Blurring is performed using a variable-sized filter representing the circle of confusion. To perform image filtering, a simple screen-aligned quadrilateral is drawn, textured with the results of the first phase. The filter kernel in the post-processing step has 13 samples - a center sample and 12 outer samples. The number of samples can vary but 12 represents the maximum number of samples that can be processed by a 2.0 pixel shader in a single pass. The post-processing pixel-shader computes filter smaple positions based on a 2D offset stored in filterTaps array, initialized by the following function: void SetupFilterKernel() { FLOAT dx = 1.0f / (FLOAT)m_ScreenWidth; FLOAT dy = 1.0f / (FLOAT)m_ScreenHeight; D3DXVECTOR4 v[12]; v[0] = D3DXVECTOR4(-0.326212f * dx, -0.405805f * dy, 0.0f, 0.0f); v[1] = D3DXVECTOR4(-0.840144f * dx, -0.07358f * dy, 0.0f, 0.0f); v[2] = D3DXVECTOR4(-0.695914f * dx, 0.457137f * dy, 0.0f, 0.0f); v[3] = D3DXVECTOR4(-0.203345f * dx, 0.620716f * dy, 0.0f, 0.0f); v[4] = D3DXVECTOR4(0.96234f * dx, -0.194983f * dy, 0.0f, 0.0f); v[5] = D3DXVECTOR4(0.473434f * dx, -0.480026f * dy, 0.0f, 0.0f); v[6] = D3DXVECTOR4(0.519456f * dx, 0.767022f * dy, 0.0f, 0.0f); v[7] = D3DXVECTOR4(0.185461f * dx, -0.893124f * dy, 0.0f, 0.0f); v[8] = D3DXVECTOR4(0.507431f * dx, 0.064425f * dy, 0.0f, 0.0f); v[9] = D3DXVECTOR4(0.89642f * dx, 0.412458f * dy, 0.0f, 0.0f); v[10] = D3DXVECTOR4(-0.32194f * dx, -0.932615f * dy, 0.0f, 0.0f); v[11] = D3DXVECTOR4(-0.791559f * dx, -0.597705f * dy, 0.0f, 0.0f); g_pPostEffect->SetVectorArray("filterTaps", v, 12); } One of the problems with all post-filtering methods is leaking or "bleeding" of color from sharp objects onto the blurry backgrounds. This results in faint halos around sharp objects. The color leaking happens because the filter for the blurry background will sample color from the sharp object in the vicinity due to the large filter size. To solve this problem, we will discard the outer samples that can contribute to leaking according to the following criteria: if the outer sample is in focus and it is in front of the blurry center sample, it should not contribute to the blurred color. This can introduce a minor popping effect when objects go in or out of focus. To combat sample popping, the outer sample blurriness factor is used as a sample weight to fade out its contribution gradually. The vertex shader implementation is feeded with a fullscreen quad without texture coordinates. Those are calculated in the vertex shader and passed to the pixel shader. struct VS_INPUT { float3 Position : POSITION; }; struct VS_OUTPUT { float4 hPosition : POSITION; // point in normalized device space before homogeneous division float2 TexCoord : TEXCOORD0; }; VS_OUTPUT DepthVS(VS_INPUT IN) { VS_OUTPUT output; output.hPosition = half4(IN.Position, 1); output.TexCoord = IN.Position.xy * 0.5f + 0.5f; output.TexCoord.y *= -1; return output; } //------------------------------------------------------------------------------------ // // DoF pixel shader // //------------------------------------------------------------------------------------ const float maxCoC = 5; float4 DepthPS(VS_OUTPUT IN) : COLOR { // Get center sample float4 colorSum = tex2D(ColorMapSampler, IN.TexCoord); float2 centerDepthBlur = tex2D(DepthMapSampler, IN.TexCoord); // Compute CoC size based on blurriness float sizeCoC = centerDepthBlur.y * maxCoC; float totalContribution = 1.0f; // Run through all taps for (int i = 0; i < NUM_DOF_TAPS; i++) { // Compute tap coordinates float2 tapCoord = IN.TexCoord + filterTaps[i] * sizeCoC; // Fetch tap sample float4 tapColor = tex2D(ColorMapSampler, tapCoord); float2 tapDepthBlur = tex2D(DepthMapSampler, tapCoord); // Compute tap contribution float tapContribution = (tapDepthBlur.x > centerDepthBlur.x) ? 1.0f : tapDepthBlur.y; // Accumulate color and contribution colorSum += tapColor * tapContribution; totalContribution += tapContribution; } // Normalize to get proper luminance float4 finalColor = colorSum / totalContribution; return finalColor; }