Here's the second part of the custom Direct3D framework and 3D tutorial, encapsulated in the class
CXD3D. In Part I we covered some 3D concepts, some Direct3D architecture, and the enumeration object; you'll also find the demo project there. In this chapter, we'll cover from choosing settings among enumeration results to the rendering loop.
The CXD3DSettings Class
When the enumeration of display adapters, display modes, and device settings succeeds,
CXD3D::CreateD3D goes on choosing the initial best settings to render. What this means is that the framework will now fill a
Settings structure, the other internal setup object of the
UINT ndm; UINT ndi; UINT ndc;
void SetDSFormat(UINT nFmt);
This class functions exactly like in the original SDK framework, but is implemented in a totally different manner.
First, we have an
AdapterInfo pointer array, in which at index 0, we store the best full-screen adapter info, and at index 1, we keep the best windowed counterpart. Next, we have the
Windowed flag (0 or 1), used as an index into the previous array.
In turn, the
ndi UINT arrays will hold the corresponding index to the best
DeviceInfo within each
ndc UINT array holds indices to the best
DeviceCombo within each
DeviceInfo. The whole mess is nothing but a collection of nested indices, keeping references to the best settings the enumeration has to offer, instead of copying enumeration results. In other words, the
Settings object always points to the best choice of D3D settings the
Enumeration class holds.
Settings are initialised by
CXD3D::ChooseInitialSettings, which branches into
CXD3D::FindBestWindowedMode, both taking in a couple of flags that indicate whether we require a HAL device or a reference device. Let's go through each.
Finding the Best Settings to Render
FindBestFullscreenMode traverses the enumeration adapter info. For each adapter, it will save its current display mode, which is the desktop's display mode, in a temp variable. Done that, it traverses each adapter's
DeviceInfos, skipping those that do not match the function arguments for type; in a third loop, it traverses each
DeviceCombos. We are just going down through the tree-like enumeration structures (remember to check the enumeration structures graph of Part I), until we get to a
Windowed device combos are skipped; each full-screen
DeviceCombo is compared against the current best to ultimately decide which one is really the best one, or failing that, which one is 'better' than any other.
Initially, there is no best
DeviceCombo, so the first one in the iteration is taken as the best (literally, better than nothing). In subsequent iterations, a HAL
DeviceCombo is better than a non-HAL
DeviceCombo, and becomes the best when both its display format and back buffer format match those of the desktop, meaning that the device can adopt the current display mode, and that no color conversion is required.
Once we end up with the best
DeviceCombo (or a 'better than nothing' one), we need to find a display mode on the best
AdapterInfo that uses its display format, and that is as close to the best desktop display mode as possible, in terms of dimensions and refresh rate. Obviously, the best
AdapterInfo is the one which owns the best
DeviceInfo, which in turn is the one that owns the best
This rather intricate logic leads to the best full-screen settings that do not change the desktop's current display mode, and that do not require any color conversion, whenever possible.
FindBestWindowedMode function follows a similar approach, except it will never change the display mode; it will enforce the current display mode of the primary adapter, based on the assumption that the windowed application will always start on the primary adapter.
Pointers to the resulting best full-screen and windowed
AdapterInfos are saved as
Settings, along with indices to the best
DeviceCombo within them.
This is where the framework might fail dramatically for multiple monitor setups. A PC with two or more display adapters, or an adapter with multi-head display, might be using the desktop extension feature of Windows, or might have different display formats for each adapter. Even worse, there is software out there (like MaxiVista) that allows the display to span across multiple monitors in a network, and there might be other running applications with exclusive adapter ownership.
I have not tested either the default framework or this version of the framework in multiple monitor setups, but intuition tells me things may go south when the best windowed adapter is not the primary adapter; it would be cool, though, to have a windowed application create a full-screen render target in a secondary adapter, while showing standard Windows stuff like menus, dialogs, toolbars, and controls in the primary adapter, for the case of a level editor application. It would be even better to have a full-screen application render to three adapters, like Microsoft Flight Simulator does, but for the purposes of this article, rendering to multiple monitors is an advanced topic I simply will not cover. Check the Multihead topic in the SDK help for more.
The settings class also sponsors a set of 'Get' wrappers for each specific setting. When we need the current depth/stencil format, we get it from the
Settings object with the
(Please ignore the line breaks; the VC++ complier does, anyway.) For a particular setup, this might evaluate to:
AdapterInfos->DeviceInfos.DeviceCombos.DSFormats == D3DFMT_D24X8
So, you can tell how important it is for the application to carefully set the correct indices.
There might be more than one Depth/Stencil format in the list, so to support changing from, say
D3DFMT_D16 at index 0 to
D3DFMT_D24X8 at index 2, the class keeps track of the Depth/Stencil format index, in the
nDSFormat UINT, initialised by the settings constructor to 0, and changed via a
SetDSFormat, with just a range check:
void CXD3DSettings::SetDSFormat(UINT nFmt)
if (nFmt < AdapterInfos[Windowed]->
nDSFormat = nFmt;
Because this pimped framework does not allow changing the settings, I just provided the aforementioned
nDSFormat index and the corresponding 'Get/Set' functions as a guide for the rest of the 'Set' wrappers; at this point, you should be able to do the rest if you want to provide the user with a way to change the settings, like the original SDK framework does with a Settings dialog.
Now that we have chosen the best settings to render, it is time to initialize the Direct3D environment, that is, create the Direct3D device with such settings.
Creating the Device
CXD3D::CreateD3D successfully creates the Direct3D object, enumerates all of the adapters, display modes, and devices, and chooses the best settings among them, it is ready to create the Direct3D device.
Before that, though, it gives the application a chance to create or initialise any custom stuff that does not depend on the device, with a call to
CXD3D::OneTimeSceneInit. This is an overridable function returning
S_OK in the base class implementation. Non-device objects might be cursors, Direct3D fonts, Direct3D mesh objects, or any app-defined objects. This function pairs up with
CXD3D::FinalCleanup, where such objects should be released or deleted.
CreateDevice Direct3D API (Application Programming Interface) function takes in the following arguments:
- the desired adapter ordinal in which to create the device;
- the desired device type;
- a window handle;
- a creation behavior, that comprehends the VP type and, optionally, a request for a pure device;
- a presentation parameters structure, and
DIRECT3DDEVICE pointer where to return the device.
Take notice that from now on, I may refer to Direct3D API functions (or structures) in a shorthand fashion as functionX API, e.g., the
Back to the subject at hand, at this point, the framework has (valid) values for most of the arguments (either from the enumeration, wrapped by settings, or from the app itself), so let's examine the one not covered so far, the presentation parameters structure. Note, it is not a framework defined structure; it is a Direct3D API structure:
typedef struct _D3DPRESENT_PARAMETERS_
Presentation parameters are selected from the settings class; it is not a one-to-one correspondence deal, so a
CXD3D function takes care of filling up one from the other:
m_d3dpp.Windowed = Settings.Windowed;
m_d3dpp.hDeviceWindow = m_hWndRender;
m_d3dpp.BackBufferCount = 1;
m_d3dpp.EnableAutoDepthStencil = Enumeration.AppUsesDepthBuffer;
m_d3dpp.MultiSampleType = Settings.GetMSType();
m_d3dpp.MultiSampleQuality = Settings.GetMSQuality();
m_d3dpp.SwapEffect = D3DSWAPEFFECT_DISCARD;
m_d3dpp.Flags = 0;
m_d3dpp.Flags = D3DPRESENTFLAG_DISCARD_DEPTHSTENCIL;
m_d3dpp.AutoDepthStencilFormat = Settings.GetDSFormat();
m_d3dpp.BackBufferWidth = m_rcClient.right - m_rcClient.left;
m_d3dpp.BackBufferHeight = m_rcClient.bottom - m_rcClient.top;
m_d3dpp.FullScreen_RefreshRateInHz = 0;
m_d3dpp.BackBufferWidth = Settings.GetDisplayMode().Width;
m_d3dpp.BackBufferHeight = Settings.GetDisplayMode().Height;
m_d3dpp.BackBufferFormat = Settings.GetBackBufferFormat();
m_d3dpp.PresentationInterval = Settings.GetPresentInterval();
Most parameters come straight from the settings, one comes from the enumeration, and some are hard-coded; let's go through the ones we have not discussed.
- The backbuffer count: either 0 (treated as 1), 1, 2, or 3. Typically, applications use a single back buffer, a.k.a. double-buffering. More buffers can smooth out your frame rate, but they can also cause input lag (delay between hitting a key and seeing the results), and they also consume extra memory.
- The swap effect: remember swap chains? Well, the swap effect indicates what to do with back buffers after they are presented; either their contents are discarded, or preserved. In terms of memory consumption and performance, it will always be more efficient to use the
D3DSWAPEFFECT_DISCARD for this parameter. If your application operates directly on the back buffer(s), you would require one of the two other types, namely flip or copy, but the framework discards them, by default. Besides, back buffer tweaking or playing with is an advanced topic that requires more than just setting this flag; check the SDK help here for more.
EnableAutoDepthStencil flag: when set to
true, indicates that the device will automatically create a depth/stencil buffer, and that Direct3D will manage it. In the framework, it is set to
true by default, which means it must also take care of setting a format for it in
AutoDepthStencilFormat, and that it must set the
D3DPRESENTFLAG_DISCARD_DEPTHSTENCIL, to enable discarding the buffer in the same fashion as back buffers are discarded, again to increase performance. Incidentally, the debug runtime version of Direct3D will enforce this flag after presenting the depth/stencil buffer.
FullScreen_RefreshRateInHz: The rate at which the display adapter refreshes the screen. For windowed modes, this value must be 0 or
D3DPRESENT_RATE_DEFAULT, telling the runtime to choose the presentation rate or to adopt the current rate; for full-screen modes, this value must be one of the refresh rates returned by the
EnumAdapterModes API, or once again,
So, after we fill the presentation parameters with the chosen settings, we can issue, at last, the creation of the Direct3D device.
When the device is successfully created, we turn to filling a device statistics string, that in my laptop reads HAL software VP on S3 Graphics SuperSavage/IXC 1014, but in your PC, might be HAL pure hardware VP on NVIDIA Multi-GPU GeForce 7950 GX2, in which case, I'd turn red with envy-dia.
At this point, in the 3D environment initialization, we have a device, so we can initialise any device-dependant object. This would be a good time for another break, since we are going even deeper into the inner workings of Direct3D.
Device-Dependant Objects, a.k.a. Resources
Device-dependant objects are those created through the
IDirect3DDevice9 interface, (in the framework, the
m_pd3dDevice object returned by
CreateDevice), and they are no other than Direct3D resources: vertex buffers, index buffers, and textures used to render the 3D scene, surfaces holding the pixel data of a render target, back buffers, the front buffer or depth/stencil buffers, and surfaces holding data used to create volume textures, for whatever they are.
Resources have a set of properties that define their type (vertex buffer, surface, etc.), their pool, or in which type of memory they are to be created (e.g., system memory, video memory, etc.), their format, (e.g., the pixel format of a 2D surface), and their usage, defining how the resource will be used, e.g., as a render target, as a texture, etc.
The framework makes an explicit distinction between resources that can survive a device reset and those that must be re-created after the device is reset. Resetting a device is an important topic concerning the normal (operational) state and the lost (non-operational) state of a device, and involving the pool where device objects (resources) are created, so we will deviate a little to get it over with.
A Direct3D device is lost when in full-screen mode and the user presses Alt+TAB (releasing the keyboard focus to another application), when in windowed mode and the window size changes, when a system dialog is initialised, when a power management event fires, or when another application claims exclusive full-screen operation. These are only the typical scenarios, but hopefully, you get the point: sometimes, something will scare the device away, and the application must wait until it comes back; that something might be as trivial as the user switching to the calculator or resizing the application window, but it might also be as important as an antivirus warning.
The framework handles a device reset in the following manner: during the rendering loop of the application, the one ultimately presenting the resources on the device window, the device's
Present method might fail, returning
D3DERR_DEVICELOST, in which case we set the 'device lost' state flag accordingly. On the next iterations, we detect such state, and test the cooperative level of the device until it returns
D3DERR_DEVICENOTRESET, indicating that it is not lost anymore, but that it needs to be reset; consequently, it calls for an environment reset, in which any device objects that will not survive the reset are invalidated (released) before the device is reset, and restored (re-created) after the device is successfully reset. Get it?
All this complexity reduces to common-sense memory (resource) management. We have a good reason to
CXD3D::InvalidateDeviceObjects, (we just lost the device!). Invalidating resources translates into releasing resources (for they are COM interfaces), particularly those that cannot survive a reset. In the event of regaining the device, we have to
CXD3D::RestoreDeviceObjects, where we created them in the first place, after the device is reset. Incidentally, such resources are exactly those created in the
D3DPOOL_DEFAULT memory pool (essentially, the video memory).
CXD3D::InitDeviceObjects is the place to create resources that can survive a reset, exactly those in
D3DPOOL_SCRATCH (essentially, non-video memory). It pairs up with
CXD3D::DeleteDeviceObjects, where we release such resources, this time in response to an application shutdown.
OK, so let's track back a little: resources are created through the device interface, and you have a choice of where in memory to create them:
- Default pool resources are placed in the preferred memory for device access, mentioned before as driver-optimal memory: local video memory and/or accelerated graphics port (AGP) memory (though it can also be system memory if the driver so decides). Resources in this pool are also referred to as device-sensitive, since they must meet device requirements in terms of size and format. Vertex buffer resources allocated in the default pool yield the best performance in virtually all cases.
- Managed resources are copied automatically to video memory as needed, but are always backed by system memory. They are also device-sensitive. Managed is the preferred choice of memory pool for most resources, except (maybe) for vertex buffers.
- System memory resources reside in the PC's physical and/or virtual RAM, which is not typically available to the device, so it is best for resources you do not wish to render directly, e.g., textures or surfaces that will be used to update other textures or surfaces in the default pool.
- Scratch resources are also created in the system RAM, but they are never available to the device, which means they cannot be used as textures or render targets. On the other hand, they are not bound by device size or format restrictions, meaning you can use any format to create them, hence they are device-insensitive. Good examples are textures with proprietary formats, really large textures that the application will convert or slice into smaller, device-friendly textures, and an off-screen plain surface where to save a snapshot of the 3D scene.
The choice of memory pool for resources in your application depends highly on their type and usage, but whenever you have a choice, you must again balance the aforementioned, universal tradeoff between performance and visual quality; in my laptop's humble HAL soft VP device, I haven't got much choice, other than to let Direct3D manage most resources, but a pure device might be capable of 3x the fps, if resource management is correctly setup.
And, that is that about resources: create some in
InitDeviceObjects, and some in
RestoreDeviceObjects; invalidate the latter before a reset, and recreate them after a reset; when shutting down, be sure to cleanup the former in
DeleteDeviceObjects; choose carefully what to create where, and tweak the choice to test your really-cool, top-of-the-line, brand-new graphics card.
I know that is too much to chew in a mouthful, and I haven't even mentioned other subtleties about loosing the device. Just picture the user switching from your 3D game to the calculator at some point; the reset logic is in place to pause the 3D app and free some memory to the OS and other apps, namely the one memory that won't affect the 3D app performance sensibly, in the event of regaining focus. As a general rule, the geometry resources (vertex and index buffers) are optimally placed in the default memory pool (ideally video memory) so that they can be discarded when the device is lost, and recreated when the app recovers, but resources that take the most time to reload (namely, textures or other app-defined objects) should be kept in memory across resets, to make the transition somewhat smoother. Of course, this is only a guideline, and you'll learn to manage memory as you go along. Keep in mind, the logic is covering for a somewhat rare event, but also for an application shutdown, so it makes sense to clean up at some point, just as for any Windows application, in order to avoid any memory leaks.
We created a Direct3D device and used it to create resources, yet something may have failed! When the creation of HAL truly fails, the framework will try switching to a reference device, show a warning message, and try again, but the reference device might not be able to render at all, in which case, there isn't much you can do, (except switching to the debug runtime version of Direct3D, i.e., the retail version, but there are no guarantees). In any case, if everything goes according to plan, you should end up with a Direct3D HAL device so that now the application can start the rendering loop.
The Rendering Loop
if (FAILED(hr = m_pd3dDevice->TestCooperativeLevel()))
if (hr == D3DERR_DEVICELOST)
if (hr == D3DERR_DEVICENOTRESET)
if (FAILED(hr = ResetEnvironment()))
m_bDeviceLost = false;
FLOAT fTime = DXUtil_Timer(TIMER_GETAPPTIME);
FLOAT fElapsedTime = DXUtil_Timer(TIMER_GETELAPSEDTIME);
if (fElapsedTime == 0.0f)
m_fTime = fTime;
m_fElapsedTime = fElapsedTime;
if (FAILED(hr = FrameMove()))
if (FAILED(hr = Render()))
NULL) == D3DERR_DEVICELOST)
m_bDeviceLost = true;
There are two easily distinguishable sections: one handling device lost states (as explained before), and one actually calling the rendering functions. Also, nothing actually happens if the application is paused.
The framework's timing is handled by the
DXUtil_Timer function in dxutil.h/dxutil.cpp, a wrapper for the
QueryPerformanceFrequency() Windows API (or
timeGetTime(), in its defect). Incidentally, the
RenderEnvironment function is the only place in the framework where the elapsed time is queried, so it is effectively the time elapsed between calls.
Whenever the device is operational and some time has elapsed between calls, the presentation is prepared in two separate overrides:
CXD3D::FrameMove, in charge of animation, and
CXD3D::Render, in charge of drawing.
FLOAT fTime = DXUtil_Timer(TIMER_GETABSOLUTETIME);
D3DXMatrixRotationY(&matWorld, fTime / 150.0f);
D3DCLEAR_TARGET | D3DCLEAR_ZBUFFER,
if (SUCCEEDED(hr = m_pd3dDevice->BeginScene()))
m_pd3dDevice->DrawPrimitive(D3DPT_LINESTRIP, 0, nVertices);
Note that these are only example overrides of what to do in these functions, but in general,
FrameMove is the place where to apply all the matrix algebra and geometry transformations that make the scene animate, and
Render is the place where to clear the viewport (mandatory) and call the device methods to draw the scene, within the (also mandatory)
Also note that almost everything going on in both functions go through device 'Set' calls; these methods are the ones defining the rendering state of the device (the what, when, where, and how 3D objects are displayed).
Imagine a First-Person Shooter 3D game; it makes sense to render the sky in the background first, then the skyline mountain range, also in the background, then the terrain, extending from left to right and front to back, then the buildings on top of the terrain, then the enemies, moving around, then the vehicle or shooter, which is the actual viewpoint, a.k.a. the 'eye' or 'camera', also moving, and then a HUD (Heads Up Display) on top of everything.
Every object mentioned is represented in Direct3D by one or more resources, presented in a logical order to the device; some are static, some are moving (transformed from frame to frame), some might be occluded at times and visible at others, one is visible at all times and one constantly changes the view. The rendering state comprehends the entire dynamic scene.
Of course, there is yet another set of key 3D concepts and Direct3D methods we must get through with before we are able to present our dream scenario. There are tons of books and web pages discussing these, including the SDK help, but I am committed to cover all I've learned about them here, so be prepared for the next delivery of the series, in which I'll get into 3D transformations, 3D vector and matrix algebra, 3D geometry, in general, and into how it all works in Direct3D.
Genius is 1% inspiration and 99% perspiration - Thomas Edison.