Ralf 的个人资料Fun with GPUs日志列表 工具 帮助

日志


12月10日

I am not dead

It is nearly two months since my last post here. Shame over me but living between two jobs makes that thing called life not easier at all. Next month I will start working for a German game studio and as tomorrow is my last day in my old company I should have a little more time again.

But there are still many things to do. First I have to finish my part of Wolfgang Engels new Direct3D 10 book. Second I will try to finish some of my various other Direct3D 10 projects. Hopefully the new DirectX SDK will show up in the next few days. I don’t expect any API changes but the old SDK doesn’t work with the RTM build of Vista. Hopefully we will see a D3D10 driver for nvidias first Direct3D 10 GPU at the same timeframe.

Currently there is only a Windows XP driver. It already supports new OpenGL extensions for the new features and many of them are EXT. But there are still some new vendor specific extensions. Anyway it seems like the OpenGL guys have gotten a little head start when it comes to using real next generation hardware. Well there are rumors that some people outside nvidia already have drivers because they are working on the first wave on Direct3D 10 games or patches for already released games. I don’t now but I wouldn’t surprised that much.

Enough for now and back to work.

10月16日

What’s new in the DirectX October SDK?

I finally find some time to make a check which headers are changed in the new SDK.

 

First Direct3D 10 seams stable now, as there are no more changes in this release. Direct3DX 10 got a new function. D3DXCpuOptimizations let you control if D3DX 10 should use optimized math functions.

 

The Direct3DX 9 texture atlas functions have an additional option parameter that makes this minor update complete.

 

The new effect framework for D3D9 that was shown at the gamefest conference is not yet in. But there is a beta version of the new shader compiler for D3D9.

10月1日

MVP Award

I am just got the email notification that I have received an MVP Award for DirectX. I don’t know who have suggested me for this honor but I wants’ to thank this person. Seams I have to update my MVP profile now as first official task.

9月25日

New Vista build for everyone!?

Another week another Vista build (5728-16387 to be exact) and as the original RC1 build it is public available. But as with other builds beside the big releases (Beta 2; RC 1) Direct3D 10 is out of Sync again. Fortunately I haven’t override my RC 1 installation with this new build.

But this is something I could life with but the WDK (Windows Driver Kit) updates that are released with every new Vista build too starts to make me angry. As I want to make my SSE D3D10 software device compatible to the IDXGIFactory::CreateSoftwareAdapter method I need the WDK that contains the driver interfaces that the D3D10 runtime use to talk with a driver. Software devices are nothing else than D3D10 User Mode driver that although “emulates” the kernel mode part of the driver.

I haven’t counted the number of WDK updates but every time it’s the same game. The necessary D3D10 headers are still missing and the documentation is crap. Therefore I could not work on the SSE Software device and the same is partial true for the D3D10 to D3D9 layer.

I am really hope they include this headers real soon.

Beside of these downs at least the managed D3D10 layer is near finalization but October and a new SDK is near. Maybe the API is already stable and we will not see any future changes.

9月16日

CodePlex project

The managed Layer for Direct3D 10 has now its own CodePlex project. Thanks’ to the CodePlex team for providing the infrastructure.

9月7日

XNA Capabilities Viewer

As XNA use compared to other Direct3X version different names for all the surface formats it’s a little bit pedantically to use the standard DirectX Caps Viewer. To ease my pain a little bit I have written a XNA Version of the Viewer. I will upload it later (including source) for everyone.

8月31日

XNA Framework (Beta) finally arrived.

As I prefer to camp in the managed world I couldn’t resist to download the new XNA beta. I must confess that I was less interested in the Game Studio. I was locking for what Microsoft had done with MDX 2. The first impression wasn’t that bad as it looks very similar to what we have seen in MDX 1 and 2. Even if there are some leftovers the fixed function are gone as announced. Some classes are renamed but that’s fine. I like the step that gives us properties for the old descriptions structures. The new effect framework looks very similar what I am already have seen for Direct3D 10. Good work guys.

 

But with light comes shadows. I really don’t like that the have cut the possibility to lock and unlock resources. This makes dynamic resources less effective. Remove most of Direct3DX doesn’t make me happy too. This reduces the use of the XNA framework to write tools.

 

As this is only the first impression on a first beta I am still undecided how useful the final version would be for me. If I will not be happy at the end I still could use MDX 1 or write my own wrapper.

8月30日

The power of reflection

Last night I spend some time on my managed layer for Direct3D 10. It’s nearly complete and therefore I start playing with some advanced features. One thing I never really like since Direct3D 9 was the need to define your vertex structure twice. One time for the compiler and then for Direct3D again. It’s still the same for D3D10. The name has changed from vertex declaration to input layout. Some other details have changed too but the pain to do it twice is still there.

 

C++:

 

struct SimpleVertex

{

    D3DXVECTOR3 Pos; 

    D3DXVECTOR4 Color;

};

 

To create and set the input layout for this structure the following code is necessary.

 

// Define the input layout

D3D10_INPUT_ELEMENT_DESC layout[] =

{

    { "POSITION", 0, DXGI_FORMAT_R32G32B32_FLOAT, 0, 0, D3D10_INPUT_PER_VERTEX_DATA, 0 }, 

    { "COLOR", 0, DXGI_FORMAT_R32G32B32A32_FLOAT, 0, 12, D3D10_INPUT_PER_VERTEX_DATA, 0 },

};

UINT numElements = sizeof(layout)/sizeof(layout[0]);

 

// Create the input layout

D3D10_PASS_DESC PassDesc;

g_pTechnique->GetPassByIndex( 0 )->GetDesc( &PassDesc );

hr = g_pd3dDevice->CreateInputLayout( layout, numElements, PassDesc.pIAInputSignature, PassDesc.IAInputSignatureSize, &g_pVertexLayout );

if( FAILED(hr) )

    return hr;

 

// Set the input layout

g_pd3dDevice->IASetInputLayout( g_pVertexLayout );

 

This can easily be ported to C#:

 

struct SimpleVertex

{

    public SimpleVertex(DXGI.R32G32B32Float position, DXGI.R32G32B32A32Float color)

    {

        this.position = position;

        this.color = color;

    }

 

    DXGI.R32G32B32Float position;

    DXGI.R32G32B32A32Float color;

}

 

// Define the input layout

D3D10.InputElement[] layout = new D3D10.InputElement[]

{

    new D3D10.InputElement ("position", 0, DXGI.Format.R32G32B32Float, 0, 0, D3D10.InputClassification.PerVertexData, 0),

    new D3D10.InputElement ("color", 0, DXGI.Format.R32G32B32A32Float, 0, 12, D3D10.InputClassification.PerVertexData, 0)

};

 

// Create the input layout

inputLayout = new D3D10.InputLayout(device, layout, technique.Passes[0].InputAssemblerSignature);

 

// Set the input layout

device.InputAssembler.InputLayout = inputLayout;

 

Clever usage of propertys save some lines of code but the main problem is still the same. Let’s add some reflection magic.

 

// Create the input layout

inputLayout = new D3D10.InputLayout(device, typeof(SimpleVertex), technique.Passes[0].InputAssemblerSignature);

 

// Set the input layout

device.InputAssembler.InputLayout = inputLayout;

 

This new construtor I add last night use reflection to analyse the vertex structure and create the nessary layout on the fly. If you change the structure the layout will be automatically in sync.

8月26日

Time to upgrade

After staying a long time with Vista Beta2 I moved forward to the new pre RC1 last night. Installation of the ~2.5 GB ISO runs very well. A nice new feature compared to Beta 2 is that it grabs missing drivers from Windows update before it starts installing. No pain to find and install drivers for my two Ethernet adapters without network access this time.

As this pre RC1 supports the current August version of D3D 10 I need to upgrade my code to the new interfaces. But with limited amount of changes this should not take that long. I am trying to deliver an updated and more complete version of my managed D3D10 layer over at MDX Info next week.

Muhammad Haggag has written a fine managed milkshape model viewer for MDX 1.1. As he was so kindly to release it as public domain I thought I will make a port to my managed D3D10 layer and include it as sample how to use it.

8月4日

Direct3D 10 changes in the August 2006 release

  • The Shader Mirror interface is removed
    • D3D10_VERTEX_SHADER_DESC is removed
    • D3D10_GEOMETRY_SHADER_DESC is removed
    • D3D10_PIXEL_SHADER_DESC is removed
    • D3D10_INPUT_LAYOUT_DESC is removed
  • D3D10_COMMONSHADER_CONSTANT_BUFFER_SLOT_COUNT (15) is replaced byD3D10_COMMONSHADER_CONSTANT_BUFFER_API_SLOT_COUNT ( 14 ) andD3D10_COMMONSHADER_CONSTANT_BUFFER_HW_SLOT_COUNT ( 15 )
  • D3D10_FTOU_INSTRUCTION_MAX_INPUT is changed from 4294967296.999f to 4294967295.999f
  • D3D10_REQ_MIP_LEVELS is changed from 13 to 14
  • The specification date is now 06/21/2006 and the version is 1.050004
  • The WGF version is finally removed
  • The constant D3D10_APPEND_ALIGNED_ELEMENT is added
  • The D3D10_FRONT_WINDING enumeration is removed
  • The D3D10_BUFFEROFFSET_APPEND constant is removed
  • Two new format support flags
    • D3D10_FORMAT_SUPPORT_MULTISAMPLE_RENDERTARGET
    • D3D10_FORMAT_SUPPORT_MULTISAMPLE_LOAD
  • Get and Set methods now use “StartSlot” instead of “Offset” as variable name
  • Methods that need a pointer to shader byte code need the size too
  • The new SDK Version is 29
  • D3D10_COMPILE_CHILD_EFFECT is replaced with D3D10_EFFECT_COMPILE_CHILD_EFFECT
  • D3D10_COMPILE_DISABLE_PERFORMANCE_MODE is replaced with D3D10_EFFECT_COMPILE_ALLOW_SLOW_OPS
  • New effect variable flag: D3D10_EFFECT_VARIABLE_EXPLICIT_BIND_POINT
  • D3D10_EFFECT_VARIABLE_DESC has a new member: ExplicitBindPoint
  • New As methods for effect variables:
    • AsRenderTargetView
    • AsDepthStencilView
  • New element in D3D10_PASS_DESC structure: IAInputSignatureSize
  • New shader variable flag: D3D10_SVF_USED
  • New shader variable types:
    • D3D10_SVT_TEXTURE2DMS
    • D3D10_SVT_TEXTURE2DMSARRAY
  • Multiple new count values in the shader description structure:
    • TempArrayCount
    • DefCount
    • DclCount
    • TextureNormalInstructions
    • TextureLoadInstructions
    • TextureCompInstructions
    • TextureGradientInstructions
    • FloatInstructionCount
    • IntInstructionCount
    • UintInstructionCount
    • StaticFlowControlCount
    • DynamicFlowControlCount
    • MacroInstructionCount
    • ArrayInstructionCount
    • CutInstructionCount
    • EmitInstructionCount
  • The D3D10_SHADER_TYPE_DESC structure contains a new offset member
  • The GetShaderSize functions is removed
  • D3D10ReflectShader, D3D10GetInputSignatureBlob, D3D10GetOutputSignatureBlob and D3D10GetInputAndOutputSignatureBlob now requires the size of the shader code.
  • The DXGI_SWAP_CHAIN_DESC don’t have the MaxFrameLatency and BufferRotation members anymore.
  • CreateDataTransportDevice is removed
  • New status result code: DXGI_STATUS_MODE_CHANGED
  • New error result: DXGI_ERROR_FRAME_STATISTICS_DISJOINT

DirectX SDK August 2006

Two months are over again and we got a new SDK.

Warning: The Direct3D 10 part will require RC1.

7月26日

“Capsed” Direct3D 10

Time goes by very fast if you are busy. The hot weather over here seems to make me a little bit slower, too.

Anyway even with this handicap the “capsed” Direct3D 10 layer shortly comes to a point where it can be used for some real projects. It’s time to think about the aim of this project again. The motivation was the missing backward compatibility of the Direct3D 10 implementation that Microsoft will use for Windows Vista. If we let leave the strict feature requirements beside the API itself could work fine with every shader hardware. Even non shader hardware would be possible as every fixed function setup could be encodes in an own text and binary format. In the context of Direct3D something like this could be called shader model 0.X. This makes mapping to another 3D API possible. I decided to go with Direct3D 9 first as it have the biggest accordance. But a mapping to OpenGL may be an additional project for the future.

Direct3D 9 has its limit and even if a card reports support for any single supported feature you can’t provide a full featured Direct3D 10 based on this. But this was never the aim. The capsed D3D10 layer for Direct3D 9 is build to make sure that one code base will work with a large range of hardware and not only on Windows Vista. If we limit all our developments to Direct3D 9 we already have reached this aim without any additional pieces of software. But as the first companies already talked about Direct3D 10 features in their games not using Direct3D 10 were available would be no real solution for the future. As developer we could go with OpenGL too but there isn’t any public world yet about the extensions for the new features. Would nVidia and ATI find a way to provide a common set of extension for the new features? Currently Direct3D 10 seems to be a safer way.

Even with the limitations it was surprising easy to map the core Direct3D 10 functions to Direct3D 9. There are some minor problems like the differences in the channel ordering between Direct3D 9 and 10. But this could easily solved with some sizzling code. After all Direct3D 10 samples that don’t require any special Direct3D feature like the geometries shader would already work fine if there wasn’t another big problem. Nearly every single sample makes use the Direct3D 10 effect framework that works on top of the core API but is in the same DLL. This makes it necessary that a layer solution which should work on Windows XP need to implement this part of Direct3D 10, too. The same is true for any other part like the shader reflection system and anything you can find in Direct3DX 10. This makes the development of this whole layer more time consuming.

But I think it still worth it because it will make the development of the next generation of Direct3D 10 software that need to support Direct3D 9 and 10 hardware easier. It removes the “no caps” advanced form Direct3D 10 but as long as older hardware and Windows XP need to be supported anybody have to deal with caps anyway. Sure this can be solved with the classic multi render DLL system that the Unreal engine use but not everybody has the time and money to write two render systems for their software.

Before I go into more details about the mapping in another post I would end with the announcement that we will shortly start a beta program for this layer. If someone is interested please contact me.

7月16日

Another spinning cube

From the visual point of view it is the same cube as lat time but this time its draw from a real GPU. As I am not a lucky guy with access to a Direct3D 10 GPUs I have used a good old Direct3D 9 chip and a Direct3D 10 to Direct3D 9 layer. The layer doesn’t provide additional features for this card but makes it able to write D3D10 code that can run on Direct3D 9 hardware. As the software device it’s currently only a prototype and supports only parts of the whole Direct3D interface. As it need’s nothing from the original Direct3D 10 runtime it works well with Windows XP,too.

7月8日

Concepts of Direct3D 10: State Objects

The removing of the fixed functions pipeline in Direct3D 10 makes many render states obsolete. But even for all of them who are left the rules are different now. Direct3D 10 doesn’t let you change every single state with a simple SetRenderState call like Direct3D 9. Render states can only changed in groups called state objects. There are four of them:

  • Rasterizer
  • Depth Stencil
  • Blend
  • Sampler

With the input layout state object the documentation add another one but this is a replacement for the Direct3D 9 vertex declaration and not a collection of render states.

Working with state objects is not that hard at all if you start a new Direct3D 10 project from the ground. You will always initialize the matching description structure before you call the create method. Later if you want to set the states all you have to do is calling the set method for this kind of state objects. It is easy but there are still some chances to make it wrong. Like shaders every state object should be created in advanced during initialization or level loading. As there are high chances that different shaders are combined with the same set of render states you may come to the conclusion that a state objects manager is a good idea to avoid create multiple objects for the same set. Don’t do it if you need to compare every single state in all of your already created state objects. The runtime already does the same and returns the same object again with a higher reference count. So don’t do the same work twice. A hash map that managed state objects based on ids or names would be fine.

But you would not always have advantage to start from an empty ground a new Direct3D 10 only project. In this case you have to convert from Direct3D 9 or even more badly make sure that your app runs with both APIs. Most of us had learned that to many calls to SetRenderStates are bad and should be avoided. Because of this there are many approaches for doing fine granular render state changes. That’s the absolute different from what Direct3D 10 forces you to do. But if you merge this to different worlds together you have no time to cry and better find a solution.

There are two of them that can be used depending on the level of control you have on the render engine. In the case you control the interface between the core render engine and the higher parts of the game you can rebuild the state objects system for Direct3D 9. Simply group the matching states together. You still need to be careful when the higher levels sets a state object. In the case of Direct3D 9 you should only update the render states that are different from the current once. At the moment I will let this task open.

If you have to migrate a engine to add Direct3D 10 support there is an high chance that the interfaces are already there and should not change more than necessary. This means you need a system that allows fine granular render state changed for Direct3D 10. As state objects cannot change after they have created you may come to the idea to read the description of the object and change the single state there. Then a new state object can be created and set to the pipeline. It will work but expect some performance problems. The better way will be a cache that store the different state objects. Additional you will need an array of single render states. To bring both together you can calculate a hash value for the states that is then used to find the right state object. Again I will move the presentation for an implementation of such a system to a later time.

With Direct3D 10 state objects you will be able to reconfigure the whole pipeline with less calls then using Direct3D 9. But this concept change can give you some headaches during migration or multi API projects.

7月6日

The smart guys

Finally most people that have something to do with whole GPU stuff should hear the message “No Direct3D 10 for Windows XP”. No surprise after so many news sides has spread the word. After the first wave of curses against Microsoft some people starts to believe in “The smart guys” that will magical bring Direct3D 10 to Windows XP. Maybe some remembers that Direct3D does not work with Windows NT but these smart guys solve the problem and we don’t need to buy Windows 2000 with the new driver model that was needed for Direct3D. But wait a minute. Let’s remember what this “hack” had done. It allows you to use the Direct3D runtime and the software devices but you were still not able to install a new Windows 2000 driver on Windows NT to get full hardware accelerate 3D. Today a software device would not be an option for a whole game. More work for our smart guys but we still believe in them.

 

Let’s look at the way Direct3D will be distributed. It will be part of Vista and is contained in some DLLs. Nice! Just let us copy these files to a XP system and we will have Direct3D 10 there. At first this is not legal but let us ignore this for a moment and try to start a Direct3D 10 application (only as experiment in our mind). It will not work and using your favorite DLL dependency viewer will tell you why. The Direct3D 10 runtime is linked against a Visual C runtime that was linked against versions of system DLLs that contains function that only exist on Vista. So there will be no lucky user with a simple file transfer. But as our guys are really smart they can use the Direct3D 10 documentation and implement the whole runtime for Windows XP. As the shader compiler and the effect framework are now part of the core this will take some time and will only lead to the next problem.

 

Implementing the user mode part of Direct3D 10 only solve half of the problems. There is another runtime in the kernel. But as we already have gone that far we can implement this piece of software too. The WDK will be our friend and tell us how. Another huge amount of time pass away but finally Windows XP will load Vista drivers and the screen went black. Unfortunately the new driver model does not only change the rules for Direct3D it changes them for the GDI too. Without a working GDI most Windows applications will not work. And the desktop is an application too.

 

Let’s implement a new GDI for Windows XP that can work with a Vista driver and hopefully we can work with our XP again. But gaming would be a little hard. Sure we had a running Direct3D 10 now but there are still many games that use older versions of Direct3D, DirectDraw or OpenGL. Some more runtimes that need to be written form someone for XP again.

 

In the end the “smart guys” we believe in have to touch every single part of Windows XP that has something to do with graphics. If they are such smart to do this without any help from Microsoft they should be smart enough to not waste their time with this.

 

But there is still some hope for some kind of Direct3D 10 for Windows XP. Looking over to some other smart guys that prefer to use Linux and try to solve the gaming problem there. Both Wine and Cedega provides DirectX support for Linux. Unfortunately it is still limited but I am sure that somewhere in the future they will support anything that the original version supports today. At the same time they offer Direct3D 10 for Linux the necessary code should be portable to Windows XP. The only question left is: How many people will still use Windows XP at this time?

 

If you really want to play Direct3D 10 games you should not put to much hope in the smart guys out there to solve this problem for you.
7月5日

The pain of cutting compatibility

Warning: Could contain rants.

 

Many people are angry that there will be no Direct3D 10 for Windows XP. I have heard more than one time that the decision making Direct3D 10 exclusive for Vista was only made to force players to upgrade as Vista will not give them any advanced beside. That’s the user side of the decision but what are about us developers?

 

Using only Direct3D 10 will cut not only the whole XP user base at the moment it will cut the whole user base at all. Nobody (maybe some lucky guys who work for ATI or nVidia are exceptions) have Direct3D 10 hardware at the moment and even after the launch of Vista the User base with the right GPU will be small. Isn’t this is clear stop sign for everybody who want to use Direct3D 10? It is for sure as long as your competitor doesn’t drive through and sell their newest game with “Direct3D 10 effects”. If you don’t have this effects game tester will write that your game is not “visual up to date” anymore.

 

At this moment the missing compatibility bites you. You have to add Direct3D 10 support to get good test results but you can’t drop Direct3D 9 support because there will be simply to much users without the right hardware or operating system to run Direct3D 10. I am not sure how many developers believe that Direct3D will always take same way of one API for every GPU out there. But everybody who does could possible face same hard work in the near feature. Migrate from a monolithic engine that was bind to only one API to a Multi API design would need some time.

 

I am sure that most developers are not very satisfied with this situation. I am like clean cuts that generate new grounds to start from. But this is more like building new roads where no current car can drive. Microsoft should avoid building such situations that make game developers life harder. Particularly after the have recognized that playing games is in the top 3 how people use their computers.

 

The XBox team found a way to make sure that games from the first XBox runs on the new too. Why can’t the responsible Windows team not offer a solution to use only one API for most 3D hardware out there? A solution that give the developers access to the newest GPU technologies but stay compatible with older. I was never a friend of the OpenGL extension system but at this point it can play its advantages.

 

Anyway it could have been so easy. Building an additional version of Direct3D 10 that use the same base concepts and interfaces but works with Direct3D 9 drivers. Adding an additional interface that allows checking caps that can be access with a simple QueryInterface call. Call this Direct3D 10 version the “capsed” and the version we now have the “pure” one. The could even put in only in the SDK as some kind of extension like D3DX. Developers out there will be more willing to migrate to the Direct3D 10 interfaces and will starting to build effects that require real Direct3D 10 GPUs and therefore Vista.

 

You may have noticed that one of my Direct3D projects on the list is such a migration helper but it would be much better if something like this comes from Microsoft with full official support. This way more people would know about it and could save themselves from the Direct3D 10 migration hell.

 

7月4日

The broken feature?

Someone may remember that one of the features that nVidia talked about during the NV40 (GeForce 6800) launch was a hardware tone mapping unit. But later it was missing in action. The public never see an API extension neither a demo of this feature. Today tone mapping is still done with a pixel shader program and the rumor mill says the feature was broken.
 
But finally there is a sign of life. The US Patent 7,071,947 “Automatic adjustment of floating point output images” describes how it may had should work. If you don’t fear reading patents have a look
 
Ralf
7月3日

First Cube

Finally I have something to show. The software device is able to render the cube from the Direct3D 10 tutorial 04.

Still a long way to go but the first step is done.

 

Ralf

Direct3D 10 Software Device

Someone may ask why I am working on a software device for Direct3D 10 when there is already the reference rastierizer and real hardware around the corner. This question is not that easy to answer but one of the primary motivations was the technical challenge. But this alone was not enough to fire up my compiler and write some code. The annoying slowness of RefRast was another brick in the wall. But finally the motivation comes from the idea to use Direct3D 10 for every kind of SIMD processor. As modern multicore CPUs are SIMD processors too it was natural for me to start there.

 

This leads to the next question. Why would someone use Direct3D 10 to make use of the CPU power? Well multicore and SSE programming is something that not everyone want to do. But nearly every graphics programmer out there know how to write code for a SIMD processor using an 3D API and shaders. Sure we have OpenMP and the SSE output from the different compilers is getting better too. But it doesn’t solve the load-balancing problem. PCs are individual configured and we never know were we can find some additional calculation power. Having a common API for every device in the system that can do SIMD math would let us move the SIMD calculations to any place without changing the code.

 

But hasn’t show us RefRast that the CPU is to slow to handle Direct3D 10 in a fast way. No it hasn’t. RefRast shows that is is possible to write a slow software emulation of a Direct3D 10 GPU. RefRast was build to be correct and act as a development tool it was never build to be fast. Think about shader programs as byte code like Java or .Net use. Java was not very fast as long as the byte code was interpreted but a JIT compiler can do wonders. The same is true for shader model 4 shaders and with the new Direct3D 10 features like stream out we don’t even have to pay for the rasterisation if we don’t need it.

 

Maybe I am wrong as many developers fears to lose direct control but for anybody who is willing to allow an additional layer between the program and the CPU a Direct3D 10 software device could be a useful extensions for the toolbox.

 

Ralf