View Full Version : DirectDraw with system memory very slow?
programmer_ted
08-06-2003, 08:06 PM
Hi! If you haven't seen me here before, that's because I'm new. Just a quick question. I'm using DirectDraw 7 in the game I'm working on, and it appears that when using system memory, the game is SIGNIFICANTLY slower. We're talking from 100+ FPS, to 0-10 FPS. Why? Thanks in advance!
BarrySlisk
08-07-2003, 02:30 AM
Because system memory is slow.
The image data must be sent via the bus to the graphics card before it can be drawn.
When using video mem the image data is already there, and therefore faster.
Khaile
08-07-2003, 03:42 AM
Well... system memory isn't "slow", it's the transfer rate between the system memory to the video card (via the bus) that is slow. If your backbuffer is placed in the same memory space as your images you want to draw on it, you shouldn't lose as much performance. I don't really remember if the DDraw flipchain allowed backbuffers to be placed in system memory, so you will perhaps need to create a new surface with the same resolution as the screen and then blit the surface instead of using a flip.
Another important thing to know is that the AGP bus is intended for one-way transmissions. Going from system memory to video memory is really fast, but going back the other way is really slow. This means that you don't have to worry too much about copying images and pixels to your video memory, but you should never ever read from it, unless you're doing something where the user can accept a slowdown (such as when capturing a screenshot).
When I worked at Oblivion Entertainment, and still were using DDraw for graphics, we had a system memory buffer instead of video memory buffer since it allowed us to use blending and similar effects in our 2D game (blending requires reading, so video memory was a no-no). Though I must admit we decided to scrap DDraw in favor for D3D since it is a lot faster to make use of the video hardware most people have today. Of course, shareware games often target people who doesn't always have the latest technology, so I'm not blaming anybody for keeping with DDraw.
Hope that helps!
LordKronos
08-07-2003, 04:01 AM
No, system memory isn't that slow. More than likely, your problem is that you have some things stored in video memory, some in system, and you keep blitting back and forth. If you do everything in system memory and then just blit to the screen once per frame, you shouldn't have any problem getting 50 FPS on a P200 unless you are doing massive amounts of drawing or tons of alpha blending.
BrewKnowC
08-07-2003, 05:32 AM
Originally posted by LordKronos
If you do everything in system memory and then just blit to the screen once per frame, you shouldn't have any problem getting 50 FPS on a P200 unless you are doing massive amounts of drawing or tons of alpha blending.
Sorry if this sounds ignorant, but can/should the primary surface get stored in system memory as well?
BarrySlisk
08-07-2003, 07:00 AM
Originally posted by Khaile
Well... system memory isn't "slow", it's the transfer rate between the system memory to the video card (via the bus) that is slow.
Yes, I explained it poorly.
BarrySlisk
08-07-2003, 07:02 AM
Originally posted by BrewKnowC
Sorry if this sounds ignorant, but can/should the primary surface get stored in system memory as well?
I don't think thats possible, but creating the surface in system mem manually is probably just as good.
programmer_ted
08-07-2003, 08:58 AM
Hi. Basically, so far we've allowed DirectDraw to choose where it should allocate surfaces. But I did try yesterday putting everything in system memory and got 0-10 FPS. Actually, now that I think about it, the back buffer was still in video memory. That way I'd be drawing sprites (system memory)->back buffer (video memory)->primary surface (video memory). Still, there doesn't appear to be a problem here. Strange.
programmer_ted
08-07-2003, 10:11 AM
Hmm...Just tried putting the back buffer in system memory too and had only a 5-10 FPS increase (now 0-20). Weird.
Dan MacDonald
08-07-2003, 10:14 AM
are you running windowed or fullscreen?
programmer_ted
08-07-2003, 10:43 AM
Fullscreen.
BrewKnowC
08-07-2003, 01:23 PM
I'm really interested in hearing if you come up with a solution because I believe my game is suffering from the same thing. If I come to any epiphanies, I'll let you know.
programmer_ted
08-07-2003, 01:42 PM
Thanks. I'll be sure to let you know if I come up with something.
programmer_ted
08-07-2003, 02:01 PM
Well I found a page here that says you should have all your sprites and an off-screen back buffer in system memory, and then in video memory just the primary surface and back buffer (in a flipping chain). Then copy to the buffer that isn't showing and flip. I don't really know how this would solve the problem, but if you want to give it a shot, go ahead. I don't have the time at this moment to try it though, so if you finish before I do, tell me how it went ;)
programmer_ted
08-07-2003, 02:02 PM
Oh, and here's the page (courtesy of Experts-Exchange):
http://www.experts-exchange.com/Programming/Programming_Languages/Cplusplus/Q_20605141.html
jordan1207
08-07-2003, 06:29 PM
Hey guys... I'm working with Ted.... so obviously I have basically the same question he does ;) . I'm just wondering how people make games that run alright without requiring a 16MB video card... I mean, people do do that, right? We just can't seem to get around this...
MirekCz
08-07-2003, 11:30 PM
jordan:it's simply..
step 1.
Create primary and back buffer in vidmem
(for 800x600x16bpp image you need 480000*2bytes*2surfaces , so 2mb vidmem should be enought)
step 2.
Create 800x600x16bpp surface in sysmemory
step 3.
Each frame:
a)draw new frame to sysmemory (using some asm in your blitting routines and mmx will help heaps to speed it up)
b)copy sysmemory surface to vidmemory backbuffer
c)flip vidmemory front/back buffer to see new image
d)process game data
e)goto step a
Everything is quite simple. Getting basic blitters to work in software isn't a problem.. with a week or two of work you can get some advance RLE blitting with alpha-blending that are asm/mmx optimized and have pretty good performace.
The biggest downside using this method is the sysmemory->vidmemory copy as it's time consuming. For ex. on D600 with kt133 chipset/133mhz sdrams and AGP2x TNT vid card it takes about 1/90 of a sec to copy 640x480x32bpp screen using small mmx routine.
Overally depending on your screen res (640x480x16bpp or 800x600x16bpp or higher (like 32bpp)) your game will have different min. spec requirements.
To get 30fps with 640x480x16bpp something like p200mmx should be enought, depends how much graphic you process. (there's obviously a huge difference between puzzle game with few alpha-blending tricks and a diablo clone :-)
For slower machines obviously 640x480x8bpp bit mode should be used (like diablo 1) or lower. (also remember CPUs <200mhz don't have MMX support)
PS.
To make one thing clear, when you draw to sysmemory in point "a)" you use ressources from sysmemory. So you keep all your images/animations/etc in sysmemory.
I never initialize any other surfaces then those 3 (2x in vidmem and 1x in sysmem), when drawing I simply lock surface in sysmemory and use mallocated data in sysmemory to draw with it, without any surfaces.
PPS.It might be possible (well, I have never tryed), that keeping 2 surfaces in sysmemory and writing to the other one each frame will speed up things a bit as you can draw to one surface and the other surface will be copied to vidmem in "background".. althrought I doubt it will work out, you might want to try if you care :-)
Mattias
08-08-2003, 12:24 AM
Hey, Khaile, this is very off-topic, but you mentioned Oblivion Entertainment... is that the Swedish company you're talking about?
LordKronos
08-08-2003, 03:53 AM
I'f you are still having performance problems, I would try to break this into 2 steps to see where your problem is. First I would try disabling the copying from system memory to video memory and see what kind of performance you would get. Of course, without the copying, you aren't actually going to see anything on the screen, so you will have to log your FPS counter to a file once a second or something. If you performance there is good, then the bottleneck is in your copying.
Also, make sure you are running your performance tests on an optimized build. Depending on what you are doing, sometimes unoptimized (debug) code can run at 1/10 the speed of optimized (release) code (so your 20FPS could actually be 200FPS once you rebuild it).
programmer_ted
08-08-2003, 07:15 AM
MirekCz, thank you for your very informative post. I changed around the code a little to reflect what you said, but the game still runs incredibly slow (when using system memory). I'm not exactly sure what I've done wrong here...
LordKronos: I'll try that, and I am testing this on an optimized build.
programmer_ted
08-08-2003, 07:25 AM
It appears that the problem is in blitting. I'm using the BltFast function of the IDirectDrawSurface7 interface, with the DDBLTFAST_WAIT flag. I'm going to try using Blt, and if that doesn't help, try removing DDBLTFAST_WAIT.
programmer_ted
08-08-2003, 07:37 AM
Well I guess that didn't work either. Hm...
MirekCz
08-08-2003, 09:48 AM
programmer_ted:
email me to kherin@go2.pl and I will dig out my old source code and send it to you.
It basically only draws some alpha-blended sprites to sysmem and then blits it to vidmem, fast written mmx code, but works quite fine.