Yes, you are reading this right. NVIDIA has finally released their long awaited next generation video card that moves away from the previous architecture. The GeForce GTX 480 is their high end GF100 card making it their most powerful video card currently available. While the delays have stacked up a little, I can finally say NVIDIA has their DirectX11 card out to compete with AMD's offering. Now, was the card worth the wait?
From their press conference a few months ago, we got the factors in creating their next GPU architecture which I will re-iterate here. There were four points outlined in their presentation: Geometric Realism, Unrivaled Image Quality, Revolutionary GPU Compute for Gaming, and Highest Performance GPU Ever Built. The first two and last one should always be, in my opinion, the targets for a brand new architecture. It’s only recently that we saw using the GPU’s computing power to accomplish things so that is a natural addition to the goals outlined. The GF100 is what's come out of that addressing the four points. NVIDIA could have stuck some parts to accomplish tessellation, a core component of DirectX 11, onto their current architecture but there are bottlenecks there that would’ve gotten in the way of producing a powerful and fast video card. Below is the brand new design of the GF100.
As you can see from the diagram, there are four groups of GPCs or graphics processing clusters. Each cluster contains what is called an SM or shader multiprocessors. Within those are 32 CUDA cores, 48/16KB of shared memory, 16/48KB of L1 cache, ISA improvements, 4 texture units, and 1 polymorph engine. Compared to GT200, their last architecture, it has 4X the CUDA cores, 3X the shared memory, and 100% more L1 cache as GT200 doesn’t have L1. But, this was a few months ago. The GeForce GTX 480 won't ship with 512 cores as initially thought but 480 cores. There has been word around the Net that the yields were low for 512 cores so this might be the case of having the initial high end cards having 480 cores. The good news though is that it gives NVIDIA a little room to grow and they can, in the future, release a higher end part using 512 cores. Also from the presentation a few days ago, there are about the same number of transistors on the GTX 480 as there are in four, yes four Intel Core i7 CPUs. We're talking over 3 billion transistors so there's a ton of them in this video card alone.
In the middle of it all is a big L2 cache block. L2 cache size is determined by the number of memory controllers available with a total of 768k of L2 cache with six memory controllers. Taking away one memory controller will remove 128k from the L2 cache. The L2 cache allows for efficient communication between SMs.
So you have these split up units working together to render scenes. Because many of the items can be rendered in parallel, the architecture is set to work on these tasks efficiently. There’s a lot of work though to ensure that in the end everything renders and comes out in the right order, which NVIDIA has spent a lot of time perfecting.
Coming with the release of GeForce GTX 480 and 470 will be a new CSAA called 32x sample CSAA which consists of 8 color samples and 24 coverage samples. If used, this will have an improvement over the previous method with better accuracy and quality. One example shown was a railing using the old CSAA against the new one. With the old method, some railings were missing but the 32x CSAA showed a more accurate picture with the railings appearing. It didn’t look that nice up close but from a distance it will be a more accurate representation when using the new technique.
What NVIDIA is trying to accomplish with Geometric Realism is to make things like rounded items be less triangulated. For example the picture supplied by NVIDIA for Far Cry 2, you can see holster of the gun and the shoulder of the character having angles. Yes, it’s suppose to be a curved surface but everything is made up of triangles and you need a lot of little triangles to make a curved shape. GF100 is looking to solve this problem and part of the solution I mentioned earlier was to use tessellation.
With tessellation, one stores a rough shape of the object which is also what the program animates while the hardware adds in more triangles to make the object look smoother. You can subdivide the various triangles using tessellation from the rough shape in hardware to produce a much smoother shape. Developers have control on how much they want the objects tessellated. Tessellation can also add more details and more defined objects such as making a rather flat stone road into a very bumpy road by adding in more triangles all the while keeping it looking smooth. The process will also know what to tessellate and what not to so you won't see added geometry in places such as a corner of a wall and so forth. To do this without bogging down the machine, DirectX 11 supports hardware accelerated tessellation and GF100 will be NVIDIA’s first architecture that will support DX11 and in turn hardware tessellation.
DirectCompute should also be a big thing for gaming in the near future. Right now, you can experience some benefits first hand with some of the CUDA enabled applications. Taking advantage of the cores on the GPU, some operations can be performed quicker than by just using the CPU. Developers will be able to use the GPU far more than just outputting graphics. Metro 2033 was one example given of a company using DirectCompute to improve the game. The picture shown of an image where the items up close were clear while those farther away were blurred. Using DirectCompute, it was faster to accomplish this effect than using standard rendering modes.
So let's take a look at the reference card that NVIDIA sent to me. Now what you see here will be the cards that the other OEMs will be releasing for the short time after the launch. It will take some time for their board partners to put their special touches on the card to make it unique to them but the initial releases will definitely be the same as the reference card.
The GeForce GTX 480 is their flagship and will retail for $499. Here are the specifications:
- 700MHz GPU
- 1401 MHz CUDA cores
- 924MHz Memory
- 15 streaming processors
- 480 CUDA cores
- 1585MB of memory
- 384-bit memory interface
- 250W TDP
The architecture is built on the 40nm process, which is not their first 40nm release. It's got double the shader power and double the number of ROP units per partition of their previous generation high end card. The texture units have been redesigned for better performance. What NVIDIA claims is the GeForce GTX 480 will be 1.5 to 3 times more powerful than the GeForce GTX 285 because of all the changes. The GeForce GTX 480 and the GTX 470 are the first NVIDIA cards to support DirectX 11 so they are ready for the next generation of PC gaming.
The card measures at 10.5" long which matches the reference GeForce GTX 275 that was sent to me a while ago. Because the card runs at 250W TDP, it requires a 6 pin and an 8 pin power connector. These two connectors are located on top and are facing up. Because of how the power connectors are situated, the designers at NVIDIA opted to design the four heat pipes to protrude from the top both for performance and aesthetics. Some people like it, some people don't but I have no issues with the card showing off some of the heat pipes that way. The dual slot card is pretty heavy though so it feels like a pretty solid and well built card when you hold it in your hands. The bracket holds two Dual-DVI connectors and a min-HDMI connector. Display port can be added in by another vendor if they choose to do so.
The highest priced AMD card with a single GPU is the AMD Radeon HD 5870. A check at NewEgg.com shows that the card sells for as little as $420 from XFX. The NVIDIA GeForce GTX 480 will retail for $499 making it a $70 premium over the competition's best card. It's also coming about five months later than the release of the AMD Radeon HD 5870. It's too bad I don't have a 5870 to compare the GeForce GTX 480 with but from talking to my colleagues the card does perform faster in most situations than the AMD offering. Albeit it's not a big gap in DirectX 10 games or less, the card is built for future games and thus, we won't know until more DirectX 11 games using tessellation come out to see if the GeForce GTX 480 will pull father ahead of the ATI Radeon HD 5870 cards in performance.While we don't have an AMD card to test it with, I will show how much of an improvement you do get over NVIDIA's last generation card in this piece. We have a reference GeForce GTX 275 here to run some benchmarks against with. The test setup includes:
- Intel i7-860 CPU
- 4GB DDR3 memory
- MSI P55-GD65 motherboard
- Seagate 2TB 7200RPM drive
- Windows 7 64-bit Ultimate
- 196.21 drivers for the GeForce GTX 275
- 197.12 drivers for the GeForce GTX 480
All tests were ran at 1680x1050 resolution except for 3D Mark Vantage which was set at the default settings.
First up will be 3D Mark Vantage, the synthetic benchmark by Futurmark
Batman: Arkham Asylum uses the Unreal engine and was one of the surprise hit games of last year. Here we turned on everything at the maximum and ran the built in benchmark five times, averaging the scores. PhysX was turned off in these runs but I'll get into how much of a performance hit you do get when you turn on PhysX in Batman.
As you can see, there's a nice performance gain in this game over the GeForce GTX 275.
Dawn of War II is an awesome RTS from the makers of Company of Heroes. If you're an RTS fan, you owe it to yourself to pick up this game. Here we turned on all the max settings1 and used the built in benchmark.
Now, here's a game that didn't really show that much improvement which is a little disappointing. I would've expected more than a 3.5 fps increase but Dawn of War II's.When talking about video cards, almost every time someone asks if it can run Crysis well. With Crysis, I set everything to maximum and ran the built in flyby benchmark of the island a few times.
Now this is a lot better. Here we see a healthy 16 FPS increase or a 38% increase in speed.
Dragon Age: Origins is up next and it's also one of the best PC RPG games released recently. For testing, I loaded up a saved game near the beginning inside the castle and did a run through of various areas. I took the same path a few times and averaged out the score reported by Fraps.
There's a increase here of 19% over the previous generation's card on the system, which isn't too shabby.
Metro 2033 is a brand new game from THQ using an engine that's got some great DirectX 11 features such as tessellation. It's also pretty hard as I'm even having a tough time going through the game with enough ammunition in tow. For testing, I loaded up a saved game in a tunnel
making a run through and recording the score from Fraps. For this test, I had it set at DirectX 10 for two runs and included the score when I changed to DirectX 11.
Here we get a good increase of 43% when using the GeForce GTX 480 in DirectX 10 mode. Turning on DirectX 11, we suddenly get hit on average of a 25 FPS delta but still ahead of the DirectX 10 score of the GeForce GTX 275. Also, you'll have to turn off Depth of Field if you want to play with the GeForce 3D Vision kit in DirectX 11. Otherwise, there will be no 3D when you put the glasses on. From the scores though, I'm very happy with what I got from the new engine going to the GeForce GTX 480 in DirectX 10 mode. In a future article, I'll talk more about the differences between DirectX 10 and DirectX11 but for those wanting the advanced graphical features Metro 2033 brings in DX11, the GeForce GTX 480 will allow you to do so.
Far Cry 2's a little older game but it's a solid DirectX 10 game. I used the built in benchmark to run through a few iterations while averaging out the score.
Like Metro 2033, we got a very high increase in FPS with a 48% change over the GeForce GTX 275.
Left 4 Dead II is one of my favorite PC games and it's such a great sequel to a game I've spent over 150 hours playing. At least that's what my Steam client tells me. I've set everything to maximum and turned on multi-core rendering as well.
Not as much of an increase compared to Metro 2033 or Far Cry but 26% is still pretty good. It's actually at the point where I could get close to 60 FPS when using 3D Vision, which would make for some buttery smooth 3D playing.
The final test goes to Battlefield: Bad Company 2 which is a great, great game from DICE if you guys haven't tried it out. I love the destructible environments and multiplayer fun is really, really top notch. I turned everything on and ran the beginning of the High Value Target level where I rode on a jeep through a bridge and as I was being attacked by enemies with rockets. There are a lot of water effects and foliage in this level. I ran the scene multiple times and averaged out the scores from Fraps.
Like Dawn of War II, there's only a minor increase in FPS but I did find that the minimum FPS for the GeForce GTX 480 was double that of the GeForce GTX 275 going from 20 to 40. I'm a little disappointed in the small increase though but I am happy it seemed to give a more consistent performance throughout testing.
So depending on the game you'll either see minor gains or very high gains. Some of the newer games fared a lot better than others but all in all, I was pretty happy with how things turned out.
When idling, the card is pretty warm to the touch. Run a game though and the temperature just heats up quickly and the card becomes too hot to place your finger on for more than a second or two. Idling sits at around 41C. I used FurMark to run the card at load and it wasn't long before the card heated up to 96C. It was then the fan kicked up and cooled it down to 91C and it held it there throughout the 15 minute test. Yeah, it gets pretty hot which I'm glad that the card's a two slot cooler that funnels most of the heat out the back. Still, there's a good amount of heat radiating from the heat sink and heat pipes so it might be a factor in heating the inside of your case a bit. This test was done with an open bench though so it might be a little better once you put it in a case and have nice air flow through it.
Now, you know you get more than just graphics with NVIDIA cards. Stereoscopic 3D is one of the cooler technologies available and it's only currently available with NVIDIA's GeForce 3D Vision kit. When activated, you can pretty much bet that your framerate will be cut in half since the scene is rendered twice. The added speed with the GeForce GTX 480 over my previous card makes the gaming experience in 3D a lot better because I am getting smoother gameplay. I won't get into 3D Vision too much as I talk more in depth in the review as well as my Through Active Shutter Glasses piece. In any case, I'm happy for the extra power the GeForce GTX 480 provides in making 3D gaming a little more pleasant.
Coming soon though will be 3D Vision Surround, which lets you experience 3D gaming using three monitors. This feature will first be available for the GeForce GTX 480 and 470 cards but you'll also be able to use the older GTX 200 series of GPUs to play this way. Since you are putting three monitors side by side, there will be a nice piece of software to account for the bezel so the pictures line up correctly. It's going to be an expensive venture though because you have to pick up at least two GeForce GTX 480 aor 480 cards as well as three 120Hz monitors. It's definitely something you have to experience though as I found the feature to be pretty fun to play with at CES.PhysX acceleration is another feature that's only available on NVIDIA cards and I've actually enjoyed some of the added effects when turning this option on. I actually use another NVIDIA card in my system to do PhysX acceleration so it takes the load off the main video card but what if you don't have that option? I've provided one example here where I compared PhysX performance in Batman: Arkham Asylum with just using the GeForce GTX 480 doing both the video and PhysX. Here I ran through the built in benchmark in three settings and compared the values.
The added visual items when turning on PhysX in Batman makes for a more immersive experience. You're still getting good performance when turning on Normal PhysX settings as well as High but I would suggest using a NVIDIA card, if you can, for PhysX.
NVIDIA’s going to be bundling this cool little PhysX program where you shoot this guy down a track in a rocket sled and can do things like blow the sled apart or watch the bridge collapse in various pieces. It really shows some of the cool things you can do with PhysX and with the power of the GeForce GTX 480. If you saw the demo at CES, it’s the same thing and it’s quite fun to play with.
Like all NVIDIA cards, the GeForce GTX 480 will support SLI and as you can see from the pictures, there are SLI connectors on the top. NVIDIA claims you’re going to get near linear scaling in performance when paired up with one or more cards so that means if you run two of these bad boys, you should get about a 90% performance increase on average. If that is the case, that would be a nice speed boost later on down the road should you decide to add another one to your system. Like past cards, 3-way SLI is supported with the GeForce GTX 480.
Ray tracing has been a feature of past NVIDIA cards but NVIDIA says that the GeForce GTX 480 will be one of the first cards that makes ray tracing interactive. It’ll be up to 3.5 times faster than the GeForce GTX 285 in performance and because of this, they are trying to get developers to use it in games for such things as gallery modes or a 3D model display that you can rotate around in. NVIDIA’s including a cool little ray tracing program that shows off some car models in various garages that lets you change some variables in the environment as well as camera effects.
Yes, Fermi is finally here and yes it's pretty fast. It's the fastest single GPU card out there now and it's the card that NVIDIA purchasers are waiting for. My concern though is that, from reports and NVIDIA's own documents, it might not be an overwhelming speed increase over the 5870 cards. Considering how long the 5870 cards have been out, the numbers might not be what the public are wanting but I've been told that these cards will really show off their goods when more DX11 games come out. We will have to wait for more DirectX 11 games to really see if NVIDIA's statement saying that these cards really shine on next generation games comes true but what it can do now is pretty damn fast. NVIDIA cards do more than just graphics as they provide PhysX acceleration, CUDA support, ray tracing, and 3D gaming. The GeForce GTX 480 offers the fastest experience so far but it might not be what people are hoping for considering the delays. Even so, I am glad to see NVIDIA offering up something new in terms of architecture and I await what they can do in the future with the growth that's possible. For those that want to pick up the card, it will be available the week of April 12, 2010 and NVIDIA is expecting 10s of thousands of units available at that time.