Benchmarking the new Kepler (GTX 680)

Pavan YalamanchiliBenchmarks, CUDA 13 Comments

NVIDIA has launched their next generation GPU based on their Kepler Architecture. They followed it up with a rather quick update to their CUDA toolkit. Considering that we have access to 3 generations of their GTX cards (480, 580 and 680), we thought we would show case how the performance has changed over the generations.

Matrix multiplication:

One terra flops!

It can be seen that the GTX 680 breaches the 1 Terraflop mark comfortably for single precision, while the GTX 580 barely scratches it. However the performance seems to peak around 2048 x 2048 and then rallies downward to match the performance of the GTX 580 at larger sizes. The high end Tesla C2070 finishes last for single precision behind the third placed GTX 480.

For double precision, as expected the C2070 is well ahead of the pack. The most interesting snippet here is that the GTX 680, finishes dead last compared to its predecessors. At about 1/10 th  of its single precision performance, the 680 is about twice as slow as the 580 which settles down a ~ 1/5th the single precision performance.

Fast Fourier Transform:

Fast fast aww no more memory!The performance gains moving from 480 to a 580 is significant (~20%), while the 680 does not seem to have huge wins over its immediate predecessor. The Fast Fourier Transform is an interesting benchmark in that, it is a case of these cards running out of memory before the peak performance is reached. At 2GB, the 680 can hold two 8192×8192 single precision, complex matrices, but the scratch space required for this algorithm is more than the free space available. All the transformations were 2D , Real to complex transforms.

SORT:

Here the GTX 680 starts off strong before losing out to the GTX 580, and eventually to the 480.  We are using the same radix-sort algorithm for all the benchmarks.  It is really astonishing that the 680 is more than 20% slower at peak.

Resources:

Benchmark Code

Benchmark Results

Related links:

Tom’s hardware: LuxMark Benchmarks

Anandtech: Retaking the performance crown

 

 

Comments 13

  1. Cool!  Thank you for checking this out.  It’s nice to be reminded occasionally that NVIDIA is first and foremost a company that makes graphics cards for gamers.

    Out of curiosity, have you done any benchmarks using the OpenCL versions of your products on the HD 7970?

  2. Pingback: Benchmarks sobre Kepler | Paralelizados.com

  3. Pingback: Walking Randomly » A Month of Math Software – April 2012

  4. Hi Pavan.

    Great comparison. Finally the 1 TFlops barrier broken for the 680. Somewhat disappointing performance as such though. With low-cost 580 cards with 3 GB memory they seem very attractive.

    I assume you do not count the PCI transfers. Could you include that also? In my apps I sometime need to consider if it is worth the effort to move the data, compute on the GPU and return the result to the host. Would be nice to see them compared. The disadvantage is that the PCI transfers (latency, rate) varies quite a lot from computer to computer. Interesting it is though.Torben

    1. Torben,

      The motherboards the cards were on were wildly different (the 680 and the 480 are a bit similar). Benchmarks across multiple machines would not provide consistent results I think.

  5. We are running GTX680 benchmarks and disappointed with performance. However, the great thing about the new Kepler chip is low power consumption though. So having ability to put two chips in a single card and have the same power output as 580 is impressive and very useful. But Nvidia was positioning 680 as being 50% faster than 580, that part is disappointingly not true.

  6. What happens when the matrix size doesn’t fit in the on board GDDR5. How much performance hit do you get from the maximum

  7. I sometime need to consider if it is worth the effort to move the data,
    compute on the GPU and return the result to the host. Would be nice to
    see them compared. The disadvantage is that the PCI transfers (latency,
    rate) varies quite a lot from computer to computer. Interesting it is
    though.Torben http://www.datanetsolutions.org/

  8. Pingback: Intel’s Haswell is an unprecedented threat to Nvidia, AMD | CD DISK

  9. Pingback: The danger Intel's Haswell poses to Nvidia/AMD

  10. Pingback: Gregory Smith

Leave a Reply

Your email address will not be published. Required fields are marked *