Hi folks,
The state of the ANSYS-compatible GPU market is currently a little weird. NVIDIA's product portfolio is especially confused right now, which I'll explain below. But in short, I'm trying to figure out which GPU we should get for our mechanical solutions. The rest of this message goes into great detail, but fundamentally, my question is: which GPU should we buy?
Parameters:
Our models are typically 10-75 MDOF and typically use the PCG solver, the sparse direct solver, or the various Lanczos solvers. We also frequently use MPC184 joint elements, which preclude the use of a GPU. We've got a single-seat HPC license (not an HPC pack) so we're either solving on four cores + GPU or five cores, depending on the model in question. We're running Windows 10 with 128 GB of RAM.
Current status:
It's pretty unclear which NVIDIA card we should get. The Quadro GV100 is lovely, but at over $10K it's over our budget. The Tesla V100 is essentially the same thing, but with only 16 gigabytes of RAM and without monitor outputs. It's available in 16-gig and 32-gig configurations. We found a reputable vendor who had the 16-gig version for $5800, so we got one. And hey, it works with ANSYS 19.2! We're off to the races, right?
Not quite. The V100 is passively cooled (no fan) and it implicitly relies on a rack-mount server case to flow enough air to keep it cool. I'm able to solve with it, but it never reaches thermal equilibrium; it just keeps getting hotter, even if it's just idling. So I'm forced to shut down my machine when the V100 gets to about 80 degrees C. (The card will shut itself down around 88-90 degrees C, which causes my workstation to crash immediately). IMHO, NVIDIA does a terrible job of explaining that the V100 expects a rackmount case that flows enough air to cool it. Their documentation only specifies that "air can flow from left-to-right or from right-to-left," which is pretty unhelpful.
Next steps:
Option 1: add a fan to the V100 we have. I'm not the first to try to put a V100 into a non-rack-mount case...others have posted .SLA files that you can use to 3D print a manifold that puts a fan in the right place to cool a V100 card. This would work, but if I pursued it, I suspect my manager would be highly dubious and our IT people would freak out.
Option 2: return the V100 and buy a different card. As I said before, it's not clear which card to buy. There is no actively cooled version of the V100. Here's what's out there:
-
NVIDIA Quadro GV100: ideal from a performance perspective, but nearly $11K; not an option.
-
NVIDIA Titan V: this is a Quadro GV100 with only 12 GB of RAM and slightly less memory bandwidth.
-
It's only $3K, so obviously that helps its appeal.
-
The Titan V is neither fish nor fowl. It's not officially a part of the Quadro line, but it's a whole lot like the Quadro GV100. It's not part of the
-
It's hard to tell how limiting the 12-GB RAM spec will be. The sparse direct solver wants ~10 GB of RAM per MDOF to solve in-core, so 12 GB is only about 400,000 nodes using Solid187 elements. A modal analysis that finds and extracts the first 12 mode shapes takes less than a minute using four CPU cores, so there's not much to be gained there.
-
The Titan V not officially supported by ANSYS. Supposedly, it was on track to be supported in 19.1, but that didn't happen for reasons that are a little hazy (at least to me).
-
The fact that a Titan V is nearly identical to a GV100 (architecturally speaking) makes me a little suspicious about why it's not supported or at least on track for support in 19.3.
-
NVIDIA Quadro RTX 6000: The RTX series is the newest high-end Quadro and sells for about $6300.
-
The RTX 6000 uses NVIDIA's Turing architecture, an evolution of the GV100's Volta architecture. It's not immediately clear whether this is a good thing; the Volta cards may be faster for ANSYS purposes.
-
This card has fewer CUDA cores (4608) than the GV/V100/Titan V cards (5120).
-
Faster RAM and a few other things mean that single-precision (SP) performance (16.3 TFLOPS) is slightly better than the Volta-based cards (~14 TFLOPS).
-
Double-precision (DP) performance is much worse, probably by design. The math-hamsters on the RTX-6000 produce about 0.509 TFLOPs in DP mode (16.3 SP TFLOPS / 32), while the Volta-based cards are all at about 7 TFLOPS (14 SP TFLOPS / 2).
-
The RTX 6000 has 24 GB of RAM, 50% more than our V100, so we can use it to solve some models that we can't with the V100 (for lack of RAM). However, we couldn't use it at all for the sparse solver or Block Lanczos solver because NVIDIA has hobbled the RTX's DP performance.
-
The RTX 6000 isn't currently supported by ANSYS for GPU solving, but it's highly likely that it will be in 19.3 or shortly thereafter.
-
The Quadro RTX cards have hardware-based ray tracing features that supposedly allow real-time rendering of CAD models, which would be a minor benefit for us.
Discussion:
It's unclear which of these options to pursue. Does anyone here have insight or direct experience with one or more of these cards?
I believe (but don't know for sure) that, though the Titan V isn't officially supported, setting the ANSGPU_OVERRRIDE environment variable (new as of 19.1) solutions would work fine via the documented procedure (new as of 19.1) to set the ANSGPU_OVERRIDE environment variable. That said, I can't really recommend that my company buy a GPU that ANSYS has decided not to support.
I suspect that the RTX 6000 offers the most solving performance per dollar among the officially-supported GPUS, but the Titan V may well be a better value since it's less than half the cost and works with the sparse solver(s).
Part of the problem is that there are very few benchmarks out there that cleanly compare the effect that different GPUs have on ANSYS performance. Even a comparison involving older cards and versions of ANSYS would be really helpful (e.g., the Quadro P6000 vs the Quadro GP100 on ANSYS 18).
Of course, the performance of different cards is highly problem-dependent, so there is no clear hierarchy. But I wish there were an apples-to-apples comparison. It would be delightful if ANSYS would provide benchmark data for all officially supported cards with each release, but that seems really unlikely.
I'd appreciate any feedback or insight anyone might have.
Thanks,
Jason
Jason Krantz
Mechanical Engineer
jason.krantz@flir.com
FLIR Systems, Inc.
27700 SW Parkway Avenue
Wilsonville, OR 97070
Bonus conspiracy theory:
While trying to figure out whether NVIDIA supported putting a Tesla V100 in a non-rack-mount computer, I spoke to NVIDIA about the Titan V. The NVIDIA rep said that the Titan V was soon to be "de-emphasized" because NVIDIA doesn't want it to cannibalize their Quadro and Tesla sales. I can't help wondering whether ANSYS (which seems to have a fairly close relationship with NVIDIA) dropped plans for Titan V support at NVIDIA's request. The answer may be unknowable. Or maybe there's an unrelated technical reason why ANSYS has decided not to support the Titan V. I wish I could benchmark one, because it might be the best performance bargain currently available for running ANSYS solutions with a GPU.
Notice to recipient: This email is meant for only the intended recipient of the transmission, and may be a communication privileged by law, subject to export control restrictions or that otherwise contains proprietary information. If you receive this email by mistake, please notify us immediately by replying to this message and then destroy it and do not review, disclose, copy or distribute it. Thank you in advance for your cooperation.
Hi folks,
The state of the ANSYS-compatible GPU market is currently a little weird. NVIDIA's product portfolio is especially confused right now, which I'll explain below. But in short, I'm trying to figure out which GPU we should get for our mechanical solutions. The rest of this message goes into great detail, but fundamentally, my question is: which GPU should we buy?
Parameters:
-----------------
Our models are typically 10-75 MDOF and typically use the PCG solver, the sparse direct solver, or the various Lanczos solvers. We also frequently use MPC184 joint elements, which preclude the use of a GPU. We've got a single-seat HPC license (not an HPC pack) so we're either solving on four cores + GPU or five cores, depending on the model in question. We're running Windows 10 with 128 GB of RAM.
Current status:
-----------------
It's pretty unclear which NVIDIA card we should get. The Quadro GV100 is lovely, but at over $10K it's over our budget. The Tesla V100 is essentially the same thing, but with only 16 gigabytes of RAM and without monitor outputs. It's available in 16-gig and 32-gig configurations. We found a reputable vendor who had the 16-gig version for $5800, so we got one. And hey, it works with ANSYS 19.2! We're off to the races, right?
Not quite. The V100 is passively cooled (no fan) and it implicitly relies on a rack-mount server case to flow enough air to keep it cool. I'm able to solve with it, but it never reaches thermal equilibrium; it just keeps getting hotter, even if it's just idling. So I'm forced to shut down my machine when the V100 gets to about 80 degrees C. (The card will shut itself down around 88-90 degrees C, which causes my workstation to crash immediately). IMHO, NVIDIA does a terrible job of explaining that the V100 expects a rackmount case that flows enough air to cool it. Their documentation only specifies that "air can flow from left-to-right or from right-to-left," which is pretty unhelpful.
Next steps:
-----------------
Option 1: add a fan to the V100 we have. I'm not the first to try to put a V100 into a non-rack-mount case...others have posted .SLA files that you can use to 3D print a manifold that puts a fan in the right place to cool a V100 card. This would work, but if I pursued it, I suspect my manager would be highly dubious and our IT people would freak out.
Option 2: return the V100 and buy a different card. As I said before, it's not clear which card to buy. There is no actively cooled version of the V100. Here's what's out there:
- NVIDIA Quadro GV100: ideal from a performance perspective, but nearly $11K; not an option.
- NVIDIA Titan V: this is a Quadro GV100 with only 12 GB of RAM and slightly less memory bandwidth.
- It's only $3K, so obviously that helps its appeal.
- The Titan V is neither fish nor fowl. It's not officially a part of the Quadro line, but it's a whole lot like the Quadro GV100. It's not part of the
- It's hard to tell how limiting the 12-GB RAM spec will be. The sparse direct solver wants ~10 GB of RAM per MDOF to solve in-core, so 12 GB is only about 400,000 nodes using Solid187 elements. A modal analysis that finds and extracts the first 12 mode shapes takes less than a minute using four CPU cores, so there's not much to be gained there.
- The Titan V not officially supported by ANSYS. Supposedly, it was on track to be supported in 19.1, but that didn't happen for reasons that are a little hazy (at least to me).
- The fact that a Titan V is nearly identical to a GV100 (architecturally speaking) makes me a little suspicious about why it's not supported or at least on track for support in 19.3.
- NVIDIA Quadro RTX 6000: The RTX series is the newest high-end Quadro and sells for about $6300.
- The RTX 6000 uses NVIDIA's Turing architecture, an evolution of the GV100's Volta architecture. It's not immediately clear whether this is a good thing; the Volta cards may be faster for ANSYS purposes.
- This card has fewer CUDA cores (4608) than the GV/V100/Titan V cards (5120).
- Faster RAM and a few other things mean that single-precision (SP) performance (16.3 TFLOPS) is slightly better than the Volta-based cards (~14 TFLOPS).
- Double-precision (DP) performance is much worse, probably by design. The math-hamsters on the RTX-6000 produce about 0.509 TFLOPs in DP mode (16.3 SP TFLOPS / 32), while the Volta-based cards are all at about 7 TFLOPS (14 SP TFLOPS / 2).
- The RTX 6000 has 24 GB of RAM, 50% more than our V100, so we can use it to solve some models that we can't with the V100 (for lack of RAM). However, we couldn't use it at all for the sparse solver or Block Lanczos solver because NVIDIA has hobbled the RTX's DP performance.
- The RTX 6000 isn't currently supported by ANSYS for GPU solving, but it's highly likely that it will be in 19.3 or shortly thereafter.
- The Quadro RTX cards have hardware-based ray tracing features that supposedly allow real-time rendering of CAD models, which would be a minor benefit for us.
Discussion:
-----------------
It's unclear which of these options to pursue. Does anyone here have insight or direct experience with one or more of these cards?
I believe (but don't know for sure) that, though the Titan V isn't officially supported, setting the ANSGPU_OVERRRIDE environment variable (new as of 19.1) solutions would work fine via the documented procedure (new as of 19.1) to set the ANSGPU_OVERRIDE environment variable. That said, I can't really recommend that my company buy a GPU that ANSYS has decided not to support.
I suspect that the RTX 6000 offers the most solving performance per dollar among the officially-supported GPUS, but the Titan V may well be a better value since it's less than half the cost and works with the sparse solver(s).
Part of the problem is that there are very few benchmarks out there that cleanly compare the effect that different GPUs have on ANSYS performance. Even a comparison involving older cards and versions of ANSYS would be really helpful (e.g., the Quadro P6000 vs the Quadro GP100 on ANSYS 18).
Of course, the performance of different cards is highly problem-dependent, so there is no clear hierarchy. But I wish there were an apples-to-apples comparison. It would be delightful if ANSYS would provide benchmark data for all officially supported cards with each release, but that seems really unlikely.
I'd appreciate any feedback or insight anyone might have.
Thanks,
Jason
Jason Krantz
Mechanical Engineer
jason.krantz@flir.com
FLIR Systems, Inc.
27700 SW Parkway Avenue
Wilsonville, OR 97070
Bonus conspiracy theory:
-----------------
While trying to figure out whether NVIDIA supported putting a Tesla V100 in a non-rack-mount computer, I spoke to NVIDIA about the Titan V. The NVIDIA rep said that the Titan V was soon to be "de-emphasized" because NVIDIA doesn't want it to cannibalize their Quadro and Tesla sales. I can't help wondering whether ANSYS (which seems to have a fairly close relationship with NVIDIA) dropped plans for Titan V support at NVIDIA's request. The answer may be unknowable. Or maybe there's an unrelated technical reason why ANSYS has decided not to support the Titan V. I wish I could benchmark one, because it might be the best performance bargain currently available for running ANSYS solutions with a GPU.
________________________________
Notice to recipient: This email is meant for only the intended recipient of the transmission, and may be a communication privileged by law, subject to export control restrictions or that otherwise contains proprietary information. If you receive this email by mistake, please notify us immediately by replying to this message and then destroy it and do not review, disclose, copy or distribute it. Thank you in advance for your cooperation.