Advertisement
If you have a new account but are having problems posting or verifying your account, please email us on hello@boards.ie for help. Thanks :)
Hello all! Please ensure that you are posting a new thread or question in the appropriate forum. The Feedback forum is overwhelmed with questions that are having to be moved elsewhere. If you need help to verify your account contact hello@boards.ie

New build for development

Options

Comments

  • Registered Users Posts: 3,965 ✭✭✭mp3guy


    I would definitely go Intel on the CPU. For vector instructions, oftentimes the latencies and clocks are better on Intel because well, it's Intel's instruction set. If you can afford it get a CPU that supports AVX512 for bleeding edge best performance.


  • Moderators, Society & Culture Moderators Posts: 15,750 Mod ✭✭✭✭smacl


    mp3guy wrote: »
    I would definitely go Intel on the CPU. For vector instructions, oftentimes the latencies and clocks are better on Intel because well, it's Intel's instruction set. If you can afford it get a CPU that supports AVX512 for bleeding edge best performance.

    Arguably better short term performance, but the plan is to drop in a 16 core 32 thread 3950x next year without any other mods needed to the system. VS2017 basically compiles one unit per available thread as separate processes so AVX512 can't help much here. The other issue with newer SIMD extensions is that I'm developing commercial software that needs to run on most hardware, so I can't actually use it in delivered products unless I'm putting a customer specific build together.


  • Registered Users Posts: 3,965 ✭✭✭mp3guy


    smacl wrote: »
    Arguably better short term performance, but the plan is to drop in a 16 core 32 thread 3950x next year without any other mods needed to the system. VS2017 basically compiles one unit per available thread as separate processes so AVX512 can't help much here. The other issue with newer SIMD extensions is that I'm developing commercial software that needs to run on most hardware, so I can't actually use it in delivered products unless I'm putting a customer specific build together.

    The Ryzen 3000 series will still only have AVX512 so the high end Intel CPUs will have double the vector flops. Half the cores on an Intel CPU with AVX512 will crush an AMD CPU with just AVX2.

    Not sure what you're trying to say about VS2017; https://devblogs.microsoft.com/cppblog/microsoft-visual-studio-2017-supports-intel-avx-512/

    But, your end point really hits the nail on the head as to why you'd stick with AMD, but then that conflicts with your desire for vendor specific GPGPU via CUDA


  • Moderators, Society & Culture Moderators Posts: 15,750 Mod ✭✭✭✭smacl


    mp3guy wrote: »

    The C++ compiler itself uses one single threaded process for each file it is compiling, so 16 threads means 16 source files compiling at a time. SIMD (single instruction. multiple data) doesn't help this as we're talking multiple instructions, multiple data across multiple processes. This is also true of any code I write myself, where the SIMD benefits come largely from compiler optimizations rather than explicit code whereas the multi-threaded optimizations come from the code. Typically, a piece of code that would benefit from explicit SIMD/AVX code would be a good candidate for porting to the GPU. For the same number of threads Intel is faster, so if I wasn't going to upgrade a i9-9900 would be a better bet, but the AMD gives me the option to jump to 32 threads for ~€600 next year where there's no sign of this being an option for Intel.
    But, your end point really hits the nail on the head as to why you'd stick with AMD, but then that conflicts with your desire for vendor specific GPGPU via CUDA

    The CUDA stuff can be ported onto AMD GPUs with HIP/hipify but it is extra work I don't want in my development cycle. Good enough to go through this on developed code and test on another box. It's also the case in my industry that people commonly spec nVidia cards for CUDA apps but also use Ryzen and Threadripper.


  • Registered Users Posts: 3,965 ✭✭✭mp3guy


    smacl wrote: »
    The C++ compiler itself uses one single threaded process for each file it is compiling, so 16 threads means 16 source files compiling at a time. SIMD (single instruction. multiple data) doesn't help this as we're talking multiple instructions, multiple data across multiple processes. This is also true of any code I write myself, where the SIMD benefits come largely from compiler optimizations rather than explicit code whereas the multi-threaded optimizations come from the code. Typically, a piece of code that would benefit from explicit SIMD/AVX code would be a good candidate for porting to the GPU. For the same number of threads Intel is faster, so if I wasn't going to upgrade a i9-9900 would be a better bet, but the AMD gives me the option to jump to 32 threads for ~€600 next year where there's no sign of this being an option for Intel.

    Oh compilation of code versus execution. Sure, vector instructions aren't very useful for compilation, but at runtime make all the difference. You can just write using intrinsics if you want to leverage the SIMD instructions without writing assembly, then you don't have to worry about the compiler being too dumb to do it automatically, but you get the register allocation for free. Porting to the GPU would only make sense if the GPU is not required for something else, e.g. rendering in a game loop. But I get where you're coming from, you want fast compilation and there many cores (and M.2 SSDs in RAID 0) make all the difference (once you don't also have a lot of linking to do).
    smacl wrote: »
    The CUDA stuff can be ported onto AMD GPUs with HIP/hipify but it is extra work I don't want in my development cycle. Good enough to go through this on developed code and test on another box. It's also the case in my industry that people commonly spec nVidia cards for CUDA apps but also use Ryzen and Threadripper.

    I typically never trust any automatic converts like that, the fact of the matter is NVIDIA pumped resources in to CUDA to sell GPUs while OpenCL and AMD rotted in a corner with just Compute Shaders and the like. Much better developer experience with CUDA, plus more neat features.


  • Advertisement
  • Moderators, Society & Culture Moderators Posts: 15,750 Mod ✭✭✭✭smacl


    mp3guy wrote: »
    Oh compilation of code versus execution. Sure, vector instructions aren't very useful for compilation, but at runtime make all the difference. You can just write using intrinsics if you want to leverage the SIMD instructions without writing assembly, then you don't have to worry about the compiler being too dumb to do it automatically, but you get the register allocation for free. Porting to the GPU would only make sense if the GPU is not required for something else, e.g. rendering in a game loop. But I get where you're coming from, you want fast compilation and there many cores (and M.2 SSDs in RAID 0) make all the difference (once you don't also have a lot of linking to do).

    Second M.2 might be worth considering there, and linking is also a plus for Intel, though tends not to be an issue in my typical development cycle. The point about the GPU is well made. While it won't typically be rendering for me, mixing GPU and multi-threading can lead to GPU resource issues, where SIMD wont.
    I typically never trust any automatic converts like that, the fact of the matter is NVIDIA pumped resources in to CUDA to sell GPUs while OpenCL and AMD rotted in a corner with just Compute Shaders and the like. Much better developer experience with CUDA, plus more neat features.

    Same, been down that road once with OpenCL which seems to be dying a death and similarly AMP. The Direct3D compute shader is my current choice for Windows apps, but no good for Unix or Mac. CUDA also has a bunch of good library code out there, which makes it attractive and has me leaning towards nVidia for dev at least.


Advertisement