Thesis: Hardware-Assisted Software Testing and Debugging for Heterogeneous Computing

Thesis: Hardware-Assisted Software Testing and Debugging for Heterogeneous Computing
Even now, Thrust as a dependency is one of the main reason why we have a #CUDA backend, a #HIP / #ROCm backend and a pure #CPU backend in #GPUSPH, but not a #SYCL or #OneAPI backend (which would allow us to extend hardware support to #Intel GPUs). <https://doi.org/10.1002/cpe.8313>
This is also one of the reason why we implemented our own #BLAS routines when we introduced the semi-implicit integrator. A side-effect of this choice is that it allowed us to develop the improved #BiCGSTAB that I've had the opportunity to mention before <https://doi.org/10.1016/j.jcp.2022.111413>. Sometimes I do wonder if it would be appropriate to “excorporate” it into its own library for general use, since it's something that would benefit others. OTOH, this one was developed specifically for GPUSPH and it's tightly integrated with the rest of it (including its support for multi-GPU), and refactoring to turn it into a library like cuBLAS is
a. too much effort
b. probably not worth it.
Again, following @eniko's original thread, it's really not that hard to roll your own, and probably less time consuming than trying to wrangle your way through an API that may or may not fit your needs.
6/
I'm getting the material ready for my upcoming #GPGPU course that starts on March. Even though I most probably won't get to it,I also checked my trivial #SYCL programs. Apparently the 2025.0 version of the #Intel #OneAPI #DPCPP runtime doesn't like any #OpenCL platform except Intel's own (I have two other platforms that support #SPIRV, so why aren't they showing up? From the documentation I can find online this should be sufficient, but apparently it's not …)
Exploring data flow design and vectorization with oneAPI for streaming applications on CPU+GPU
Just how deep is #Nvidia's #CUDA moat really?
Not as impenetrable as you might think, but still more than Intel or AMD would like
It's not enough just to build a competitive part: you also have to have #software that can harness all those #FLOPS — something Nvidia has spent the better part of two decades building with its CUDA runtime, while competing frameworks for low-level #GPU #programming are far less mature like AMD's #ROCm or Intel's #OneAPI.
https://www.theregister.com/2024/12/17/nvidia_cuda_moat/ #developers
Analyzing the Performance Portability of SYCL across CPUs, GPUs, and Hybrid Systems with Protein Database Search
#SYCL #oneAPI #Bioinformatics #Databases #HPC #PerformancePortability #Package
Howdy all - registrations are still open for the first oneAPI DevSummit hosted by the UXL Foundation! Learn about GPGPU programming, oneAPI and how companies are coalescing around #oneapi / #sycl
https://linuxfoundation.regfox.com/oneapiuxldevsummit2024
Registration will closeat 5pm today. The DevSummit will start at 8pm PT or 8:30am IST. See you there!
Intel(R) SHMEM: GPU-initiated OpenSHMEM using SYCL
Introduction to #oneAPI, #SYCL2020 & #OpenMP offloading
September 23-25, 2024
In this 3-day online course, HLRS - High-Performance Computing Center Stuttgart provides an introduction to Intel Corporation's oneAPI implementation
Read more & Register https://www.hlrs.de/training/2024/intel-oneapi
Just one more day to submit your session for the UXL oneAPI DevSummit being held October 9th & 10th!
Learn more: https://sessionize.com/uxldevsummit
#SYCL #oneAPI #UXL
Evaluating Operators in Deep Neural Networks for Improving Performance Portability of SYCL
@pytorch 2.4 upstream now includes a prototype feature supporting Intel GPUs through source build using #SYCL and #oneDNN as well as a backend integrated to inductor on top of Triton - enabling a path for millions and millions of GPUs through #oneAPI for #AI.
Lots of important milestones to make this happen - including support for #UXL Foundation open AI technologies. Just a prototype, but a big step forward... thanks to all in the PyTorch community. Feedback welcome!
Can we run TornadoVM applications on CPUs and take advantage of all CPU cores? The answer is YES. All you need is an OpenCL implementation that can run on your CPU. In this video, I will show you how you can configure TornadoVM to run on such systems using the Intel oneAPI base toolkit for Intel CPUs, and even FPGAs.
A coalition led by Qualcomm, Google, and Intel, under the UXL Foundation, aims to break $2.2 trillion Nvidia's stronghold on the AI market by developing an open-source software suite that supports diverse AI accelerator chips, leveraging Intel's OneAPI. #nvidia #google #intel #Qualcomm #ai #opensource #chips #semiconductor #strategy #partnership #oneapi #api #engineer #engineering #software #market
This is what #CUDA is, a prison from which there seems to be no possibility of escaping, and also the fault of developers who rely on this, forcing anyone to necessarily buy these cards, I hope this nightmare ends soon, I'm done with having to buy Nvidia products. #NVIDIA #SYCL #intel #OneApi #opensource
Intel Proposes Adding Full SYCL Programming Model Support To Upstream LLVM: https://www.phoronix.com/news/Intel-Full-SYCL-Upstream-LLVM
James Reinders et al. have released the second edition of their SYCL book "Data Parallel C++", available for free in PDF and EPUB: https://link.springer.com/book/10.1007/978-1-4842-9691-2
"SYCL is a royalty-free open standard developed by the Khronos Group that allows developers to program heterogeneous architectures [such as CPUs, GPUs, and FPGAs] in standard C++."