[OpenMP][flang] Support GPU team reductions on allocatables
Extends the work started in #165714 by supporting team reductions.
Similar to what was done in #165714, this PR introduces proper
allocations, loads, and stores for by-ref reductions in teams-related
callbacks:
* `_omp_reduction_list_to_global_copy_func`,
* `_omp_reduction_list_to_global_reduce_func`,
* `_omp_reduction_global_to_list_copy_func`, and
* `_omp_reduction_global_to_list_reduce_func`.
Address reviewer feedback: fix getWaitCountMax and reduce code duplication
- Fix getWaitCountMax() to use correct bitmasks based on architecture:
- Pre-GFX12: Use getVmcntBitMask/getLgkmcntBitMask for LOAD_CNT/DS_CNT
- GFX12+: Use getLoadcntBitMask/getDscntBitMask for LOAD_CNT/DS_CNT
- Refactor repetitive if-blocks for LOAD_CNT, DS_CNT, EXP_CNT into
a single loop using getCounterRef helper function
- Fix X_CNT to return proper getXcntBitMask(IV) instead of 0
[HIP][AMDGPU] Remove 't' from all __builtin_*_load_lds builtins
Allows for type checking depending on the builtin signature.
stack-info: PR: https://github.com/llvm/llvm-project/pull/165389, branch: users/jmmartinez/fix/load_lds_typesignature/3
[NFC][HIP] Add __builtin_*_load_lds type check test cases
This tests show how typechecking is performed for
__builtin_amdgcn_load_to_lds, but not for
__builtin_amdgcn_raw_ptr_buffer_load_lds,
__builtin_amdgcn_struct_ptr_buffer_load_lds and
__builtin_amdgcn_global_load_lds since they are declared with the 't'
attribute.
stack-info: PR: https://github.com/llvm/llvm-project/pull/165388, branch: users/jmmartinez/fix/load_lds_typesignature/2
[HIP] Perform implicit pointer cast when compiling HIP, not when -fcuda-is-device (#165387)
When compiling HIP device code, we add implicit casts for the pointer arguments passed to built-in calls.
When compiling for the host, apply the same casts, since the device side of the source (device functions and kernels) should still pass type checks.
[BOLT][PAC] Warn about synchronous unwind tables
BOLT currently ignores functions with synchronous PAuth DWARF info.
When more than 10% of functions get ignored for inconsistencies, we
should emit a warning to only use asynchronous unwind tables.
See also: #165215
[BOLT][NFC] Rename Pointer Auth DWARF rewriter passes
Original names were "working titles". After initial patches are merged,
I'd like to rename these passes to names that reflect their intent
better and show their relationship to each other:
InsertNegateRAStatePass renamed to PointerAuthCFIFixup,
MarkRAStates renamed to PointerAuthCFIAnalyzer.
Reland: [GPUToXeVMPipeline][Pipeline] Modify pipeline to add `convert-vector-to-llvm`. (#169573)
`convert-vector-to-llvm` pass applies a set of vector transformation
patterns that are not included in the standard `convert-to-llvm` pass
interface. These additional transformations are required to properly
lower MLIR vector operations. Since not all vector ops have direct
`llvm` dialect lowering, many of them must first be progressively
rewritten into simpler or more canonical vector ops, which are then
lowered to `llvm`. Therefore, running `convert-vector-to-llvm` is
necessary to ensure a complete and correct lowering of vector operations
to the `llvm` dialect.
Original PR: https://github.com/llvm/llvm-project/pull/166204 +
post-commit failure fixes.
[Delinearization] Remove tryDelinearizeFixedSizeImpl (#169046)
`tryDelinearizeFixedSizeImpl` is a heuristic function relying on GEP's
type information. Using these information to drive an optimization
heuristic is not allowed, so this function should be removed. As #161822
and #164798 have eliminated all calls to this, this patch removes the
function itself.
Reland "[clang][Driver] Support for the SPIR-V backend when compiling HIP" (#169637)
This relands "[clang][Driver] Support for the SPIR-V backend when compiling HIP" #167543. The only new change is a small fix for the multicall driver.
For HIP, the SPIR-V backend can be optionally activated with the -use-spirv-backend flag. This option uses the SPIR-V BE instead of the SPIR-V translator. These changes also ensure that -use-spirv-backend does not require external dependencies, such as spirv-as and spirv-link
[BOLT][PAC] Warn about synchronous unwind tables
BOLT currently ignores functions with synchronous PAuth DWARF info.
When more than 10% of functions get ignored for inconsistencies, we
should emit a warning to only use asynchronous unwind tables.
See also: #165215