[LLVM][IR] Add support for address space names in DataLayout (#170559)
Add support for specifying the names of address spaces when specifying
pointer properties for an address space. Update LLVM's AsmPrinter and
LLParser to print and read these symbolic address space name.
AMDGPU/GlobalISel: Regbanklegalize rules for G_UNMERGE_VALUES
Move G_UNMERGE_VALUES handling to AMDGPURegBankLegalizeRules.cpp.
Fix sgpr S16 unmerge by lowering using shift and using S32.
Previously sgpr S16 unmerge was selected using _lo16 and _hi16 subreg
indexes which are exclusive to vgpr register classes.
For remaing cases we do trivial mapping, assigns same reg bank
to all operands, vgpr or sgpr.
[lldb] Log when we use fallback register information
These fallback layouts are essentially guesses. Used when there is
no other way to query register information from the debug server.
Therefore there is a risk that LLDB and the debug server disagree,
which can produce strange effects.
I have added a log message here so we have a clue when triaging
these problems.
Note that it's not wrong to assume a layout in some situations.
It's how some debug servers were built. However if you end up
using the fallback when the server expected you to use XML,
you're likely going to have a bad time.
[MLIR][XeGPU] Support subview memref: handling the base address during xegpu to xevm type conversion (#170541)
During the XeGPU-to-XeVM type conversion, a memref is lowered to its
base address. This PR extends the conversion to correctly handle memrefs
that include an offset, such as those generated by memref.subview.
Revert "[flang][OpenMP] Fix firstprivate not working with lastprivate in DO SIMD" (#171646)
Reverts llvm/llvm-project#170163
Regression in fujitsu test suite
[AMDGPU][SDAG] Add missing cases for SI_INDIRECT_SRC/DST
Before, instruction selection would fail to select extract/insert
elements for i32/float vectors of sizes 3, 5, 6 and 7 when -O0 was used.
This patch adds the missing SI_INDIRECT_SRC/DST cases for those sizes.
[mlir][acc] Add isValidValueUse to OpenACCSupport (#171538)
Add a new API `isValidValueUse ` to OpenACCSupport. This is used in
ACCImplicitData to check value that are already legal in the OpenACC
region and do not require implicit clause to be generated. An example
would be a CUDA Fortran device variable that is already on the GPU.
[flang][AliasAnalysis] Cray pointers/pointees might alias with anything (#170900)
The LOC intrinsic allows a cray pointer to alias with ordinary variables
with no other attribute. See the new test for an example.
This is not enabled by default. The functionality can be used with
`-mmlir -funsafe-cray-pointers`.
First part of the un-revert of #169544. That will handle TBAA.
[clang][tooling] Fix `getFileRange` false negative (#171555)
When an expression is in a single macro argument but also contains a
macro, `getFileRange` would incorrectly reject that expression,
concluding that it came from two different macro arguments because they
came from two different expansions.
We adjust the logic to look at the full path of macro argument expansion
locations instead, tracking that if our traversal up the macro
expansions continues all the way through macro arguments all the way to
the top. This is similar to the technique used by `makeFileCharRange`.
We also add some test cases to ensure we don't introduce any false
positives.
[CIR] Implement builtin extractf (#170427)
Implement builtin extractf, tests are from
clang/test/CodeGen/X86/avx512f-builtins.c.
Added a new type constraint "element or vector of element" since
LLVMIR also has said constraint. The new getBoolMaskValue is because the
existing SelectOp already accepts only a boolean condition; it'd make
more sense for it to accept a vector of boolean instead of a vector of
i32.
[OpenMP][Offload] Continue to update libomptarget debug messages (#170425)
* Add support to use lambdas to output debug messages (like LDBG_OS)
* Update messages for interface.cpp and omptarget.cpp
[ROCDL] Added global/flag data prefetch ops (#171449)
This PR brings data prefetch ops to ROCDL for gfx1250 architecture.
Extended all necessary rocdl tests