[AArch64][GlobalISel] Enable BF16 legalization for fadd and friends. (#196081)
This enabled bf16 promotion for the following operations in GISel,
promoting them to f32 and truncating the result back:
G_FADD, G_FSUB, G_FMUL, G_FDIV, G_FMA, G_FSQRT, G_FMAXNUM, G_FMINNUM,
G_FMAXIMUM, G_FMINIMUM, G_FCEIL, G_FFLOOR, G_FRINT, G_FNEARBYINT,
G_INTRINSIC_TRUNC, G_INTRINSIC_ROUND, G_INTRINSIC_ROUNDEVEN
[clang][AMDGPU] Reject malformed target IDs with empty components (#196140)
Fixes #196078
An extra colon in `-mcpu` (e.g. `gfx900::xnack+`) produced an empty
feature component and triggered an assertion in `StringRef::back()`.
Return `std::nullopt` for malformed target IDs instead.
[RISCV][GISel] Add test coverage for the srliw+shXadd patterns. NFC (#196676)
GISel isn't canonicalizing the shift pair to an AND the same way
SelectionDAG does so the patterns weren't firing. Add more directed
tests that use an And explicitly.
[DAGTypeLegalizer] Add missing BR_CC handler for soft-promoted half operands (#196214)
`SoftPromoteHalfOperand` had no case for `ISD::BR_CC`, causing a crash
when a half-typed `fcmp` result fed directly into a conditional branch.
All other comparison-related nodes (`SETCC, SELECT_CC`) were already
handled. Add `SoftPromoteHalfOp_BR_CC` following the same pattern as
`SoftPromoteHalfOp_SELECT_CC`.
Fixes #195562
---------
Co-authored-by: Tony Varghese <tony.varghese at ibm.com>
[Utils] Fix duplicate DomTree updates in SplitIndirectBrCriticalEdges (#196475)
SplitIndirectBrCriticalEdges generates DomTree Insert/Delete pairs for
each predecessor in OtherPreds. However, OtherPreds can contain
duplicate entries when a conditional branch has both targets pointing to
the same block (e.g., `br i1 %c, label %X, label %X`). This produces
duplicate DomTree updates for the same edge, triggering the assertion
`std::abs(NumInsertions) <= 1 && "Unbalanced operations!"` in
LegalizeUpdates.
Fix by tracking which source blocks have already had DomTree updates
emitted, and skipping duplicates.
[Clang] Do not eat SFINAE diagnostics for explicit template arguments (#139066)
Instead of merely suggesting the template arguments are invalid, we now
provide an explanation of why the explicit template argument is invalid.
[clang] Don't warn on __COUNTER__ in system macros
The introduction of extension and compatibility warnings means
that __COUNTER__ has started causing warnings (and -Werror= build
failures) due to use of system APIs.
This PR simply ensures that these diagnostics don't get reported
for system macro expansions as well.
Revert "[lldb] Handle SIGINT via the MainLoop signal thread (on POSIX)" (#196684)
Reverts llvm/llvm-project#195959 because it caused
`TestIOHandlerCompletion.py` to fail in CI (GreenDragon).
[OpenMP][MLIR] Modify lowering OpenMP Dialect lowering to support attach mapping
This PR adjusts the LLVM-IR lowering to support the new attach map type that the runtime
uses to link data and pointer together, this swaps the mapping from the older
OMP_MAP_PTR_AND_OBJ map type in most cases and allows slightly more complicated ref_ptr/ptee
and attach semantics.
[Flang][OpenMP][Offload] Modify MapInfoFinalization to handle attach mapping and 6.1's ref_* and attach map keywords
This PR is one of four required to implement the attach mapping semantics in Flang, alongside the
ref_ptr/ref_ptee/ref_ptr_ptee map modifiers and the attach(always/never/auto) modifiers.
This PR is the MapInfoFinalization changes required to support these features, it mainly deals with
applying the correct attach map type and manipulating the descriptor types maps for base address
and descriptor so that when we specify ref_ptr/ref_ptee we emit one of the two maps and when we
emit ref_ptr_ptee we emit our usual default maps. In all cases we add the "glue" of an new
attach map except in cases where a user has provided attach never. In cases where we are
provided an always, we apply the always map type to our attach maps.
It's important to note the runtime has a toggle for the auto map behaviour, which will flip the
attach behaviour to the newer semantics or the older semantics for backwards compatability (outside
the purview of this PR but good to mention).
[AMDGPU] Support atomic load and store for vector float types (v2f16, v2i16, v4i16, v4f16, v2f32) (#192904)
Add support for atomic load and store on <2 x half>, <4 x half>, and
<2 x float> vector types in the AMDGPU backend.
These types are promoted to equivalently sized integer types before
instruction selection:
<2 x half> -> i32
<4 x half> -> i64
<2 x i16> -> i32
<4 x i16> -> i64
<2 x float> -> i64
[Flang][MLIR][OpenMP] Add distinct var_ptr_ptr_type to omp.map.info operations & remove ref_ptr_ptee
This is a precursor patch to attach and ref_ptr/ptee mapping that I intend to upstream
over the next few weeks. The attach maps require both the type of the descriptor and
the pointed to data to calculate the appropriate offload/base pointers and size. In
the base case of ref_ptr_ptee all of this information can be gathered from the pointer
and pointee maps, but in cases where we have only one (i.e. ref_ptr/ref_ptee) we will
be missing one of the key elements required to create an corresponding attach map.
So, this PR basically adds the ability to ferry around the type of both var_ptr and
var_ptr_ptr as opposed to just var_ptr, then we can emit attach maps as seperate
map.info's that carry all the pre-requisite informaion for lowering to LLVM-IR. But,
otherwise it seems reasonable to have var_ptr_ptr mirror var_ptr in all aspects for
consistency.
It also removes ref_ptr_ptee, instead opting to use the setting of both ref_ptr and
ref_ptee to mean ref_ptr_ptee.
[Flang][OpenMP][MLIR] Add attach and ref map type lowering to MLIR
This doesn't implement the functionality, just the relevant map type
lowering to MLIR's omp.map.info. The more complicated changes to
MapInfoFinalizationPass.cpp and OpenMPTOLLVMIRTranslation.cpp to support
attach map and the various ref/attach semantics will come in a subsequent
set of PRs. This just helps compartmentalize the changeset.
[lldb] Fix CommandObjects that don't set a return status (#196588)
Several CommandObject subclasses had DoExecute paths that returned
without ever calling SetStatus on the CommandReturnObject. The status
was silently left at its initial eReturnStatusStarted value, which made
Succeeded() report false for what were really successful commands and
left CommandReturnObject in an undefined state.
[CodeGen] Use unique_ptr for FunctionInfo to prevent memory leaks (#196603)
Raw pointer return from `FunctionInfo::create` caused leaks in callers
like `computeABIInfoUsingLib`, breaking BPF tests on ASan bots.
Using `std::unique_ptr` enforces automatic cleanup.
Fixes leak from #194460.
Buildbot: https://lab.llvm.org/buildbot/#/builders/52/builds/17090
Assisted-by: Gemini
[mlir][tensor] Enhance pattern to fold extract_slice(insert_slice) (#195045)
Extend the DropRedundantRankExpansionOnExtractSliceOfInsertSlice pattern
to support cases where the expanded dimensions are a subset of the
dropped dimensions, rather than requiring them to be exactly equal.
For example:
```
%inserted_slice = tensor.insert_slice %src into %dest[0, 0, 0, 0] [1, 1, 128, 480] [1, 1, 1, 1] :
tensor<128x480xf32> into tensor<1x1x128x480xf32>
%extracted_slice = tensor.extract_slice %inserted_slice[0, 0, 0, 0] [1, 1, 123, 1] [1, 1, 1, 1] :
tensor<1x1x128x480xf32> to tensor<123xf32>
```
can be folded into:
```
%extracted_slice = tensor.extract_slice %src[0, 0] [123, 1] [1, 1] :
tensor<128x480xf32> to tensor<123xf32>
```
Revert "[BOLT] Fix EH data encoding checks in relocateEHFrameSection (#195691)" (#196672)
This reverts commit 7ab26d7c3a160e1dc166f2673644baa396703ee5.
There is test failure in bolt-tests::exceptions-split-strip.test.
[RISCV] Use the nhs.lea.h/w/d instead of nhs.lea.h/w/d.ze with Sh1AddPat. (#196660)
The srliw already took care of zeroing the upper bits. Using the non-.ze
form is consistent with the Zba version of this pattern.
[clang][deps] Move `ScanningOutputFormat` out of the library (#196631)
Basing behavior of the dependency scanner on the final output format is
a leaky abstraction. Instead, we should aim to introduce proper feature
flags.