[OpenACC] Replace terminators with scf.yield in wrapMultiBlockRegionWithSCFExecuteRegion (#184458)
When wrapping a multi-block region in `scf.execute_region`, replace
`func::ReturnOp` (if flag `convertFuncReturn` is set) and `acc::YieldOp`
in all the blocks with `scf.yield` so the region has a valid SCF
terminator.
[SelectionDAG] Fix -Wunused-variable after #179318 (#184623)
```
llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp:3572:26: error: unused variable 'NanEnc' [-Werror,-Wunused-variable]
3572 | const fltNanEncoding NanEnc = SrcSem.nanEncoding;
| ^~~~~~
```
Simply inline the definition of the variable given it is not used
anywhere else and the assignment is a simple copy.
[flang] Fix distribution build of Fortran builtin/intrinsic modules. (#184204)
Currently, `-DLLVM_DISTRIBUTION_COMPONENTS="flang-module-interfaces"`
doesn't work. It failed to build the Fortran builtin/intrinsic modules
as distribution build, `install-distribution`.
This PR is to fix that.
[Hexagon] Use __HVX_IEEE_FP__ to guard protos that need -mhvx-ieee-fp (#184422)
Hexagon clang recently started to define __HVX_IEEE_FP__ when the
-mhvx-ieee-fp option is specified. Guard the intrinsic macros for
instructions that should only be available with -mhvx-ieee-fp with
__HVX_IEEE_FP__.
Additionally, the following NFC changes are included:
- NFC: Remove guards around HVX v60 intrinsic macros
Hexagon v60 is the oldest Hexagon version that supports HVX so these
guards were redundant. Presence of HVX is guarded separately, once
per the whole file.
- Remove comments from closing guards (HVX protos)
These comments served very limited function as they only guard
one macro. Also, they were incorrect. Instead of fixing remove them.
This will also reduce by the factor of two the amount of changes
when guarding conditions change.
[flang][OpenMP] Avoid implicit default mapper on pointer captures (#184382)
This change fixes incorrect implicit declare mapper behavior in Flang
OpenMP lowering.
Issue:
Implicit default mappers were being attached/generated for pointer-based
implicit captures, and also on data-motion directives. That could
trigger recursive component mapping that overlaps/conflicts with
explicit user mappings, causing runtime mapping failures.
Fix:
- Skip implicit default mapper generation for implicit pointer captures
(keep support for allocatables).
- Do not auto-attach implicit mappers on target enter data, target exit
data, or target update.
- Apply the same pointer guard in the implicit target-capture lowering
path.
[mlir][AMDGPU] Add folders for memref aliases to TDM base creation (#184567)
The TDM base creation (amdgpu.make_tdm_base and
amdgpu.make_gather_tdm_base) take references to a
`%memref[%i0, %i1,, ...]` for the starting point of the tiles in
global/shared memory that the TDM descriptor refers to. Memory alias ops
can be safely folded into these operations, since these two memref
operands are just pointers to a scalar starting pint and don't have
semantics that depend on the memref layout (except to the extent that it
defines a location in memory).
While I'm here, I've cleaned up a few things, like the incorrect file
header and fixed the tests to not use integer address spaces.
Co-authored-by: Claude Opus 4.6 <noreply at anthropic.com>
[X86] remove unnecessary movs when %rdx is an input to mulx (#184462)
Closes: https://github.com/llvm/llvm-project/issues/174912
When generating a `mulx` instruction for a widening multiplication, even
if one input is placed in %rdx, LLVM won't place it in the implicit
first slot, instead it'll generate two movs before calling mulx to swap
the registers, which are unnecessary. GCC already has this optimization
(as shown in the issue) so this puts the two compilers closer to each
other on that front.
Co-authored-by: Aiden Grossman <aidengrossman at google.com>
[DTU] fix dominator tree update eliding reachable nodes (#177683)
The initial CFG looks like this:

After inlining, it looks like this:

It should be sufficient to add and remove the edges shown in the test, i.e.:
- add: `bb3->bb1.i` and `bb3->bb2.i`
- remove: `bb3->bb4`, `bb3->bb5` and `bb5->bb8`
New nodes, like `bb5.body`, get discovered when adding bb3->bb2.i. See the "StepByStep" variant of the test). Without the fix in this patch, however, `bb5.body` gets elided when the deleted edges get taken into account, and `DT` is left invalid.
[mlir][Func] Fix FuncOp verifier ordering via hasRegionVerifier (#184612)
FuncOp::verify() iterated over all blocks and called
getMutableSuccessorOperands() on any RegionBranchTerminatorOpInterface
terminator to check return types. This ran during the entrance phase of
verification — before child ops had been verified — so a malformed
terminator whose getMutableSuccessorOperands() assumed invariants
established by its own verify() could crash instead of emitting a clean
diagnostic.
Fix by switching to hasRegionVerifier=1: rename verify() →
verifyRegions() so the return-type checks run in the exit phase, after
all nested ops have already been verified.
To demonstrate the bug and guard against regression, add
TestCrashingReturnOp to the test dialect. The op implements
RegionBranchTerminatorOpInterface and report_fatal_errors in
getMutableSuccessorOperands() when its 'valid' unit-attr is absent,
reproducing the class of crash described above. The accompanying lit
test confirms a clean diagnostic is emitted rather than a crash.
[SPIRV] Fix global emission for modules with no functions (#183833)
Right now we have a problem where if you have a LLVM module with globals
but no functions, a completely empty SPIR-V module is emitted.
This is because global emission is dependent on tracking intrinsic
functions being emitted in functions.
As a simple fix, just insert a service function, which the backend is
already set up to not actually emit, if there are no real functions.
The current use case of the service function is for function pointers. I
don't think it's possible that we need to both generate a service
function for function pointers and for globals with no functions, so I
just added an error (not an assert) just in case if we do need it for
both cases.
Probably we should rework global handling in the future to work without
these workarounds, but this is a pretty fundamental issue so let's work
[15 lines not shown]
[AArch64] Update clmul tests after #184403 (#184611)
This was likely a mid-air collision with #183282. Update the tests to
match the current state of HEAD.
[mlir][shape] Fix crash when shape.lib array references undefined symbol (#184613)
In verifyOperationAttribute(), the single-symbol path for shape.lib used
SymbolTable::lookupSymbolIn() followed by an explicit null check. The
array path at line 196-197 used dyn_cast<FunctionLibraryOp>() directly
on the lookup result, which asserts when the symbol is not found (null
pointer).
Fix: use dyn_cast_or_null<> instead of dyn_cast<> so that a missing
symbol falls through to the existing "does not refer to
FunctionLibraryOp" error diagnostic instead of asserting.
Fixes #159653
[mlir][affine] Fix crash in vectorizeAffineLoopNest test utility for reduction loops (#184617)
The test utility function `testVecAffineLoopNest` called
`isLoopParallel` with a `reductions` output parameter, which populates
reduction descriptors when the loop performs a reduction. However, these
descriptors were never added to `strategy.reductionLoops` before calling
`vectorizeAffineLoopNest`. When the vectorizer then processed a loop
with `iter_args`, it found no reduction descriptors in the strategy and
hit an assertion failure.
Fix by registering the reduction loop descriptors in the strategy before
vectorization, matching what the production vectorizer code already does
correctly.
Fixes #128334
[VPlan] Preserve IsSingleScalar for hoisted predicated load. (#184453)
The predicated loads may be single scalar (e.g. for VF = 1). We should
preserve IsSingleScalar when hoisting them. As all loops access the same
address, IsSingleScalar must match across all loads in the group.
This fixes an assertion when interleaving-only with hoisted loads.
Fixes https://github.com/llvm/llvm-project/issues/184372
PR: https://github.com/llvm/llvm-project/pull/184453
[clang][Lex] Preserve MultipleIncludeOpt state in Lexer::peekNextPPToken (#183425)
Fixes https://github.com/llvm/llvm-project/issues/180155.
This is a duplicate of https://github.com/llvm/llvm-project/pull/180700
except that I also added some tests, fine to go with either PR, but we
should add the tests.
peekNextPPToken lexed a token and mutated MIOpt, which could clear the
controlling-macro state for main files in C++20 modules mode.
Save/restore MIOpt in Lexer::peekNextPPToken.
Add regression coverage in
LexerTest.MainFileHeaderGuardedWithCPlusPlusModules that checks to make
sure the controlling macro is properly set in C++20 mode.
Add source level lit test in miopt-peek-restore-header-guard.cpp that
checks to make sure that the warnings that depend on the MIOpt state
machine are emitted in C++20 mode.
[analyzer] Suppress optin.cplusplus.VirtualCall warnings in system headers (#184183)
Fixes #184178
The optin.cplusplus.VirtualCall checker reports warnings for virtual
method calls during construction/destruction even when the call site is
in a system header (included via -isystem). Users cannot fix such code
and must resort to NOLINT suppressions.
Add a system header check in checkPreCall before emitting the report,
consistent with how other checkers (e.g. MallocChecker) handle this.
[SPIRV] Rename `selectSelectDefaultArgs` to `selectBoolToInt` (#184120)
The function is used to extend a `bool` (vector or scalar) into `1/-1`
for `true` and `0` for `false` (vector or scalar).
There is no obvious "default" argument for a select operation, so the
original name is confusing.
This patch:
* Renames this function to better signal its intention,
* makes the boolean argument explicit in the function (instead of
implicit through the first register operand of the instruction),
* rename `I` to `InsertAt`.
[LV] Add `-force-target-supports-masked-memory-ops` option (#184325)
This can be used to make target agnostic tail-folding tests much less
verbose, as masked loads/stores can be used rather than scalar
predication.
[TableGen] Complete the support for artificial registers (#183371)
Artificial registers were added in
eb0c510ecde667cd911682cc1e855f73f341d134
as a means of giving super-registers heavier weights than that
of their subregisters, even when they only contain a single
physical subregister.
Artifical registers thus do exist in code and participate in
register unit weight calculations, but are not supposed to be
available for register allocation.
This patch completes the support for artificial registers to:
- Ignore artificial registers when joining register unit uber
sets. Artificial registers may be members of classes that
together include registers and their sub-registers, making it
impossible to compute normalised weights for uber sets they
belong to.
[28 lines not shown]