[AArch64] Fix `shufflevector` miscompilation on `aarch64_be` (#193076)
A function like
```llvm
define <4 x i16> @xtn_shuffle_even_v8i16(<8 x i16> %a) {
entry:
%r = shufflevector <8 x i16> %a, <8 x i16> poison, <4 x i32> <i32 0, i32 2, i32 4, i32 6>
ret <4 x i16> %r
}
```
will use the `xtn` instruction, which for each 32-bit vector element
keeps only the lower 16 bits, so effectively this is a truncation.
However, if the vector actually has 16-bit elements, then the conversion
from a shuffle to a truncation is only valid on LE, not on BE. On BE,
`uzp1` should be used instead. So this PR moves some logic to right
after a check for LE, so that BE does not miscompile.
[5 lines not shown]
Prevent undefined behavior caused by combination of branch and load delay slots on MIPS1 (#185427)
Under certain conditions the LLVM `MipsDelaySlotFiller` fills a branch
delay slot with an instruction requiring a load delay slot. However the
`MipsDelaySlotFiller` does not check the filled instruction for hazard
which leads to code like this:
```asm
beqz $1, $BB0_5
lbu $2, %lo(_RNvCs5jWYnRsDZoD_3app13CONTROLLERS_A)($2)
# --- Some other instructions
$BB0_5:
andi $1, $2, 1
```
`lbu` got moved into the branch delay slot but has a load delay slot -
so when jumping to `$BB0_5` the value for `$2` will not be ready, which
leads to undefined behavior.
This PR suggests to declare instructions with a load delay slot to be
hazardous for the branch delay slot, only for `MIPS1`. This will prevent
[23 lines not shown]
[RFC][NFCI][Constants] Add `Constant::isZeroValue`
The old `isZeroValue` was removed because it was functionally identical to
`Constant::isNullValue`. Currently, a "null value" in LLVM means a zero value.
We are moving toward changing the semantics of `ConstantPointerNull` to
represent a semantic null pointer instead of a zero-valued pointer. As a result,
the meaning of "null value" will also change in the future.
This PR series is the first step toward renaming the two widely used "null
value" interfaces to "zero value". As the first PR in the series, this change
adds a "new" `isZeroValue` alongside `isNullValue`, and makes `isNullValue` call
`isZeroValue` directly. Then, all uses of `isNullValue` in LLVM are replaced
with `isZeroValue`. Uses in other projects will be updated in separate PRs.
The plan is to eventually remove `isNullValue` after all uses have been
migrated.
[VPlan] Get GEP wrap flags from VPInstructions (NFCI). (#195730)
Add helper to retrieve GEP no-wrap flags from VPInstructions, looking
through zero-index GEPs and pointer casts, like
Value::stripPointerCasts. Removes an access to underlying IR.
[ModuleInliner] Skip function declarations during candidate scan (#195567)
This patch skips function declarations during the candidate scan in
ModuleInlinerPass::run as declarations do not have bodies.
[InlineOrder] Fix assertion failure in CostBenefitPriority (#195564)
InlineCost::getStaticBonusApplied() triggers an assertion failure
if the CostBenefitPriority constructor calls it when
IC.isVariable() is false. This is because
getStaticBonusApplied() expects isVariable() to be true.
Unconditionally populating CostBenefit also incorrectly prioritizes
a NeverInline candidate with a cost-benefit pair over other
valid variable-cost sites.
This patch fixes the crash and the sorting issue by calling
getStaticBonusApplied() and populating CostBenefit only when
IC.isVariable() is true. For AlwaysInline and NeverInline costs,
CostBenefit is explicitly set to std::nullopt.
[IPO] Fix infinite recursive inlining in ModuleInliner (#195471)
The ModuleInliner currently lacks inline history tracking. Without
it, the inliner can get stuck in an infinite loop when mutually
recursive functions are involved.
This patch enables inline history tracking in the ModuleInliner to
address this issue.
The minsize attribute in the test case lowers the threshold for the
mutually recursive functions, ensuring the bug reproduces in pass
isolation.
[SSAF] Add CLI option --ssaf-apply-source-pass for SourcePassAnalysis
The '--ssaf-apply-source-pass' option takes a list of
SourcePassAnalysis names. The option expects the user to provide a WPA
result using '--ssaf-load-wpa-result'.
Provided SourcePassAnalysis passes will be run by clang on each AST
with the WPA result.
[mlir][Vector] Add load, store, etc. to dropleadunitdim (#195686)
Discussions on improvements to fold-memref-alias-ops changes revealed
that the patterns meant to drop leading unit dimensions from vector
operations weren't handling load, store, and other "terminal" vector
dialect operations. This PR adds the patterns to fix that.
Assisted-by: Claude 4.7
[VPlan] Bail out on recipes without live-outs in narrowIG. (#195729)
When narrowing interleave groups, recipes with users outside the loop
region are not be handled properly. We would need to properly check if
the operations can be narrowed in a way that serve the correct results
to the users.
For now, just bail out to fix miscompiles/crashes.
[libsycl] Add explicit LLVMSupport dependency (#195371)
libLLVMSYCL.so includes <llvm/Object/OffloadBinary.h>, which
transitively instantiates LLVMSupport templates.
With -Wl,-z,defs those symbols must be on the direct link line.
LLVMObject (which contains OffloadBinary.cpp) was already linked, but
LLVMSupport was not, causing undefined-reference errors in clean builds.
[clang] correctly handle +/- features when matching modules
By sorting and then comparing, we made +sse2 -sse2 equal to
-sse2 +sse2, where the former has sse2 disabled, and the latter
enabled. I verified this is actually the case by compiling the
following:
```
#ifdef __SSE2__
#error X
#endif
```
Pull Request: https://github.com/llvm/llvm-project/pull/187624