[clang][X86] Add constexpr support for mpsadbw128/256 intrinsics (#202257)
Enable constexpr evaluation for `_mm_mpsadbw_epu8` and
`_mm256_mpsadbw_epu8` (`__builtin_ia32_mpsadbw128`/`mpsadbw256`).
Fixes #157522.
[SystemZ] Enable liveness reduction in pre-RA sched strategy. (#188823)
Add some handling of register pressure by scheduling an SU "low" if it closes a
live range (under certain conditions).
As this is checked before latency reduction, the "data-sequnces" check that was
used to selectively enable latency reduction can now be removed.
This gives good improvements on several benchmarks and is also a simplification
of the SystemZPreRASchedStrategy.
[X86] phaddsub.ll - update PR39921/PR39936 test case to a vector.reduce.v8i32 call (#205310)
Matches middle-end IR produced from the tests' C++ source since #199872
[clang-tidy] Avoid token merging in redundant-parentheses fix-its (#202365)
The readability-redundant-parentheses check emitted fix-its that simply
removed both parentheses. Tools that apply those fix-its directly could
join adjacent tokens and produce invalid code, e.g. `return(0)` becoming
`return0`.
Replace the opening parenthesis with a space when removing it would
merge identifier characters across the removed token.
AI Usage: Test assisted by Codex.
Closes https://github.com/llvm/llvm-project/issues/185108
[clang-tidy][NFC] Update CERT wiki link across all clang-tidy docs (#205086)
This patch updates the outdated CMU wiki link in the clang-tidy
documentation.
The old link currently returns a `301 Moved Permanently` redirecting to
the new GitHub Pages location. This patch updates the source file to
point directly to the new destination to prevent future link rot.
Closes #200277
[BOLT][AArch64] reproducible output with constant islands (#204546)
Optimized binaries from subsequent llvm-bolt runs may sometimes differ
due to the unordered set (SmallPtrSet), even if the input binary and
parameters are the same. Usage of SetVector guarantees a deterministic
sequence of binary functions while keeping each function as a single
instance.
Below you can see two different main functions before the fix after two
llvm-bolt runs (same input binaries, same arguments).
```
0000000000210400 <main>:
210400: 10000140 adr x0, 0x210428 <main+0x28>
210404: f9400000 ldr x0, [x0]
210408: 10000140 adr x0, 0x210430 <main+0x30>
21040c: f9400000 ldr x0, [x0]
210410: 10000180 adr x0, 0x210440 <main+0x40>
210414: f9400000 ldr x0, [x0]
[38 lines not shown]
[NFC] UnicodeNameMappingGenerator: restore #include <unordered_map> (#205316)
#204303 removed this include while converting `unordered_map` uses to
`DenseMap`, but `loadDataFiles` still uses `unordered_multimap`.
See
https://ci.swift.org/job/llvm.org/job/clang-stage2-Rthinlto/job/main/360/
```
[2026-06-23T05:46:26.519Z] /Users/ec2-user/jenkins/workspace/m.org_clang-stage2-Rthinlto_main/llvm-project/llvm/utils/UnicodeData/UnicodeNameMappingGenerator.cpp:34:13: error: missing '#include <unordered_map>'; 'unordered_multimap' must be declared before it is used
[2026-06-23T05:46:26.519Z] 34 | static std::unordered_multimap<char32_t, std::string>
[2026-06-23T05:46:26.519Z] | ^
```
Reland "[clang][ssaf][NFC] Move SSAF flags from FrontendOptions to a dedicated SSAFOptions" (#205312)
Third attempt of #204686
Previous attempt was: #204798
This was last reverted in #205279
This class will help keeping SSAF options apart from generic
FrontendOptions. It is inspired by AnalyzerOptions.
This way all of these SSAF (and future) options will be at a
centralized place.
In preparation of rdar://179151023
---
The previous attempt had issues on Windows with `/permissive` configs.
The issue was that `GENERATE_OPTION_WITH_MARSHALLING` had a generic
lambda capture and that does not constitute as an ODR-use of the
[5 lines not shown]
[SystemZ] Add serialization strings for some MO target flags. (#203053)
These strings are needed for MIR textual representation: If one is
missing it doesn't work to do "-stop-before=XXX and then
-start-before=XXX".
[X86][TTI] Handle structs in areTypesABICompatible() (#205308)
Fixes a regression from #205106. getValueType() asserts on aggregate
types. Use CompuateValueVTs() to compute the de-aggregated VTs.
Performing argument promotion for struct types seems pretty
dubious to me, but it was previously allowed, so I'm retaining
that behavior. We may want to disable promotion of aggregates
in ArgPromotion entirely though.
[AMDGPU][HWEvents] Refactor VMEM_ACCESS as VMEM_READ_ACCESS
Instead of having an HWEvent that can be either a read or a write
depending on the target, keep the events as straightforward as
possible and let InsertWaitCnt interpret it. Rename VMEM_ACCESS
to VMEM_READ_ACCESS and set VMEM_STORE_ACCESS & similar events
even if the target does not have a VSCnt.
I think this conceptually makes more sense.
This separates concerns better so that HWEvents nodels events
objectively, and InsertWaitCnt handles them as necessary for the task
it is trying to achieve (insert wait instructions).
[AMDGPU][InsertWaitCnts] Move TENSOR/ASYNC event detection to separate header
I forgot to move those out of the way as they were not grouped with the other.
Now `getEventsFor` does all the work.
[AMDGPU][InsertWaitCnts] Make HWEvent a BitMask
Follow up from comments on https://github.com/llvm/llvm-project/pull/202886
Make HWEvent a bitmask by default instead of having both the enum, and a separate HWEventSet. This has the advantage of streamlining the code a bit and opening the possibility of adding "modifiers" to events, e.g. I imagine we could now fold "VMemType" into the Events.
We already do this with things like SMEM_GROUP. At least now it's baked into the design.
I opted for a bit more verbosity by taking inspiration from FastMathFlags (FMF): instead of exposing a raw enum, I wrap it in a class w/ helper function. The downside is having to reimplement all the little bitwise ops, but the result is a cleaner, simpler interface than a raw enum (class) w/ many helper functions. I initially tried that but I recoiled at the sight of things like `contains(A, B)` which isn't very clear, while `A.contains(B)` is self explanatory.
Considering HWEvent is a bitmask, I also implemented a simple iterator to iterate over all set bits of the mask, which is a useful thing to have as some APIs in InsertWaitCnt rely on treating one event at a time.
[LSR] Don't merge ICmpZero uses outside loop (#205131)
In NarrowSearchSpaceByMergingUsesOutsideLoop don't merge ICmpZero uses
outside the loop with uses inside the loop, as the resulting use will
have a kind that's not ICmpZero, which will mean the compare won't be
expanded correctly later.
[Clang][RISCV] packed exchanged add/sub intrinsics (#205251)
Add the `__riscv_{pas,psa,psas,pssa,paas,pasa}_x_*` header wrappers over
new `__builtin_riscv_*` builtins.
[MLIR][ADT] Improve matcher compatability with C++20 STL (#205255)
When building MLIR on C++20 in Visual Studio with clang-cl, there are
several related compiler errors, grouped by project:
MLIRQueryMatcher
```C
type '_Mybase' (aka 'typename conditional<conjunction_v<is_trivially_destructible<DynMatcher>, is_trivially_move_constructible<DynMatcher>, is_trivially_move_assignable<DynMatcher>>, typename conditional<conjunction_v<is_trivially_destructible<DynMatcher>, is_trivially_copy_constructible<DynMatcher>, is_trivially_copy_assignable<DynMatcher>>, _Non_trivial_move<_Optional_construct_base<DynMatcher>, DynMatcher>, _Non_trivial_copy_assign<_Optional_construct_base<DynMatcher>, DynMatcher>>::type, _Non_trivial_move_assign<_Optional_construct_base<DynMatcher>, DynMatcher>>::type') is not a direct or virtual base of 'std::optional<mlir::query::matcher::DynMatcher>'
no member named '_Value' in 'std::optional<mlir::query::matcher::DynMatcher>'
no member named '_Has_value' in 'std::optional<mlir::query::matcher::DynMatcher>'
no matching function for call to '_Destroy_range'
invalid application of 'sizeof' to an incomplete type 'mlir::query::matcher::DynMatcher'
invalid application of 'alignof' to an incomplete type 'mlir::query::matcher::DynMatcher'
```
MLIRQueryMatcher, MLIRQuery, MLIRQueryLib, and mlir-query
```C
no viable conversion from 'std::vector<DynMatcher>' to 'ArrayRef<DynMatcher>'
incomplete type 'mlir::query::matcher::DynMatcher' used in type trait expression
[13 lines not shown]
[FixIrreducible] Use reportFatalUsageError for unsupported terminators (#205244)
`opt -passes=fix-irreducible` crashed via `llvm_unreachable` on a
`switch` terminator incident to an irreducible cycle header. Such
terminators must be lowered first (`lower-switch`); replace the
`llvm_unreachable` at both sites with `reportFatalUsageError` so the
pass fails gracefully instead of crashing.
Fixes #191978
Signed-off-by: AvhiMaz <avhimazumder5 at outlook.com>
clang: Change TargetInfo::setCPU to take StringRef (#205278)
The related APIs all use StringRef, so use StringRef for
consistency.
Co-Authored-By: Claude (Opus 4.8) <noreply at anthropic.com>