[RISCV] Fix fixed-length masked.{u,s}{div,rem} lowering not converting operands (#197913)
Similar to #197724, but this time I also somehow forgot to convert the
operands to scalable vectors. I'm surprised that nothing asserted here,
since SDT_RISCVIntBinOp_VL has a type profile constraint that the
operands and result types need to be the same.
Reland [C++20] [Modules] Don't profiling the callee of CXXFoldExpr (#190732) (#195983)
Close https://github.com/llvm/llvm-project/issues/190333
For the test case, the root cause of the problem is, the compiler
thought the declaration of `operator &&` in consumer.cpp may change the
meaning of '&&' in the requrie clause of `F::operator()`. But it doesn't
make sense. Here we skip profiling the callee to solve the problem. Note
that we've already record the kind of the operator. So '&&' and '||'
won't be confused.
---
See the discussion in https://github.com/llvm/llvm-project/pull/194283
For the new found pattern that we may have other binary operator (e.g.,
operator +) in the require clause, e.g.,
```C++
[8 lines not shown]
[Clang][Sema] Fix crash in __builtin_dump_struct with immediate callables (#192880)
## Motivation
`ComplexRemove` (used by `Sema::PopExpressionEvaluationContext` to strip
nested `ConstantExpr` wrappers) inherits the default
`TreeTransform::TransformOpaqueValueExpr`, which asserts on any
`OpaqueValueExpr` with a non-null `SourceExpr` unless a binding has
already been set up.
`__builtin_dump_struct` binds the record pointer to an `OpaqueValueExpr`
inside a `PseudoObjectExpr`. When the callable argument is
immediate-escalated (e.g. via `__builtin_is_within_lifetime`),
`RemoveNestedImmediateInvocation` roots `ComplexRemove` inside the PSE's
semantic form, reaching that OVE without the binding the assert expects
- triggering a crash.
## Closing Issues
[6 lines not shown]
[CoroSplit] Never collect allocas used by catchpad into frame (#186728)
Windows EH requires exception objects allocated on stack. But there is
no reliable way to identify them. CoroSplit employs a best-effort
algorithm to determine whether allocas persist on the stack or the
frame, which may result in miscompilation when Windows exceptions are
used.
This patch proposes that we treat allocas used by catchpad as exception
objects and never place them on the frame. A verifier check is added to
enforce that operands of catchpad are either constants or allocas.
Close #143235 Close #153949 Close #182584
[VPlan] Fold canonical IV recipe creation into createLoopRegion. (#198383)
Remove the separate addCanonicalIVRecipes transform and create the
canonical IV's increment and the latch's exiting branch directly in
createLoopRegion, using the loop region's VPRegionValue for the
canonical IV. The temporary VPPhi placeholder previously inserted in the
header is no longer needed.
PR: https://github.com/llvm/llvm-project/pull/198383
[Clang][AMDGPU] Add ``amdgcn_av("none")`` attribute for atomic expressions
Add a statement attribute that suppresses MakeAvailable/MakeVisible
cache operations on AMDGPU atomic instructions while preserving memory
ordering (waits).
The attribute takes a string argument specifying the mode. Currently
"none" is the only supported mode. The resulting atomic or fence
instruction carries !mmra !{!"amdgcn-av", !"none"} metadata.
Assisted-By: Claude Opus 4.6
[IR] Introduce an appendTags() idiom to set MMRA metadata [NFC]
This is a simple set-union of new tags and existing tags. This is safer than
directly setting metadata, which can over-write existing MMRAs.
Assisted-By: Claude Opus 4.6
[AArch64][GlobalISel] Add tablegen pattern for uaddo combine (#198724)
Related to #197693 which filters the worklist to only include opcodes
for which there are combines. It's mostly handled by canMatchOpcode
which is tablgen'ed but some old combines like this one are missing a
tablegen pattern and require extra handling. This adds a simple wrapper
so it gets picked up by canMatchOpcode and we can delete the C++
handling.
Assisted-by: codex
[clang][ASTImporter] Fix of crash at ConstraintSatisfaction import (#197407)
Null pointer dereference could happen during `ASTImporter` import of a
`ConstraintSatisfaction` object.
[GVN] Properly combine AA metadata if available load is hoisted (#197948)
Ensure the AA metadata are properly merged between the new load and the
old one during PRE. Actually set `DoesKMove` in `combineMetadataForCSE`,
otherwise the new load is assumed not to move, which is not correct if
the new load has happened to be hoisted.
Fixes: https://github.com/llvm/llvm-project/issues/196787.
[DenseMap] Invalidate iterators on erase (#199369)
Tighten DenseMap's `erase` contract so that, like `insert` and `grow`,
it invalidates iterators and references obtained before the call.
Under the current tombstone-based deletion this is purely an
LLVM_ENABLE_ABI_BREAKING_CHECKS check — the bucket array is not actually
mutated for other entries — but it surfaces stale-iterator-after-erase
patterns now rather than when DenseMap's deletion scheme changes.
Mirrors the SmallPtrSet change in #96762, which dropped tombstones in
small mode and likewise had `erase` invalidate iterators.
Depends on #198982 and #199365
[IR] Inline remove_if in PMDataManager::removeNotPreservedAnalysis (#199571)
PR #198982 rewrote removeNotPreservedAnalysis to use DenseMap::remove_if
with one predicate shared across two call sites. The predicate is always
inlined; the cost is that two call sites make
DenseMapBase::remove_if<...>
itself emit out of line instead of inlining into the caller. As this
runs
after every modifying codegen pass (legacy PM), it shows up as a small
instructions:u regression, most visibly at -O0 where the legacy codegen
PM
is a large fraction of compile time:
https://llvm-compile-time-tracker.com/compare.php?from=69a5cf515fd317bcf918e48de9137dd8549870c5&to=6302439f5aaea6cb776d8ceb5c2ef9108fccf702&stat=instructions%3Au
Collect the maps into a SmallVector and prune them from a single
remove_if
call site, so the instantiation is inlined again.
[InferAttrs] Annotate math and basic string libcalls with `nosync` (#197761)
Math libcalls as well as some simple string ones do not create
synchronizes-with edges, thus `nosync` may be derived.
Co-authored-by: Johannes Doerfert <jdoerfert.llvm at gmail.com>
[RISCV] Use append TableGen feature in RISCVInstrInfoXqci.td (#199603)
This improves the readability of the file.
An AI came up with the patch which I reviewed and ensured that the tests
pass.
[mlir][x86] Fix - multiple issues / F8 support for AMX dot-product lowering (#196984)
This patch fixes issues or support additional patterns for AMX
`dot-product` lowering.
1. Fix issue related to write-back to `C` matrix,
2. Supports additional lowering pattern where the cache tile sizes are:
32,32,32,
3. Online packing - loop peeling is now based on `step` size,
4. Extends support for `f8` lowering (`mx-fp8` lowering will be
supported after vector.contract has `mx` support).
[NFC] [clangd] [C++20] [Modules] Fix false duplicate module warning for equivalent paths (#199343)
When checking for multiple source files declaring the same module, the
comparison used raw string equality on file paths. This causes false
positives when the same file is represented by different but equivalent
path strings.
Use pathEqual(normalizePath(...), normalizePath(...)) instead to compare
canonical paths, consistent with how clangd handles path comparisons
elsewhere.
sanitizer_common: Fix build on MIPS with _TIME_BITS=64 (#199590)
When we build sanitizer_common with -D_TIME_BITS=64, the assert of
struct_kernel_stat_sz fails due to the size of struct stat get different
size.