[libcxx][string] Test: fix copy&paste typo for safe_allocator (#195820)
In
test/std/strings/basic.string/string.modifiers/string_insert/iter_iter_iter.pass.cpp
Was missing the test for safe_allocator (test for min_allocator was
called twice) safe_allocator would be consistent with the rest of that
PR.
[NVPTX] Add commutativity to SETP instructions to enable MachineCSE of inverted predicates
Inverted predicates can be used freely in PTX. If we can invert a
predicate and CSE the generating instruction we can save calculating
the inverse.
Teach the NVPTX commuteInstructionImpl that SETP instructions can be
inverted to allow CSEing with previous SETP that match the inverted
form. This also inverts the branch users of the predicate to maintain
correctness.
Currently only allow the SETP inversion if all users are branches.
Future work can extend this to sel and not instructions.
Made-with: Cursor
Revert "move cmp modes into td and update users"
This reverts commit 5950d9fcd6b2053e71929972b89cc983ce2cccaa, restoring
the hand-written PTXCmpMode enum in NVPTX.h and the switch-based
implementations of invertIntegerCmpMode, invertScalarFloatCmpMode,
NVPTXInstPrinter::printCmpMode and NVPTXDAGToDAGISel::getPTXCmpMode.
The TableGen GenericTable migration consolidated the comparison-mode
data but at the cost of an extra .inc file, an ODR-driven split between
NVPTXCodeGen and NVPTXDesc, and indirection through a generated lookup
where the local switches were already self-contained. Reverting until a
broader cleanup of NVPTX::PTXCmpMode is taken on as part of a larger
refactor.
Co-authored-by: Cursor <cursoragent at cursor.com>
Revert "move cmp modes into td and update users"
This reverts commit 5950d9fcd6b2053e71929972b89cc983ce2cccaa, restoring
the hand-written PTXCmpMode enum in NVPTX.h and the switch-based
implementations of invertIntegerCmpMode, invertScalarFloatCmpMode,
NVPTXInstPrinter::printCmpMode and NVPTXDAGToDAGISel::getPTXCmpMode.
The TableGen GenericTable migration consolidated the comparison-mode
data but at the cost of an extra .inc file, an ODR-driven split between
NVPTXCodeGen and NVPTXDesc, and indirection through a generated lookup
where the local switches were already self-contained. Reverting until a
broader cleanup of NVPTX::PTXCmpMode is taken on as part of a larger
refactor.
Co-authored-by: Cursor <cursoragent at cursor.com>
[NVPTX] Add commutativity to SETP instructions to enable MachineCSE of inverted predicates
Inverted predicates can be used freely in PTX. If we can invert a
predicate and CSE the generating instruction we can save calculating
the inverse.
Teach the NVPTX commuteInstructionImpl that SETP instructions can be
inverted to allow CSEing with previous SETP that match the inverted
form. This also inverts the branch users of the predicate to maintain
correctness.
Currently only allow the SETP inversion if all users are branches.
Future work can extend this to sel and not instructions.
Made-with: Cursor
[RISCV] Fix inconsistent usage of ValVT and LocVT in CC_RISCV_Impl. NFCI (#195368)
I think all of our checks should be against LocVT. If LocVT is different
than ValVT, that means the location has already been changed and we
should be acting on that changed type. For the most part, I don't think
that happens for RISC-V.
[NFC][SSAF][EntityPointerLevel] Move EntityID-to-EPL map serialization to the EPL module (#193092)
Factor out the serialization of `std::map<EntityId,
EntityPointerLevelSet>` to `EntityPointerLevelFormat.h`.
---------
Co-authored-by: Balázs Benics <benicsbalazs at gmail.com>
Co-authored-by: Jan Korous <jkorous at apple.com>
[CIR] Use SymbolUserMap in applyReplacements to fix quadratic behavior (#195883)
applyReplacements() previously called replaceAllSymbolUses() for each
replacement, which walks the entire module every time — O(R × M) for R
replacements and M operations. For C++ programs with heavy template
instantiation (e.g., Eigen), this quadratic behavior dominated compile
time.
Replace the per-replacement module walk with a single SymbolUserMap
built once (O(M)), then use replaceAllUsesWith() which scopes each
replacement to only the actual user operations. The debug-only
verifyPointerTypeArgs helper is also updated to reuse the map.
Measured on Eigen's basicstuff.cpp (356 lines, heavy template
instantiation): compile time dropped from 20m29s to 1m2s (20x speedup).
CIR-to-classic ratio improved from 117x to 7.2x.
Made with [Cursor](https://cursor.com)
Co-authored-by: Cursor <cursoragent at cursor.com>
[CIR] Add pass_object_size hidden parameter support (#191482)
Emit the hidden `i64` parameter that
`__attribute__((pass_object_size(N)))` requires. At call sites the size
is constant-folded when possible (e.g. `&a` → 4) and falls back to
`cir.objsize` / `@llvm.objectsize` otherwise (e.g. VLAs).
On the callee side, `buildFunctionArgList` now creates an
`ImplicitParamDecl` for each annotated parameter so that
`emitBuiltinObjectSize` can load the passed size instead of re-computing
it.
This also fixes the `llvm_unreachable("NYI")` in
`RequiredArgs::getFromProtoWithExtraSlots` and the `errorNYI` in
`appendParameterTypes` / `arrangeFreeFunctionLikeCall` that fired
whenever `hasExtParameterInfos()` was true.
New test: `clang/test/CIR/CodeGen/pass-object-size.c` (CIR / LLVM /
OGCG).
[5 lines not shown]
[MLIR][XeGPU] Support pointer/dynamic-memref sources in array-length optimization (#195872)
Extend `OptimizeCreateNdDescOp` to handle the two remaining
`create_nd_tdesc` source forms — `i64` pointer and dynamic-shape memref
— by forwarding the existing shape/strides operands through the general
builder. The memory region is unchanged by the rewrite; only the
`tensor_desc` view is narrowed along the FCD and tagged with
`array_length`.
Co-authored-by: Claude Opus 4.7 <noreply at anthropic.com>
clang: Avoid dummy LAST entry in OffloadArch (#195952)
Use this as an alias of the final entry, rather than its
own enum value. This will allow writing covered switches
that don't need to handle this case. This matches how
other places with an end enum entry handle this.
[mlir][acc] Improve implicit deviceptr detection for alias (#195934)
The ACCImplicitData automatically is able to use deviceptr clause when
variable is detected as being device data. However, it was missing check
for own `acc declare deviceptr` attribute.