[NFC][SSAF][EntityPointerLevel] Move EntityID-to-EPL map serialization to the EPL module (#193092)
Factor out the serialization of `std::map<EntityId,
EntityPointerLevelSet>` to `EntityPointerLevelFormat.h`.
---------
Co-authored-by: Balázs Benics <benicsbalazs at gmail.com>
Co-authored-by: Jan Korous <jkorous at apple.com>
[CIR] Use SymbolUserMap in applyReplacements to fix quadratic behavior (#195883)
applyReplacements() previously called replaceAllSymbolUses() for each
replacement, which walks the entire module every time — O(R × M) for R
replacements and M operations. For C++ programs with heavy template
instantiation (e.g., Eigen), this quadratic behavior dominated compile
time.
Replace the per-replacement module walk with a single SymbolUserMap
built once (O(M)), then use replaceAllUsesWith() which scopes each
replacement to only the actual user operations. The debug-only
verifyPointerTypeArgs helper is also updated to reuse the map.
Measured on Eigen's basicstuff.cpp (356 lines, heavy template
instantiation): compile time dropped from 20m29s to 1m2s (20x speedup).
CIR-to-classic ratio improved from 117x to 7.2x.
Made with [Cursor](https://cursor.com)
Co-authored-by: Cursor <cursoragent at cursor.com>
[CIR] Add pass_object_size hidden parameter support (#191482)
Emit the hidden `i64` parameter that
`__attribute__((pass_object_size(N)))` requires. At call sites the size
is constant-folded when possible (e.g. `&a` → 4) and falls back to
`cir.objsize` / `@llvm.objectsize` otherwise (e.g. VLAs).
On the callee side, `buildFunctionArgList` now creates an
`ImplicitParamDecl` for each annotated parameter so that
`emitBuiltinObjectSize` can load the passed size instead of re-computing
it.
This also fixes the `llvm_unreachable("NYI")` in
`RequiredArgs::getFromProtoWithExtraSlots` and the `errorNYI` in
`appendParameterTypes` / `arrangeFreeFunctionLikeCall` that fired
whenever `hasExtParameterInfos()` was true.
New test: `clang/test/CIR/CodeGen/pass-object-size.c` (CIR / LLVM /
OGCG).
[5 lines not shown]
[MLIR][XeGPU] Support pointer/dynamic-memref sources in array-length optimization (#195872)
Extend `OptimizeCreateNdDescOp` to handle the two remaining
`create_nd_tdesc` source forms — `i64` pointer and dynamic-shape memref
— by forwarding the existing shape/strides operands through the general
builder. The memory region is unchanged by the rewrite; only the
`tensor_desc` view is narrowed along the FCD and tagged with
`array_length`.
Co-authored-by: Claude Opus 4.7 <noreply at anthropic.com>
clang: Avoid dummy LAST entry in OffloadArch (#195952)
Use this as an alias of the final entry, rather than its
own enum value. This will allow writing covered switches
that don't need to handle this case. This matches how
other places with an end enum entry handle this.
[mlir][acc] Improve implicit deviceptr detection for alias (#195934)
The ACCImplicitData automatically is able to use deviceptr clause when
variable is detected as being device data. However, it was missing check
for own `acc declare deviceptr` attribute.
[LLDB] Fix UBSan issue with ValueType enums. (#195540)
ValueTypeSyntheticMask, when bitwise OR'd with ValueType enums, produces
a value that is outside the official enum range for ValueTypes. This
causes UBSan errors, when UBSan is set to check enum values. E.g. If you
build LLDB with the Cmake flags
-DCMAKE_CXX_FLAGS="-fsanitize=enum -fsanitize-trap=enum"
-DCMAKE_C_FLAGS="-fsanitize=enum -fsanitize-trap=enum"
Then try to run the LLDB test TestScripedFrameProvider, it crashes with
a SIGILL from UBSan.
This change fixes that by pulling ValueTypeSyntheticMask into the
ValueType enums, expanding the valid enum range and making the bitwise
OR'd values valid.
[gn] use action() instead of copy() for libcxx headers (#195948)
copy() doesn't handle file deletions. Use an action() that syncs the
output directory with the input list via a response file, removing files
that are no longer in the list.
This works because if files are added or removed, ninja's command line
tracking re-runs the script, and if contents of existing files change,
ninja's input mtime checking reruns it.
This also makes the remove_float_h workaround unnecessary.
Motivated by all the recent header removals in libc++.
clang: Avoid dummy LAST entry in OffloadArch
Use this as an alias of the final entry, rather than its
own enum value. This will allow writing covered switches
that don't need to handle this case. This matches how
other places with an end enum entry handle this.
[SSAF][WPA] Add "no-op" PointerFlow and UnsafeBufferUsage analysis (#193089)
Added 'no-op' PointerFlow and UnsafeBufferUsage analyses to convert
summary data into AnalysisResult, which DerivedAnalysis can then consume.
Also, refactored the PointerFlow and UnsafeBufferUsage serialization
for code sharing.
rdar://174874942
---------
Co-authored-by: Balázs Benics <benicsbalazs at gmail.com>
Co-authored-by: Jan Korous <jkorous at apple.com>
[flang][cuda] Fix unregistered allocator (#195924)
#194290 changed how we register the constructor and made an early return
which then miss to add the constructor to `llvm.mlir.global_ctors` which
leads to runtime failure because the allocators for CUDA Fortran are not
registered.
[AMDGPU] Implement -amdgpu-spill-cfi-saved-regs
These spills need special CFI anyway, so implementing them directly
where CFI is emitted avoids the need to invent a mechanism to track them
from ISel.
Change-Id: If4f34abb3a8e0e46b859a7c74ade21eff58c4047
Co-authored-by: Scott Linder scott.linder at amd.com
Co-authored-by: Venkata Ramanaiah Nalamothu VenkataRamanaiah.Nalamothu at amd.com
[AMDGPU] Implement CFI for CSR spills
Introduce new SPILL pseudos to allow CFI to be generated for only CSR
spills, and to make ISA-instruction-level accurate information.
Other targets either generate slightly incorrect information or rely on
conventions for how spills are placed within the entry block. The
approach in this change produces larger unwind tables, with the
increased size being spent on additional DW_CFA_advance_location
instructions needed to describe the unwinding accurately.
Change-Id: I9b09646abd2ac4e56eddf5e9aeca1a5bebbd43dd
Co-authored-by: Scott Linder <scott.linder at amd.com>
Co-authored-by: Venkata Ramanaiah Nalamothu <VenkataRamanaiah.Nalamothu at amd.com>