[clang][deps] Simplify VFS overlays (#197785)
Instead of operating on on-disk files, the scanner can be made to
operate on in-memory buffers and module names. This is facilitated by
changes to the command line and the VFS, where an imaginary file is
injected (mainly to make the driver happy). Currently, this is
implemented by functions external to the worker that take its base VFS,
wrap it with an overlay VFS, and pass it back to the worker. Since the
worker _needs_ to operate on top of the base VFS, it performs a sanity
check like so:
```c++
#ifndef NDEBUG
bool SawDepFS = false;
OverlayFS->visit(
[&](llvm::vfs::FileSystem &VFS) { SawDepFS |= &VFS == DepFS.get(); });
assert(SawDepFS && "OverlayFS not based on DepFS");
#endif
```
[6 lines not shown]
[Clang] Default to async unwind tables for amdgcn (#183148)
To avoid codegen changes when enabling debug-info (see
https://bugs.llvm.org/show_bug.cgi?id=37240) we want to
enable unwind tables by default.
There is some pessimization in post-prologepilog scheduling, and a
general solution to the problem of CFI_INSTRUCTION-as-scheduling-barrier
should be explored.
Change-Id: I83625875966928c7c4411cd7b95174dc58bda25a
Fix MSVC template parsing error in SerializationFormat (#196571)
This commit fixes a hard compilation error on Windows (when building with
Clang's MSVC compatibility mode) and a subsequent access violation that
occurred during Windows CI testing.
Root Causes:
1. When compiling with `-fms-compatibility`, Clang's two-phase template
lookup fails to resolve function-local static variables (`SavedSerialize`
and `SavedDeserialize`) captured by a local class (`ConcreteCodec`) inside
an uninstantiated template. It incorrectly assumes they are members of a
dependent base class.
2. Originally, `TypedSerializerFn` and `DeserializerFn` were typed as
`llvm::function_ref`. Storing these in static variables created dangling
pointers, as `function_ref` is a non-owning wrapper that only referenced
the temporaries decaying on the constructor's stack, causing an 0xC0000005
access violation on x64 Windows.
The Fix:
[11 lines not shown]
[LifetimeSafety] Expand diagnostic list that enables analysis (#198599)
Now, when any lifetime safety related diagnostic is not ignored, we run
the analysis.
No tests were added since this does not add new functionality.
[NVPTX] Constant fold clusterDim when reqnctapercluster is specified (#195967)
This is a follow-up of https://github.com/llvm/llvm-project/pull/191575.
Currently, NVPTX cannot fold the `cluster_nctaid.x/y/z` and
`cluster_nctarank` intrinsic calls into const values when
`reqnctapercluster` is specified, which prevents the code from further
optimization.
Therefore, in this change, we extend the `NVVMIntrRange` pass to:
- Tighten `cluster_nctaid.x/y/z` intrinsic calls to one value range,
which can be const folded in later InstCombine pass
- Tighten `cluster_nctarank` intrinsic calls to one value range when
`cluster_dim` is specified
- Tighten `cluster_ctaid.x/y/z` range attributes to use per-dimension
`cluster_dim` bounds
[clang-format] Harden annotation of operator keywords (#196768)
The star was already annotated as TT_PointerOrReference, just overwrite
it for the sake of not crashing. Also remove the annotation above, since
that would always be overwritten (or at least I don't see when not, and
there's no failed test).
Fixes #196054.
Require explicit yield in iterator op
Remove the implicit terminator trait from omp.iterator so iterator
modifiers must explicitly yield the value used to form the iterated list.
Add and update verfier and test accordingly.
Reject target map iterators without captures
Reject target map iterators until the follow-up capture-binding
representation is added since currently map_iterated on omp.target
only represents the dynamic map list and does not consider the
target-region arguments required by IsolatedFromAbove.
Simplify map iterator clause assembly
- Split MLIR map syntax into separate map_entries(...) and map_iterated(...),
removing the custom MapEntryList parser/printer.
- Moved omp.target map_iterated out of TargetOpRegion
- it now prints before the target region instead of as map_iterated_entries(...) after the region.
- Renamed LLVMIR TODO helper to clause-style checkMap.
- Added DeclareMapperInfoOp builder from DeclareMapperInfoOperands
and updated Flang call sites so they do not need to spell out newly
added operands..
[mlir][OpenMP] Add iterator support to motion clauses
Extend omp.target_data, omp.target_enter_data, omp.target_exit_data,
and omp.target_update to support `!omp.iterated<Ty>`.
This is part of feature work for #188061
Assisted with copilot
[CIR][CodeGen] Extract shared LinkModule struct and loadLinkModules helper
The bitcode-link plumbing is ABI-neutral and identical between the classic
CodeGen path and the ClangIR path. Prior to this change, each side carried
its own copy of the `LinkModule` struct and `loadLinkModules` routine;
CIRGenAction.cpp explicitly flagged the copy with a TODO.
Move the struct and loader into clang/CodeGen/ModuleLinker.{h,cpp} so both
frontends share one definition. `LinkInModules` remains per-consumer because
the classic path threads `CurLinkModule` into its diagnostic handler while
CIR does not.
This is a straight refactor -- no behavior change.
(cherry picked from commit 09f730c2ed5ae3c5e08232f8d6f4050f2c341b08)
[CIR] Move bitcode-link state from CIRGenConsumer to CIRGenAction
Match the classic CodeGenAction architecture: the action owns the
LLVMContext and the vector of LinkModules. CIRGenConsumer receives them
by reference. Bitcode files are loaded in BeginSourceFileAction (matching
OG timing) so missing-file errors fire before the translation unit is
parsed.
Prepares CIRGen for driver-side offload work (CUDA/HIP split-compilation)
that needs to inspect the llvm::Module after lowering. The previous
shape kept the LLVMContext on the stack inside HandleTranslationUnit,
making it impossible to hand the module back to the driver.
No behavior change for existing callers.
(cherry picked from commit 91f4a07c91a898423b085507289cf92ab92b5a33)
[libc][mathvec] Add exhaustive tester for SIMD math routines (#189488)
An exhaustive tester based on the scalar version.
Uses LIBC scalar math routines as a reference rather than MPFR
Also corrects a missed 1ULP value in expf when the target doesn't
support FMAs
[CIR] Fix get_method callee type for member pointer calls (#198358)
Member-pointer calls through `cir.get_method` were lowering to an
indirect
callee type that still listed the member function's implicit `this`
parameter after `createGetMethod` had already prepended the adjusted
`void*` receiver. A call like `(obj->*pmf)(arg)` therefore carried a
three-parameter `var_callee_type` but only two argument operands, and
`-fclangir -emit-llvm` failed LLVM's variadic-call verifier with
`expected var_callee_type to have at most N parameters`. Classic codegen
emits `(ptr, …)` for the same pattern.
The libc++ sweep had one remaining `frontend-crash-other` bucket hit on
`F_nullptr.pass.cpp`, which boils down to `__builtin_invoke` on a
varargs member function pointer — the same callee/operand mismatch in a
minimal repro.
The fix skips the implicit-`this` slot when cloning the member signature
into the callee function type in `createGetMethod`, and tightens
[3 lines not shown]
[CIR] Guard union ABI alignment when getLargestMember is empty (#198340)
Padding-only unions (an empty union lowered as a single `!u8i`
padding member) leave `getLargestMember()` null when CIRGen walks
record layout through MLIR's DataLayout API.
`RecordType::getABIAlignment`
then passed that null `Type` into `getTypeABIAlignment` and crashed.
This showed up compiling libc++ types such as
`std::__variant_detail::__union` nested under `common_iterator`.
Return ABI alignment `1` when there is no largest member, matching a
byte-padded empty union. This parallels how empty unions are already
handled for size (`getTypeSizeInBits` uses zero size in that situation).
Regression coverage adds a nested-union global in `empty-union.cpp`.
[mlir][LLVMIR] Allow address-of-global as a leaf in array constants (#198424)
Large `llvm.mlir.global` initializers built as nested `llvm.insertvalue`
chains make `LLVMModuleTranslation::convertGlobalsAndAliases` call
`ConstantFoldInsertValueInstruction` on every step, rebuilding the
whole `ConstantArray` each time. That is O(N²) in the number of
elements and shows up as multi-minute compiles on translation units with
huge pointer tables (SPEC CPU 2026 `gcc/insn-automata.cc` is the
motivating case; Eric Keane's `convertOperationImpl` profile matches
this
path).
This change lets `llvm.mlir.constant` carry an `ArrayAttr` of
`FlatSymbolRefAttr` leaves that name globals (not just functions), adds
a name-keyed global map beside the existing op-keyed map, and resolves
those refs in `getLLVMConstant`. A translate test checks the resulting
single LLVM constant array initializer.
[flang] Recognize effects on non-addressable resources in opt-bufferization (#198051)
opt-bufferization has been only handling `fir::DebuggingResource`
explicitly. This patch adds support for other non-addressable resources,
such as `fir::VolatileMemoryResource`. This allows merging
elemental/assign for the `volatile_src_nonvolatile_dst` example in the
updated LIT test.
[Docs] Reccomend Container Pinning (#198572)
Add some info to CI Best Practices about pinning container images to a
specific image SHA, which we agreed was a best practice in #197315 (and
maybe somewhere else, but I cannot find anything).
This updates the best practices but does not currently attempt to
actually fix all the cases where we are using unpinned container images.
[llvm-dwp] Fix incorrect ELF OS/ABI in DWP output (#198486)
I received a report internally that
https://github.com/llvm/llvm-project/pull/192112 caused issues with
lldb.
LLDB has not able to load the dwp files because of the OS mismatch
between the binary and dwp file.
Investigating, it turns out that the refactor caused DWPWriter to call
`ELFObjectFileBase::getOS()` which sets the output OS/ABI, but getOS()
returns `Triple::OSType`, not the raw `e_ident[EI_OSABI]` byte. These
enums have different numbering :( oops.
This caused certain tools that validate OS/ABI consistency between a
binary and its DWP to reject the debug info.
Fix by adding getEIdentOSABI() to ELFObjectFileBase (parallel to
getEIdentABIVersion()) and using it instead of getOS().
Assisted-by: Claude
[lldb-dap] Add missing `arguments` field to LldbDapProcessEntry (#198597)
The TypeScript interface was missing the optional `arguments` field that
`parseListProcessesOutput` reads and `pick-process` displays, breaking
the extension build.
[lldb] Don't require a real target for `target modules list -g` (#198594)
The `-g` flag lists the global module list, which doesn't need a target.
Switch to eCommandAllowsDummyTarget and error out explicitly in
DoExecute on the non-global paths when no real target is selected.
Fixes a regression introduced by #198429.
[AMDGPU] Disable dpp src1 sgpr on gfx11 (#164241)
https://github.com/llvm/llvm-project/pull/67461 enabled SGPRs as src1 by
default for all dpp opcodes with manual checks for targets where this is
not supported. In that case, isOperandLegal checked if the second
operand is legal as src0.
https://github.com/llvm/llvm-project/pull/155595 disabled this check by
removing the calls to isOperandLegal, which resulted in SGPRs being used
as operands for src1 on gfx11. This PR reenables this check and fixes
the lit test.
---------
Co-authored-by: Paul Trojahn <paul.trojahn at amd.com>