[WebAssembly] Fix exception handling initialization order in TargetMachine constructor (#177542)
The WebAssemblyTargetMachine constructor had an ordering issue where
initAsmInfo() was called before basicCheckForEHAndSjLj(). This caused
problems in incremental compilation scenarios where:
1. `initAsmInfo()` sets `MCAsmInfo` exception type based on
`Options.ExceptionModel`
2. But `Options.ExceptionModel` might still be None at this point
3. `basicCheckForEHAndSjLj()` runs later and updates
`Options.ExceptionModel`
based on command-line flags like `-wasm-enable-eh`
4. `MCAsmInfo` retains the incorrect exception type (`None` instead of
`Wasm`)
5. This prevents WebAssembly exception handling passes from running
The fix swaps the order so basicCheckForEHAndSjLj() runs first to
establish the correct exception model before initAsmInfo() configures
MCAsmInfo based on that model.
[2 lines not shown]
[ELF] Set vna_flags to VER_FLG_WEAK if all references are weak (#176673)
When all undefined references to a version are weak, set vna_flags to
VER_FLG_WEAK in the .gnu.version_r section. This enables glibc ld.so to
report a warning instead of an error when the required version is not
found at runtime, supporting optional dependencies.
Per https://sourceware.org/bugzilla/show_bug.cgi?id=24718#c20 ,
glibc rtld since 2.30 (BZ #24741) tolerates missing versioned symbols
when the runtime shared object defines the required version. With this
vna_flags VER_FLG_WEAK change, rtld can also tolerate a completely
missing version, printing a message like:
```
% LD_PRELOAD=c2.so ./a
./a: /tmp/t/v2/c2.so: weak version `v1' not found (required by /tmp/t/v2/b.so)
a
```
[2 lines not shown]
[Flang][OpenMP][Offload] Modify MapInfoFinalization to handle attach mapping and 6.1's ref_* and attach map keywords
This PR is one of four required to implement the attach mapping semantics in Flang, alongside the
ref_ptr/ref_ptee/ref_ptr_ptee map modifiers and the attach(always/never/auto) modifiers.
This PR is the MapInfoFinalization changes required to support these features, it mainly deals with
applying the correct attach map type and manipulating the descriptor types maps for base address
and descriptor so that when we specify ref_ptr/ref_ptee we emit one of the two maps and when we
emit ref_ptr_ptee we emit our usual default maps. In all cases we add the "glue" of an new
attach map except in cases where a user has provided attach never. In cases where we are
provided an always, we apply the always map type to our attach maps.
It's important to note the runtime has a toggle for the auto map behaviour, which will flip the
attach behaviour to the newer semantics or the older semantics for backwards compatability (outside
the purview of this PR but good to mention).
[lldb] Fix data buffer regression in ObjectFile (#177724)
This fixes a regression in `ObjectFile` and `ObjectFileELF` introduced
by #171574.
The original code created a `DataBuffer` using `MapFileDataWritable`.
```
data_sp = MapFileDataWritable(*file, length, file_offset);
if (!data_sp)
return nullptr;
data_offset = 0;
```
The new code requires converting the `DataBuffer` to a `DataExtractor`:
```
DataBufferSP buffer_sp = MapFileDataWritable(*file, length, file_offset);
if (!buffer_sp)
[11 lines not shown]
[CodeGen][NPM] Specify Loop pass adaptor to not use MSSA (#176690)
this needs to be done since "loop-mssa" adapter assumes all passes that
are part of it to preserve MSSA, CanonicalizeFreezeInLoopsPass doesen't
do this. I'm not really sure of the history here (about having two
variants of loop pass adatpters)
[bazel] Fixes for compiler-rt Bazel build rules (#177287)
Update the compiler-rt arch-specific file groups to include `.h` file
extensions. At least `arm` and `ppc` have these, and seems better to be
consistent and defensive.
Also add `5` to model list for outlined atomics, matching CMake.
[AMDGPU][GFX1250] Optimize s_wait_xcnt for back-to-back atomic RMWs (#177620)
This patch optimizes the insertion of s_wait_xcnt instruction for
sequences of atomic read-modify-write (RMW) operations in the
SIInsertWaitcnts pass. The Memory Legalizer conservatively inserts a
soft xcnt instruction before each atomic RMW operation as part of PR
168852, which is correct given the nature of atomic operations.
However, for back-to-back atomic RMWs, only the first s_wait_xcnt is
necessary for better runtime performance. This patch tracks atomic
RMW blocks within each basic block and removes redundant soft xcnt
instructions, keeping only the first wait in each sequence. An atomic
RMW block continues through subsequent atomic RMWs and non-memory
instructions (e.g., ALU operations) but is broken by CU-scoped memory
operations, atomic stores, or basic block boundaries.
Cleanup: Remove SmallVector hacks (#177667)
We no longer support any platform that uses Clang 5. This was a
workaround for older clang versions where template arguments weren't
merged between forward declarations and definitions correctly.
Since we don't support anything this old anymore, we can drop this
workaround.
Note: I do not have merge permissions.
[flang] Added ConditionallySpeculatable and Pure for some FIR ops. (#174013)
This patch implements `ConditionallySpeculatable` interface for some
FIR operations (`embox`, `rebox`, `box_addr`, `box_dims` and `convert`).
It also adds `Pure` trait for `fir.shape`, `fir.shapeshift`,
`fir.shift` and `fir.slice`.
I could have split this into multiple patches, but the changes
are better tested together on real apps, and the amount of affected
code is small.
There are more `NoMemoryEffect` operations for which I am planning
to do the same in future PRs.
[flang] Support cuf.device_address in FIR AliasAnalysis. (#177518)
Support `cuf.device_address` same way as `fir.address_of`.
This implementation implies that the host address and the device
address `MustAlias` (as shown in the new test). This should be
conservatively correct as long as `MustAlias` does not allow
to assume that the actual addresses are the same (that is what
LLVM documentation implies, I believe).
It is probably worth adding an operation interface to handle
`fir::AddrOfOp` and `cuf::DeviceAddressOp` in FIR AliasAnalysis,
but for the initial implementation I hardcoded the checks.
I also removed the call to `fir::valueHasFirAttribute` that performs
on demand SymbolTable lookups, which may be costly, and added
SymbolTable caching in FIR AliasAnalysis object. Anyway,
`fir::valueHasFirAttribute` does not work for `cuf::DeviceAddressOp`.
[HIP] Provide implicit include to ROCm library directory (#177704)
Summary:
It's more correct to directly link the HIP runtime if we know the path,
however some users were relying on the old `-L` to pass in some other
non-standard HIP libraries. Put that part back in for now.
[lldb] Avoid redundant calls to `std::shared_ptr::get` (NFC) (#177720)
Avoid redundant calls to `std::shared_ptr::get()`. The class provides a
dereference operator and using that is the standard, idiomatic way to
access the underlying object.
[Flang][OpenMP][Offload] Modify MapInfoFinalization to handle attach mapping and 6.1's ref_* and attach map keywords
This PR is one of four required to implement the attach mapping semantics in Flang, alongside the
ref_ptr/ref_ptee/ref_ptr_ptee map modifiers and the attach(always/never/auto) modifiers.
This PR is the MapInfoFinalization changes required to support these features, it mainly deals with
applying the correct attach map type and manipulating the descriptor types maps for base address
and descriptor so that when we specify ref_ptr/ref_ptee we emit one of the two maps and when we
emit ref_ptr_ptee we emit our usual default maps. In all cases we add the "glue" of an new
attach map except in cases where a user has provided attach never. In cases where we are
provided an always, we apply the always map type to our attach maps.
It's important to note the runtime has a toggle for the auto map behaviour, which will flip the
attach behaviour to the newer semantics or the older semantics for backwards compatability (outside
the purview of this PR but good to mention).
[CIR][NFC] Fix build after emitIntrinsicCallOp change (#177706)
The emitIntrinsicCallOp function was moved from the
architecture-specific builtin implementation files to a shared location
in CIRGenBuilderTy by https://github.com/llvm/llvm-project/pull/172735.
Unfortunately, a few changes had been merged before that change was
landed and it broke the build. This updates the broken call sites.