[MLIR] Support dynamic traits in `DynamicDialect` (#177735)
Unlike Interfaces, Traits in MLIR are static: they are defined via CRTP
templates and used as base classes of an `Op`, which makes them
difficult to attach to an op dynamically.
However, in IRDL and the Python bindings, we define operations
dynamically through `DynamicDialect`, which means the traditional static
traits cannot be applied to them. Traits are important, for example,
they are how MLIR marks an op as a terminator or a non-terminator.
If `DynamicDialect` does not support traits, then even though we can
define an op with regions, we cannot define new terminators or mark an
op as a non-terminator. This makes `DynamicDialect` very limited in
region-related scenarios.
In this PR, we introduce a `DynamicOpTrait` type that “dynamizes”
`OpTrait`, enabling traits to be attached to ops in `DynamicDialect`.
The key design goal is that existing checks in the MLIR codebase such as
[9 lines not shown]
[InstCombine] Combine `select(C0, select(C1, b, a), b)` -> `select(C0&&!C1, a, b)` (#177410)
Fixes #82350
Address cases like:
```
select(C0, select(C1, b, a), b) -> select(C0&!C1, a, b)
select(C0, a, select(C1, b, a)) -> select(C0|!C1, a, b)
```
It seem that it generates better code for the real world examples for
the few targets I have checked: https://godbolt.org/z/KeEMd9b8E .
On the most generic case it generates the same assembly code for the
sources and targets for all targets, expect RISC-V, where the targets
seem shoretr and better (less branching):
https://godbolt.org/z/3has1Td5G So I did not experience any regression
on any target in no scenario.
Proofs: https://alive2.llvm.org/ce/z/DoL3zQ
InstCombine: Fold known-qnan results to a literal nan
Previously we only considered fcNan to fold to qnan for canonicalizing
results, ignoring the simpler case where we know the nan is already
quiet.
[test][NFC] Add more keys to test SDKSettings files (#177538)
Every time DarwinSDKInfo reads a new key out of SDKSettings, a boatload
of test SDKSettings files need to be updated across several repositories
and forks and branches. It’s tedious to be careful to update those with
real values so that the tests are properly regression testing older
SDKs. It’s important to be careful so that the tests are accurate, e.g.
to prevent the scenario where DarwinSDKInfo starts reading a new key out
of SDKSettings and assumes that it’s always available everywhere, when
in reality it was only added a few releases ago and will break with
older SDKs. If the test SDKSettings files continue to be updated ad hoc,
it’s going to be really easy to copy/paste a default value everywhere,
and then clients will see incorrect behaviors with the real SDKs, or
even compiler crashes if the key is unconditionally read. Preemptively
add all of the maybe-possibly-compiler relevant keys to the test
SDKSettings files from the real SDKs so that the test files are an
accurate representation and shouldn't need to be touched in the future.
Where the test SDKSettings have intentionally doctored data, add a
Comments key explaining what is changed from the real SDK, and alter the
SDK name with a tag indicating the change.
[WebAssembly] Fix exception handling initialization order in TargetMachine constructor (#177542)
The WebAssemblyTargetMachine constructor had an ordering issue where
initAsmInfo() was called before basicCheckForEHAndSjLj(). This caused
problems in incremental compilation scenarios where:
1. `initAsmInfo()` sets `MCAsmInfo` exception type based on
`Options.ExceptionModel`
2. But `Options.ExceptionModel` might still be None at this point
3. `basicCheckForEHAndSjLj()` runs later and updates
`Options.ExceptionModel`
based on command-line flags like `-wasm-enable-eh`
4. `MCAsmInfo` retains the incorrect exception type (`None` instead of
`Wasm`)
5. This prevents WebAssembly exception handling passes from running
The fix swaps the order so basicCheckForEHAndSjLj() runs first to
establish the correct exception model before initAsmInfo() configures
MCAsmInfo based on that model.
[2 lines not shown]
[ELF] Set vna_flags to VER_FLG_WEAK if all references are weak (#176673)
When all undefined references to a version are weak, set vna_flags to
VER_FLG_WEAK in the .gnu.version_r section. This enables glibc ld.so to
report a warning instead of an error when the required version is not
found at runtime, supporting optional dependencies.
Per https://sourceware.org/bugzilla/show_bug.cgi?id=24718#c20 ,
glibc rtld since 2.30 (BZ #24741) tolerates missing versioned symbols
when the runtime shared object defines the required version. With this
vna_flags VER_FLG_WEAK change, rtld can also tolerate a completely
missing version, printing a message like:
```
% LD_PRELOAD=c2.so ./a
./a: /tmp/t/v2/c2.so: weak version `v1' not found (required by /tmp/t/v2/b.so)
a
```
[2 lines not shown]
[Flang][OpenMP][Offload] Modify MapInfoFinalization to handle attach mapping and 6.1's ref_* and attach map keywords
This PR is one of four required to implement the attach mapping semantics in Flang, alongside the
ref_ptr/ref_ptee/ref_ptr_ptee map modifiers and the attach(always/never/auto) modifiers.
This PR is the MapInfoFinalization changes required to support these features, it mainly deals with
applying the correct attach map type and manipulating the descriptor types maps for base address
and descriptor so that when we specify ref_ptr/ref_ptee we emit one of the two maps and when we
emit ref_ptr_ptee we emit our usual default maps. In all cases we add the "glue" of an new
attach map except in cases where a user has provided attach never. In cases where we are
provided an always, we apply the always map type to our attach maps.
It's important to note the runtime has a toggle for the auto map behaviour, which will flip the
attach behaviour to the newer semantics or the older semantics for backwards compatability (outside
the purview of this PR but good to mention).
[lldb] Fix data buffer regression in ObjectFile (#177724)
This fixes a regression in `ObjectFile` and `ObjectFileELF` introduced
by #171574.
The original code created a `DataBuffer` using `MapFileDataWritable`.
```
data_sp = MapFileDataWritable(*file, length, file_offset);
if (!data_sp)
return nullptr;
data_offset = 0;
```
The new code requires converting the `DataBuffer` to a `DataExtractor`:
```
DataBufferSP buffer_sp = MapFileDataWritable(*file, length, file_offset);
if (!buffer_sp)
[11 lines not shown]
[CodeGen][NPM] Specify Loop pass adaptor to not use MSSA (#176690)
this needs to be done since "loop-mssa" adapter assumes all passes that
are part of it to preserve MSSA, CanonicalizeFreezeInLoopsPass doesen't
do this. I'm not really sure of the history here (about having two
variants of loop pass adatpters)
[bazel] Fixes for compiler-rt Bazel build rules (#177287)
Update the compiler-rt arch-specific file groups to include `.h` file
extensions. At least `arm` and `ppc` have these, and seems better to be
consistent and defensive.
Also add `5` to model list for outlined atomics, matching CMake.
[AMDGPU][GFX1250] Optimize s_wait_xcnt for back-to-back atomic RMWs (#177620)
This patch optimizes the insertion of s_wait_xcnt instruction for
sequences of atomic read-modify-write (RMW) operations in the
SIInsertWaitcnts pass. The Memory Legalizer conservatively inserts a
soft xcnt instruction before each atomic RMW operation as part of PR
168852, which is correct given the nature of atomic operations.
However, for back-to-back atomic RMWs, only the first s_wait_xcnt is
necessary for better runtime performance. This patch tracks atomic
RMW blocks within each basic block and removes redundant soft xcnt
instructions, keeping only the first wait in each sequence. An atomic
RMW block continues through subsequent atomic RMWs and non-memory
instructions (e.g., ALU operations) but is broken by CU-scoped memory
operations, atomic stores, or basic block boundaries.
Cleanup: Remove SmallVector hacks (#177667)
We no longer support any platform that uses Clang 5. This was a
workaround for older clang versions where template arguments weren't
merged between forward declarations and definitions correctly.
Since we don't support anything this old anymore, we can drop this
workaround.
Note: I do not have merge permissions.
[flang] Added ConditionallySpeculatable and Pure for some FIR ops. (#174013)
This patch implements `ConditionallySpeculatable` interface for some
FIR operations (`embox`, `rebox`, `box_addr`, `box_dims` and `convert`).
It also adds `Pure` trait for `fir.shape`, `fir.shapeshift`,
`fir.shift` and `fir.slice`.
I could have split this into multiple patches, but the changes
are better tested together on real apps, and the amount of affected
code is small.
There are more `NoMemoryEffect` operations for which I am planning
to do the same in future PRs.
[flang] Support cuf.device_address in FIR AliasAnalysis. (#177518)
Support `cuf.device_address` same way as `fir.address_of`.
This implementation implies that the host address and the device
address `MustAlias` (as shown in the new test). This should be
conservatively correct as long as `MustAlias` does not allow
to assume that the actual addresses are the same (that is what
LLVM documentation implies, I believe).
It is probably worth adding an operation interface to handle
`fir::AddrOfOp` and `cuf::DeviceAddressOp` in FIR AliasAnalysis,
but for the initial implementation I hardcoded the checks.
I also removed the call to `fir::valueHasFirAttribute` that performs
on demand SymbolTable lookups, which may be costly, and added
SymbolTable caching in FIR AliasAnalysis object. Anyway,
`fir::valueHasFirAttribute` does not work for `cuf::DeviceAddressOp`.
[HIP] Provide implicit include to ROCm library directory (#177704)
Summary:
It's more correct to directly link the HIP runtime if we know the path,
however some users were relying on the old `-L` to pass in some other
non-standard HIP libraries. Put that part back in for now.
[lldb] Avoid redundant calls to `std::shared_ptr::get` (NFC) (#177720)
Avoid redundant calls to `std::shared_ptr::get()`. The class provides a
dereference operator and using that is the standard, idiomatic way to
access the underlying object.