[MLIR][GPU] Update serializeToObject to use SerializedObject wrapper and include ISA compiler logs (#176697)
This PR makes the compilation log from ISA compiler available to users
by returning it as part of the `gpu::ObjectAttr` properties, following
the existing pattern like `LLVMIRToISATimeInMs`.
Currently, the compiler log (which contains useful information such as
spill statistics when --verbose is passed) is only accessible in debug
builds via `LLVM_DEBUG`. However, there are good reasons to make this
information available in release builds as well:
1. Both `ptxas` and `libnvptxcompiler` are publicly available
tools/libraries distributed with the CUDA Toolkit. The `--verbose` flag
and its output are documented public features, not internal debug
information.
2. The verbose output provides valuable insights for users.
A new `SerializedObject` class is used to carry the metadata alongside
the binary when returning from `serializeObject`.
fuzzer: modernize FuzzedDataProvider conversions (#177794)
This change modernizes FuzzedDataProvider.h now that C++17+ is standard
in LLVM.
Replace the runtime if with if constexpr in ConvertUnsignedToSigned
Make the unsigned/signed comparison explicit by casting TS::max() to TU
[flang][NFC] Converted five tests from old lowering to new lowering (part 12) (#178831)
Tests converted from test/Lower: derived-pointer-components.f90,
derived-type-finalization.f90, derived-types.f90, do_loop.f90,
do_loop_unstructured.f90
ValueTracking: Revert noundef checks in computeKnownFPClass for fmul/fma (#178850)
This functionally reverts fd5cfcc41311c6287e9dc408b8aae499501660e1 and
35ce17b6f6ca5dd321af8e6763554b10824e4ac4.
This was correct and necessary, but is causing performance regressions
since isGuaranteedNotToBeUndef is apparently not smart enough to detect
through recurrences. Revert this for the release branch.
Also the test coverage was inadequate for the fma case, so add a new
case which changes with and without the check.
AMDGPU/GlobalISel: Regbanklegalize rules for G_UNMERGE_VALUES
Move G_UNMERGE_VALUES handling to AMDGPURegBankLegalizeRules.cpp.
Fix sgpr S16 unmerge by lowering using shift and using S32.
Previously sgpr S16 unmerge was selected using _lo16 and _hi16 subreg
indexes which are exclusive to vgpr register classes.
For remaing cases we do trivial mapping, assigns same reg bank
to all operands, vgpr or sgpr.
[lldb] Fix SBBreakpointName::SetEnabled to propagate changes to breakpoints (#178734)
When setting the enabled state of a breakpoint name via the API, the
change was not being propagated to breakpoints using that name.
This was inconsistent with the CLI behaviour where `breakpoint name
configure --enable/--disable` correctly updates all associated
breakpoints.
[SPIRV] Split async copy tests and fix invalid tests
After a spirv-val update, tests that mix spirv32 and spirv64 targets with
the same LLVM IR are now correctly flagged as invalid. The SPIR-V
specification requires that NumElements and Stride operands in
OpGroupAsyncCopy must be 32-bit integers when the addressing model is
Physical32, and 64-bit integers for Physical64.
ValueTracking: Revert noundef checks in computeKnownFPClass for fmul/fma
This functionally reverts fd5cfcc41311c6287e9dc408b8aae499501660e1 and
35ce17b6f6ca5dd321af8e6763554b10824e4ac4.
This was correct and necessary, but is causing performance regressions
since isGuaranteedNotToBeUndef is apparently not smart enough to detect
through recurrences. Revert this for the release branch.
Also the test coverage was inadequate for the fma case, so add a new
case which changes with and without the check.
[GlobalISel] Insert bitcast instead of register replacement when types don't match. (#177397)
Cases like the newly added test with the vector types currently hit
```Assertion `canReplaceReg(OldReg, Replacement, MRI) && \"Cannot
replace register?\"' failed."``` because source and destination
registers have mismatching types. Apart from the assertion, it also
fails when using `--verify-machineinstrs`. This PR adds a bitcast in
those cases.
[DAG] Enable bitcast STLF for Constant/Undef (#172523)
This patch introduces support for Store-to-Load Forwarding (STLF) in
`DAGCombiner::ForwardStoreValueToDirectLoad` when the store and load
have **different types but equal memory size** (e.g., storing an `i32`
then loading a `float` from the same location).
### What this patch does:
**Enables Optimization:** It allows for the safe forwarding of the
stored value as a Bitcast when the value is:
* A **Constant** (`ConstantSDNode`, `ConstantFPSDNode`,
`ConstantPoolSDNode`).
* **Undef**.
* And the memory sizes (`LdMemSize` == `StMemSize`) match.
### Scope and Next Steps:
This patch **only implements forwarding for constant and undef values
that has the same memory size** so far.
[14 lines not shown]
[MemCpyOpt] Extend `performMemCpyToMemSetOptzn` to partially memset'd region
While doing memset-to-memcpy forwarding, take into account memset
that covers memory regions from a given offset, and the leading
bytes of such a region are undef.
Fixes: https://github.com/llvm/llvm-project/issues/172326.
[clang][bytecode] Clean up `interp::Function` parameter handling (#178621)
Replace the multiple data structures with a vector + a map holding all
`ParamDescriptor`s. Update docs.
Add test for amdgcn.if/else uniformity analysis
This test documents the current behavior where both outputs of
amdgcn.if and amdgcn.else are marked as divergent. The second
output (exec mask) should be uniform.
[Driver][Frontend] Add -f[no-]ms-anonymous-structs flag to control Microsoft anonymous struct/union extension (#176551)
Add a Clang driver option -fms-anonymous-structs and
-fno-ms-anonymous-structs
to enable or disable Microsoft anonymous struct/union support
independently of -fms-extensions.
**Motivation**:
- On some platforms (e.g. AIX), enabling `-fms-extensions` can conflict
with system headers (such as usage of `__ptr32`).
- Some codebases rely specifically on Microsoft anonymous struct/union
behavior without requiring other Microsoft extensions.
This change allows users to selectively enable the anonymous
struct/union
extension at the driver level without enabling full Microsoft
compatibility
mode.
[28 lines not shown]
[lldb-dap] Fix the completion provided to the DAP client. (#177151)
Previously, completion behavior was inconsistent,
sometimes including the partial token or removing existing user text.
Since LLDB completions includes the partial token by default, we now
strip it before sending to the client.
The completion heuristic:
1. Strip the commandEscapePrefix
2. Request completions from the debugger
3. Get the line at cursor position
4. Calculate the length of any partial token
5. Offset each completion by the partial token length
In all cases, the completion starts from the cursor position. then
offsets by `Length` to the left and inserts the completion.
Examples (single quotes show whitespace and are not part of the input):
```md
[15 lines not shown]
[mlir][shape] Fix crash in FromExtentsOp::fold with poison operands (#178844)
## Summary
- Fix assertion failure in `shape.from_extents` fold when operands
include `ub.poison`
- The fold assumed all non null attributes were `IntegerAttr`, but
poison produces a different attribute type
- Use `dyn_cast_if_present` to safely handle non integer attributes
Fixes #178820
## Test plan
- Added regression test `@from_extents_poison` in canonicalize.mlir
[ExpandIRInst] Support expanding fptoi to smaller type (#178690)
In order to support expanding fptoi where the target type is smaller,
make most of the code work on the float-as-integer type, rather than the
target type of the cast. We only need to cast the final result to the
target type, or prior to performing a left shift.
This not only allows us to handle casts to a smaller type, but also
avoids performing intermediate calculations on unnecessarily large
types.
This also matches how compiler-rt handles this:
https://github.com/llvm/llvm-project/blob/main/compiler-rt/lib/builtins/fp_fixint_impl.inc
Proof: https://alive2.llvm.org/ce/z/3pJ9pE
(Note that there is a pre-existing issue that we produce the same code
for fptosi and fptoui.)