[lldb-dap] Use MainLoop instead of a background thread in OutputRedirector. (#199970)
Replace the background thread in OutputRedirector with LLDB's MainLoop
event loop. This reduces the number of threads created and ensures file
descriptors are properly closed when no longer needed.
Since debugger's output is not I/O intensive, there is no risk of
hitting the pipe buffer limit with this approach.
[mlir][SliceAnalysis] Fix visited set to avoid infinite recursion (#200008)
Fixes #139694, which introduced use-def cycle detection during slice
analysis, but some cycles were still not detected, potentially leading
to infinite recursion.
This PR fixes the handling of the visited set, which tracks the current
DFS path during recursion. Previously, the set could fail to detect
double cycles because entries were erased even when no recursive call
was made. The insert/erase operations are now only performed when
recursion actually occurs, ensuring that cycle detection correctly
reflects the active DFS path.
[AArch64][GlobalISel] Add BF16 fabs and fneg (#198655)
These should be very simple as they are just legal or expanded based on
whether fullfp16 is available, as the FP16 FNEG and FABS instructions can
be used equally for BF16.
[flang-rt][cuda] Use a thinner I/O in CUDA build (#199769)
Reduce the footprint of IO in the CUDA build. This helps including IO
when using non relocatable device code mode.
[AtomicExpand] Preserve volatile in widenPartwordAtomicRMW. (#199722)
widenPartwordAtomicRMW widens a sub-word atomicrmw to the target's
minimum cmpxchg size by calling CreateAtomicRMW, which has no
IsVolatile parameter, and didn't copy isVolatile() from the original.
Every other expansion path in this file already does. Affects targets
whose MinCmpXchgSizeInBits exceeds the value width (RISC-V without
Zabha, LoongArch base, SPARC, AMDGPU, etc.).
This bug was found by a large run of Opus 4.7 looking for bugs in LLVM.
[ProfCheck] Fix #199174 (#200013)
The patch added another large fp conversion test, which we currently are
missing some profile annotations for, so add it to the xfail list for
now.
[flang][OpenMP] Optionally get final symbol in Get(Argument|Object)Sy… (#196816)
…mbol
Originally these functions returned the ultimate symbol for the one
obtained from the argument or object. However, this may be somewhat
unintuitive/unexpected, so instead return the original symbol, and add a
flag to optionally return the ultimate one.
[Flang][OpenMP] Support declare reduction without initializer (#196211)
For declare reduction without an explicit initializer clause, the init
callback now handles initialization inline rather than relying on the
_FortranAInitialize runtime call, which is available on the device
runtime but has known issues on GPU targets.
The initialization logic first checks whether an initializer clause is
present. If one is provided, it is used directly. Otherwise, for derived
types, the code checks whether the type has default component
initialization. If it does, each component is initialized inline:
components with explicit default values use those values, components
that are themselves derived types with defaults are recursively
initialized, and components without any default are zero-initialized.
Derived types with allocatable components that require runtime
initialization are guarded by a TODO.
Assisted by: Claude Opus 4.6
[ExpandIRInsts] Support llvm.fpto{u,s}i.sat (#199174)
Previously, running ExpandIRInsts on a program which needs to expand a
vector fptoui.sat would hit llvm_unreachable, because the `scalarize`
function didn't handle this intrinsic.
This bug was found by a large run of Opus 4.7 looking for bugs in LLVM.
[flang][OpenMP] Lower target in_reduction for host fallback
Teach Flang lowering and MLIR OpenMP translation to carry
in_reduction through omp.target for the host-fallback path.
The translation looks up task reduction-private storage with
__kmpc_task_reduction_get_th_data and binds the target region's
in_reduction block argument to that private pointer, so uses inside the
region do not keep referring to the original variable.
The patch also preserves in_reduction operands in the TargetOp builder
path and ensures target in_reduction list items are mapped into the
target region when needed.
The device/offload-entry path remains diagnosed as not yet implemented.
[InstCombine] Use sadd.sat for chained ldexp fold (#199274)
ldexp(ldexp(x, a), b) -> ldexp(x, a + b) didn't consider the fact that
`a + b` may overflow! Use a saturating add instead.
This bug was found by a large run of Opus 4.7 looking for bugs in LLVM.
[X86][AvoidStoreForwardingBlocks] Skip volatile/atomic accesses. (#199698)
The pass splits an XMM/YMM load+store pair into smaller copies when a
preceding narrower store would block store-to-load forwarding into the
load, but it didn't check the MachineMemOperand's isVolatile/isAtomic
bits.
This bug was found by a large run of Opus 4.7 looking for bugs in LLVM.
[win][x64] Updated `llvm-objdump` and `llvm-readobj` to be able to dump Windows x64 Unwind v3 information. (#199120)
Public docs:
<https://learn.microsoft.com/en-us/cpp/build/x64-unwind-information-v3?view=msvc-170>
The change adds Windows x64 unwind v3 info decoding and printing support
in LLVM, including new data structures, enums, and decoding functions to
handle the different WOD opcodes and epilog descriptors. It also updates
the dumping utilities (llvm-readobj and llvm-objdump) to correctly
interpret v3 unwind info.
[RISCV][P-ext] Make the direction argument for RVPPairShift* classes required. NFC (#199799)
It's part of the encoding. I don't think we should have a preference for
one of the bit values being the default.