[libc++] Remove non-standard member type iterator_type from __wrap_iter (#186871)
Resolves #186801
Removed the non-standard member type `iterator_type` from `__wrap_iter`.
This member exposed the underlying iterator type, and its removal
prevents users from relying on the implementation detail.
[mlir][memref] Simplify expand_shape size/stride computation using output_shape (#187844)
This PR refactors `getExpandedSizes` and `getExpandedStrides` to compute
their results directly from the `output_shape` of `memref.expand_shape`.
Instead of reconstructing expanded sizes/strides through manual
inference, we now rely on the operation’s explicit shape information.
The previous implementation imposed the restriction that there must be
at most one dynamic size per reassociation group. This limitation is
removed by the new approach: any number of dynamic dimensions within a
group is now supported, as long as they are represented in the
`output_shape`.
As a result, the code becomes both simpler and more expressive, while
better matching the semantics of `memref.expand_shape`.
[CoroSplit] Never collect allocas used by catchpad into frame (#186728)
Windows EH requires exception objects allocated on stack. But there is
no reliable way to identify them. CoroSplit employs a best-effort
algorithm to determine whether allocas persist on the stack or the
frame, which may result in miscompilation when Windows exceptions are
used.
This patch proposes that we treat allocas used by catchpad as exception
objects and never place them on the frame. A verifier check is added to
enforce that operands of catchpad are either constants or allocas.
Close #143235 Close #153949 Close #182584
[ORC] EPCGenericJITLinkMemoryManager from SimpleNativeMemoryMap syms. (#188391)
Adds a new EPCGenericJITLinkMemoryManager convenience constructor that
constructs an instance by looking up the given symbol names in the
bootstrap JITDylib of the given ExecutionSession.
The symbol names default to the SimpleNativeMemoryMap SPS-interface
symbol names provided by the new ORC runtime.
[LoongArch] Fix incorrect reciprocal sqrt estimate semantics (#187621)
The current implementation of getSqrtEstimate() has incorrect semantics
when using `FRSQRTE`.
`FRSQRTE` computes an approximation to 1/sqrt(x), but the existing code
multiplies the estimate by the operand when Reciprocal is true. This
results in returning sqrt(x) instead of 1/sqrt(x), effectively reversing
the intended semantics of the 'Reciprocal' flag.
Additionally, the implementation does not properly account for LLVM's
Newton-Raphson refinement pipeline. When refinement steps are requested,
the initial estimate must be in reciprocal form so that the generic
DAGCombiner can apply NR iterations correctly.
This patch fixes the behavior by:
- Returning the raw FRSQRTE result when Reciprocal is true, or when
refinement steps are required.
[13 lines not shown]
[RISCV] Fix ssub_sat cost model to use signed VSSUB instead of VSSUBU (#188195)
Intrinsic::ssub_sat was incorrectly mapped to RISCV::VSSUBU_VV (unsigned
saturating subtract) instead of RISCV::VSSUB_VV (signed saturating
subtract), causing wrong cost estimates.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply at anthropic.com>
[flang][acc] Handle fir.undefined with OutlineRematerializationOpInterface in OffloadLiveInValueCanonicalization (#188325)
Example:
```fortran
!$ACC KERNELS PRESENT(CG, W1)
CG(1:W1%WDES1%NPL, NN) = W1%CPTWFP(1:W1%WDES1%NPL)
CPROJ(:, NN) = W1%CPROJ(1:SIZE(CPROJ,1))
!$ACC END KERNELS
```
When compiling OpenACC kernels containing array section assignments of
rank-2 arrays with a scalar index in one dimension (e.g. `CG(1:NPL,
NN)`), the Fortran lowering creates a `fir.slice` where collapsed
(scalar) dimensions use `fir.undefined index` as the stop/step values.
`SliceOp::getOutputRank()` relies on `getDefiningOp()` returning
`fir::UndefOp` to identify these collapsed dimensions and compute the
correct output rank.
When `fir.undefined` values defined outside an offload region are used
[15 lines not shown]
[libc] Wrong guards for `totalorderbf16` and `totalordermagbf16` (#188241)
Currently the guards for `totalorderbf16` and `totalordermagbf16` are as
follows:
```
#ifndef LLVM_LIBC_SRC_MATH_TOTALORDERMAGF16_H
#define LLVM_LIBC_SRC_MATH_TOTALORDERMAGF16_H
-
#endif // LLVM_LIBC_SRC_MATH_TOTALORDERMAGF16_H
```
and
```
#ifndef LLVM_LIBC_SRC_MATH_TOTALORDERF16_H
#define LLVM_LIBC_SRC_MATH_TOTALORDERF16_H
-
#endif // LLVM_LIBC_SRC_MATH_TOTALORDERF16_H
```
As we can see these are for F16 and not BF16 .
This Pr intends to fix that with correct guards as `TOTALORDERBF16` and
`TOTALORDERMAGBF16`
[lld][WebAssembly] Propagate +atomics for ThinLTO when using --shared-memory (#188381)
When compiling WebAssembly with ThinLTO, functions are partitioned into
isolated `.bc` modules and dispatched to individual LTO backend threads.
During code generation, the `CoalesceFeaturesAndStripAtomics` pass
iterates over the module to gather the union of target features (like
`+atomics`) attached to defined functions. In particular when not using
threads, it lowers away atomics and TLS variables to their
single-threaded equivalents.
However, if a partitioned module only contains globally defined TLS
variables (e.g. there are no functions, or all functions were fully
inlined or stripped by dropDeadSymbols before ThinLTO optimization), the
module becomes completely devoid of function definitions. The coalescing
pass then falls back to fetching features from the `TargetMachine`.
Because in LTO the `TargetMachine` defaults to a generic target without
atomics enabled, the TLS is lowered away and the `wasm-feature-atomics`
flag is omitted from the resulting ThinLTO object partition, causing
`wasm-ld` to immediately reject it.
[8 lines not shown]
[clang-tidy] Add missing #include insertion in macros for modernize-use-std-format (#188247)
Added missing ``#include`` insertion when the format function call
appears as an argument to a macro.
Part of #175183
---------
Co-authored-by: Victor Chernyakin <chernyakin.victor.j at outlook.com>