[LoongArch] Fix incorrect reciprocal sqrt estimate semantics (#187621)
The current implementation of getSqrtEstimate() has incorrect semantics
when using `FRSQRTE`.
`FRSQRTE` computes an approximation to 1/sqrt(x), but the existing code
multiplies the estimate by the operand when Reciprocal is true. This
results in returning sqrt(x) instead of 1/sqrt(x), effectively reversing
the intended semantics of the 'Reciprocal' flag.
Additionally, the implementation does not properly account for LLVM's
Newton-Raphson refinement pipeline. When refinement steps are requested,
the initial estimate must be in reciprocal form so that the generic
DAGCombiner can apply NR iterations correctly.
This patch fixes the behavior by:
- Returning the raw FRSQRTE result when Reciprocal is true, or when
refinement steps are required.
[13 lines not shown]
[RISCV] Fix ssub_sat cost model to use signed VSSUB instead of VSSUBU (#188195)
Intrinsic::ssub_sat was incorrectly mapped to RISCV::VSSUBU_VV (unsigned
saturating subtract) instead of RISCV::VSSUB_VV (signed saturating
subtract), causing wrong cost estimates.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply at anthropic.com>
[flang][acc] Handle fir.undefined with OutlineRematerializationOpInterface in OffloadLiveInValueCanonicalization (#188325)
Example:
```fortran
!$ACC KERNELS PRESENT(CG, W1)
CG(1:W1%WDES1%NPL, NN) = W1%CPTWFP(1:W1%WDES1%NPL)
CPROJ(:, NN) = W1%CPROJ(1:SIZE(CPROJ,1))
!$ACC END KERNELS
```
When compiling OpenACC kernels containing array section assignments of
rank-2 arrays with a scalar index in one dimension (e.g. `CG(1:NPL,
NN)`), the Fortran lowering creates a `fir.slice` where collapsed
(scalar) dimensions use `fir.undefined index` as the stop/step values.
`SliceOp::getOutputRank()` relies on `getDefiningOp()` returning
`fir::UndefOp` to identify these collapsed dimensions and compute the
correct output rank.
When `fir.undefined` values defined outside an offload region are used
[15 lines not shown]
[libc] Wrong guards for `totalorderbf16` and `totalordermagbf16` (#188241)
Currently the guards for `totalorderbf16` and `totalordermagbf16` are as
follows:
```
#ifndef LLVM_LIBC_SRC_MATH_TOTALORDERMAGF16_H
#define LLVM_LIBC_SRC_MATH_TOTALORDERMAGF16_H
-
#endif // LLVM_LIBC_SRC_MATH_TOTALORDERMAGF16_H
```
and
```
#ifndef LLVM_LIBC_SRC_MATH_TOTALORDERF16_H
#define LLVM_LIBC_SRC_MATH_TOTALORDERF16_H
-
#endif // LLVM_LIBC_SRC_MATH_TOTALORDERF16_H
```
As we can see these are for F16 and not BF16 .
This Pr intends to fix that with correct guards as `TOTALORDERBF16` and
`TOTALORDERMAGBF16`
[lld][WebAssembly] Propagate +atomics for ThinLTO when using --shared-memory (#188381)
When compiling WebAssembly with ThinLTO, functions are partitioned into
isolated `.bc` modules and dispatched to individual LTO backend threads.
During code generation, the `CoalesceFeaturesAndStripAtomics` pass
iterates over the module to gather the union of target features (like
`+atomics`) attached to defined functions. In particular when not using
threads, it lowers away atomics and TLS variables to their
single-threaded equivalents.
However, if a partitioned module only contains globally defined TLS
variables (e.g. there are no functions, or all functions were fully
inlined or stripped by dropDeadSymbols before ThinLTO optimization), the
module becomes completely devoid of function definitions. The coalescing
pass then falls back to fetching features from the `TargetMachine`.
Because in LTO the `TargetMachine` defaults to a generic target without
atomics enabled, the TLS is lowered away and the `wasm-feature-atomics`
flag is omitted from the resulting ThinLTO object partition, causing
`wasm-ld` to immediately reject it.
[8 lines not shown]
[clang-tidy] Add missing #include insertion in macros for modernize-use-std-format (#188247)
Added missing ``#include`` insertion when the format function call
appears as an argument to a macro.
Part of #175183
---------
Co-authored-by: Victor Chernyakin <chernyakin.victor.j at outlook.com>
[SandboxIR][Tracker] Test UncondBrInst CondBrInst setters (#187549)
This checks the `setCondition()` and `setSuccessor()` setters introduced
in #187196.
[lldb] use the Py_REFCNT() macro instead of directly accessing member (#188161)
[PyObject members are not to be accessed
directly](https://docs.python.org/3/c-api/structures.html#c.PyObject),
but rather through macros, in this case `Py_REFCNT()`.
In most, ie Global Interpreter Lock-enabled, CPython cases,
`Py_REFCNT()` expands to accessing `ob_refcnt` anyway. However, in a
free-threaded CPython, combined with disabling the limited API (since it
requires the GIL for now), the direct member does not exist, causing the
build to fail. The macro expands to the correct access method in the
free-threaded configuration.
[libc++] Fix type confusion in hash_{,multi}map
The type `__gnu_cxx::hash_{,multi}map` creates objects of type
`std::pair<Key, Value>` and returns pointers to them of type
`std::pair<const Key, Value>`. If either `Key` or `Value` are
non-standard-layout, this is UB, and is furthermore considered by
pointer field protection to be a type confusion, which leads to a
program crash. Fix it by using the correct type for the pair's storage
and using const_cast to form a pointer to the key in the one place where
that is needed.
Reviewers: ldionne
Reviewed By: ldionne
Pull Request: https://github.com/llvm/llvm-project/pull/183223
[mlir][mem2reg] Process direct uses inside other regions. (#188359)
We need to add the regions with the direct uses into the list
for processing, otherwise the direct uses will not be removed
and will use the slot after the promotion.
The added LIT test was triggering "after promotion, the slot pointer
should not be used anymore" assertion.
[lldb][DWARFASTParserClang] Handle pointer-to-member-data non-type template (#187598)
## Description
### Problem
MakeAPValue in DWARFASTParserClang.cpp did not handle
pointer-to-member-data non-type template parameters (e.g., template <int
S::*P>), causing LLDB to produce incorrect results or crash.
DWARF encodes pointer-to-member-data NTTPs as
`DW_TAG_template_value_parameter` with a `DW_AT_const_value`
representing the byte offset of the member within the containing struct.
MakeAPValue is responsible for converting this value into a clang
APValue, but it only handled integer/enum and floating-point types. For
pointer-to-member types, it returned `std::nullopt`.
This caused the caller (ParseTemplateDIE) to fall back to creating a
type-only TemplateArgument (kind=Type) instead of a value-carrying one.
When two specializations differ only by which member they point to
[209 lines not shown]