[Object][ELF] Pass Error to WarningHandler
Warning consumers may need to handle errors based on their type. Pass
the Error object instead of a string representation to enable this. This
also brings WarningHandler in line with Support/WithColor.h.
[LV] Fix the cost of first order recurrence splice (#192473)
The index had the wrong sign (for splice.right, the sign is negative),
which meant that it calculates the cost of a splice.left operation. For
SVE this makes a difference because a splice.left is lowered using an
unpredicated EXT instruction, whereas a splice.right is lowered using a
predicated SPLICE instruction, which needs a slightly higher cost.
The change in `reduction-recurrence-costs-sve.ll` happens because the
vector loop is now less profitable (higher cost) and therefore requires
a higher trip-count to be profitable (hence the extra umax).
[LifetimeSafety] Handle xvalue operand of LValueToRValue cast (#192312)
Under C++23, P2266 wraps the operand of `return p;` in an xvalue NoOp
cast for by-value parameters. The `CK_LValueToRValue` branch in
FactsGenerator guarded on `!SubExpr->isLValue()`, breaking origin flow
and silencing the suggestion for `int* id(int* p) { return p; }`.
Use `isGLValue()`, matching how origins are built and stripped elsewhere
in the analysis.
Only add a RUN in suggestion test file, since some tests in
`warn-lifetime-safety.cpp` cause a hard error under C++23. For example:
`MyObj& f() { MyObj s; return s; }`. `error: non-const lvalue reference
to type 'MyObj' cannot bind to a temporary of type 'MyObj'`.
Fixes: #176292
[Flang] Adding first lowering for the allocation and deallocation of coarrays (#182110)
This PR add support of coarray allocation and deallocation in Flang and
adds two new operations to MIF:
- `mif::AllocaCoarrayOp` : Allocates a coarray
using `prif_allocate_coarray` PRIF procedure.
- `mif::DeallocaCoarrayOp` : Deallocates a coarray
using `prif_deallocate_coarray` PRIF procedure
This PR does not yet handle allocation for the following cases (which
will be added in future PRs):
- Coarrays with ALLOCATABLE and/or POINTER components (PRIF has
procedures (`prif_(de)allocate`) for this).
- Coarray dummy arguments (PRIF has also procedures for this)
- Finalization of coarrays
- non-ALLOCATABLE SAVE coarrays outside the scoping unit of the main
program (e.g. non-ALLOCATABLE coarrays declared in a module or a
procedure)
[5 lines not shown]
[mlir][arith] Add rounding mode flags to binary arithmetic operations (#188458)
Add rounding mode flags for `addf`, `subf`, `mulf`, `divf`. This
addresses a TODO in the op description.
The folder now takes into account the specified rounding mode. If no
rounding mode is specified, the folders/canonicalizations default to
`rmNearestTiesToEven`. (This behavior has not changed.) This is
documented in the top-level arith dialect documentation. The default
arith rounding mode applies only to "internal" transformations such as
foldings/canonicalizations. In case of an unspecified explicit rounding
mode, the runtime behavior is up to the target backend.
Also add a lowering to LLVM intrinsics such as
`llvm.intr.experimental.constrained.fadd`.
Assisted-by: claude-4.6-opus-high
[LV] Replace "BinOp" with "ExtendedOp" in partial reduction transforms (NFCI) (#192422)
"BinOp" as not been accurate for a while (as it's sometime just an
extend). After #188043, it can now also be an "abs" in some cases.
This patch renames "BinOp" to "ExtendedOp" (in line with
matchExtendedReductionOperand). It also updates some doc comments and
tweaks matching the "ExtendedOp" in transformToPartialReduction.
[MLIR] make One-Shot and SCF bufferization TensorLikeType-aware (#189073)
Fix bufferization inconsistencies between builtin tensor types and
custom TensorLikeType implementations across One-Shot analysis/module
paths and SCF bufferization interfaces.
The main issue was a mix of TensorType/RankedTensorType checks in places
that need TensorLikeType-aware handling. This could leave
function-boundary equivalence/aliasing incomplete for custom tensor-like
types, leading to spurious SCF loop equivalence verification failures.
This change:
- switches relevant One-Shot analysis/module checks from TensorType/
RankedTensorType to TensorLikeType;
- updates generic/default aliasing utilities to treat TensorLikeType
consistently;
- updates SCF BufferizableOpInterface implementations
(for/while/if/yield related paths) to use TensorLikeType/BufferLikeType
where appropriate;
[16 lines not shown]
[mlir][vector] Fold poison operands into vector.shuffle mask (#190932)
Fold poison operands into the `vector.shuffle` mask. This commit also
splits up the `vector::ShuffleOp::fold` implementation into multiple
helper functions.
Assisted-by: claude-4.6-opus-high
[NFC][mlir][shard] Unify MoveLastSplitAxisPattern/MoveLastSplitAxisPattern (#192295)
Made MoveLastSplitAxisPattern more general to also cover MoveLastSplitAxisPattern.
Less code, same functionality.
Assisted by claude.
[LV][RISCV] Fix incorrect pointer operand in interleaved access tests. nfc (#192464)
In some load cases, the index 1 member used the same pointer as the
index 0 member. This patch corrected the pointer use.
[CFI] Extract DropTypeTestsPass from LowerTypeTestsPass (#192578)
This patch introduces `DropTypeTestsPass` as a dedicated pass
to handle the dropping of type tests. Previously, this was handled
by `LowerTypeTestsPass` with a specific parameter.
By splitting this into its own pass, we simplify the pass pipeline
construction and make the intent clearer in `PassRegistry.def` and
various pipeline builders.
It's almost NFC, if not opt command line changes.
[libc][nfc] Fix ucontext buildbot failure with noexcept (#192343) (#192601)
Added noexcept to getcontext and setcontext declarations and definitions
to resolve missing attribute warning on aliases.
This fixes failures on builders using GCC like
libc-x86_64-debian-gcc-fullbuild-dbg.
[AMDGPU][ASAN] Move allocas to entry block in amdgpu-sw-lower-lds pass (#190772)
The `amdgpu-sw-lower-lds` pass inserts a workitem-0 check, malloc, and
barrier before the original entry block, creating a new entry block.
This pushes the original allocas into a non-entry block, causing LLVM to
treat them as dynamic allocas.
AMDGPU backend generates incorrect flat addresses for dynamic alloca
addrspacecasts at -O0, causing memory faults when ASan is enabled with
LDS.
This PR hoists constant-size allocas to the new entry block so they
remain static.
[AMDGPU] Report only local per-function resource usage when object linking is enabled
With object linking the linker aggregates resource usage across TUs via
`.amdgpu.info`, so compile-time pessimism and call-graph propagation duplicate
the linker's work or pollute its inputs.
In this mode, skip the per-callsite conservative bumps in
`AMDGPUResourceUsageAnalysis` and assign each resource symbol in
`AMDGPUMCResourceInfo` a concrete local constant instead of building call-graph
max/or expressions.
[lldb] Add synthetic variable support to Get*VariableList.
This patch adds a new flag to the lldb_private::StackFrame API to get variable lists: `include_synthetic_vars`. This allows ScriptedFrame (and other future synthetic frames) to construct 'fake' variables and return them in the VariableList, so that commands like `fr v` and `SBFrame::GetVariables` can show them to the user as requested.
This patch includes all changes necessary to call the API the new way - I tried to use my best judgement on when to include synthetic variables or not and leave comments explaining the decision.
As a consequence of producing synthetic variables, this patch means that ScriptedFrame can produce Variable objects with ValueType that contains a ValueTypeExtendedMask in a high bit. This necessarily complicates some of the switch/case handling in places where we would expect to find such variables, and this patch makes best effort to address all such cases as well. From experience, they tend to show up whenever we're dealing with checking if a Variable is in a specified scope, which means we basically have to check the high bit against some user input saying "yes/no synthetic variables".
stack-info: PR: https://github.com/llvm/llvm-project/pull/181501, branch: users/bzcheeseman/stack/9
[clang][bytecode] Mark pointers destroyed in destructors (#192460)
We didn't use to do this at all, so calling the destructor explicitly
twice in a row wasn't an error. Calling it and accessing the object
afterwards wasn't an error either.
[LoongArch] Add support for vector FP_ROUND from vxf64 to vxf32
In LoongArch, [x]vfcvt.s.d intstructions require two vector registers
for v4f64->v4f32, v8f64->v8f32 conversions.
This patch handles these cases:
- For FP_ROUND v2f64->v2f32(illegal), add a customized v2f32 widening
to convert it into a target-specific LoongArchISD::VFCVT.
- For FP_ROUND v4f64->v4f32, on LSX platforms, v4f64 is illegal and will
be split into two v2f64->v2f32, resulting in two LoongArchISD::VFCVT.
Finally, they are combined into a single node during combining
LoongArchISD::VPACKEV. On LASX platforms, v4f64->v4f32 can directly
lower to vfcvt.s.d in lowerFP_ROUND.
- For FP_ROUND v8f64->v8f32, on LASX platforms, v8f64 is illegal and
will be split into two v4f64->v4f32 and then combine using
ISD::CONCAT_VECTORS, so xvfcvt.s.d is generated during its
combination.