LLVM/project bcc606cmlir/lib/Dialect/Shard/Transforms Partition.cpp

[NFC][mlir][shard] Unify MoveLastSplitAxisPattern/MoveLastSplitAxisPattern (#192295)

Made MoveLastSplitAxisPattern more general to also cover MoveLastSplitAxisPattern.
Less code, same functionality.
Assisted by claude.
DeltaFile
+7-99mlir/lib/Dialect/Shard/Transforms/Partition.cpp
+7-991 files

LLVM/project 8671b79llvm/test/Transforms/LoopVectorize/RISCV tail-folding-interleave.ll

[LV][RISCV] Fix incorrect pointer operand in interleaved access tests. nfc (#192464)

In some load cases, the index 1 member used the same pointer as the
index 0 member. This patch corrected the pointer use.
DeltaFile
+40-47llvm/test/Transforms/LoopVectorize/RISCV/tail-folding-interleave.ll
+40-471 files

LLVM/project 10536d4clang/lib/CodeGen BackendUtil.cpp, llvm/include/llvm/Transforms/IPO LowerTypeTests.h

[CFI] Extract DropTypeTestsPass from LowerTypeTestsPass (#192578)

This patch introduces `DropTypeTestsPass` as a dedicated pass
to handle the dropping of type tests. Previously, this was handled
by `LowerTypeTestsPass` with a specific parameter.

By splitting this into its own pass, we simplify the pass pipeline
construction and make the intent clearer in `PassRegistry.def` and
various pipeline builders.

It's almost NFC, if not opt command line changes.
DeltaFile
+50-47llvm/lib/Transforms/IPO/LowerTypeTests.cpp
+16-8llvm/include/llvm/Transforms/IPO/LowerTypeTests.h
+8-15llvm/lib/Passes/PassBuilderPipelines.cpp
+20-0llvm/lib/Passes/PassBuilder.cpp
+1-4clang/lib/CodeGen/BackendUtil.cpp
+5-0llvm/lib/Passes/PassRegistry.def
+100-7410 files not shown
+112-8616 files

LLVM/project b4e75e1libc/src/ucontext getcontext.h setcontext.h, libc/src/ucontext/x86_64 getcontext.cpp setcontext.cpp

[libc][nfc] Fix ucontext buildbot failure with noexcept (#192343) (#192601)

Added noexcept to getcontext and setcontext declarations and definitions
to resolve missing attribute warning on aliases.

This fixes failures on builders using GCC like
libc-x86_64-debian-gcc-fullbuild-dbg.
DeltaFile
+2-1libc/src/ucontext/x86_64/getcontext.cpp
+1-1libc/src/ucontext/getcontext.h
+1-1libc/src/ucontext/setcontext.h
+1-1libc/src/ucontext/x86_64/setcontext.cpp
+5-44 files

LLVM/project 19463aallvm/lib/DebugInfo/DWARF DWARFCFIPrinter.cpp

[llvm][DebugInfo] Use formatv in DWARFCFIPrinter (#191982)

This relates to #35980.
DeltaFile
+10-10llvm/lib/DebugInfo/DWARF/DWARFCFIPrinter.cpp
+10-101 files

LLVM/project ede75e5clang/lib/AST/ByteCode Interp.cpp, clang/test/AST/ByteCode cxx20.cpp

[clang][bytecode] Don't diagnose const assignments... (#192593)

... when we're in CPCE mode.
DeltaFile
+9-0clang/test/AST/ByteCode/cxx20.cpp
+5-3clang/lib/AST/ByteCode/Interp.cpp
+14-32 files

LLVM/project 81af175llvm/tools/llvm-readobj ELFDumper.cpp

fixup! [Object][ELF] Pass Error to WarningHandler
DeltaFile
+32-34llvm/tools/llvm-readobj/ELFDumper.cpp
+32-341 files

LLVM/project 218e747clang/lib/CIR/CodeGen CIRGenBuiltinAMDGPU.cpp, clang/test/CIR/CodeGenHIP builtins-amdgcn-logb-scalbn.hip

[CIR][AMDGPU] Fix FltSemantics, naming convention, and CIR APIs
DeltaFile
+52-61clang/lib/CIR/CodeGen/CIRGenBuiltinAMDGPU.cpp
+12-0clang/test/CIR/CodeGenHIP/builtins-amdgcn-logb-scalbn.hip
+64-612 files

LLVM/project 95b516eclang/lib/CIR/CodeGen CIRGenBuiltinAMDGPU.cpp, clang/test/CIR/CodeGenHIP builtins-amdgcn-logb-scalbn.hip

[CIR][AMDGPU] Adds amdgcn logb and scalebn builtins
DeltaFile
+89-10clang/lib/CIR/CodeGen/CIRGenBuiltinAMDGPU.cpp
+42-0clang/test/CIR/CodeGenHIP/builtins-amdgcn-logb-scalbn.hip
+131-102 files

LLVM/project 02bee9aclang/lib/CIR/CodeGen CIRGenBuiltin.cpp CIRGenBuiltinAMDGPU.cpp, clang/test/CIR/CodeGenHIP builtins-amdgcn-logb-scalbn.hip

[CIR][AMDGPU] Fix constrained FP and library calls path
DeltaFile
+26-1clang/lib/CIR/CodeGen/CIRGenBuiltin.cpp
+9-17clang/test/CIR/CodeGenHIP/builtins-amdgcn-logb-scalbn.hip
+2-3clang/lib/CIR/CodeGen/CIRGenBuiltinAMDGPU.cpp
+3-0clang/lib/CIR/CodeGen/TargetInfo.h
+2-0clang/lib/CIR/CodeGen/TargetInfo.cpp
+42-215 files

LLVM/project fca80b4llvm/lib/Target/AMDGPU AMDGPUSwLowerLDS.cpp, llvm/test/CodeGen/AMDGPU amdgpu-sw-lower-lds-static-alloca-placement.ll

[AMDGPU][ASAN] Move allocas to entry block in amdgpu-sw-lower-lds pass (#190772)

The `amdgpu-sw-lower-lds` pass inserts a workitem-0 check, malloc, and
barrier before the original entry block, creating a new entry block.
This pushes the original allocas into a non-entry block, causing LLVM to
treat them as dynamic allocas.

AMDGPU backend generates incorrect flat addresses for dynamic alloca
addrspacecasts at -O0, causing memory faults when ASan is enabled with
LDS.

This PR hoists constant-size allocas to the new entry block so they
remain static.
DeltaFile
+61-0llvm/test/CodeGen/AMDGPU/amdgpu-sw-lower-lds-static-alloca-placement.ll
+14-1llvm/lib/Target/AMDGPU/AMDGPUSwLowerLDS.cpp
+75-12 files

LLVM/project 9854bf4llvm/lib/Target/AMDGPU AMDGPUMCResourceInfo.cpp AMDGPUResourceUsageAnalysis.cpp, llvm/test/CodeGen/AMDGPU object-linking-local-resources.ll lds-link-time-codegen-indirect.ll

[AMDGPU] Report only local per-function resource usage when object linking is enabled

With object linking the linker aggregates resource usage across TUs via
`.amdgpu.info`, so compile-time pessimism and call-graph propagation duplicate
the linker's work or pollute its inputs.

In this mode, skip the per-callsite conservative bumps in
`AMDGPUResourceUsageAnalysis` and assign each resource symbol in
`AMDGPUMCResourceInfo` a concrete local constant instead of building call-graph
max/or expressions.
DeltaFile
+104-0llvm/test/CodeGen/AMDGPU/object-linking-local-resources.ll
+26-8llvm/lib/Target/AMDGPU/AMDGPUMCResourceInfo.cpp
+10-1llvm/lib/Target/AMDGPU/AMDGPUResourceUsageAnalysis.cpp
+4-0llvm/lib/Target/AMDGPU/AMDGPUMCResourceInfo.h
+1-1llvm/test/CodeGen/AMDGPU/lds-link-time-codegen-indirect.ll
+145-105 files

LLVM/project 2d70192lldb/include/lldb/Target StackFrame.h, lldb/source/API SBFrame.cpp

[lldb] Add synthetic variable support to Get*VariableList.

This patch adds a new flag to the lldb_private::StackFrame API to get variable lists: `include_synthetic_vars`.  This allows ScriptedFrame (and other future synthetic frames) to construct 'fake' variables and return them in the VariableList, so that commands like `fr v` and `SBFrame::GetVariables` can show them to the user as requested.

This patch includes all changes necessary to call the API the new way - I tried to use my best judgement on when to include synthetic variables or not and leave comments explaining the decision.

As a consequence of producing synthetic variables, this patch means that ScriptedFrame can produce Variable objects with ValueType that contains a ValueTypeExtendedMask in a high bit. This necessarily complicates some of the switch/case handling in places where we would expect to find such variables, and this patch makes best effort to address all such cases as well. From experience, they tend to show up whenever we're dealing with checking if a Variable is in a specified scope, which means we basically have to check the high bit against some user input saying "yes/no synthetic variables".

stack-info: PR: https://github.com/llvm/llvm-project/pull/181501, branch: users/bzcheeseman/stack/9
DeltaFile
+42-11lldb/source/API/SBFrame.cpp
+44-8lldb/source/Plugins/Process/scripted/ScriptedFrame.cpp
+31-11lldb/source/Commands/CommandObjectFrame.cpp
+27-8lldb/test/API/functionalities/scripted_frame_provider/TestScriptedFrameProvider.py
+16-2lldb/source/Target/StackFrame.cpp
+16-0lldb/include/lldb/Target/StackFrame.h
+176-407 files not shown
+217-5613 files

LLVM/project ab94dbcclang/lib/AST/ByteCode Interp.cpp Pointer.cpp, clang/test/AST/ByteCode cxx20.cpp

[clang][bytecode] Mark pointers destroyed in destructors (#192460)

We didn't use to do this at all, so calling the destructor explicitly
twice in a row wasn't an error. Calling it and accessing the object
afterwards wasn't an error either.
DeltaFile
+39-23clang/lib/AST/ByteCode/Interp.cpp
+13-25clang/lib/AST/ByteCode/Pointer.cpp
+19-8clang/test/AST/ByteCode/cxx20.cpp
+3-0clang/lib/AST/ByteCode/Compiler.cpp
+1-0clang/lib/AST/ByteCode/Descriptor.h
+1-0clang/lib/AST/ByteCode/Pointer.h
+76-562 files not shown
+78-568 files

LLVM/project 8398672llvm/test/CodeGen/LoongArch/lasx/ir-instruction fptrunc.ll, llvm/test/CodeGen/LoongArch/lsx/ir-instruction fptrunc.ll

[LoongArch][NFC] Pre-commit tests for vector fptrunc from vxf64 to vxf32 (#164058)
DeltaFile
+117-0llvm/test/CodeGen/LoongArch/lasx/ir-instruction/fptrunc.ll
+112-0llvm/test/CodeGen/LoongArch/lsx/ir-instruction/fptrunc.ll
+229-02 files

LLVM/project 2487d43llvm/lib/Target/LoongArch LoongArchISelLowering.cpp

stricter restrictions on original types
DeltaFile
+5-4llvm/lib/Target/LoongArch/LoongArchISelLowering.cpp
+5-41 files

LLVM/project e1878d7llvm/lib/Target/LoongArch LoongArchISelLowering.cpp LoongArchLSXInstrInfo.td, llvm/test/CodeGen/LoongArch/lasx/ir-instruction fptrunc.ll

[LoongArch] Add support for vector FP_ROUND from vxf64 to vxf32

In LoongArch, [x]vfcvt.s.d intstructions require two vector registers
for v4f64->v4f32, v8f64->v8f32 conversions.

This patch handles these cases:
- For FP_ROUND v2f64->v2f32(illegal), add a customized v2f32 widening
  to convert it into a target-specific LoongArchISD::VFCVT.
- For FP_ROUND v4f64->v4f32, on LSX platforms, v4f64 is illegal and will
  be split into two v2f64->v2f32, resulting in two LoongArchISD::VFCVT.
  Finally, they are combined into a single node during combining
  LoongArchISD::VPACKEV. On LASX platforms, v4f64->v4f32 can directly
  lower to vfcvt.s.d in lowerFP_ROUND.
- For FP_ROUND v8f64->v8f32, on LASX platforms, v8f64 is illegal and
  will be split into two v4f64->v4f32 and then combine using
  ISD::CONCAT_VECTORS, so xvfcvt.s.d is generated during its
  combination.
DeltaFile
+131-0llvm/lib/Target/LoongArch/LoongArchISelLowering.cpp
+10-38llvm/test/CodeGen/LoongArch/lasx/ir-instruction/fptrunc.ll
+5-22llvm/test/CodeGen/LoongArch/lsx/ir-instruction/fptrunc.ll
+7-0llvm/lib/Target/LoongArch/LoongArchLSXInstrInfo.td
+2-0llvm/lib/Target/LoongArch/LoongArchLASXInstrInfo.td
+1-1llvm/lib/Target/LoongArch/LoongArchISelLowering.h
+156-616 files

LLVM/project 2acb741llvm/lib/Target/LoongArch LoongArchISelLowering.cpp, llvm/test/CodeGen/LoongArch/lasx/ir-instruction fptrunc.ll

fixes according reviews
DeltaFile
+4-27llvm/test/CodeGen/LoongArch/lasx/ir-instruction/fptrunc.ll
+2-12llvm/test/CodeGen/LoongArch/lsx/ir-instruction/fptrunc.ll
+9-2llvm/lib/Target/LoongArch/LoongArchISelLowering.cpp
+15-413 files

LLVM/project 7cb8a80llvm/lib/Target/LoongArch LoongArchLASXInstrInfo.td LoongArchLSXInstrInfo.td, llvm/lib/Target/LoongArch/AsmParser LoongArchAsmParser.cpp

Address wanglei's comments
DeltaFile
+12-24llvm/lib/Target/LoongArch/LoongArchLASXInstrInfo.td
+12-24llvm/lib/Target/LoongArch/LoongArchLSXInstrInfo.td
+1-4llvm/lib/Target/LoongArch/LoongArchInstrInfo.td
+0-1llvm/lib/Target/LoongArch/AsmParser/LoongArchAsmParser.cpp
+25-534 files

LLVM/project 85f485ellvm/lib/Target/LoongArch LoongArchLSXInstrInfo.td LoongArchLASXInstrInfo.td, llvm/test/CodeGen/LoongArch/lsx/ir-instruction and.ll xor.ll

[LoongArch] Select `V{AND,OR,XOR,NOR}I.B` for bitwise with byte splat immediates

The `V{AND,OR,XOR,NOR}I.B` instructions operate on byte elements and accept
an 8-bit immediate. However, when the same byte splat constant is used with
wider vector element types (e.g. v8i16, v4i32, v2i64), instruction selection
currently falls back to materializing the constant in a temporary register.

```
vrepli.b  -1
vxor.v
```

even though the immediate form is available:

```
vxori.b 255
```

This happens because selectVSplatImm requires the splat bit width to match

    [11 lines not shown]
DeltaFile
+29-2llvm/lib/Target/LoongArch/LoongArchLSXInstrInfo.td
+24-0llvm/lib/Target/LoongArch/LoongArchLASXInstrInfo.td
+3-6llvm/test/CodeGen/LoongArch/lsx/ir-instruction/and.ll
+3-6llvm/test/CodeGen/LoongArch/lsx/ir-instruction/xor.ll
+3-6llvm/test/CodeGen/LoongArch/lsx/ir-instruction/or.ll
+3-6llvm/test/CodeGen/LoongArch/lsx/ir-instruction/icmp.ll
+65-2614 files not shown
+102-8520 files

LLVM/project 35b981bllvm/lib/Target/LoongArch LoongArchISelDAGToDAG.cpp

Address wanglei's comments
DeltaFile
+3-4llvm/lib/Target/LoongArch/LoongArchISelDAGToDAG.cpp
+3-41 files

LLVM/project 0c4cb19llvm/lib/Target/LoongArch LoongArchISelDAGToDAG.cpp

Fix a typo
DeltaFile
+1-1llvm/lib/Target/LoongArch/LoongArchISelDAGToDAG.cpp
+1-11 files

LLVM/project a758238llvm/lib/Target/LoongArch LoongArchISelDAGToDAG.cpp LoongArchLSXInstrInfo.td, llvm/test/CodeGen/LoongArch/lasx/ir-instruction add.ll sub.ll

[LoongArch] Select V{ADD,SUB}I for operations with negative splat immediates

Currently, vector add/sub with a negative splat immediate is lowered as a
vector splat followed by a register-register add, e.g.:

```
vrepli.b $vr1, -1
vadd.b   $vr0, $vr0, $vr1
```

This misses the opportunity to use the more efficient V{ADD,SUB}I instruction
with a positive immediate.

This patch introduces `selectVSplatImmNeg` to detect negative splat
immediates whose negated value fits in a 5-bit unsigned immediate. New
patterns `(Pat{Vr,Vr}Nimm5)` are added to match:

```
add v, splat(-imm)  -->  vsubi v, v, imm

    [7 lines not shown]
DeltaFile
+22-0llvm/lib/Target/LoongArch/LoongArchISelDAGToDAG.cpp
+17-0llvm/lib/Target/LoongArch/LoongArchLSXInstrInfo.td
+5-10llvm/test/CodeGen/LoongArch/lsx/ir-instruction/sub.ll
+5-10llvm/test/CodeGen/LoongArch/lasx/ir-instruction/add.ll
+5-10llvm/test/CodeGen/LoongArch/lasx/ir-instruction/sub.ll
+5-10llvm/test/CodeGen/LoongArch/lsx/ir-instruction/add.ll
+59-402 files not shown
+74-408 files

LLVM/project 2bac8d6llvm/test/CodeGen/LoongArch/lasx/ir-instruction nor.ll and.ll, llvm/test/CodeGen/LoongArch/lsx/ir-instruction nor.ll and.ll

[LoongArch][NFC] Add tests for bitwise with byte splat immediates (#192216)
DeltaFile
+36-0llvm/test/CodeGen/LoongArch/lsx/ir-instruction/nor.ll
+36-0llvm/test/CodeGen/LoongArch/lasx/ir-instruction/nor.ll
+33-0llvm/test/CodeGen/LoongArch/lsx/ir-instruction/and.ll
+33-0llvm/test/CodeGen/LoongArch/lsx/ir-instruction/or.ll
+33-0llvm/test/CodeGen/LoongArch/lsx/ir-instruction/xor.ll
+33-0llvm/test/CodeGen/LoongArch/lasx/ir-instruction/and.ll
+204-02 files not shown
+270-08 files

LLVM/project 7cabc53llvm/test/CodeGen/LoongArch/lasx/ir-instruction add.ll sub.ll, llvm/test/CodeGen/LoongArch/lsx/ir-instruction add.ll sub.ll

[LoongArch][NFC] Add tests for add/sub with negative splat immediates (#191965)
DeltaFile
+66-0llvm/test/CodeGen/LoongArch/lsx/ir-instruction/add.ll
+66-0llvm/test/CodeGen/LoongArch/lasx/ir-instruction/add.ll
+66-0llvm/test/CodeGen/LoongArch/lasx/ir-instruction/sub.ll
+66-0llvm/test/CodeGen/LoongArch/lsx/ir-instruction/sub.ll
+264-04 files

LLVM/project 6527bf9llvm/test/CodeGen/AMDGPU/NextUseAnalysis spill-vreg-many-lanes.mir acyclic-770bb.mir

Merge branch 'main' into users/ylzsx/precommit-fptrunc
DeltaFile
+275,101-0llvm/test/CodeGen/AMDGPU/NextUseAnalysis/spill-vreg-many-lanes.mir
+144,679-0llvm/test/CodeGen/AMDGPU/NextUseAnalysis/acyclic-770bb.mir
+57,682-0llvm/test/CodeGen/AMDGPU/NextUseAnalysis/double-nested-loops-complex-cfg.mir
+41,844-0llvm/test/CodeGen/AMDGPU/NextUseAnalysis/test_ers_multiple_spills2.mir
+40,613-0llvm/test/CodeGen/AMDGPU/NextUseAnalysis/test_ers_multiple_spills1.mir
+37,209-0llvm/test/CodeGen/AMDGPU/NextUseAnalysis/test_ers_multiple_spills3.mir
+597,128-01,735 files not shown
+976,030-41,5901,741 files

LLVM/project 685ee06utils/bazel/llvm-project-overlay/libc BUILD.bazel

[Bazel] Fixes 7094eb5 (#192584)

This fixes 7094eb52d8cbaa9faeb635bfb6f6c06e6cd52b64.

Co-authored-by: Google Bazel Bot <google-bazel-bot at google.com>
DeltaFile
+12-0utils/bazel/llvm-project-overlay/libc/BUILD.bazel
+12-01 files

LLVM/project 7094eb5libc/src/__support/threads futex_utils.h raw_rwlock.h, libc/src/__support/threads/darwin futex_utils.h

[libc][threads] adjust futex library and expose requeue API (#192478)

Make futex a common abstraction layer across platforms.
(linux/wasm/macOS/windows/fuchsia all have the support, which we can
align their support later on).

This patch also expose a requeue API that returns ENOSYS on unsupported
platforms. The requeue operation will be needed to reimplement a strict
FIFO style condvar similar to musl.

Additional cleanup is done to change raw syscall return value to
`ErrorOr<int>`.

Assisted-by: Codex with gpt-5.4 medium fast
DeltaFile
+64-29libc/src/__support/threads/linux/futex_utils.h
+54-26libc/src/__support/threads/darwin/futex_utils.h
+80-0libc/test/integration/src/__support/threads/futex_requeue_test.cpp
+37-0libc/test/src/__support/threads/futex_utils_test.cpp
+20-0libc/src/__support/threads/futex_utils.h
+8-7libc/src/__support/threads/raw_rwlock.h
+263-627 files not shown
+305-7313 files

LLVM/project 91fcdabmlir/lib/Dialect/MemRef/Transforms FoldMemRefAliasOps.cpp, mlir/test/Dialect/MemRef fold-memref-alias-ops.mlir

[mlir][memref] Remove unit-stride restriction in SubViewOp folding  (#192437)

This PR replaces manual offset/size resolution with `affine::mergeOffsetsSizesAndStrides`, simplifying the code and extending subview-of-subview folding to support non-unit strides.
DeltaFile
+8-26mlir/lib/Dialect/MemRef/Transforms/FoldMemRefAliasOps.cpp
+22-0mlir/test/Dialect/MemRef/fold-memref-alias-ops.mlir
+30-262 files

LLVM/project 7f6c395llvm/include/llvm/Transforms/IPO LowerTypeTests.h, llvm/lib/Passes PassBuilder.cpp PassRegistry.def

enum

Created using spr 1.3.7
DeltaFile
+20-0llvm/lib/Passes/PassBuilder.cpp
+13-6llvm/lib/Transforms/IPO/LowerTypeTests.cpp
+12-3llvm/include/llvm/Transforms/IPO/LowerTypeTests.h
+3-6llvm/lib/Passes/PassRegistry.def
+1-1llvm/lib/Passes/PassBuilderPipelines.cpp
+49-165 files