LLVM/project a33a7d9llvm/test/CodeGen/AMDGPU amdgcn.bitcast.1024bit.ll, llvm/test/CodeGen/RISCV clmul.ll

Merge branch 'main' into users/eas/mem-widen-subpasses
DeltaFile
+25,784-36,416llvm/test/CodeGen/RISCV/rvv/clmulh-sdnode.ll
+12,227-23,140llvm/test/CodeGen/RISCV/rvv/clmul-sdnode.ll
+4,004-11,142llvm/test/CodeGen/RISCV/clmul.ll
+6,940-6,782llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.1024bit.ll
+3,502-9,174llvm/test/CodeGen/X86/clmul-vector.ll
+3,985-7,989llvm/test/CodeGen/Thumb2/mve-clmul.ll
+56,442-94,643718 files not shown
+90,649-138,929724 files

LLVM/project f65782bllvm/lib/Transforms/Vectorize VPlanTransforms.cpp

Update llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp

Co-authored-by: Florian Hahn <flo at fhahn.com>
DeltaFile
+1-1llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
+1-11 files

LLVM/project 729d7abclang/include/clang/Basic DiagnosticSemaKinds.td, clang/lib/Sema SemaHLSL.cpp

Warns that the WaveSize attribute is unsupported for the SPIR-V target. (#196004)

Addresses #187188.
DeltaFile
+3-1clang/lib/Sema/SemaHLSL.cpp
+4-0clang/include/clang/Basic/DiagnosticSemaKinds.td
+7-12 files

LLVM/project d098154llvm/docs AMDGPUDwarfExtensionsForHeterogeneousDebugging.rst

[AMDGPU][docs] Remove abandoned augementation-related changes (#204420)

These haven't been carried forward in the DWARF committee proposal, and
we don't expect them to standardized (at least in the form presented
here). Drop them to avoid confusion.

Change-Id: I60dd6ffb5df1bb63d132733466ecf3d697f79276
DeltaFile
+7-117llvm/docs/AMDGPUDwarfExtensionsForHeterogeneousDebugging.rst
+7-1171 files

LLVM/project 8d2165emlir/include/mlir/Dialect/SPIRV/IR SPIRVGLOps.td, mlir/test/Dialect/SPIRV/IR gl-ops.mlir

[mlir][SPIR-V] Add GL Radians and Degrees ops (#203879)
DeltaFile
+52-0mlir/test/Dialect/SPIRV/IR/gl-ops.mlir
+48-0mlir/include/mlir/Dialect/SPIRV/IR/SPIRVGLOps.td
+4-0mlir/test/Target/SPIRV/gl-ops.mlir
+104-03 files

LLVM/project 8eb3b11mlir/include/mlir/Dialect/SPIRV/IR SPIRVCLOps.td, mlir/test/Dialect/SPIRV/IR ocl-ops.mlir

[mlir][SPIR-V] Add CL expm1 and log1p ops (#203881)
DeltaFile
+42-0mlir/include/mlir/Dialect/SPIRV/IR/SPIRVCLOps.td
+40-0mlir/test/Dialect/SPIRV/IR/ocl-ops.mlir
+4-0mlir/test/Target/SPIRV/ocl-ops.mlir
+86-03 files

LLVM/project a20244bllvm/lib/Transforms/Vectorize VPlanTransforms.cpp, llvm/test/Transforms/LoopVectorize narrow-interleave-groups-scalable-vf.ll pr128062-interleaved-accesses-narrow-group.ll

[VPlan] Narrow interleave groups with distinct live-in operands. (#203778)

Extend narrowInterleaveGroups so bundles with live-ins can be narrowed
by using BuildVector for the operands.

This only applies to fixed VFs: for scalable VFs the number of original
iterations processed by the narrowed plan depends on vscale, so a fixed
per-field vector cannot be built.

Another missing piece for
https://github.com/llvm/llvm-project/issues/128062

On a large IR corpus based on C/C++ workloads (32k modules), this
triggers in ~38 modules.

PR: https://github.com/llvm/llvm-project/pull/203778
DeltaFile
+137-4llvm/test/Transforms/LoopVectorize/narrow-interleave-groups-scalable-vf.ll
+29-63llvm/test/Transforms/LoopVectorize/AArch64/transform-narrow-interleave-to-widen-memory-constant-ops.ll
+32-10llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
+4-15llvm/test/Transforms/LoopVectorize/RISCV/interleaved-accesses.ll
+7-11llvm/test/Transforms/LoopVectorize/AArch64/transform-narrow-interleave-to-widen-memory-with-wide-ops-chained.ll
+2-2llvm/test/Transforms/LoopVectorize/pr128062-interleaved-accesses-narrow-group.ll
+211-1056 files

LLVM/project 5471cdbcompiler-rt/lib/builtins/cpu_model/riscv hwprobe.inc, llvm/lib/Target/RISCV RISCVFeatures.td

[RISCV] Implement Zicfiss Extension Bitmask (#201699)

This implements the proposal here:
https://github.com/riscv-non-isa/riscv-c-api-doc/pull/187

This was prepared with the assistance of AI.
DeltaFile
+82-69compiler-rt/lib/builtins/cpu_model/riscv/hwprobe.inc
+5-1llvm/lib/TargetParser/Host.cpp
+2-1llvm/lib/Target/RISCV/RISCVFeatures.td
+89-713 files

LLVM/project def1355llvm/include/llvm/Analysis AssumptionCache.h, llvm/lib/Analysis AssumptionCache.cpp

[AssumptionCache] Add replaceAssumption() to replace in-place. (#204432)

Add replaceAssumption and use it to replace assumptions when removing
bundles from assume in DropUnnecessaryAssumesPass, as suggested in PR
#203765.

Unfortunately I could not find any other candidates,that would not
require finding the WeakVH entry for assumes manually.

Compile-time impact is neutral/in the noise:

https://llvm-compile-time-tracker.com/compare.php?from=a2f4a1cabb083337ccb17c77cafb36d94c1ef52b&to=61a64fc62cbb38001894b75053f4add124869fe0&stat=instructions:u

PR: https://github.com/llvm/llvm-project/pull/204432
DeltaFile
+5-10llvm/lib/Transforms/Scalar/DropUnnecessaryAssumes.cpp
+10-1llvm/lib/Analysis/AssumptionCache.cpp
+7-0llvm/include/llvm/Analysis/AssumptionCache.h
+22-113 files

LLVM/project 6093c0eclang/docs ReleaseNotes.rst, clang/docs/analyzer checkers.rst

[analyzer] Bring unix.cstring.UninitializedRead checker out of alpha (#196292)

There have been recent improvements (#186802) and fixes (#191061)
related to this checker. The reports are no longer noisy, as evaluated
on 14 OS projects.

---------

Co-authored-by: Donát Nagy <donat.nagy at ericsson.com>
DeltaFile
+34-33clang/docs/analyzer/checkers.rst
+5-5clang/include/clang/StaticAnalyzer/Checkers/Checkers.td
+4-4clang/test/Analysis/bstring.c
+6-0clang/docs/ReleaseNotes.rst
+2-2clang/test/Analysis/cstring-uninitread-notes.c
+2-2clang/test/Analysis/wstring.c
+53-465 files not shown
+59-4911 files

LLVM/project 46b5bc7llvm/lib/Target/RISCV RISCVISelLowering.cpp, llvm/test/CodeGen/RISCV rvp-simd-64.ll

[RISCV][P-ext] Fold (PSRL/PSRA (concat (trunc (PSRL X, C1)), (trunc (PSRL Y, C1))), C2). (#204659)

into (concat (trunc (PSRL/PSRA X, C1+C2)), (trunc (PSRL/PSRA Y,
C1+C2))). If C1 is equal to the number of bits discarded by the truncate.

We recently added this for for a single truncate. This expands it to
concatenated truncates.

Assisted-by: Claude
DeltaFile
+58-23llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+4-6llvm/test/CodeGen/RISCV/rvp-simd-64.ll
+62-292 files

LLVM/project e078606clang/include/clang/Analysis/Analyses/LifetimeSafety Origins.h, clang/lib/Analysis/LifetimeSafety Origins.cpp

[LifetimeSafety][NFC] Collect accessed fields in a unified pre-scan
DeltaFile
+22-8clang/lib/Analysis/LifetimeSafety/Origins.cpp
+15-4clang/include/clang/Analysis/Analyses/LifetimeSafety/Origins.h
+37-122 files

LLVM/project e819ec7clang/include/clang/Analysis/Analyses/LifetimeSafety Origins.h, clang/lib/Analysis/LifetimeSafety FactsGenerator.cpp Facts.cpp

[LifetimeSafety][NFC] Add field-labeled child edges to OriginNode and generalize subtree walks
DeltaFile
+78-36clang/include/clang/Analysis/Analyses/LifetimeSafety/Origins.h
+47-25clang/lib/Analysis/LifetimeSafety/FactsGenerator.cpp
+18-8clang/lib/Analysis/LifetimeSafety/Facts.cpp
+8-3clang/lib/Analysis/LifetimeSafety/LiveOrigins.cpp
+7-3clang/lib/Analysis/LifetimeSafety/Origins.cpp
+158-755 files

LLVM/project bfad800clang/include/clang/Analysis/Analyses/LifetimeSafety Origins.h FactsGenerator.h, clang/lib/Analysis/LifetimeSafety Origins.cpp FactsGenerator.cpp

[LifetimeSafety] Track per-field origins for record types
DeltaFile
+348-5clang/test/Sema/warn-lifetime-safety.cpp
+106-7clang/lib/Analysis/LifetimeSafety/Origins.cpp
+69-37clang/lib/Analysis/LifetimeSafety/FactsGenerator.cpp
+30-0clang/include/clang/Analysis/Analyses/LifetimeSafety/Origins.h
+4-6clang/test/Sema/warn-lifetime-safety-dangling-field.cpp
+0-2clang/include/clang/Analysis/Analyses/LifetimeSafety/FactsGenerator.h
+557-576 files

LLVM/project 0f4d8ceoffload/plugins-nextgen/level_zero/include L0CmdListManager.h L0Queue.h, offload/plugins-nextgen/level_zero/src L0Queue.cpp L0Device.cpp

[offload][l0] Implement olLaunchHost function for level zero plugin (#204173)

Adds support to enqueue host tasks in Inorder queues for the L0 plugin.
DeltaFile
+23-0offload/plugins-nextgen/level_zero/src/L0Queue.cpp
+22-1offload/plugins-nextgen/level_zero/include/L0CmdListManager.h
+2-8offload/unittests/OffloadAPI/queue/olLaunchHostFunction.cpp
+9-0offload/plugins-nextgen/level_zero/include/L0Queue.h
+9-0offload/plugins-nextgen/level_zero/src/L0Device.cpp
+2-5offload/plugins-nextgen/level_zero/include/L0Device.h
+67-143 files not shown
+79-209 files

LLVM/project df9b1f8clang/test/Driver femit-dwarf-unwind.c femit-dwarf-unwind.s, llvm/lib/Target/X86/MCTargetDesc X86AsmBackend.cpp

[clang][Mach-O] Add an option to force UNWIND_*_MODE_DWARF compact unwind info (#204005)

The new option value extends: `-femit-dwarf-unwind=dwarf-only`. This is
primarily intended as a testing mechanism to ensure coverage on the
DWARF-only parts of the unwinder, where previously the compact unwinder
would have taken care of most functions.
DeltaFile
+54-0llvm/test/MC/X86/compact-unwind-force-dwarf.s
+42-0llvm/test/MC/AArch64/compact-unwind-force-dwarf.s
+19-8llvm/test/MC/MachO/ARM/compact-unwind-armv7k.s
+13-2clang/test/Driver/femit-dwarf-unwind.c
+12-1clang/test/Driver/femit-dwarf-unwind.s
+4-0llvm/lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp
+144-118 files not shown
+159-1314 files

LLVM/project 19f6c18llvm/lib/Transforms/Instrumentation InstrProfiling.cpp

[NFC] Add comments and Fix No-Asserts build by using [[maybe_unused]] (#202488) (#204672)

Co-authored-by: Wael Yehia <wyehia at ca.ibm.com>
DeltaFile
+18-6llvm/lib/Transforms/Instrumentation/InstrProfiling.cpp
+18-61 files

LLVM/project ca9675aopenmp/docs/design Runtimes.rst

Fix documentation
DeltaFile
+6-5openmp/docs/design/Runtimes.rst
+6-51 files

LLVM/project 8c3fc44clang/lib/Format TokenAnnotator.cpp, clang/unittests/Format FormatTest.cpp

[clang-format] Fix crash on malformed operator input (#199098)

fixes the remaining clang-format crash case after #199100 landed.

The problematic input is:
```cpp
{ operator } a
```
When annotating operator, clang-format should stop scanning at } instead
of consuming it and disturbing brace scope tracking. And adds a no-crash
regression test for it.
DeltaFile
+2-1clang/lib/Format/TokenAnnotator.cpp
+1-0clang/unittests/Format/FormatTest.cpp
+3-12 files

LLVM/project 2848608llvm/lib/Target/AMDGPU AMDGPU.td GCNHazardRecognizer.cpp, llvm/test/CodeGen/AMDGPU wmma-hazards-gfx1250-w32.mir wmma-coexecution-valu-hazards.mir

[AMDGPU] Introduce WMMACoexecutionHazards target feature (#204654)

gfx1250, gfx1251 and gfx12-5-generic have this feature, but gfx1310
does not have it.
DeltaFile
+541-1llvm/test/CodeGen/AMDGPU/wmma-hazards-gfx1250-w32.mir
+349-1llvm/test/CodeGen/AMDGPU/wmma-coexecution-valu-hazards.mir
+11-3llvm/lib/Target/AMDGPU/AMDGPU.td
+1-1llvm/lib/Target/AMDGPU/GCNHazardRecognizer.cpp
+902-64 files

LLVM/project 6c118c3llvm/lib/Target/AMDGPU AMDGPU.td, llvm/test/CodeGen/AMDGPU trans-coexecution-hazard.mir carryout-selection.ll

[AMDGPU] Remove TransCoexecutionHazard feature from gfx13 (#204645)

No TRANS coexecution hazard on gfx13 based on the latest SPG.
DeltaFile
+0-4llvm/test/CodeGen/AMDGPU/trans-coexecution-hazard.mir
+1-2llvm/lib/Target/AMDGPU/AMDGPU.td
+1-2llvm/test/CodeGen/AMDGPU/carryout-selection.ll
+2-83 files

LLVM/project e99e343clang/lib/Format TokenAnnotator.cpp, clang/unittests/Format TokenAnnotatorTest.cpp FormatTest.cpp

[clang-format] Fix annotation of alternative operator and (#199112)

I now annotate`and` as TT_BinaryOperator before the pointer/reference
heuristic. I left `bitand` alone since, like `&`, it can still be a
reference.
Fixes #199027.
DeltaFile
+8-0clang/lib/Format/TokenAnnotator.cpp
+4-0clang/unittests/Format/TokenAnnotatorTest.cpp
+2-0clang/unittests/Format/FormatTest.cpp
+14-03 files

LLVM/project 631aa1cutils/bazel/llvm-project-overlay/openmp/runtime/src BUILD.bazel

[Bazel] Fixes d81069d (#204629)

This fixes d81069d7c3da46ea7bd000c3d6ab618e3db79bd4.

Co-authored-by: Google Bazel Bot <google-bazel-bot at google.com>
DeltaFile
+1-0utils/bazel/llvm-project-overlay/openmp/runtime/src/BUILD.bazel
+1-01 files

LLVM/project 01b8de8mlir/lib/Target/LLVMIR/Dialect/OpenMP OpenMPToLLVMIRTranslation.cpp, mlir/test/Target/LLVMIR openmp-taskloop-reduction.mlir openmp-todo.mlir

[mlir][OpenMP] Translate reductions on taskloop (#199670)

This patch adds LLVM IR translation for `reduction` and `in_reduction`
clauses on `omp.taskloop.context`.

For `taskloop reduction`, the lowering emits the implicit taskgroup
reduction setup, builds the task-reduction descriptor array, and maps
each generated task to runtime-provided private reduction storage
through `__kmpc_task_reduction_get_th_data`.

For `taskloop in_reduction`, the lowering uses the same runtime lookup
path with a null descriptor, allowing the runtime to find the enclosing
task-reduction context.

Unsupported byref, cleanup-region, and two-argument initializer forms
remain diagnosed.

### Stack / review order


    [18 lines not shown]
DeltaFile
+373-0mlir/test/Target/LLVMIR/openmp-taskloop-reduction.mlir
+238-27mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
+92-10mlir/test/Target/LLVMIR/openmp-todo.mlir
+703-373 files

LLVM/project adae5c0llvm/lib/Target/AMDGPU AMDGPUISelLowering.cpp AMDGPUCodeGenPrepare.cpp, llvm/test/CodeGen/AMDGPU srem64.ll sdiv64.ll

[AMDGPU] Fix 64->32 bit division corner case (#204469)

Do not implement 64-bit signed division with 32-bit division if operands
are only constrained to a 32-bit signed range.
-2147483648/-1 != -2147483648/1, but their lower 32-bits are identical.
32-bit division cannot generate the correct result for both sets of
operands. Only use 32-bit division if operands are constrained to a
31-bit signed range.

Bug appears in both AMDGPUCodeGenPrepare.cpp and AMDGPUISelLowering.cpp.

Tested in https://github.com/llvm/llvm-test-suite/pull/428.

---------

Signed-off-by: John Lu <John.Lu at amd.com>
DeltaFile
+244-66llvm/test/CodeGen/AMDGPU/srem64.ll
+234-69llvm/test/CodeGen/AMDGPU/sdiv64.ll
+5-2llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp
+5-1llvm/lib/Target/AMDGPU/AMDGPUCodeGenPrepare.cpp
+488-1384 files

LLVM/project d028359llvm/lib/DWARFLinker/Parallel DWARFLinkerImpl.cpp DWARFLinkerImpl.h, llvm/test/tools/dsymutil/X86 fat-multiarch-parallel-link.test

[DWARFLinker] Fix data race on the global parallel strategy (#204642)

DWARFLinkerImpl::link() assigned the process-global
llvm::parallel::strategy on entry. dsymutil runs link() concurrently,
one call per architecture of a universal binary, so those assignments
race. An inconsistent strategy can route per-compile-unit cloning onto a
thread that is not an llvm::parallel ThreadPoolExecutor worker, where
the per-thread allocators call getThreadIndex().

This manifested itself as an assert, but otherwise returns in a
out-of-bounds.

```
Assertion failed: ((threadIndex != UINT_MAX) && "getThreadIndex() must be called from a thread created by " "ThreadPoolExecutor"), function getThreadIndex, file Parallel.h, line 51.
```

The assert is non-deterministic and needs more than one architecture to
reproduce.


    [5 lines not shown]
DeltaFile
+16-0llvm/test/tools/dsymutil/X86/fat-multiarch-parallel-link.test
+8-5llvm/lib/DWARFLinker/Parallel/DWARFLinkerImpl.cpp
+0-3llvm/lib/DWARFLinker/Parallel/DWARFLinkerImpl.h
+24-83 files

LLVM/project de9a994clang/lib/CIR/CodeGen CIRGenModule.cpp, clang/test/CIR/CodeGenCUDA address-spaces.cu

[CIR][CUDA] Replace poison attr usages to undef for global shared/shadow/device-shadow instantiation
DeltaFile
+17-17clang/test/CIR/CodeGenCUDA/address-spaces.cu
+1-1clang/lib/CIR/CodeGen/CIRGenModule.cpp
+18-182 files

LLVM/project a358c6cclang/cmake/caches Fuchsia-stage2.cmake

[CMake][Fuchsia] Add llvm-profgen to Fuchsia toolchain (#204638)
DeltaFile
+1-0clang/cmake/caches/Fuchsia-stage2.cmake
+1-01 files

LLVM/project 0607caallvm/include/llvm/Target/GlobalISel Combine.td, llvm/test/CodeGen/AArch64/GlobalISel combine-or-and-xor.ll combine-or-and-xor.mir

[GlobalISel] Add `or_and_xor_to_or` pattern from SelectionDAG (#204614)

PR #201108 was merged and then reverted due to a failing test. This PR
fixes the tests that failed.
DeltaFile
+213-0llvm/test/CodeGen/AArch64/GlobalISel/combine-or-and-xor.ll
+206-0llvm/test/CodeGen/AArch64/GlobalISel/combine-or-and-xor.mir
+40-1llvm/include/llvm/Target/GlobalISel/Combine.td
+1-1llvm/test/CodeGen/AMDGPU/bitop3-shared-operand.ll
+460-24 files

LLVM/project 9824d35lldb/source/Plugins/ExpressionParser/Clang ClangExpressionParser.cpp, lldb/source/Plugins/LanguageRuntime/ObjC ObjCLanguageRuntime.cpp ObjCLanguageRuntime.h

[lldb] Don't enable Objective-C in expressions on unsupported formats (#204639)

Evaluating any expression against a WebAssembly target aborted LLDB:

```
(lldb) expr (int)sizeof(Point)
LLVM ERROR: Objective-C support is unimplemented for object file format
```

WebAssembly can't JIT expressions (RuntimeDyld doesn't support the Wasm
object format, so ProcessWasm sets CanJIT to false), but it can handle
simple expressions that can be IR interpreted.

When setting up the expression's language options, LLDB speculatively
enables Objective-C, which trips up the fatal error as Objective-C code
generation only supports Mach-O, ELF, and COFF.

Add ObjCLanguageRuntime::IsSupportedForArchitecture and disable
Objective-C in the expression's language options when the target's

    [2 lines not shown]
DeltaFile
+18-0lldb/test/Shell/Expr/wasm-no-objc-codegen.test
+14-0lldb/source/Plugins/LanguageRuntime/ObjC/ObjCLanguageRuntime.cpp
+6-0lldb/source/Plugins/ExpressionParser/Clang/ClangExpressionParser.cpp
+5-0lldb/source/Plugins/LanguageRuntime/ObjC/ObjCLanguageRuntime.h
+43-04 files