LLVM/project b28ad9cllvm/tools/llvm-dwp Opts.td

[llvm-dwp] Fix typo in --help
DeltaFile
+1-1llvm/tools/llvm-dwp/Opts.td
+1-11 files

LLVM/project ed9e8e6clang/lib/CodeGen/TargetBuiltins ARM.cpp, clang/lib/Headers arm_acle.h

fixup! More small fixes
DeltaFile
+15-37clang/lib/Sema/SemaARM.cpp
+11-0clang/test/Sema/AArch64/pcdphint-atomic-store.c
+5-3clang/lib/CodeGen/TargetBuiltins/ARM.cpp
+2-2clang/lib/Headers/arm_acle.h
+0-1clang/test/CodeGen/arm_acle.c
+33-435 files

LLVM/project 4082092llvm/test/Transforms/LoopUnrollAndJam dependencies.ll

[LoopUnrollAndJam] Update test dependencies.ll (NFC) (#183509)

Recent on-going works to fix the correctness issues in DA will affect
some existing regression tests for passes that rely on it. As a result,
the original intent of several tests will be lost.
This patch updates `dependencies.ll` to avoid such issues and preserve
its intent. Specifically, this patch changes the loop bounds from
parameters to constants, which allows SCEV to infer no-wrap flags for
the addrecs. Also this patch updates other minor issues in the test,
such as adding pseudo codes and removing some `nuw` to avoid UB.
DeltaFile
+160-64llvm/test/Transforms/LoopUnrollAndJam/dependencies.ll
+160-641 files

LLVM/project decb5d3clang/lib/CIR/CodeGen CIRGenCleanup.cpp EHScopeStack.h, clang/test/CIR/CodeGen label-values.c nrvo.cpp

[CIR] Remove branch through cleanup fixups (#182953)

Because we are using a structured representation of cleanups in CIR, we
don't need to handle branching through cleanups during codegen. These
branches are created during CFG flattening instead. However, we had
already committed some code that copied the classic codegen behavior for
branching through cleanups. This change deletes that unneeded code.

The most significant change here is that when we encounter a return
statement we emit the return directly in the current location.

The coroutine implementation still creates a return block in the current
lexical scope and branches to that block. Cleaning up that
representation is left as future work.

The popCleanupBlock handling still has a significant amount of logic
that is carried over from the classic codegen implementation. It is left
in place until we can be sure we won't need it.
DeltaFile
+8-91clang/lib/CIR/CodeGen/CIRGenCleanup.cpp
+0-62clang/lib/CIR/CodeGen/EHScopeStack.h
+32-14clang/lib/CIR/CodeGen/CIRGenStmt.cpp
+0-42clang/lib/CIR/CodeGen/CIRGenFunction.h
+12-16clang/test/CIR/CodeGen/label-values.c
+17-11clang/test/CIR/CodeGen/nrvo.cpp
+69-2366 files not shown
+107-27012 files

LLVM/project 361e235mlir/python/mlir/dialects ext.py, mlir/test/python/dialects ext.py

[MLIR][Python] Support op adaptor for Python-defined operations (#183528)

Previously, in #177782, we added support for dialect conversion and
generated an `OpAdaptor` subtype for every ODS-defined operation. In
this PR, we will also generate `OpAdaptor` subtypes for Python-defined
operations, so that they can be applied in dialect conversion as well.
DeltaFile
+49-1mlir/python/mlir/dialects/ext.py
+10-0mlir/test/python/dialects/ext.py
+59-12 files

LLVM/project 9b708b0mlir/lib/Conversion/ArithToSPIRV ArithToSPIRV.cpp, mlir/test/Conversion/ArithToSPIRV arith-to-spirv-unsupported.mlir

[mlir][arith-to-spirv] Fix null dereference when converting trunci/extui with tensor types (#183654)

`getScalarOrVectorConstInt` only handles `VectorType` and `IntegerType`,
returning `nullptr` for any other type (e.g., a `RankedTensorType` that
slips through after type emulation maps `tensor<Nxi16>` to
`tensor<Nxi32>` with the same destination type). The callers in
`TruncIPattern` and `ExtUIPattern` passed this null value directly to
`spirv::BitwiseAndOp::create`, causing a null-pointer dereference in
`OperandStorage`.

Similarly, the signed-extension pattern passes the result of
`getScalarOrVectorConstInt` as a shift amount to
`ShiftLeftLogicalOp::create` without a null check.

Add `if (\!mask)` / `if (\!shiftSize)` guards that return a match
failure in all three cases, converting the crash into a proper
legalization failure.

Fixes #178214
DeltaFile
+10-0mlir/test/Conversion/ArithToSPIRV/arith-to-spirv-unsupported.mlir
+6-0mlir/lib/Conversion/ArithToSPIRV/ArithToSPIRV.cpp
+16-02 files

LLVM/project ba7fc2fllvm/lib/Transforms/Vectorize VPlanTransforms.cpp LoopVectorize.cpp, llvm/test/Transforms/LoopVectorize vplan-based-stride-mv.ll

[VPlan] Implement VPlan-based stride speculation
DeltaFile
+928-1,076llvm/test/Transforms/LoopVectorize/vplan-based-stride-mv.ll
+273-150llvm/test/Transforms/LoopVectorize/VPlan/vplan-based-stride-mv.ll
+235-3llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
+49-3llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+43-0llvm/lib/Transforms/Vectorize/VPlan.h
+5-5llvm/lib/Transforms/Vectorize/VPRecipeBuilder.h
+1,533-1,2377 files not shown
+1,563-1,24213 files

LLVM/project 42a3ac5llvm/lib/Transforms/Vectorize LoopVectorize.cpp, llvm/test/Transforms/LoopVectorize induction.ll single-value-blend-phis.ll

[VPlan] Process instructions in reverse order when widening

It doesn't matter right now because we're using CM's decision, but
https://github.com/llvm/llvm-project/pull/182595 introduces some
scalarization (first-lane-only) opportunites that aren't known in CM and
those require reverse iteration order to support as those are determined
by VPUsers and not operands.
DeltaFile
+27-27llvm/test/Transforms/LoopVectorize/induction.ll
+7-3llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+3-3llvm/test/Transforms/LoopVectorize/AArch64/predication_costs.ll
+3-3llvm/test/Transforms/LoopVectorize/X86/induction-costs.ll
+2-2llvm/test/Transforms/LoopVectorize/RISCV/induction-costs.ll
+1-1llvm/test/Transforms/LoopVectorize/single-value-blend-phis.ll
+43-396 files

LLVM/project 99c4635mlir/lib/Debug DebugCounter.cpp, mlir/test/mlir-opt debugcounter-invalid-cl-options.mlir

[MLIR] Do not abort on invalid --mlir-debug-counter values (#181751)

Use `cl::Option::error()` diagnostics for invalid `--mlir-debug-counter`
arguments and exit with status 1 (no stack dump).

Added `mlir/test/mlir-opt/debugcounter-invalid-cl-options.mlir`
covering:
  - non-numeric value (`-1n`)
  - missing `=`
  - missing `-skip`/`-count` suffix

Fixes #180117
DeltaFile
+17-16mlir/lib/Debug/DebugCounter.cpp
+27-0mlir/test/mlir-opt/debugcounter-invalid-cl-options.mlir
+44-162 files

LLVM/project d149830llvm/test/CodeGen/AMDGPU load-saddr-offset-imm.ll

[AMDGPU] Pre-Commit tests for handle mbcnt in computeKnownBitsFromOperator (#178607)

For PR #183229
DeltaFile
+89-0llvm/test/CodeGen/AMDGPU/load-saddr-offset-imm.ll
+89-01 files

LLVM/project 26b4c25flang-rt/lib/cuda stream.cpp allocator.cpp, flang-rt/unittests/Runtime/CUDA Allocatable.cpp

[flang][cuda] Add support for cudaStreamDestroy (#183648)

Add specific lowering and entry point for cudaStreamDestroy. Since we
keep associated stream for some allocation, we need to reset it when the
stream is destroy so we don't use it anymore.
DeltaFile
+35-0flang-rt/unittests/Runtime/CUDA/Allocatable.cpp
+19-0flang/lib/Optimizer/Builder/CUDAIntrinsicCall.cpp
+10-4flang-rt/lib/cuda/stream.cpp
+10-0flang/test/Lower/CUDA/cuda-default-stream.cuf
+9-0flang-rt/lib/cuda/allocator.cpp
+8-0flang/module/cuda_runtime_api.f90
+91-43 files not shown
+96-49 files

LLVM/project 5e6f0c4clang/lib/Basic Targets.cpp, clang/lib/Basic/Targets OSTargets.h

[Clang][Hexagon] Add QURT as recognized OS in target triple (#183622)

Add support for the QURT as a recognized OS type in the LLVM triple
system, and define the __qurt__ predefined macro when targeting it.
DeltaFile
+13-0clang/lib/Basic/Targets/OSTargets.h
+5-1llvm/include/llvm/TargetParser/Triple.h
+6-0clang/test/Preprocessor/hexagon-predefines.c
+3-0llvm/lib/TargetParser/Triple.cpp
+2-0clang/lib/Basic/Targets.cpp
+29-15 files

LLVM/project 7c022afcompiler-rt/lib/scudo/standalone wrappers_c.inc report.cpp, compiler-rt/lib/scudo/standalone/tests wrappers_c_test.cpp report_test.cpp

[scudo] Add reallocarray C wrapper. (#183385)

`reallocarray()` is a POSIX extension to C standard which wraps
`realloc` function and adds `calloc`-like overflow detection. It is
available in glibc and some other standard library implementations. Add
`reallocarray` to the list of Scudo C wrappers, so that the code that
depends on `reallocarray` presence will continue to work.
DeltaFile
+29-0compiler-rt/lib/scudo/standalone/tests/wrappers_c_test.cpp
+13-0compiler-rt/lib/scudo/standalone/wrappers_c.inc
+7-0compiler-rt/lib/scudo/standalone/report.cpp
+3-3compiler-rt/lib/scudo/standalone/wrappers_c_checks.h
+2-0compiler-rt/lib/scudo/standalone/tests/report_test.cpp
+1-0compiler-rt/lib/scudo/standalone/report.h
+55-36 files

LLVM/project 7e39b28libc/shared/math nextafterf16.h nextafterf128.h, libc/src/__support/math CMakeLists.txt nextafterf16.h

[libc][math] Refactor nextafter family to header-only (#181673)

closes #181672
DeltaFile
+88-3utils/bazel/llvm-project-overlay/libc/BUILD.bazel
+57-0libc/src/__support/math/CMakeLists.txt
+32-0libc/src/__support/math/nextafterf16.h
+32-0libc/src/__support/math/nextafterf128.h
+29-0libc/shared/math/nextafterf16.h
+29-0libc/shared/math/nextafterf128.h
+267-318 files not shown
+502-4024 files

LLVM/project 2f6b31fcompiler-rt/test/safestack overflow.c

merge

Created using spr 1.3.7
DeltaFile
+6-0compiler-rt/test/safestack/overflow.c
+6-01 files

LLVM/project f659508clang/lib/Driver SanitizerArgs.cpp, clang/test/Driver fsanitize-minimal-runtime.c

[𝘀𝗽𝗿] initial version

Created using spr 1.3.7
DeltaFile
+3-3clang/lib/Driver/SanitizerArgs.cpp
+6-0compiler-rt/test/cfi/icall/bad-signature.c
+6-0compiler-rt/test/safestack/overflow.c
+4-0clang/test/Driver/fsanitize-minimal-runtime.c
+19-34 files

LLVM/project 92317c2clang/lib/Driver SanitizerArgs.cpp, clang/test/Driver fsanitize-minimal-runtime.c

[𝘀𝗽𝗿] changes to main this commit is based on

Created using spr 1.3.7

[skip ci]
DeltaFile
+3-3clang/lib/Driver/SanitizerArgs.cpp
+6-0compiler-rt/test/safestack/overflow.c
+4-0clang/test/Driver/fsanitize-minimal-runtime.c
+13-33 files

LLVM/project 9f3fc05clang/lib/Driver SanitizerArgs.cpp, clang/test/Driver fsanitize-minimal-runtime.c

[𝘀𝗽𝗿] initial version

Created using spr 1.3.7
DeltaFile
+3-3clang/lib/Driver/SanitizerArgs.cpp
+6-0compiler-rt/test/safestack/overflow.c
+4-0clang/test/Driver/fsanitize-minimal-runtime.c
+13-33 files

LLVM/project 55a3cd2clang/lib/Driver SanitizerArgs.cpp, clang/test/Driver fsanitize-minimal-runtime.c

[𝘀𝗽𝗿] changes to main this commit is based on

Created using spr 1.3.7

[skip ci]
DeltaFile
+3-3clang/lib/Driver/SanitizerArgs.cpp
+4-0clang/test/Driver/fsanitize-minimal-runtime.c
+7-32 files

LLVM/project a7ffa75clang/lib/Driver SanitizerArgs.cpp, clang/test/Driver fsanitize-minimal-runtime.c

[𝘀𝗽𝗿] initial version

Created using spr 1.3.7
DeltaFile
+3-3clang/lib/Driver/SanitizerArgs.cpp
+4-0clang/test/Driver/fsanitize-minimal-runtime.c
+7-32 files

LLVM/project 20ec9a9clang/cmake/modules AddClang.cmake

build: correct `MSVC` and Windows mixup for `CLANG_BUILD_STATIC` (#183609)

The build incorrectly used `MSVC` to determine that we were building for
Windows (MS ABI). This prevents the use of the GNU driver for building
LLVM for Windows. Adjust the condition to `WIN32 AND NOT MINGW` to
correctly identify that we are building for Windows MS ABI.
DeltaFile
+1-1clang/cmake/modules/AddClang.cmake
+1-11 files

LLVM/project e559455compiler-rt/lib/scudo/standalone secondary.h

[scudo] Change header tagging for the secondary allocator (#182487)

When secondary allocator allocates a new chunk, the allocation is 
prepended with a chunk header (common with the primary allocator)
and large header (only used for secondary).
Only the headers are tagged, the data is not, and the headers are
tagged individually as different tags are used for them.
In the current implementation while tagging the large header the unused
area is tagged with it, so the allocator can tag up to a page size (in
worst case), which is costly and does not bring security benefit (as the
area is unused).
With the current fix we can get rid of around 97-98% of the tagging for
the secondary allocator, measured with random benchmarks.

Co-authored-by: Christopher Ferris <cferris1000 at users.noreply.github.com>
DeltaFile
+4-7compiler-rt/lib/scudo/standalone/secondary.h
+4-71 files

LLVM/project 2fc0733llvm/lib/Target/AArch64 AArch64ISelLowering.cpp, llvm/test/CodeGen/AArch64 faddv.ll faddv-fp16.ll

[AArch64] Decompose FADD reductions with known zero elements (#167313)

FADDV is matched into FADDPv4f32 + FADDPv2i32p but this can be relaxed
when one element (usually the 4th) or more are known to be zero.

Before:
```
movi d1, #0000000000000000
mov v0.s[3], v1.s[0]
faddp v0.4s, v0.4s, v0.4s
faddp s0, v0.2s
```

After:
```
mov s1, v0.s[2]
faddp s0, v0.2s
fadd s0, s0, s1
```

    [2 lines not shown]
DeltaFile
+301-0llvm/test/CodeGen/AArch64/faddv.ll
+130-0llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+68-0llvm/test/CodeGen/AArch64/faddv-fp16.ll
+2-3llvm/test/CodeGen/AArch64/vecreduce-fadd.ll
+501-34 files

LLVM/project 6c41d12llvm/lib/Transforms/Vectorize VPlanTransforms.cpp LoopVectorize.cpp, llvm/test/Transforms/LoopVectorize vplan-based-stride-mv.ll

[VPlan] Implement VPlan-based stride speculation
DeltaFile
+928-1,076llvm/test/Transforms/LoopVectorize/vplan-based-stride-mv.ll
+273-150llvm/test/Transforms/LoopVectorize/VPlan/vplan-based-stride-mv.ll
+237-3llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
+54-3llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+43-0llvm/lib/Transforms/Vectorize/VPlan.h
+5-5llvm/lib/Transforms/Vectorize/VPRecipeBuilder.h
+1,540-1,2376 files not shown
+1,568-1,24012 files

LLVM/project e92dd71llvm/lib/Target/RISCV RISCVInstrInfoP.td

[RISCV] Add Defs = VXSAT to P extension instructions. (#183455)

DeltaFile
+91-2llvm/lib/Target/RISCV/RISCVInstrInfoP.td
+91-21 files

LLVM/project 9bdcb45llvm/test/CodeGen/PowerPC clmul-vector.ll, llvm/test/CodeGen/RISCV clmul.ll clmulr.ll

Merge branch 'fix-blockfreq-unroll-unconditional-latches--fast' into fix-blockfreq-unroll-unconditional-latches--uniform
DeltaFile
+25,051-14,920llvm/test/CodeGen/RISCV/clmul.ll
+16,004-0llvm/test/MC/AMDGPU/gfx13_asm_vopd3.s
+13,198-0llvm/test/CodeGen/RISCV/clmulr.ll
+12,863-0llvm/test/CodeGen/RISCV/clmulh.ll
+5,835-5,584llvm/test/tools/llvm-dwarfdump/X86/simplified-template-names.s
+8,874-0llvm/test/CodeGen/PowerPC/clmul-vector.ll
+81,825-20,5046,673 files not shown
+440,525-170,1316,679 files

LLVM/project c6db35fmlir/lib/Dialect/XeGPU/Transforms XeGPUPeepHoleOptimizer.cpp, mlir/test/Dialect/XeGPU peephole-optimize.mlir

[mlir][xegpu] Retain order attribute during load + transpose optimization. (#183608)

As described in the title `order` attribute is ignored in this
transformation causing downstream test failures.
DeltaFile
+59-47mlir/test/Dialect/XeGPU/peephole-optimize.mlir
+3-2mlir/lib/Dialect/XeGPU/Transforms/XeGPUPeepHoleOptimizer.cpp
+62-492 files

LLVM/project 1ab63d5llvm/lib/Transforms/Vectorize LoopVectorize.cpp, llvm/test/Transforms/LoopVectorize induction.ll single-value-blend-phis.ll

[VPlan] Process instructions in reverse order when widening

It doesn't matter right now because we're using CM's decision, but
https://github.com/llvm/llvm-project/pull/182595 introduces some
scalarization (first-lane-only) opportunites that aren't known in CM and
those require reverse iteration order to support as those are determined
by VPUsers and not operands.
DeltaFile
+27-27llvm/test/Transforms/LoopVectorize/induction.ll
+4-2llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+3-3llvm/test/Transforms/LoopVectorize/AArch64/predication_costs.ll
+3-3llvm/test/Transforms/LoopVectorize/X86/induction-costs.ll
+1-1llvm/test/Transforms/LoopVectorize/single-value-blend-phis.ll
+38-365 files

LLVM/project 6bc9ba7llvm/lib/Target/Hexagon HexagonISelLowering.cpp, llvm/test/CodeGen/Hexagon vgather-memvt.ll

[Hexagon] Fix memory type for vgather intrinsics (#183563)

Some of the Hexagon vgather intrinsics were picking the memory type
(memVT) from a fixed argument position, but for several variants (e.g.
the predicated ones), that argument isn’t actually the data vector being
gathered. As a result, LLVM could end up recording the wrong memory type
or size (e.g. i32 or mask instead of the vector arg). This patch fixes
that by always taking memVT from the last intrinsic argument, which is
the actual data vector.
DeltaFile
+35-0llvm/test/CodeGen/Hexagon/vgather-memvt.ll
+2-1llvm/lib/Target/Hexagon/HexagonISelLowering.cpp
+37-12 files

LLVM/project 10abb23flang/docs GettingInvolved.md

[flang] Update the Flang Community Call to the new MS Teams series (#183576)

DeltaFile
+10-10flang/docs/GettingInvolved.md
+10-101 files