LLVM/project fd51c78clang/lib/AST DeclTemplate.cpp, clang/lib/Sema SemaTemplateDeductionGuide.cpp SemaTemplate.cpp

[clang] fix alias ctad producing function template with no template parameters (#195303)
DeltaFile
+40-43clang/lib/Sema/SemaTemplateDeductionGuide.cpp
+18-9clang/lib/Sema/SemaTemplate.cpp
+9-0clang/lib/AST/DeclTemplate.cpp
+4-3clang/test/SemaCXX/cxx20-ctad-type-alias.cpp
+3-3clang/lib/Sema/SemaInit.cpp
+2-2clang/test/SemaTemplate/deduction-guide.cpp
+76-604 files not shown
+82-6210 files

LLVM/project b506ef0clang/lib/CodeGen/TargetBuiltins AMDGPU.cpp, llvm/lib/Analysis ValueTracking.cpp

[ValueTracking] Add CharWidth argument to getConstantStringInfo (NFC)

The method assumes that host chars and target chars have the same width.
Add a CharWidth argument so that it can bail out if the requested char
width differs from the host char width.

Alternatively, the check could be done at call sites, but this is more
error-prone.

In the future, this method will be replaced with a different one that
allows host/target chars to have different widths. The prototype will
be the same except that StringRef is replaced with something that is
byte width agnostic. Adding CharWidth argument now reduces the future
diff.
DeltaFile
+72-31llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp
+9-3llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp
+5-2llvm/lib/Analysis/ValueTracking.cpp
+2-2clang/lib/CodeGen/TargetBuiltins/AMDGPU.cpp
+2-2llvm/lib/Transforms/Utils/AMDGPUEmitPrintf.cpp
+2-2llvm/lib/Target/AMDGPU/AMDGPUPrintfRuntimeBinding.cpp
+92-424 files not shown
+96-4610 files

LLVM/project 3a59b36llvm/include/llvm/IR PatternMatch.h, llvm/lib/Analysis InstructionSimplify.cpp

[IR] Account for byte width in m_PtrAdd

The method has few uses yet, so just pass DL argument to it. The change
follows m_PtrToIntSameSize, and I don't see a better way of delivering
the byte width to the method.
DeltaFile
+19-6llvm/unittests/IR/PatternMatch.cpp
+8-5llvm/include/llvm/IR/PatternMatch.h
+4-3llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp
+3-2llvm/lib/Transforms/Scalar/SeparateConstOffsetFromGEP.cpp
+1-1llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
+1-1llvm/lib/Analysis/InstructionSimplify.cpp
+36-186 files

LLVM/project 68d0d2bllvm/include/llvm/IR IRBuilder.h, llvm/lib/Transforms/Instrumentation SanitizerCoverage.cpp

[IRBuilder] Add getByteTy and use it in CreatePtrAdd

The change requires DataLayout instance to be available, which, in turn,
requires insertion point to be set. In-tree tests detected only one case
when the function was called without setting an insertion point, it was
changed to create a constant expression directly.
DeltaFile
+22-0llvm/unittests/IR/IRBuilderTest.cpp
+8-2llvm/include/llvm/IR/IRBuilder.h
+2-3llvm/lib/Transforms/Instrumentation/SanitizerCoverage.cpp
+32-53 files

LLVM/project 6950e52clang/lib/CodeGen ItaniumCXXABI.cpp, llvm/include/llvm/IR Constants.h

Use DL in ConstantExpr::getPtrAdd() / ConstantExpr::getInBoundsPtrAdd()
DeltaFile
+9-9llvm/lib/Analysis/ScalarEvolution.cpp
+9-9llvm/include/llvm/IR/Constants.h
+9-0llvm/lib/IR/Constants.cpp
+5-3llvm/lib/Transforms/IPO/LowerTypeTests.cpp
+2-1clang/lib/CodeGen/ItaniumCXXABI.cpp
+2-1llvm/unittests/IR/PatternMatch.cpp
+36-237 files not shown
+43-3013 files

LLVM/project 2b27589llvm/lib/Transforms/Utils SimplifyLibCalls.cpp, llvm/test/Transforms/InstCombine/SimplifyLibCalls memcpy-b16.ll memset-b16.ll

[SimplifyLibCalls] Add initial support for non-8-bit bytes

The patch makes CharWidth argument of `getStringLength` mandatory
and ensures the correct values are passed in most cases.
This is *not* a complete support for unusual byte widths in
SimplifyLibCalls since `getConstantStringInfo` returns false for those.
The code guarded by `getConstantStringInfo` returning true is unchanged
because the changes are currently not testable.
DeltaFile
+126-67llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp
+69-0llvm/test/Transforms/InstCombine/SimplifyLibCalls/memcpy-b16.ll
+66-0llvm/test/Transforms/InstCombine/SimplifyLibCalls/memset-b16.ll
+50-0llvm/test/Transforms/InstCombine/SimplifyLibCalls/strcmp-b32.ll
+47-0llvm/test/Transforms/InstCombine/SimplifyLibCalls/stpncpy-b16.ll
+45-0llvm/test/Transforms/InstCombine/SimplifyLibCalls/strchr-b16.ll
+403-6729 files not shown
+932-10435 files

LLVM/project 5d7713bllvm/lib/Analysis ValueTracking.cpp

[ValueTracking] Make isBytewiseValue byte width agnostic

This is a simple change to show how easy it can be to support unusual
byte widths in the middle end.
DeltaFile
+14-13llvm/lib/Analysis/ValueTracking.cpp
+14-131 files

LLVM/project 37f8c63llvm/include/llvm/IR Intrinsics.td Intrinsics.h, llvm/lib/IR Intrinsics.cpp Core.cpp

[IR] Make @llvm.memset prototype byte width dependent

This patch changes the type of the value argument of @llvm.memset and
similar intrinsics from i8 to iN, where N is the byte width specified
in data layout string.
Note that the argument still has fixed type (not overloaded), but type
checker will complain if the type does not match the byte width.

Ideally, the type of the argument would be dependent on the address
space of the pointer argument. It is easy to do this (and I did it
downstream as a PoC), but since data layout string doesn't currently
allow different byte widths for different address spaces, I refrained
from doing it now.
DeltaFile
+37-25llvm/lib/IR/Intrinsics.cpp
+9-4llvm/include/llvm/IR/Intrinsics.td
+6-3llvm/include/llvm/IR/Intrinsics.h
+2-2llvm/lib/IR/Core.cpp
+3-1llvm/lib/IR/Function.cpp
+2-1llvm/lib/IR/AutoUpgrade.cpp
+59-366 files not shown
+68-4212 files

LLVM/project 20f765cllvm/include/llvm/IR DataLayout.h, llvm/lib/IR DataLayout.cpp

[DataLayout] Add byte specification

This patch adds byte specification to data layout string.
The specification is `b:<size>`, where `<size>` is the size of a byte
in bits (later referred to as "byte width").

Limitations:
* The only values allowed for byte width are 8, 16, and 32.
16-bit bytes are popular, and my downstream target has 32-bit bytes.
These are the widths I'm going to add tests for in follow-up patches,
so this restriction only exists because other widths are untested.
* It is assumed that bytes are the same in all address spaces.
Supporting different byte widths in different address spaces would
require adding an address space argument to all DataLayout methods
that query ABI / preferred alignments because they return *byte*
alignments, and those will be different for different address spaces.
This is too much effort, but it can be done in the future if the need
arises, the specification reserves address space number before ':'.
DeltaFile
+68-19llvm/lib/IR/DataLayout.cpp
+16-6llvm/include/llvm/IR/DataLayout.h
+84-252 files

LLVM/project 3823341llvm/lib/Transforms/Vectorize VPlanTransforms.cpp, llvm/test/Transforms/LoopVectorize reduction-inloop-uf4.ll consecutive-ptr-uniforms.ll

Reapply "[VPlan] Run removeDeadRecipes early." (#195325) (#195445)

This reverts commit 2a9699ccd128d7f94372d18c97229e1934b8506e.

Recommit contains a small fix for skipping dead recipes when finding
induction casts.

Original message:
The initial simplifyRecipes run can leave dead recipes, which
removeDeadRecipes can clean up, similar for dead instructions in the
input.

PR: https://github.com/llvm/llvm-project/pull/190191
DeltaFile
+8-25llvm/test/Transforms/LoopVectorize/VPlan/predicator.ll
+15-10llvm/test/Transforms/LoopVectorize/reduction-inloop-uf4.ll
+10-6llvm/test/Transforms/LoopVectorize/AArch64/transform-narrow-interleave-to-widen-memory-with-wide-ops-chained.ll
+10-5llvm/test/Transforms/LoopVectorize/consecutive-ptr-uniforms.ll
+13-1llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
+6-6llvm/test/Transforms/LoopVectorize/tail-folding-optimize-vector-induction-width.ll
+62-5330 files not shown
+132-13136 files

LLVM/project 82f9618libc/src/__support/math CMakeLists.txt llrintf16.h, libc/src/math/generic CMakeLists.txt

[libc][math] Refactor lrint_lround family to header-only (#195441)

Refactors the lrint_lround math family to be header-only.

part of: #147386

Target Functions:
  - llrint
  - llrintbf16
  - llrintf
  - llrintf128
  - llrintf16
  - llrintl
  - llround
  - llroundbf16
  - llroundf
  - llroundf128
  - llroundf16
  - llroundl

    [11 lines not shown]
DeltaFile
+352-12utils/bazel/llvm-project-overlay/libc/BUILD.bazel
+228-0libc/src/__support/math/CMakeLists.txt
+24-42libc/src/math/generic/CMakeLists.txt
+48-0libc/test/shared/CMakeLists.txt
+33-0libc/src/__support/math/llrintf16.h
+33-0libc/src/__support/math/llrintf128.h
+718-5473 files not shown
+2,045-19979 files

LLVM/project 516a9d4mlir/docs/Traits _index.md, mlir/include/mlir/IR OpDefinition.h OpBase.td

[MLIR] Add HasAncestor op trait

Add HasAncestor/AncestorOneOf traits that verify an operation has a
specific ancestor anywhere in the parent chain, unlike HasParent which
only checks the immediate parent. This enables declarative verification
for ops that can be nested arbitrarily deep inside a required ancestor.
DeltaFile
+79-0mlir/test/IR/traits.mlir
+26-0mlir/include/mlir/IR/OpDefinition.h
+10-0mlir/docs/Traits/_index.md
+9-0mlir/test/lib/Dialect/Test/TestOps.td
+8-0mlir/include/mlir/IR/OpBase.td
+132-05 files

LLVM/project 1ed1761llvm/lib/Transforms/Utils CloneFunction.cpp

[NFC][LLVM] Simplify `PruningFunctionCloner::cloneInstruction` (#195389)

Add early returns and decrease indendation of the code that does
implements calls to constrained intrinsics.
DeltaFile
+57-62llvm/lib/Transforms/Utils/CloneFunction.cpp
+57-621 files

LLVM/project 7b026acllvm/lib/Transforms/Vectorize VPlanTransforms.cpp, llvm/test/Transforms/LoopVectorize/X86 replicating-load-store-costs.ll

[VPlan] Simplify extract-lane of all single-scalars (#194838)

Checking against vputils::isSingleScalar is sufficient for both
correctness and profitability.
DeltaFile
+3-5llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
+0-1llvm/test/Transforms/LoopVectorize/X86/replicating-load-store-costs.ll
+3-62 files

LLVM/project 482d0f0llvm/test/Transforms/LoopVectorize cast-induction.ll

[LV] Add test for crash with VPlan-based DCE (NFC) (#195438)

Add a test with a dead cast, causing the revert in 2a9699ccd12, due to
https://lab.llvm.org/buildbot/#/builders/67/builds/3821.
DeltaFile
+152-0llvm/test/Transforms/LoopVectorize/cast-induction.ll
+152-01 files

LLVM/project 2447939clang-tools-extra/clang-tidy/misc UnusedParametersCheck.cpp UnusedParametersCheck.h, clang-tools-extra/docs ReleaseNotes.rst

[Clang-Tidy] Skip `misc-unused-parameters` in macro. (#194999)

The new parameter allows to skip the check for the cases when we need to
use the macro, but there is no immediate way to fix the macro itself. It
recently come up with a gradual adoption of clang-tidy and not all parts
of the code could be fixed at once.

Simply enabling it using `clang-tidy-diff` is not enough, since
`misc-unused-parameters` would cause false-positive, since the given
diff didn't introduce the unused parameters and might be not easy to
change.

The given parameters allow for better incremental adoption.

---------

Co-authored-by: Dmitrii Kuragin <dkuragin at adobe.com>
Co-authored-by: EugeneZelenko <eugene.zelenko at gmail.com>
DeltaFile
+21-0clang-tools-extra/test/clang-tidy/checkers/misc/unused-parameters-macro.cpp
+8-0clang-tools-extra/docs/clang-tidy/checks/misc/unused-parameters.rst
+5-1clang-tools-extra/clang-tidy/misc/UnusedParametersCheck.cpp
+5-0clang-tools-extra/docs/ReleaseNotes.rst
+1-0clang-tools-extra/test/clang-tidy/infrastructure/dump-config-filtering.cpp
+1-0clang-tools-extra/clang-tidy/misc/UnusedParametersCheck.h
+41-16 files

LLVM/project 1b1e14flibc/src/__support/math f16addl.h f16addf128.h, libc/test/shared shared_math_constexpr_test.cpp CMakeLists.txt

[libc][math] Qualify f16add functions to constexpr (#195429)

Signed-off-by: udaykiriti <udaykiriti624 at gmail.com>
DeltaFile
+11-0libc/test/shared/shared_math_constexpr_test.cpp
+4-0libc/test/shared/CMakeLists.txt
+1-1libc/src/__support/math/f16addl.h
+1-1libc/src/__support/math/f16addf128.h
+1-1libc/src/__support/math/f16addf.h
+1-1libc/src/__support/math/f16add.h
+19-46 files

LLVM/project 34a80c9llvm/lib/Transforms/InstCombine InstructionCombining.cpp, llvm/test/Transforms/InstCombine allocsite-removable-few-users.ll

[InstCombine] Add user-count bailout to isAllocSiteRemovable (#190347)

isAllocSiteRemovable() walks all transitive users of an alloc site, but
sites with many users are almost never removable. Profiling on
real-world codegen workloads (73,943 alloc sites) showed:

- 89 removable sites, max 1,392 users walked
- 73,854 non-removable sites, avg 31,305 users walked
- 2.31B total wasted user visits (~400s wall-clock on a 35-min build)

Skip the removability analysis when direct user count exceeds a
configurable threshold (default 2048, tunable via hidden cl::opt
-instcombine-max-allocsite-removable-users).

Also defer WeakTrackingVH conversion: collect into Instruction* first
and convert only when the site is actually removable.
DeltaFile
+31-0llvm/test/Transforms/InstCombine/allocsite-removable-few-users.ll
+15-3llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
+46-32 files

LLVM/project 0ec4b22libc/src/__support/math fmul.h

[libc][math][NFC] fix fmul build (#195437)
DeltaFile
+1-1libc/src/__support/math/fmul.h
+1-11 files

LLVM/project 11dd2c1libc/src/__support/math fmul.h

[libc][math][NFC] fix fmul build
DeltaFile
+1-1libc/src/__support/math/fmul.h
+1-11 files

LLVM/project a7f6a6flibc/src/__support/math fmul.h CMakeLists.txt, libc/src/math/generic fmul.cpp

[libc][math] Refactor fmul-fsub-frexp family to header-only (#195431)

Refactors the fmul-fsub-frexp math family to be header-only.

part of: #147386

Target Functions:
  - fmul
  - fmulf128
  - fmull
  - fsub
  - fsubf128
  - fsubl
  - frexp
  - frexpbf16
  - frexpl

Co-authored-by: Muhammad Bassiouni <60100307+bassiounix at users.noreply.github.com>
DeltaFile
+131-8utils/bazel/llvm-project-overlay/libc/BUILD.bazel
+126-0libc/src/__support/math/fmul.h
+4-105libc/src/math/generic/fmul.cpp
+87-0libc/src/__support/math/CMakeLists.txt
+31-0libc/src/__support/math/fsubf128.h
+31-0libc/src/__support/math/fmulf128.h
+410-11329 files not shown
+873-16535 files

LLVM/project d22b41dllvm/lib/Transforms/Utils AssumeBundleBuilder.cpp, llvm/test/Transforms/Util assume-builder-atomics.ll

[llvm] Add support for atomicrmw and cmpxchg in AssumeBundleBuilder (#194630)

The assume builder currently only preserves dereferenceable, nonnull,
and alignment knowledge for regular load/store instructions and calls.
Atomic memory accessing instructions (atomicrmw and cmpxchg) also
dereference their pointer operands, but were previously skipped, causing
useful knowledge to be lost across these operations.

Add handling for AtomicRMWInst and AtomicCmpXchgInst in
AssumeBuilderState::addInstruction(), using the same addAccessedPtr()
path as loads and stores. The accessed type is taken from the value
operand (atomicrmw) or compare operand (cmpxchg), which corresponds to
the in-memory element type, and the alignment is taken from the
instruction's explicit alignment.

Add a test to verify that assume bundles are correctly generated before
atomicrmw and cmpxchg instructions.

---------

Co-authored-by: Nikita Popov <github at npopov.com>
DeltaFile
+17-0llvm/test/Transforms/Util/assume-builder-atomics.ll
+7-1llvm/lib/Transforms/Utils/AssumeBundleBuilder.cpp
+24-12 files

LLVM/project 6ef6713llvm/lib/Target/RISCV RISCVISelLowering.cpp RISCVInstrInfoZvzip.td, llvm/test/CodeGen/RISCV/rvv vector-interleave.ll fixed-vectors-shuffle-int-interleave.ll

[RISCV][CodeGen] Add initial vzip codegen support (#194548)

Add initial support for vzip instruction, which is included in zvzip
extension. It is used to lower VECTOR_SHUFFLE with interleave pattern
and VECTOR_INTERLEAVE.
DeltaFile
+61-143llvm/test/CodeGen/RISCV/rvv/vector-interleave.ll
+23-63llvm/test/CodeGen/RISCV/rvv/fixed-vectors-shuffle-int-interleave.ll
+55-2llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+11-25llvm/test/CodeGen/RISCV/rvv/vector-interleave-fixed.ll
+25-0llvm/lib/Target/RISCV/RISCVInstrInfoZvzip.td
+4-8llvm/test/CodeGen/RISCV/rvv/vector-deinterleave.ll
+179-2416 files

LLVM/project ef4a720llvm/lib/Transforms/IPO Instrumentor.cpp

Fix format
DeltaFile
+2-1llvm/lib/Transforms/IPO/Instrumentor.cpp
+2-11 files

LLVM/project 329853dclang/lib/CIR/CodeGen CIRGenExpr.cpp CIRGenValue.h, clang/test/CIR/CodeGen vector-ext-element.cpp

[CIR] Implement ExtVectorElementExpr with non simple base (#195165)

Implement support for the ExtVectorElementExpr with non simple base

Issue #192311
DeltaFile
+24-0clang/test/CIR/CodeGen/vector-ext-element.cpp
+13-3clang/lib/CIR/CodeGen/CIRGenExpr.cpp
+3-1clang/lib/CIR/CodeGen/CIRGenValue.h
+40-43 files

LLVM/project ce49712llvm/lib/Transforms/IPO Instrumentor.cpp

Indicate when constant int should be signed
DeltaFile
+3-3llvm/lib/Transforms/IPO/Instrumentor.cpp
+3-31 files

LLVM/project 797cad2llvm/test/CodeGen/AMDGPU amdgcn.bitcast.1024bit.ll amdgcn.bitcast.512bit.ll, llvm/test/CodeGen/AMDGPU/NextUseAnalysis spill-vreg-many-lanes.mir acyclic-770bb.mir

Merge branch 'main' into users/kevinsala/instrumentor-base-pr
DeltaFile
+158,755-173,230llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.1024bit.ll
+275,101-0llvm/test/CodeGen/AMDGPU/NextUseAnalysis/spill-vreg-many-lanes.mir
+144,679-0llvm/test/CodeGen/AMDGPU/NextUseAnalysis/acyclic-770bb.mir
+50,477-50,088llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.512bit.ll
+92,827-0llvm/test/CodeGen/RISCV/rvv/clmulh-sdnode.ll
+42,349-42,348llvm/test/MC/AMDGPU/gfx8_asm_vop3.s
+764,188-265,66644,684 files not shown
+6,747,763-3,384,96444,690 files

LLVM/project d4ae620clang/lib/CIR/Dialect/IR CIRDialect.cpp, clang/unittests/CIR ControlFlowTest.cpp CMakeLists.txt

[CIR] Add RegionBranchOpInterface unit tests and fix control flow bugs

Add unit tests for RegionBranchOpInterface implementations across CIR
control flow operations: IfOp, ScopeOp, TernaryOp, SwitchOp, WhileOp,
ForOp, DoWhileOp, and TryOp. The tests verify successor regions,
terminator successors, loop detection, repetitive region marking, and
op/terminator successor consistency.

Fix a missing return in ConditionOp::getSuccessorRegions that caused
fallthrough from the loop case to an unconditional cast<AwaitOp>,
crashing when the parent is a loop operation.

Fix IfOp::getSuccessorRegions to report parent exit as a successor
when the else region is absent, correctly modeling the case where the
condition is false.
DeltaFile
+458-0clang/unittests/CIR/ControlFlowTest.cpp
+3-3clang/lib/CIR/Dialect/IR/CIRDialect.cpp
+2-0clang/unittests/CIR/CMakeLists.txt
+463-33 files

LLVM/project 0d27ddalibc/src/__support/math CMakeLists.txt scalbnf16.h, libc/src/math/generic CMakeLists.txt

[libc][math] Refactor scalbln-scalbn-ldexp family to header-only (#195423)

Refactors the scalbln-scalbn-ldexp math family to be header-only.

part of: #147386

Target Functions:
  - ldexp
  - ldexpbf16
  - ldexpl
  - scalbln
  - scalblnbf16
  - scalblnf
  - scalblnf128
  - scalblnf16
  - scalblnl
  - scalbn
  - scalbnbf16
  - scalbnf

    [2 lines not shown]
DeltaFile
+209-12utils/bazel/llvm-project-overlay/libc/BUILD.bazel
+156-2libc/src/__support/math/CMakeLists.txt
+15-43libc/src/math/generic/CMakeLists.txt
+36-0libc/src/__support/math/scalbnf16.h
+36-0libc/src/__support/math/scalbnf128.h
+36-0libc/src/__support/math/scalblnf16.h
+488-5746 files not shown
+1,326-18252 files

LLVM/project 39d1203libc/src/__support/math CMakeLists.txt remainderf16.h, libc/test/shared shared_math_constexpr_test.cpp shared_math_test.cpp

[libc][math] Refactor remainder-remquo family to header-only (#195421)

Refactors the remainder-remquo math family to be header-only.

part of: #147386

Target Functions:
  - remainder
  - remainderbf16
  - remainderf
  - remainderf128
  - remainderf16
  - remainderl
  - remquo
  - remquobf16
  - remquof
  - remquof128
  - remquof16
  - remquol
DeltaFile
+176-6utils/bazel/llvm-project-overlay/libc/BUILD.bazel
+114-0libc/src/__support/math/CMakeLists.txt
+34-0libc/test/shared/shared_math_constexpr_test.cpp
+33-0libc/test/shared/shared_math_test.cpp
+32-0libc/src/__support/math/remainderf16.h
+32-0libc/src/__support/math/remainderf128.h
+421-639 files not shown
+1,078-9645 files