LLVM/project 481da94bolt/include/bolt/Passes PAuthGadgetScanner.h, bolt/include/bolt/Utils CommandLineOpts.h

[BOLT] Gadget scanner: implement finer-grained --scanners=... argument (#176135)

Add separate options to enable each of the available gadget detectors.
Furthermore, add two meta-options enabling all PtrAuth scanners and all
available scanners of any type (which is only PtrAuth for now, though).

This commit renames `pacret` option to `ptrauth-pac-ret` and `pauth` to
`ptrauth-all`.
DeltaFile
+130-0bolt/test/binary-analysis/AArch64/gs-pauth-scanners.s
+40-15bolt/lib/Passes/PAuthGadgetScanner.cpp
+28-17bolt/lib/Rewrite/RewriteInstance.cpp
+20-2bolt/include/bolt/Utils/CommandLineOpts.h
+7-8bolt/include/bolt/Passes/PAuthGadgetScanner.h
+9-4bolt/test/binary-analysis/AArch64/cmdline-args.test
+234-4610 files not shown
+251-7416 files

LLVM/project d87ac5butils/bazel/llvm-project-overlay/libc BUILD.bazel

Fix bazel build for #179251 (#186407)
DeltaFile
+3-0utils/bazel/llvm-project-overlay/libc/BUILD.bazel
+3-01 files

LLVM/project 579aca8llvm/lib/Transforms/Vectorize VPlanValue.h VPlanTransforms.cpp, llvm/unittests/Transforms/Vectorize VPlanTest.cpp

[VPlan] Prevent uses of materialized VPSymbolicValues. (NFC) (#182318)

After VPSymbolicValues (like VF and VFxUF) are materialized via
replaceAllUsesWith, they should not be accessed again. This patch:

1. Tracks materialization state in VPSymbolicValue.

2. Asserts if the materialized VPValue is used again. Currently it
   adds asserts to various member functions, preventing calling them
   on materialized symbolic values.

Note that this still allows some uses (e.g. comparing VPSymbolicValue
references or pointers), but this should be relatively harmless given
that it is impossible to (re-)add any users. If we want to further
tighten the checks, we could add asserts to the accessors or override
operator&, but that will require more changes and not add much extra
guards I think.

Depends on https://github.com/llvm/llvm-project/pull/182146 to fix a

    [2 lines not shown]
DeltaFile
+51-6llvm/lib/Transforms/Vectorize/VPlanValue.h
+37-0llvm/unittests/Transforms/Vectorize/VPlanTest.cpp
+11-4llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
+4-4llvm/lib/Transforms/Vectorize/VPlan.h
+2-1llvm/lib/Transforms/Vectorize/VPlanAnalysis.cpp
+3-0llvm/lib/Transforms/Vectorize/VPlan.cpp
+108-156 files

LLVM/project b3bc1f5lldb/source/Core Module.cpp

[lldb][Module][NFC] Use early-return style in LoadScriptingResourceInTarget (#186392)

Planning on adding more to this function/loop soon. Making it
early-return style (as suggested by the LLVM style guide) makes those
changes easier to reason about.

Drive-by:
* Reduced the indentation of the loop by doing an early-continue if the
`FileSpec` is invalid or doesn't exist
DeltaFile
+43-44lldb/source/Core/Module.cpp
+43-441 files

LLVM/project c5e85callvm/lib/Target/AArch64 AArch64TargetTransformInfo.cpp, llvm/test/Analysis/CostModel/AArch64 pow-special.ll

[AArch64] Improve pow(x,y) cost model for some constant values of y (#185607)

Some optimisations of pow(x, y) calls only occur during codegen,
e.g. pow(x, 0.25) -> sqrt(sqrt(x)) and at the IR level we don't
currently reflect this in the cost of calls to the llvm.pow
intrinsic. This patch attempts to fix that in cases where we know
the intrinsic can in general be legally lowered to libcalls. For
scalable vector variants of llvm.pow we need to be cautious, since
without a math library this cannot be scalarised and there is
always a small risk that the optimisation will not happen during
codegen.
DeltaFile
+119-0llvm/test/Analysis/CostModel/AArch64/pow-special.ll
+36-1llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
+155-12 files

LLVM/project 55db9cbllvm/lib/Analysis IVDescriptors.cpp, llvm/test/Transforms/LoopVectorize/AArch64 conditional-scalar-assignment.ll

[IVDescriptors] Remove single-use constraint from FindLast comparisons (#186096)

Just relaxing some minor constraints for FindLast recurrence detection.
DeltaFile
+69-25llvm/test/Transforms/LoopVectorize/AArch64/conditional-scalar-assignment.ll
+52-15llvm/test/Transforms/LoopVectorize/X86/conditional-scalar-assignment.ll
+2-6llvm/lib/Analysis/IVDescriptors.cpp
+123-463 files

LLVM/project c5e305clibc/include wctype.yaml, libc/src/wctype iswupper.cpp iswupper.h

[libc] Implement entrypoint and test of iswupper function (#185215)

Implement entrypoint and test of iswupper function (#185136)
DeltaFile
+25-0libc/test/src/wctype/iswupper_test.cpp
+21-0libc/src/wctype/iswupper.cpp
+21-0libc/src/wctype/iswupper.h
+12-0libc/src/wctype/CMakeLists.txt
+10-0libc/test/src/wctype/CMakeLists.txt
+6-0libc/include/wctype.yaml
+95-010 files not shown
+103-316 files

LLVM/project ea61110llvm/lib/Target/AArch64 AArch64ExpandPseudoInsts.cpp AArch64SVEInstrInfo.td, llvm/test/CodeGen/AArch64 sve2-bitsel-pseudos-expansion.mir sve2-bsl.ll

[AArch64][SVE2] Allow commuting two-input NBSL/BSL2N idioms. (#184847)

Specifically, EON, NAND and NOR are commutable operations that lack
dedicated SVE2 instructions, but we support them via NBSL/BSL2N.

However, as NBSL/BSL2N have tied operands, sometimes we generate a COPY
even if one of the operands could be clobbered.

This patch defines custom expansion for these operations to allow using
their commuted forms or, if still necessary, using MOVPRFX for the COPY.

Should help with
https://github.com/llvm/llvm-project/pull/176194#discussion_r2889564685.
DeltaFile
+329-0llvm/test/CodeGen/AArch64/sve2-bitsel-pseudos-expansion.mir
+74-0llvm/lib/Target/AArch64/AArch64ExpandPseudoInsts.cpp
+55-0llvm/test/CodeGen/AArch64/sve2-bsl.ll
+17-6llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
+2-2llvm/lib/Target/AArch64/AArch64SchedOlympus.td
+2-1llvm/lib/Target/AArch64/AArch64SchedNeoverseV2.td
+479-94 files not shown
+487-1310 files

LLVM/project 7855812libc/shared/math erff16.h, libc/src/__support/math erff16.h

[libc][math] Implement C23 half precision erf function (#179251)

The implementation reuses the approach in `erff`


Closes #133112
DeltaFile
+181-0libc/src/__support/math/erff16.h
+51-0libc/test/src/math/smoke/erff16_test.cpp
+43-0libc/test/src/math/erff16_test.cpp
+27-0libc/shared/math/erff16.h
+21-0libc/src/math/erff16.h
+20-0utils/bazel/llvm-project-overlay/libc/BUILD.bazel
+343-015 files not shown
+420-021 files

LLVM/project ad05482libc/shared/math hypotbf16.h, libc/src/__support/math hypotbf16.h

[libc][math][c23] Add hypotbf16 function (#183460)

This PR intends to add hypotbf16 higher math function for BFloat16 type
along with the tests.
DeltaFile
+72-0libc/test/src/math/exhaustive/hypotbf16_test.cpp
+29-0libc/src/__support/math/hypotbf16.h
+26-0libc/shared/math/hypotbf16.h
+22-0libc/test/src/math/hypotbf16_test.cpp
+21-0libc/src/math/hypotbf16.h
+18-0libc/test/src/math/smoke/hypotbf16_test.cpp
+188-025 files not shown
+313-231 files

LLVM/project cf54acallvm/utils/gn/secondary/clang-tools-extra/clang-doc/tool BUILD.gn

[gn] port b80248a0ea35df more (clang-doc md templates) (#186401)

The previous version misspelled the name of comments-partial.mustache,
and it put the md files in the wrong output directory.
DeltaFile
+12-5llvm/utils/gn/secondary/clang-tools-extra/clang-doc/tool/BUILD.gn
+12-51 files

LLVM/project c765566libc/config/linux/riscv entrypoints.txt, libc/src/unistd/linux chown.cpp CMakeLists.txt

[libc] Add support for chown on platforms that don't define SYS_chown (#186167)

Some platforms don't define SYS_chown (like risc-v), so this PR adds a
fallback to calling SYS_fchownat.
DeltaFile
+9-0libc/src/unistd/linux/chown.cpp
+1-0libc/config/linux/riscv/entrypoints.txt
+1-0libc/src/unistd/linux/CMakeLists.txt
+11-03 files

LLVM/project a7d1a87libc/src/__support/math log10p1f16.h CMakeLists.txt, libc/src/math log10p1f16.h

[libc][math][c23] Add log10p1f16 C23 math function (#184739)

Closes #133202

---------

Signed-off-by: Shikhar Soni <shikharish05 at gmail.com>
DeltaFile
+207-0libc/src/__support/math/log10p1f16.h
+49-0libc/test/src/math/log10p1f16_test.cpp
+48-0libc/test/src/math/smoke/log10p1f16_test.cpp
+21-0libc/src/math/log10p1f16.h
+19-0libc/src/__support/math/CMakeLists.txt
+18-0libc/src/math/generic/log10p1f16.cpp
+362-019 files not shown
+435-125 files

LLVM/project 1b9a4a0offload/plugins-nextgen/level_zero/src L0Device.cpp

[Offload][L0] clear completed events from a wait list (#186379)

Queue's WaitEvent collection wasn't being cleared after synchronization
and resetting of the events. This led to hangs on subsequent host
synchronizations if not preceeded by any other operation.
DeltaFile
+3-0offload/plugins-nextgen/level_zero/src/L0Device.cpp
+3-01 files

LLVM/project d4418f1llvm/lib/CodeGen MIRPrinter.cpp, llvm/lib/CodeGen/MIRParser MIParser.cpp

[MIR] Support symbolic inline asm tiedto constraints

Co-Authored-By: Claude Opus 4.6 <noreply at anthropic.com>
DeltaFile
+19-0llvm/lib/CodeGen/MIRParser/MIParser.cpp
+7-7llvm/test/CodeGen/AMDGPU/subreg-undef-def-with-other-subreg-defs.mir
+11-0llvm/test/CodeGen/MIR/Generic/inline-asm-tiedto-missing-dollar.mir
+11-0llvm/test/CodeGen/MIR/Generic/inline-asm-tiedto-missing-colon.mir
+11-0llvm/test/CodeGen/MIR/Generic/inline-asm-tiedto-bad-operand-number.mir
+4-0llvm/lib/CodeGen/MIRPrinter.cpp
+63-76 files

LLVM/project c513ed1llvm/lib/IR Verifier.cpp, llvm/test/DebugInfo/MIR/X86 live-debug-values-reg-copy.mir

[DebugInfo] Add Verifier check for duplicate arg indices in SP's retainedNodes list (#186225)

DwarfFile asserts if two arguments of the same subprogram with the same
index are present in a DISubprogram scope:
https://github.com/llvm/llvm-project/blob/5d7a502a9d923784abe4382ec479ee1c0667d743/llvm/lib/CodeGen/AsmPrinter/DwarfFile.cpp#L110

This patch adds a check to the Verifier to detect such invalid IR
earlier. It can be helpful for finding reproducers for bugs like
https://issues.chromium.org/issues/40288032.

The incorrect args field of DILocalVariable in
llvm/test/DebugInfo/MIR/X86/live-debug-values-reg-copy.mir is fixed.
DeltaFile
+13-0llvm/lib/IR/Verifier.cpp
+1-1llvm/test/DebugInfo/MIR/X86/live-debug-values-reg-copy.mir
+14-12 files

LLVM/project 991fd93llvm/lib/CodeGen MIRPrinter.cpp, llvm/lib/CodeGen/MIRParser MIParser.cpp

[MIR] Support symbolic inline asm operands (#185893)

Support parsing and printing inline assembly operands in MIR using the
symbolic form instead of numeric register class IDs, thus removing the
need to update tests when the numbers change.

The numeric form remains supported.

---------

Co-authored-by: Claude Opus 4.6 <noreply at anthropic.com>
DeltaFile
+99-3llvm/lib/CodeGen/MIRParser/MIParser.cpp
+16-16llvm/test/CodeGen/AMDGPU/dst-sel-hazard.mir
+26-0llvm/lib/CodeGen/MIRPrinter.cpp
+11-0llvm/test/CodeGen/MIR/Generic/inline-asm-no-constraint.mir
+11-0llvm/test/CodeGen/MIR/Generic/inline-asm-bad-mem-constraint.mir
+11-0llvm/test/CodeGen/MIR/Generic/inline-asm-bad-regclass.mir
+174-196 files

LLVM/project 7ba7d76libcxx/test/benchmarks/containers/associative associative_container_benchmarks.h

[libc++] Make the associative container query benchmarks more representative (#183036)

Currently the query benchmarks are training the branch predictor
incredibly well, which isn't representative of the real world. This
change causes the branch misses to go from <1% to ~50% with the current
implementation of `__tree::__find_end`.

This patch also removes the `non-existent` benchmarks, since it'd be
non-trivial to write a representative benchmark for that case, and the
benchmark would be relatively low value. We're already searching to leaf
nodes ~50% of the time (since half the nodes are leaves) with the
current benchmark. So we'd only additionally cover a relatively trivial
failure branch that is only taken once per function call. The loop is
already covered through benchmarking with keys existing in the
container.
DeltaFile
+15-57libcxx/test/benchmarks/containers/associative/associative_container_benchmarks.h
+15-571 files

LLVM/project 94da403llvm/lib/Analysis InlineCost.cpp ScalarEvolution.cpp, llvm/unittests/Analysis MemorySSATest.cpp LoopInfoTest.cpp

[Analysis][NFC] Drop use of BranchInst (#186374)

Largely straight-forward replacement.
DeltaFile
+24-24llvm/unittests/Analysis/MemorySSATest.cpp
+22-22llvm/lib/Analysis/InlineCost.cpp
+18-22llvm/lib/Analysis/ScalarEvolution.cpp
+16-16llvm/unittests/Analysis/LoopInfoTest.cpp
+12-16llvm/lib/Analysis/IRSimilarityIdentifier.cpp
+13-13llvm/unittests/Analysis/ScalarEvolutionTest.cpp
+105-11327 files not shown
+210-24033 files

LLVM/project 446c552llvm/test/Analysis/DependenceAnalysis exact-siv-large-btc.ll

[DA] Add test for the Exact SIV test misses dependency (NFC)
DeltaFile
+53-0llvm/test/Analysis/DependenceAnalysis/exact-siv-large-btc.ll
+53-01 files

LLVM/project 95050a2llvm/lib/Analysis DependenceAnalysis.cpp, llvm/test/Analysis/DependenceAnalysis exact-siv-large-btc.ll rdiv-large-btc.ll

[DA] Add precondition `0 <=s UB` to function `inferAffineDomain`
DeltaFile
+23-12llvm/lib/Analysis/DependenceAnalysis.cpp
+16-9llvm/test/Analysis/DependenceAnalysis/exact-siv-large-btc.ll
+1-1llvm/test/Analysis/DependenceAnalysis/rdiv-large-btc.ll
+40-223 files

LLVM/project 77bcad4llvm/lib/Analysis DependenceAnalysis.cpp, llvm/test/Analysis/DependenceAnalysis strong-siv-addrec-wrap.ll exact-siv-addrec-wrap.ll

[DA] Remove calls to the GCD MIV test from `testSIV`
DeltaFile
+9-19llvm/test/Analysis/DependenceAnalysis/strong-siv-addrec-wrap.ll
+9-19llvm/test/Analysis/DependenceAnalysis/exact-siv-addrec-wrap.ll
+9-16llvm/test/Analysis/DependenceAnalysis/infer_affine_domain_ovlf.ll
+12-12llvm/test/Analysis/DependenceAnalysis/run-specific-dependence-test.ll
+4-8llvm/lib/Analysis/DependenceAnalysis.cpp
+2-2llvm/test/Analysis/DependenceAnalysis/exact-siv-overflow.ll
+45-766 files

LLVM/project 9e59354mlir/lib/Dialect/Tosa/Transforms TosaProfileCompliance.cpp, mlir/test/Dialect/Tosa invalid_extension.mlir tosa-validation-version-1p1-valid.mlir

[mlir][tosa] Allow integer gather/scatter ops in fp profile (#183342)

This commit updates profile compliance to allow integer gather and
scatter operations to be used with the floating point profile. This
update aligns with the specification change:
https://github.com/arm/tosa-specification/pull/35.
DeltaFile
+76-66mlir/lib/Dialect/Tosa/Transforms/TosaProfileCompliance.cpp
+57-57mlir/test/Dialect/Tosa/invalid_extension.mlir
+96-0mlir/test/Dialect/Tosa/tosa-validation-version-1p1-valid.mlir
+47-47mlir/test/Dialect/Tosa/profile_pro_fp_unsupported.mlir
+43-44mlir/test/Dialect/Tosa/profile_all_unsupported.mlir
+37-37mlir/test/Dialect/Tosa/profile_pro_int_unsupported.mlir
+356-2516 files not shown
+435-27412 files

LLVM/project 8238ae2clang/lib/CIR/CodeGen CIRGenExprConstant.cpp, clang/test/CIR/CodeGenCXX zero_init_bases.cpp

[CIR] Implement zero-init-bases lowering (#186230)

This showed up in a test suite. A zero-initializer for a whole struct
seems completely sensible, as long as the type is zero-initializable.

This patch doesn't change the non-zero-init behavior (I am working on a
patch to do so, but it is a massive scope), so this is limited to JUST
classes with bases.
DeltaFile
+56-0clang/test/CIR/CodeGenCXX/zero_init_bases.cpp
+0-6clang/lib/CIR/CodeGen/CIRGenExprConstant.cpp
+56-62 files

LLVM/project 0baa7baclang/docs ReleaseNotes.rst, clang/lib/Sema SemaTemplateInstantiateDecl.cpp

[Clang][Sema] Only call PerformDependentDiagnostics for dependent contexts (#177452)
DeltaFile
+26-0clang/test/SemaTemplate/GH176155.cpp
+2-1clang/lib/Sema/SemaTemplateInstantiateDecl.cpp
+1-0clang/docs/ReleaseNotes.rst
+29-13 files

LLVM/project 9d20e75llvm/lib/Transforms/Vectorize VectorCombine.cpp, llvm/test/Transforms/VectorCombine/X86 shuffle-of-selects.ll

[VectorCombine] Fix crash in foldShuffleOfSelects for single-element shuffle result (#185713)

In foldShuffleOfSelects, if the shuffle result has a single element, the
resulting type may be scalar rather than a vector. The later code in
foldShuffleOfSelects assumes the result is a vector and performs cast<
FixedVectorType >, which triggers an assertion.

Fixes #183625
DeltaFile
+15-0llvm/test/Transforms/VectorCombine/X86/shuffle-of-selects.ll
+2-2llvm/lib/Transforms/Vectorize/VectorCombine.cpp
+17-22 files

LLVM/project d7a388cllvm/lib/Target/AMDGPU SIInsertWaitcnts.cpp

[AMDGPU] Pass MF into the SIInsertWaitcnts constructor. NFC. (#186369)

Pass MF into the SIInsertWaitcnts constructor instead of the run method.
This is more natural now that SIInsertWaitcnts is constructed once per
MachineFunction and enables future cleanup by initializing more fields
in the constructor that depend on MF.
DeltaFile
+7-6llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
+7-61 files

LLVM/project 8c84b3cllvm/lib/Target/AMDGPU SIInstrInfo.h

[AMDGPU][NFC] Add missing isFLAT check to isVMEM. (#186321)

This was missed in #137148
DeltaFile
+2-1llvm/lib/Target/AMDGPU/SIInstrInfo.h
+2-11 files

LLVM/project d540855mlir/lib/Bytecode/Reader BytecodeReader.cpp, mlir/test/Bytecode bytecode_callback_with_spirv_and_custom_attr.mlir

[mlir][Bytecode] Fix stale deferred worklist entries in attribute callback fallthrough (#186150)

When parseCustomEntry() calls a user attribute/type callback that
internally reads sub-attributes/types via the bytecode reader, the
reader may add entries to the deferredWorklist if the depth limit is
exceeded. If the callback then returns success with an empty entry
(falling through to the regular dialect reader), the reader position is
reset but deferredWorklist retains stale entries from the failed partial
read.

This causes an assert(deferredWorklist.empty()) failure in debug builds
when the fallback dialect reader successfully parses the attribute.

Fix by saving and restoring deferredWorklist.size() around each callback
invocation, discarding any stale entries added during a callback's
partial read when the reader position is rolled back.

Fixes #163337

Assisted-by: Claude Code
DeltaFile
+30-0mlir/test/Bytecode/bytecode_callback_with_spirv_and_custom_attr.mlir
+10-4mlir/lib/Bytecode/Reader/BytecodeReader.cpp
+40-42 files

LLVM/project 0572ad6mlir/lib/Dialect/Shape/IR ShapeCanonicalization.td, mlir/test/Dialect/Shape canonicalize.mlir

[mlir][shape] Fix crash when folding tensor.extract(shape_of(memref)) (#186270)

The `ExtractFromShapeOfExtentTensor` canonicalization pattern was
unconditionally rewriting:

  tensor.extract(shape.shape_of(%arg), %idx) -> tensor.dim(%arg, %idx)

even when `%arg` is a memref. This produced an invalid `tensor.dim`
(whose source operand must be a tensor), which then caused an assertion
failure in `DimOp::getSource()` when subsequent canonicalization
patterns tried to match the op:

Assertion `isa<To>(Val) && "cast<Ty>() argument of incompatible type\!"'
  failed.  [To = TypedValue<TensorType>, From = Value]

Fix: add an `IsTensorType` constraint to
`ExtractFromShapeOfExtentTensor` in `ShapeCanonicalization.td` so the
pattern only fires when `%arg` is a tensor type. The memref case is
intentionally left unfolded (the correct lowering to `memref.dim` would

    [8 lines not shown]
DeltaFile
+31-0mlir/test/Dialect/Shape/canonicalize.mlir
+9-3mlir/lib/Dialect/Shape/IR/ShapeCanonicalization.td
+40-32 files