LLVM/project e8ac71elibc/src/__support/CPP iterator.h

fix iterator
DeltaFile
+1-3libc/src/__support/CPP/iterator.h
+1-31 files

LLVM/project 172c0bbclang-tools-extra/clang-tidy/tool check_alphabetical_order_test.py check_alphabetical_order.py

[clang-tidy] Fix alphabetical order check for multiline doc entries and whitespace handling (#186950)

The `check_alphabetical_order.py` script previously only scanned the
first line of each bullet point in `ReleaseNotes.rst`, causing sorting
failures when a `:doc:` tag was split across multiple lines.

Also, when it is sorting the last entry of a section, the script will
insert an unnecessary whitespace.

This PR fixes these two problems.
DeltaFile
+43-3clang-tools-extra/clang-tidy/tool/check_alphabetical_order_test.py
+14-6clang-tools-extra/clang-tidy/tool/check_alphabetical_order.py
+57-92 files

LLVM/project 03cd306libc/utils/wctype_utils gen.py

remove flag
DeltaFile
+1-1libc/utils/wctype_utils/gen.py
+1-11 files

LLVM/project 66bc565utils/bazel/llvm-project-overlay/mlir/python BUILD.bazel

[BAZEL] Add missing affine python enum gen (#187669)
DeltaFile
+10-4utils/bazel/llvm-project-overlay/mlir/python/BUILD.bazel
+10-41 files

LLVM/project df3b24alibc/utils/wctype_utils gen.py, libc/utils/wctype_utils/conversion hex_writer.py

format
DeltaFile
+4-5libc/utils/wctype_utils/gen.py
+1-2libc/utils/wctype_utils/conversion/hex_writer.py
+5-72 files

LLVM/project 21f439fllvm/lib/Passes PassBuilder.cpp PassRegistry.def, llvm/lib/Transforms/Scalar LoopRotation.cpp

[LoopRotate] Use SCEV exit counts to improve rotation profitability (#187483)

Most loop transformations, like unrolling and vectorization, expect the
latch branch to be countable. Allow rotation, if it turns the latch from
uncountable to countable.

This use SCEV to check for countable exits, if CheckExitCount set.
Currently it is not set for the LPM1 run (where SCEV is not used by
other passes), only in LPM.

With that compile-time impact is mostly neutral

https://llvm-compile-time-tracker.com/compare.php?from=eba342d0ba930a404a026c80aada51c43974f0db&to=2e676337b45fae63ce9498116d8e6e43772363c5&stat=instructions:u

ClamAV is consistently slower (~+0.15%) and 7zip faster in most cases
(~-0.13%)

Across a large test set based on C/C++ workloads, this rotates ~0.8%
more loops with ~2.68M rotated loops.

    [16 lines not shown]
DeltaFile
+164-0llvm/test/Transforms/PhaseOrdering/AArch64/loop-rotate-to-enable-unrolling-and-vectorization.ll
+155-0llvm/test/Transforms/LoopRotate/rotate-exitcount.ll
+20-9llvm/lib/Transforms/Utils/LoopRotationUtils.cpp
+12-6llvm/lib/Transforms/Scalar/LoopRotation.cpp
+12-4llvm/lib/Passes/PassBuilder.cpp
+5-3llvm/lib/Passes/PassRegistry.def
+368-224 files not shown
+376-2810 files

LLVM/project 1083efclibc/utils/wctype_utils gen.py, libc/utils/wctype_utils/conversion hex_writer.py

format
DeltaFile
+2-1libc/utils/wctype_utils/conversion/hex_writer.py
+2-1libc/utils/wctype_utils/gen.py
+4-22 files

LLVM/project 14de6dallvm/lib/Target/SPIRV SPIRVAsmPrinter.cpp SPIRVModuleAnalysis.h, llvm/test/CodeGen/SPIRV/transcoding GlobalVarAnnotate.ll

[SPIR-V] Support global variable annotations in llvm.global.annotations (#187241)

SPIR-V backend previously only supported function annotations in
llvm.global.annotations and crashed with a fatal error when encountering
global variable entries
DeltaFile
+23-0llvm/test/CodeGen/SPIRV/transcoding/GlobalVarAnnotate.ll
+6-8llvm/lib/Target/SPIRV/SPIRVAsmPrinter.cpp
+5-5llvm/lib/Target/SPIRV/SPIRVModuleAnalysis.h
+5-4llvm/lib/Target/SPIRV/SPIRVMCInstLower.cpp
+6-3llvm/lib/Target/SPIRV/SPIRVModuleAnalysis.cpp
+45-205 files

LLVM/project 658bed5libc/src/__support/wctype perfect_hash_map.h lower_to_upper.h, libc/utils/wctype_utils/conversion hex_writer.py

[libc][wctype] Add perfect hash map for conversion functions
DeltaFile
+876-0libc/src/__support/wctype/perfect_hash_map.h
+568-0libc/src/__support/wctype/lower_to_upper.h
+553-0libc/src/__support/wctype/upper_to_lower.h
+0-400libc/src/__support/wctype/lower_to_upper.inc
+0-390libc/src/__support/wctype/upper_to_lower.inc
+71-1libc/utils/wctype_utils/conversion/hex_writer.py
+2,068-7918 files not shown
+2,256-79714 files

LLVM/project 7c28aaelibc/src/__support/CPP bit.h

reapply static
DeltaFile
+1-1libc/src/__support/CPP/bit.h
+1-11 files

LLVM/project f2e0e48libc/src/__support/math ceill.h, libc/test/shared shared_math_test.cpp

link issue
DeltaFile
+2-13libc/src/__support/math/ceill.h
+3-5libc/test/shared/shared_math_test.cpp
+5-182 files

LLVM/project 1d03351libc/src/__support/FPUtil bfloat16.h NearestIntegerOperations.h, libc/src/__support/FPUtil/generic add_sub.h

[libc][math] Qualify ceil functions to constexpr
DeltaFile
+59-7libc/test/shared/shared_math_test.cpp
+13-13libc/src/__support/FPUtil/generic/add_sub.h
+11-11libc/src/__support/FPUtil/bfloat16.h
+8-8libc/src/__support/FPUtil/NearestIntegerOperations.h
+13-1libc/src/__support/math/ceill.h
+7-7libc/src/__support/FPUtil/comparison_operations.h
+111-479 files not shown
+141-7215 files

LLVM/project 2b74b14libc/src/__support/FPUtil PolyEval.h

misc
DeltaFile
+4-4libc/src/__support/FPUtil/PolyEval.h
+4-41 files

LLVM/project e6789f9llvm/lib/Target/AMDGPU SIInsertWaitcnts.cpp AMDGPU.td, llvm/lib/Target/AMDGPU/Utils AMDGPUBaseInfo.cpp AMDGPUBaseInfo.h

[AMDGPU] Introduce ASYNC_CNT on GFX1250 (#185810)

Async operations transfer data between global memory and LDS. Their
progress is tracked by the ASYNC_CNT counter on GFX1250 and later
architectures. This change introduces the representation of that counter
in SIInsertWaitCnts. For now, the programmer must manually insert
s_wait_asyncnt instructions. Later changes will add compiler assistance
for generating the waits by including this counter in the asyncmark
instructions.

Assisted-by: Claude Sonnet 4.5

This is part of a stack:

- #185813
- #185810
DeltaFile
+24-9llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
+10-0llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp
+8-1llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.h
+6-0llvm/lib/Target/AMDGPU/AMDGPU.td
+48-104 files

LLVM/project 895c281llvm/lib/Target/AArch64/GISel AArch64RegisterBankInfo.cpp, llvm/test/CodeGen/AArch64 arm64-int-neon.ll

[AArch64][GlobalISel] Remove fallback for scalar usqadd/suqadd intrinsics (#187513)

Previously, GlobalISel was failing to select these intrinsics when given
scalar operands, as RegBankSelect would place these on GPR banks. Fixing
this enables GlobalISel to lower correctly, as in Instruction Selection
the intrinsic matches the SIMD patterns in AArch64InstrInfo.td.
DeltaFile
+1-5llvm/test/CodeGen/AArch64/arm64-int-neon.ll
+2-0llvm/lib/Target/AArch64/GISel/AArch64RegisterBankInfo.cpp
+3-52 files

LLVM/project 4376bf2clang-tools-extra/clang-tidy/performance FasterStringFindCheck.cpp, clang-tools-extra/test/clang-tidy/checkers/performance faster-string-find.cpp

[clang-tidy] Fix "effective" -> "efficient". (#187536)

"Effective" is the wrong word: Both overloads are effective; they do
what they're supposed to do. But the character overload does less work.
DeltaFile
+1-1clang-tools-extra/test/clang-tidy/checkers/performance/faster-string-find.cpp
+1-1clang-tools-extra/clang-tidy/performance/FasterStringFindCheck.cpp
+2-22 files

LLVM/project 4b17135llvm/lib/Transforms/Vectorize VPlanTransforms.cpp VPlanPatternMatch.h, llvm/test/Transforms/LoopVectorize/AArch64 partial-reduce-fdot-product.ll

[LV] Simplify `matchExtendedReductionOperand()` (NFCI) (#185821)

This updates `matchExtendedReductionOperand` so the simple case of
`UpdateR(PrevValue, ext(...))` is matched first as an early exit. The
binop matching is then flattened to remove the extra layer of the
`MatchExtends` lambda.
DeltaFile
+63-75llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
+58-0llvm/test/Transforms/LoopVectorize/AArch64/partial-reduce-fdot-product.ll
+4-0llvm/lib/Transforms/Vectorize/VPlanPatternMatch.h
+125-753 files

LLVM/project 78f267fclang/lib/AST/ByteCode InterpFrame.h InterpFrame.cpp

Reapply "[clang][bytecode] Allocate local variables in `InterpFrame` … (#187644)

…tail storage" (#187410)

This reverts commit bf1db77fc87ce9d2ca7744565321b09a5d23692f.

Avoid using an `InterpFrame` member after calling its destructor this
time. I hope that was the only problem.
DeltaFile
+41-15clang/lib/AST/ByteCode/InterpFrame.h
+23-21clang/lib/AST/ByteCode/InterpFrame.cpp
+13-15clang/lib/AST/ByteCode/Function.h
+9-15clang/lib/AST/ByteCode/Compiler.cpp
+15-7clang/lib/AST/ByteCode/Context.cpp
+13-6clang/lib/AST/ByteCode/Interp.cpp
+114-7910 files not shown
+146-11616 files

LLVM/project 68e3556llvm/lib/Target/AMDGPU SIInsertWaitcnts.cpp AMDGPU.td, llvm/lib/Target/AMDGPU/Utils AMDGPUBaseInfo.cpp AMDGPUBaseInfo.h

[AMDGPU] Introduce ASYNC_CNT on GFX1250

Async operations transfer data between global memory and LDS. Their progress is
tracked by the ASYNC_CNT counter on GFX1250 and later architectures. This change
introduces the representation of that counter in SIInsertWaitCnts. For now, the
programmer must manually insert s_wait_asyncnt instructions. Later changes will
add compiler assistance for generating the waits by including this counter in
the asyncmark instructions.

Assisted-by: Claude Sonnet 4.5
DeltaFile
+24-9llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
+10-0llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp
+8-1llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.h
+6-0llvm/lib/Target/AMDGPU/AMDGPU.td
+48-104 files

LLVM/project ab28384llvm/include/llvm/CodeGen ExpandMemCmp.h, llvm/include/llvm/Passes CodeGenPassBuilder.h

[ExpandMemCmp] Remove unused TM/TLI dependency (#187660)

This pass does not actually use TargetMachine/TargetLoweringInfo.
DeltaFile
+15-24llvm/lib/CodeGen/ExpandMemCmp.cpp
+0-4llvm/include/llvm/CodeGen/ExpandMemCmp.h
+1-1llvm/include/llvm/Passes/CodeGenPassBuilder.h
+1-1llvm/lib/Passes/PassRegistry.def
+0-1llvm/test/tools/opt/no-target-machine.ll
+17-315 files

LLVM/project da11265llvm/include/llvm/ADT GenericUniformityImpl.h, llvm/lib/Analysis UniformityAnalysis.cpp

add VH callback support for value deletion in uniformity
DeltaFile
+83-0llvm/unittests/Target/AMDGPU/UniformityAnalysisCallbackVHTest.cpp
+46-0llvm/lib/Analysis/UniformityAnalysis.cpp
+14-0llvm/include/llvm/ADT/GenericUniformityImpl.h
+4-0llvm/lib/IR/SSAContext.cpp
+4-0llvm/lib/CodeGen/MachineUniformityAnalysis.cpp
+2-0llvm/lib/CodeGen/MachineSSAContext.cpp
+153-02 files not shown
+155-08 files

LLVM/project f18c8aellvm/lib/CodeGen MachineSSAContext.cpp, llvm/lib/IR SSAContext.cpp

review: fix isNeverDivergent and separate VH callback for other follow-up
DeltaFile
+0-4llvm/lib/IR/SSAContext.cpp
+0-2llvm/lib/CodeGen/MachineSSAContext.cpp
+0-62 files

LLVM/project c912af8llvm/lib/Target/AMDGPU SIInsertWaitcnts.cpp, llvm/lib/Target/AMDGPU/Utils AMDGPUBaseInfo.cpp

handle inside isFLAT; add missing comment for getBitWidth
DeltaFile
+5-5llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
+1-0llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp
+6-52 files

LLVM/project 5d503a3llvm/include/llvm/ADT GenericUniformityImpl.h GenericSSAContext.h, llvm/lib/Analysis UniformityAnalysis.cpp

review: add comment in isNeverDivergent and separate VH callback for other follow-up
DeltaFile
+0-83llvm/unittests/Target/AMDGPU/UniformityAnalysisCallbackVHTest.cpp
+0-46llvm/lib/Analysis/UniformityAnalysis.cpp
+4-14llvm/include/llvm/ADT/GenericUniformityImpl.h
+5-0llvm/include/llvm/ADT/GenericSSAContext.h
+0-4llvm/lib/CodeGen/MachineUniformityAnalysis.cpp
+0-1llvm/unittests/Target/AMDGPU/CMakeLists.txt
+9-1481 files not shown
+9-1497 files

LLVM/project d97adc4llvm/lib/Target/X86 X86ISelLowering.cpp, llvm/test/CodeGen/X86 bit-manip-i512.ll bit-manip-i256.ll

[X86] Perform i128/i256/i512 BITREVERSE on the FPU (#187502)

Bitcast the large scalar integer to a vXi64 vector, reverse the elements
and then perform a per-element vXi64 bitreverse

If we have SSSE3 or later, BITREVERSE expansion using PSHUFB is always
more efficient than performing it as a scalar sequence (no need for
mayFoldIntoVector check).

Fixes #187353
DeltaFile
+780-2,395llvm/test/CodeGen/X86/bit-manip-i512.ll
+450-1,161llvm/test/CodeGen/X86/bit-manip-i256.ll
+228-452llvm/test/CodeGen/X86/bitreverse.ll
+324-317llvm/test/CodeGen/X86/bit-manip-i128.ll
+28-5llvm/lib/Target/X86/X86ISelLowering.cpp
+1,810-4,3305 files

LLVM/project ef75891clang/include/clang/AST Decl.h

clang-format
DeltaFile
+1-3clang/include/clang/AST/Decl.h
+1-31 files

LLVM/project 689afb5llvm/utils/release build_llvm_release.bat

Windows release build: Add checksum verification for downloaded source archives (#187113)

Add checksum verification for libxml2, zlib, and zstd source archives
via `cmake -E *sum` and `cmake -E compare_files` commands.

This also adds the following minor changes:
* Factor out libxml2 version into variable.
* Check `tar` return code.
DeltaFile
+24-5llvm/utils/release/build_llvm_release.bat
+24-51 files

LLVM/project 69cd746clang/tools/clang-fuzzer/handle-llvm handle_llvm.cpp, llvm/docs/CommandGuide llc.rst

[llc] Add -mtune option (#186998)

This patch adds a Clang-compatible -mtune option to llc, to enable
decoupled ISA and microarchitecture targeting, which is especially
important for backend development. For example, it can enable to easily
test a subtarget feature or scheduling model effects on codegen across a
variaty of workloads on the IR corpus benchmark:
https://github.com/dtcxzyw/llvm-codegen-benchmark.

The implementation adds an isolated generic codegen flag, to establish a
base for wider usage - the plan is to add it to `opt` as well in a
followup patch. Then `llc` consumes it, and sets `tune-cpu` attributes
for functions, which are further consumed by the backend.
DeltaFile
+69-0llvm/test/tools/llc/mtune.ll
+31-11llvm/lib/CodeGen/CommandFlags.cpp
+17-7llvm/include/llvm/CodeGen/CommandFlags.h
+15-9llvm/tools/llc/llc.cpp
+11-0llvm/docs/CommandGuide/llc.rst
+2-2clang/tools/clang-fuzzer/handle-llvm/handle_llvm.cpp
+145-293 files not shown
+149-329 files

LLVM/project 4df2967lldb/include/lldb/Utility Stream.h, lldb/source/Core UserSettingsController.cpp

[lldb] Implement llvm::formatv overload for Stream::operator << (#187462)

This will allow us to more conveniently use llvm::formatv in the
codebase.
DeltaFile
+13-1lldb/include/lldb/Utility/Stream.h
+9-0lldb/unittests/Utility/StreamTest.cpp
+3-5lldb/source/Interpreter/OptionValueProperties.cpp
+4-3lldb/source/Target/TraceDumper.cpp
+6-0lldb/source/Utility/Stream.cpp
+1-1lldb/source/Core/UserSettingsController.cpp
+36-106 files

LLVM/project d6d2289llvm/include/llvm/ADT GenericUniformityImpl.h GenericUniformityInfo.h, llvm/lib/Analysis UniformityAnalysis.cpp

add VH callback support for value deletion in uniformity
DeltaFile
+83-0llvm/unittests/Target/AMDGPU/UniformityAnalysisCallbackVHTest.cpp
+46-0llvm/lib/Analysis/UniformityAnalysis.cpp
+14-0llvm/include/llvm/ADT/GenericUniformityImpl.h
+4-0llvm/lib/CodeGen/MachineUniformityAnalysis.cpp
+1-0llvm/include/llvm/ADT/GenericUniformityInfo.h
+1-0llvm/unittests/Target/AMDGPU/CMakeLists.txt
+149-06 files