LLVM/project d52daealibc/test/shared shared_math_test.cpp

[libc] Fix the remaining long double issue in shared_math_test.cpp. (#190098)
DeltaFile
+5-6libc/test/shared/shared_math_test.cpp
+5-61 files

LLVM/project c8c7186llvm/lib/Target/X86 X86ISelLowering.cpp, llvm/test/CodeGen/X86 gfni-rotates.ll vector-fshr-rot-512.ll

[X86] LowerRotate - expand vXi8 non-uniform variable rotates using uniform constant rotates (#189986)

We expand vXi8 non-uniform variable rotates as a sequence of uniform
constant rotates along with a SELECT depending on whether the original
rotate amount needs it

This patch removes premature uniform constant rotate expansion to the
OR(SHL,SRL) sequences to allow GFNI targets to use single VGF2P8AFFINEQB
calls
DeltaFile
+301-623llvm/test/CodeGen/X86/gfni-rotates.ll
+30-30llvm/test/CodeGen/X86/vector-fshr-rot-512.ll
+9-20llvm/lib/Target/X86/X86ISelLowering.cpp
+12-12llvm/test/CodeGen/X86/vector-fshr-rot-256.ll
+352-6854 files

LLVM/project 8daaa26lld/test/ELF merge-piece-oob.s, llvm/include/llvm/Support Parallel.h

[Support] Support nested parallel TaskGroup via work-stealing (#189293)

Nested TaskGroups run serially to prevent deadlock, as documented by
https://reviews.llvm.org/D61115 and refined by
https://reviews.llvm.org/D148984 to use threadIndex.

Enable nested parallelism by having worker threads actively execute
tasks from the work queue while waiting (work-stealing), instead of
just blocking. Root-level TaskGroups (main thread) keep the efficient
blocking Latch::sync(), so there is no overhead for the common
non-nested case.

In lld, https://reviews.llvm.org/D131247 worked around the limitation
by passing a single root TaskGroup into OutputSection::writeTo and
spawning 4MB-chunked tasks into it. However, SyntheticSection::writeTo
calls with internal parallelism (e.g. GdbIndexSection,
MergeNoTailSection) still ran serially on worker threads. With this
change, their internal parallelFor/parallelForEach calls parallelize
automatically via helpSync work-stealing.

    [3 lines not shown]
DeltaFile
+16-59llvm/unittests/Support/ParallelTest.cpp
+27-7llvm/lib/Support/Parallel.cpp
+2-4llvm/include/llvm/Support/Parallel.h
+2-2lld/test/ELF/merge-piece-oob.s
+47-724 files

LLVM/project dee982dllvm/lib/Target/AArch64 AArch64PostCoalescerPass.cpp AArch64.h, llvm/test/CodeGen/AArch64 aarch64-post-coalescer.mir

[NewPM] Adds a port for AArch64PostCoalescerPass (#189520)

Adds a standard porting for AArch64PostCoalescer to NewPM.
DeltaFile
+69-52llvm/lib/Target/AArch64/AArch64PostCoalescerPass.cpp
+8-1llvm/lib/Target/AArch64/AArch64.h
+2-1llvm/test/CodeGen/AArch64/aarch64-post-coalescer.mir
+1-1llvm/lib/Target/AArch64/AArch64TargetMachine.cpp
+1-0llvm/lib/Target/AArch64/AArch64PassRegistry.def
+81-555 files

LLVM/project e27e7e4llvm/lib/Target/AArch64/GISel AArch64PreLegalizerCombiner.cpp

[NFC][AAarch64] Remove PreLegalizerCombiner pass dependency on TargetPassConfig (#190073)

This will enable NewPM porting.

Replaced with the definition in
[AArch64PassConfig::getCSEConfig](https://github.com/llvm/llvm-project/blob/1d549d9a777a6faef6d425cb6482ab1fa6b91bb7/llvm/lib/Target/AArch64/AArch64TargetMachine.cpp#L614)
DeltaFile
+2-6llvm/lib/Target/AArch64/GISel/AArch64PreLegalizerCombiner.cpp
+2-61 files

LLVM/project c97e08eclang/include/clang/AST DeclBase.h DeclContextInternals.h, clang/lib/AST DeclBase.cpp Decl.cpp

[C++20] [Modules] Add VisiblePromoted module ownership kind (#189903)

This patch adds a new ModuleOwnershipKind::VisiblePromoted to handle
declarations that are not visible to the current TU but are promoted to
be visible to avoid re-parsing.

Originally we set the visible visiblity directly in such cases. But
https://github.com/llvm/llvm-project/issues/188853 shows such decls may
be excluded later if we import #include and then import. So we have to
introduce a new visibility to express the intention that the visibility
of the decl is intentionally promoted.

Close https://github.com/llvm/llvm-project/issues/188853
DeltaFile
+47-0clang/test/Modules/include-between-imports-enums.cppm
+14-1clang/include/clang/AST/DeclBase.h
+3-1clang/include/clang/AST/DeclContextInternals.h
+2-1clang/lib/AST/DeclBase.cpp
+1-1clang/lib/Sema/SemaLookup.cpp
+1-0clang/lib/AST/Decl.cpp
+68-41 files not shown
+69-47 files

LLVM/project dbb1002clang/include/clang/ScalableStaticAnalysisFramework/Analyses EntityPointerLevel.h, clang/include/clang/ScalableStaticAnalysisFramework/Analyses/EntityPointerLevel EntityPointerLevel.h EntityPointerLevelFormat.h

rebase
DeltaFile
+0-332clang/lib/ScalableStaticAnalysisFramework/Analyses/EntityPointerLevel.cpp
+292-0clang/lib/ScalableStaticAnalysisFramework/Analyses/EntityPointerLevel/EntityPointerLevel.cpp
+0-134clang/include/clang/ScalableStaticAnalysisFramework/Analyses/EntityPointerLevel.h
+125-0clang/include/clang/ScalableStaticAnalysisFramework/Analyses/EntityPointerLevel/EntityPointerLevel.h
+67-0clang/include/clang/ScalableStaticAnalysisFramework/Analyses/EntityPointerLevel/EntityPointerLevelFormat.h
+10-23clang/lib/ScalableStaticAnalysisFramework/Analyses/UnsafeBufferUsage/UnsafeBufferUsage.cpp
+494-4895 files not shown
+510-49711 files

LLVM/project 096f9d0libc/cmake/modules LLVMLibCArchitectures.cmake, libc/config/linux/power entrypoints.txt config.json

[libc] Initial support so that libc-shared-tests can be built with pp64le (#188882)
DeltaFile
+14-0libc/config/linux/power/entrypoints.txt
+7-2libc/src/__support/FPUtil/generic/sqrt.h
+7-0libc/config/linux/power/config.json
+2-0libc/cmake/modules/LLVMLibCArchitectures.cmake
+1-0libc/config/linux/power/headers.txt
+31-25 files

LLVM/project fd609e5lld/ELF Driver.cpp, lld/MachO Driver.cpp

[lld] Glob-based BP compression sort groups (#185661)

Add
--bp-compression-sort-section=<glob>[=<layout_priority>[=<match_priority>]]
to let users split input sections into multiple compression groups, run
balanced partitioning independently per group, and leave out sections
that are poor candidates for BP. This replaces the old coarse
--bp-compression-sort with a more explicit, user-controlled one.

In ELF, the glob matches input section names (.text.unlikely.cold1). In
Mach-O, it matches the concatenated segment+section name (__TEXT__text).

layout_priority controls group placement in the final layout.
match_priority resolves conflicts when multiple globs match the same
section: explicit priority beats positional matching, and among
positional specs the last match wins.

A CRTP hook getCompressionSubgroupKey() allows backends to further
subdivide glob groups into independent BP instances. This allows Mach-O

    [3 lines not shown]
DeltaFile
+131-84lld/include/lld/Common/BPSectionOrdererBase.inc
+208-0lld/test/ELF/bp-section-orderer-cold.s
+112-0lld/test/MachO/compression-order-sections.s
+48-0lld/ELF/Driver.cpp
+48-0lld/include/lld/Common/BPSectionOrdererBase.h
+44-0lld/MachO/Driver.cpp
+591-8413 files not shown
+687-11019 files

LLVM/project 6d7c957clang/include/clang/ScalableStaticAnalysisFramework/Analyses EntityPointerLevel.h, clang/include/clang/ScalableStaticAnalysisFramework/Analyses/UnsafeBufferUsage UnsafeBufferUsage.h UnsafeBufferUsageExtractor.h

[NFC][SSAF][UnsafeBufferUsage] Separate EntityPointerLevel and UnsafeBufferUsage

EntityPointerLevel as a common data structure will later be shared by
UnsafeBufferUsage and pointer assignments analysis. So this commit
makes them separate:
- EntityPointerLevel provides the data structure and translation
- UnsafeBufferUsage uses EntityPointerLevel to translate unsafe pointers to EPLs.
DeltaFile
+332-0clang/lib/ScalableStaticAnalysisFramework/Analyses/EntityPointerLevel.cpp
+9-216clang/lib/ScalableStaticAnalysisFramework/Analyses/UnsafeBufferUsage/UnsafeBufferUsageExtractor.cpp
+134-0clang/include/clang/ScalableStaticAnalysisFramework/Analyses/EntityPointerLevel.h
+3-73clang/include/clang/ScalableStaticAnalysisFramework/Analyses/UnsafeBufferUsage/UnsafeBufferUsage.h
+2-4clang/unittests/ScalableStaticAnalysisFramework/Analyses/UnsafeBufferUsage/UnsafeBufferUsageTest.cpp
+0-5clang/include/clang/ScalableStaticAnalysisFramework/Analyses/UnsafeBufferUsage/UnsafeBufferUsageExtractor.h
+480-2981 files not shown
+481-2987 files

LLVM/project 3d7eedcllvm/lib/Target/RISCV RISCVAsmPrinter.cpp, llvm/test/CodeGen/RISCV rv64-stackmap-nops.ll

[RISCV] Fix stackmap shadow trimming NOP size for compressed targets (#189774)

The shadow trimming loop in LowerSTACKMAP hardcoded a 4-byte decrement
per instruction, but when Zca is enabled NOPs are 2 bytes. Use NOPBytes
instead of the hardcoded 4 so the shadow is correctly trimmed on
compressed targets.

Co-authored-by: Claude Opus 4.6 <noreply at anthropic.com>
DeltaFile
+14-2llvm/test/CodeGen/RISCV/rv64-stackmap-nops.ll
+1-1llvm/lib/Target/RISCV/RISCVAsmPrinter.cpp
+15-32 files

LLVM/project b9e01c2llvm/lib/Target/RISCV RISCVVectorPeephole.cpp, llvm/test/CodeGen/RISCV/rvv rvv-peephole-vmerge-to-vmv.mir

[RISCV] Relax VL constraint in convertSameMaskVMergeToVMv (#189797)

When converting a PseudoVMERGE_VVM to PseudoVMV_V_V, we previously
required MIVL <= TrueVL to avoid losing False elements in the tail.

Relax this constraint when the vmerge's False operand equals its
Passthru operand and the True instruction's tail policy is TU
(tail undisturbed). In this case, True's tail lanes preserve its
passthru value (which equals False and Passthru), so the conversion
is safe even when MIVL > TrueVL.

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply at anthropic.com>
DeltaFile
+72-0llvm/test/CodeGen/RISCV/rvv/rvv-peephole-vmerge-to-vmv.mir
+14-4llvm/lib/Target/RISCV/RISCVVectorPeephole.cpp
+86-42 files

LLVM/project 7c260d3compiler-rt/lib/scudo/standalone combined.h

[scudo] Fix reallocate for MTE. (#190086)

For MTE, we can't use the whole size or we might trigger a segfault.
Therefore, use the exact size when MTE is enabled or the exact usable
size parameter is true.

Also, optimize out the call to getUsableSize and use a simpler
calculation.
DeltaFile
+10-1compiler-rt/lib/scudo/standalone/combined.h
+10-11 files

LLVM/project 54e480eclang/include/clang/ScalableStaticAnalysisFramework/Analyses/UnsafeBufferUsage UnsafeBufferUsageTest.h UnsafeBufferUsage.h, clang/lib/ScalableStaticAnalysisFramework/Analyses/UnsafeBufferUsage UnsafeBufferUsage.cpp

address comments
DeltaFile
+27-0clang/include/clang/ScalableStaticAnalysisFramework/Analyses/UnsafeBufferUsage/UnsafeBufferUsageTest.h
+1-7clang/unittests/ScalableStaticAnalysisFramework/Analyses/UnsafeBufferUsage/UnsafeBufferUsageTest.cpp
+4-3clang/lib/ScalableStaticAnalysisFramework/Analyses/UnsafeBufferUsage/UnsafeBufferUsage.cpp
+4-1clang/include/clang/ScalableStaticAnalysisFramework/Analyses/UnsafeBufferUsage/UnsafeBufferUsage.h
+36-114 files

LLVM/project 2939132llvm/lib/Target/WebAssembly/GISel WebAssemblyCallLowering.cpp, llvm/test/CodeGen/WebAssembly/GlobalISel/irtranslator args.ll args-simd.ll

[WebAssembly][GlobalISel] CallLowering `lowerFormalArguments` (#180263)

Implements `WebAssemblyCallLowering::lowerFormalArguments`

Split from #157161
DeltaFile
+233-3llvm/lib/Target/WebAssembly/GISel/WebAssemblyCallLowering.cpp
+209-0llvm/test/CodeGen/WebAssembly/GlobalISel/irtranslator/args.ll
+171-0llvm/test/CodeGen/WebAssembly/GlobalISel/irtranslator/args-simd.ll
+73-0llvm/test/CodeGen/WebAssembly/GlobalISel/irtranslator/args-swiftcc.ll
+686-34 files

LLVM/project 4d8f738llvm/test/CodeGen/AMDGPU memory-legalizer-private-wavefront.ll memory-legalizer-private-workgroup.ll

Merge branch 'main' into users/ziqingluo/eng/PR-171920065
DeltaFile
+8,544-1,366llvm/test/CodeGen/AMDGPU/memory-legalizer-private-wavefront.ll
+8,544-1,366llvm/test/CodeGen/AMDGPU/memory-legalizer-private-workgroup.ll
+8,544-1,366llvm/test/CodeGen/AMDGPU/memory-legalizer-private-singlethread.ll
+8,449-1,355llvm/test/CodeGen/AMDGPU/memory-legalizer-private-agent.ll
+8,449-1,355llvm/test/CodeGen/AMDGPU/memory-legalizer-private-cluster.ll
+8,069-1,315llvm/test/CodeGen/AMDGPU/memory-legalizer-private-system.ll
+50,599-8,1233,260 files not shown
+337,211-92,5133,266 files

LLVM/project 52fb23elibc/src/__support/math log1pf.h

[libc][math] Remove static from log1pf implementation (#190042)

Reflecting changes according to
https://github.com/llvm/llvm-project/commit/823e3e001724ca2e93ce410a675f3b538f8a74b3
DeltaFile
+3-3libc/src/__support/math/log1pf.h
+3-31 files

LLVM/project e87ea84libc/config config.json, libc/src/__support CMakeLists.txt

Reapply "[libc] Finetune libc.src.__support.OSUtil.osutil dependency." (#190033) (#190065)

This reverts commit 84f23eb3113f2e75d1a2e45db1b5c570a5d2f4c5 and fix GPU
builds.
DeltaFile
+17-9libc/src/__support/CMakeLists.txt
+19-5libc/test/UnitTest/CMakeLists.txt
+21-2libc/test/UnitTest/TestLogger.cpp
+6-0libc/src/unistd/CMakeLists.txt
+6-0libc/src/time/linux/CMakeLists.txt
+6-0libc/config/config.json
+75-163 files not shown
+83-179 files

LLVM/project e5a7a9aclang/cmake/caches Fuchsia-stage2.cmake

[Fuchsia] Cortex-m33 runtime libraries hard float ABI (#190023)

Make cortex-m33 runtime libraries build to use hard float ABI instead of
softfp.
DeltaFile
+2-2clang/cmake/caches/Fuchsia-stage2.cmake
+2-21 files

LLVM/project b3ca423mlir/lib/Dialect/Vector/Transforms VectorUnroll.cpp, mlir/test/Dialect/Vector vector-unroll-options.mlir

[MLIR][Vector] Enhance vector.multi_reduction unrolling to handle scalar result (#188633)

Previously, UnrollMultiReductionPattern bailed out when all the
dimensions were reduced to a scalar. This PR adds support for this case
by tiling the source vector and chaining partial reductions through the
accumulator operand.
DeltaFile
+25-6mlir/lib/Dialect/Vector/Transforms/VectorUnroll.cpp
+8-6mlir/test/Dialect/Vector/vector-unroll-options.mlir
+33-122 files

LLVM/project 1a1fbf9mlir/lib/Dialect/XeGPU/Transforms XeGPUWgToSgDistribute.cpp, mlir/test/Dialect/XeGPU xegpu-wg-to-sg-unify-ops-rr.mlir xegpu-wg-to-sg-rr.mlir

[MLIR][XeGPU] Support round-robin layout for constant and broadcast in wg-to-sg distribution (#189798)

As title.
DeltaFile
+26-0mlir/test/Dialect/XeGPU/xegpu-wg-to-sg-unify-ops-rr.mlir
+17-7mlir/lib/Dialect/XeGPU/Transforms/XeGPUWgToSgDistribute.cpp
+1-1mlir/test/Dialect/XeGPU/xegpu-wg-to-sg-rr.mlir
+44-83 files

LLVM/project 7d24b17llvm/lib/Target/RISCV RISCVOptWInstrs.cpp, llvm/test/CodeGen/RISCV opt-w-instrs-p-ext.mir

[RISCV] Add SATI_RV64/USATI_RV64 to RISCVOptWInstrs. (#190030)

Note the immediates for these 2 instructions in their MachineInstr
representations both use the type width. The SATI_RV64 binary encoding
and the RISCVISD::SATI encoding uses the type width minus one.

Assisted-by: Claude Sonnet 4.5
DeltaFile
+128-0llvm/test/CodeGen/RISCV/opt-w-instrs-p-ext.mir
+8-0llvm/lib/Target/RISCV/RISCVOptWInstrs.cpp
+136-02 files

LLVM/project a9df7c7llvm/lib/Target/AMDGPU SIInstructions.td, llvm/test/CodeGen/AMDGPU bf16-math.ll

[AMDGPU] True16 support for bf16 clamp pattern on gfx1250 (#190036)
DeltaFile
+174-55llvm/test/CodeGen/AMDGPU/bf16-math.ll
+9-1llvm/lib/Target/AMDGPU/SIInstructions.td
+183-562 files

LLVM/project c6669c4llvm/lib/Transforms/Vectorize SLPVectorizer.cpp, llvm/test/Transforms/SLPVectorizer/AArch64 fma-conversion-multi-use-guard.ll

[SLP] Guard FMulAdd conversion to require single-use/non-reordered FMul operands

The FMulAdd (CombinedVectorize) transformation in transformNodes() marks
an FMul child entry with zero cost, assuming it is fully absorbed into
the fmuladd intrinsic. However, when any FMul scalar has multiple uses
(e.g., also stored separately), the FMul must survive as a separate
node.

Reviewers: hiraditya, RKSimon, bababuck

Pull Request: https://github.com/llvm/llvm-project/pull/189692
DeltaFile
+6-14llvm/test/Transforms/SLPVectorizer/AArch64/fma-conversion-multi-use-guard.ll
+16-0llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+22-142 files

LLVM/project 6c92374llvm/lib/Analysis ValueTracking.cpp, llvm/test/Transforms/Attributor/AMDGPU nofpclass-amdgcn-fract.ll

ValueTracking: llvm.amdgcn.fract cannot introduce overflow (#189002)

This returns a value with an absolute value less than 1.
DeltaFile
+26-0llvm/test/Transforms/Attributor/AMDGPU/nofpclass-amdgcn-fract.ll
+2-1llvm/lib/Analysis/ValueTracking.cpp
+28-12 files

LLVM/project 478a6ablldb/packages/Python/lldbsuite/test/make Makefile.rules

[lldb/test] Codesign executables built with custom Makefile rules (#189902)

Tests with custom a.out targets in their Makefile (i.e.
`TestBSDArchives.py`) bypass the standard Makefile.rules linking step
where `CODESIGN` is applied. This leaves the binary unsigned, causing
the process to get kill it on remote darwin devices.

This adds a codesigning step to the all target in Makefile.rules that
signs both $(EXE) and a.out if they exist. This ensures all test
binaries are signed regardless of how they were built.

rdar://173840592

Signed-off-by: Med Ismail Bennani <ismail at bennani.ma>
DeltaFile
+7-0lldb/packages/Python/lldbsuite/test/make/Makefile.rules
+7-01 files

LLVM/project b75bf1eclang/docs ReleaseNotes.rst, clang/lib/Analysis ThreadSafety.cpp

Revert "Thread Safety Analysis: Drop call-based alias invalidation (#187691)" (#190041)

This reverts commit 873d6bc3b415f1c2d942bbf4e4219c4bdcd4f2f8.

This causes Linux kernel build to fail because it relied on
alias-invalidation in kernel/core/sched.c.
DeltaFile
+52-0clang/lib/Analysis/ThreadSafety.cpp
+11-26clang/test/SemaCXX/warn-thread-safety-analysis.cpp
+0-5clang/docs/ReleaseNotes.rst
+63-313 files

LLVM/project 9f50004mlir/lib/Dialect/XeGPU/Transforms XeGPUPeepHoleOptimizer.cpp, mlir/test/Dialect/XeGPU peephole-optimize.mlir

[MLIR][XeGPU] Enhance the peephole optimization to remove the convert_layout after multi-reduction rewrite (#188849)
DeltaFile
+57-28mlir/test/Dialect/XeGPU/peephole-optimize.mlir
+25-0mlir/lib/Dialect/XeGPU/Transforms/XeGPUPeepHoleOptimizer.cpp
+82-282 files

LLVM/project 09264aeoffload CMakeLists.txt

Merge commit '61a43720f3e31357ff3842a02d5460e71e4062a6' into HEAD
DeltaFile
+0-114offload/CMakeLists.txt
+0-1141 files

LLVM/project 61a4372offload CMakeLists.txt

Merge commit '1e19b4364dd3f827e4110b0bc14ec31bf5bbaf59' into HEAD
DeltaFile
+0-115offload/CMakeLists.txt
+0-1151 files