LLVM/project 02402bellvm/include/llvm/Transforms/Vectorize/SandboxVectorizer VecUtils.h, llvm/lib/Transforms/Vectorize/SandboxVectorizer/Passes BottomUpVec.cpp

[SandboxVec][VecUtils] Lane Enumerator (#188355)

This patch introduces an iterator that helps us iterate over lane-value
pairs in a range. For example, given a container `(i32 %v0, <2 x i32>
%v1, i32 %v2)` we get:
```
Lane Value
  0   %v0
  1   %v1
  3   %v2
```

We use this iterator to replace the lane counting logic in
BottomUpVec.cpp.
DeltaFile
+60-0llvm/include/llvm/Transforms/Vectorize/SandboxVectorizer/VecUtils.h
+31-0llvm/unittests/Transforms/Vectorize/SandboxVectorizer/VecUtilsTest.cpp
+2-11llvm/lib/Transforms/Vectorize/SandboxVectorizer/Passes/BottomUpVec.cpp
+93-113 files

LLVM/project 18e6958clang/lib/Sema SemaAMDGPU.cpp, clang/test/CodeGenCUDA builtins-spirv-amdgcn.cu

[SPIRV][AMDGPU][clang][CodeGen][opt] Add late-resolved feature identifying predicates (#134016)

This change adds two builtins for AMDGPU:

- `__builtin_amdgcn_processor_is`, which is similar in observable
behaviour with `__builtin_cpu_is`, except that it is never "evaluated"
at run time;
- `__builtin_amdgcn_is_invocable`, which is behaviourally similar with
`__has_builtin`, except that it is not a macro (i.e. not evaluated at
preprocessing time).

Neither of these are `constexpr`, even though when compiling for
concrete (i.e. `gfxXXX` / `gfxXXX-generic`) targets they get evaluated
in Clang, so they shouldn't tear the AST too badly / at all for
multi-pass compilation cases like HIP. They can only be used in specific
contexts (as args to control structures).

The motivation for adding these is two-fold:


    [18 lines not shown]
DeltaFile
+304-0clang/lib/Sema/SemaAMDGPU.cpp
+281-0llvm/test/CodeGen/SPIRV/SpecConstants/amdgcnspirv-feature-predicate-specconstant.ll
+222-10clang/test/CodeGenCUDA/builtins-spirv-amdgcn.cu
+111-79clang/test/CodeGenOpenCL/builtins-amdgcn-vi.cl
+153-0llvm/test/CodeGen/SPIRV/passes/SPIRVPrepareGlobals-predicate-id-string.ll
+115-0clang/test/SemaHIP/amdgpu-feature-predicates-guard-use.hip
+1,186-8941 files not shown
+2,099-16147 files

LLVM/project 5b00cdflldb/source/Plugins/Platform/MacOSX PlatformDarwinDevice.cpp PlatformDarwin.cpp, lldb/unittests/Platform PlatformMacOSXTest.cpp

[lldb][macOS] Recognize new layouts for DeviceSupport directories (#188646)

When debugging a remote Darwin device (iOS, macOS, etc), lldb needs to
find a local copy of all the system libraries (the system's shared
cache) so we don't need to read them over gdb-remote serial protocol at
the start of every debug session.

Xcode etc normally creates these expanded shared caches in
~/Library/Developer/Xcode/<OS> DeviceSupport/<OS VER> (<OS
BUILD>)/Symbols

So when lldb sees a file like /usr/lib/libSystem.B.dylib, it may find a
copy at in
~/L/D/Xcode/iOS DeviceSupport/26.2
(23B87)/Symbols/usr/lib/libSystem.B.dylib

There may be multiple expanded shared caches in these DeviceSupport
directories, so we try to parse the "os version" and "os build" out of
the filepath name, and look in a directory that matches the target

    [23 lines not shown]
DeltaFile
+65-76lldb/source/Plugins/Platform/MacOSX/PlatformDarwinDevice.cpp
+29-0lldb/unittests/Platform/PlatformMacOSXTest.cpp
+20-8lldb/source/Plugins/Platform/MacOSX/PlatformDarwin.cpp
+16-5lldb/source/Plugins/Platform/MacOSX/PlatformDarwinDevice.h
+5-3lldb/source/Plugins/Platform/MacOSX/PlatformRemoteDarwinDevice.cpp
+135-925 files

LLVM/project 74c4243libc/include search.yaml, libc/src/search CMakeLists.txt twalk_r.cpp

[libc][tsearch] add tsearch functions (#172625)
DeltaFile
+457-0libc/test/src/search/tsearch_test.cpp
+71-0libc/src/search/CMakeLists.txt
+48-0libc/include/search.yaml
+37-0libc/src/search/twalk_r.cpp
+37-0libc/src/search/tdelete.cpp
+36-0libc/src/search/twalk.cpp
+686-024 files not shown
+1,107-1030 files

LLVM/project d74f098llvm/lib/CodeGen/SelectionDAG SelectionDAG.cpp, llvm/test/CodeGen/AMDGPU clamp.ll mad-mix-lo.ll

[DAG] isKnownNeverNaN - fallback to computeKnownFPClass check (#189476)

Remove ConstantFPSDNode handling from isKnownNeverNaN and fallback to
using computeKnownFPClass if there are no opcode matches in
isKnownNeverNaN

The test check changes are due to isKnownNeverNaN not handling
UNDEF/POISON but computeKnownFPClass does (POISON in particular now
returns isKnownNeverNaN == true, preventing a ISD::FCANONICALIZE call in
expandFMINNUM_FMAXNUM).
DeltaFile
+5-8llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
+6-6llvm/test/CodeGen/AMDGPU/clamp.ll
+4-8llvm/test/CodeGen/AMDGPU/mad-mix-lo.ll
+15-223 files

LLVM/project 8b21fe6lld/test/COFF lto-libcall-archive-bitcode.test, lld/test/ELF/lto libcall-archive-bitcode.test

[LTO][LLD] Prevent invalid LTO libfunc transforms (#164916)

In LTO, part of LLVM's middle-end runs after linking has finished. LTO's
semantics depend on the complete set of extracted bitcode files being
known at this time. If the middle-end inserts new calls to library
functions (libfuncs) that are implemented in bitcode, this could extract
new bitcode object files into the link. These cannot be compiled,
leading to undefined symbol references.

Additionally, the middle-end in LTO may reason that such library
functions have no references, and it may internalize them, then
manipulate their API or even delete them. Afterwards, it may emit a call
to them, again producing undefined symbol references.

This patch resolves the former issue by ensuring that the middle end
emits no new references to symbols defined in bitcode, and it resolves
the latter issue by ensuring that extracted bitcode for libfuncs is
considered external, since new calls may be emitted to them at any time.


    [8 lines not shown]
DeltaFile
+52-19llvm/lib/LTO/LTO.cpp
+56-0lld/test/wasm/lto/libcall-archive-bitcode.ll
+54-0lld/test/ELF/lto/libcall-archive-bitcode.test
+51-0lld/test/COFF/lto-libcall-archive-bitcode.test
+35-0llvm/test/LTO/Resolution/X86/libcall-in-thin-link.ll
+34-0llvm/test/LTO/Resolution/X86/libcall-in-tu.ll
+282-1919 files not shown
+488-5225 files

LLVM/project 878214fllvm/lib/Target/AMDGPU GCNSchedStrategy.cpp GCNSchedStrategy.h, llvm/test/CodeGen/AMDGPU sched_mfma_rewrite_copies.mir misched-remat-revert.ll

[AMDGPU][Scheduler] Use MIR-level rematerializer in rematerialization stage

This makes the scheduler's rematerialization stage use the
target-independent rematerializer. Previosuly duplicate logic is
deleted, and restrictions are put in place in the stage so that the
same cosntraints as before apply on rematerializable registers (as the
rematerializer is able to expose many more rematerialization
opportunities than what the stage can track at the moment).
Consequently it is not expected that this change improves performance
overall, but it is a first step toward being able to use the
rematerializer's more advanced capabilities during scheduling.

This is *not* a NFC for 2 reasons.

- Score equalities between two rematerialization candidates with
  otherwise equivalent score are decided by their corresponding
  register's index handle in the rematerializer (previously the pointer
  to their state object's value). This is determined by the
  rematerializer's register collection order, which is different from

    [10 lines not shown]
DeltaFile
+551-551llvm/test/CodeGen/AMDGPU/sched_mfma_rewrite_copies.mir
+0-577llvm/test/CodeGen/AMDGPU/misched-remat-revert.ll
+100-291llvm/lib/Target/AMDGPU/GCNSchedStrategy.cpp
+46-71llvm/lib/Target/AMDGPU/GCNSchedStrategy.h
+36-36llvm/test/CodeGen/AMDGPU/sched_mfma_rewrite_cost.mir
+19-19llvm/test/CodeGen/AMDGPU/machine-scheduler-sink-trivial-remats-attr.mir
+752-1,5452 files not shown
+775-1,5688 files

LLVM/project fcc69c3llvm/lib/Target/AMDGPU GCNSchedStrategy.cpp GCNSchedStrategy.h

Format
DeltaFile
+5-5llvm/lib/Target/AMDGPU/GCNSchedStrategy.cpp
+1-1llvm/lib/Target/AMDGPU/GCNSchedStrategy.h
+6-62 files

LLVM/project 7a3b7f1clang/include/clang/CIR MissingFeatures.h, clang/lib/CIR/CodeGen CIRGenCleanup.cpp CIRGenCleanup.h

[CIR] Implement handling of cleanups with active flag (#187389)

This implements handling of cleanup scopes in cases where a flag is
needed to indicate whether or not the cleanup is active. This happens in
cases where a cleanup is no longer required, but it isn't at the top of
the cleanup stack so it can't be popped. A temporary variable is used to
set the cleanup to an inactive state when it is no longer needed.

Assisted-by: Cursor / claude-4.6-opus-high (implementation)
Assisted-by: Cursor / gpt-5.3-codex (tests)
DeltaFile
+374-0clang/test/CIR/CodeGen/new-delete-deactivation.cpp
+95-8clang/lib/CIR/CodeGen/CIRGenCleanup.cpp
+20-0clang/lib/CIR/CodeGen/CIRGenCleanup.h
+0-1clang/include/clang/CIR/MissingFeatures.h
+489-94 files

LLVM/project 54b7230mlir/lib/Dialect/Affine/IR AffineOps.cpp, mlir/lib/Dialect/Affine/Transforms AffineExpandIndexOpsAsAffine.cpp

[MLIR][Affine] Add vector support to affine.linearize_index and affine.delinearize_index (#188369)

Allow `affine.delinearize_index` and `affine.linearize_index` to operate
on `vector<...x index>` types in addition to scalar indices.

---------

Signed-off-by: Keshav Vinayak Jha <keshavvinayakjha at gmail.com>
Co-authored-by: Claude Opus 4.6 <noreply at anthropic.com>
DeltaFile
+167-0mlir/test/Dialect/Affine/canonicalize.mlir
+96-18mlir/lib/Dialect/Affine/Transforms/AffineExpandIndexOpsAsAffine.cpp
+76-0mlir/test/Dialect/Affine/affine-expand-index-ops-as-affine.mlir
+47-0mlir/test/Dialect/Affine/affine-expand-index-ops.mlir
+42-0mlir/test/Dialect/Affine/ops.mlir
+27-11mlir/lib/Dialect/Affine/IR/AffineOps.cpp
+455-293 files not shown
+506-379 files

LLVM/project 6e6dd04llvm/lib/Target/AMDGPU GCNSchedStrategy.cpp GCNSchedStrategy.h, llvm/test/CodeGen/AMDGPU sched_mfma_rewrite_copies.mir misched-remat-revert.ll

[AMDGPU][Scheduler] Use MIR-level rematerializer in rematerialization stage

This makes the scheduler's rematerialization stage use the
target-independent rematerializer. Previosuly duplicate logic is
deleted, and restrictions are put in place in the stage so that the
same cosntraints as before apply on rematerializable registers (as the
rematerializer is able to expose many more rematerialization
opportunities than what the stage can track at the moment).
Consequently it is not expected that this change improves performance
overall, but it is a first step toward being able to use the
rematerializer's more advanced capabilities during scheduling.

This is *not* a NFC for 2 reasons.

- Score equalities between two rematerialization candidates with
  otherwise equivalent score are decided by their corresponding
  register's index handle in the rematerializer (previously the pointer
  to their state object's value). This is determined by the
  rematerializer's register collection order, which is different from

    [10 lines not shown]
DeltaFile
+551-551llvm/test/CodeGen/AMDGPU/sched_mfma_rewrite_copies.mir
+0-577llvm/test/CodeGen/AMDGPU/misched-remat-revert.ll
+103-294llvm/lib/Target/AMDGPU/GCNSchedStrategy.cpp
+47-72llvm/lib/Target/AMDGPU/GCNSchedStrategy.h
+36-36llvm/test/CodeGen/AMDGPU/sched_mfma_rewrite_cost.mir
+19-19llvm/test/CodeGen/AMDGPU/machine-scheduler-sink-trivial-remats-attr.mir
+756-1,5492 files not shown
+779-1,5728 files

LLVM/project ae835demlir/include/mlir/Dialect/AMDGPU/IR AMDGPUOps.td, mlir/lib/Conversion/AMDGPUToROCDL AMDGPUToROCDL.cpp

[mlir][amdgpu] implement amdgpu.global_load_async_to_lds for gfx1250 (#189279)

This patch introduces an amdgpu wrapper for
`rocdl.global.load.async.to.lds.bN` intrinsics, which were introduced in
gfx1250.

Assisted-by: Claude

---------

Signed-off-by: Eric Feng <Eric.Feng at amd.com>
DeltaFile
+73-0mlir/test/Conversion/AMDGPUToROCDL/gfx1250.mlir
+68-2mlir/lib/Conversion/AMDGPUToROCDL/AMDGPUToROCDL.cpp
+46-0mlir/include/mlir/Dialect/AMDGPU/IR/AMDGPUOps.td
+35-0mlir/test/Dialect/AMDGPU/invalid.mlir
+31-0mlir/lib/Dialect/AMDGPU/IR/AMDGPUOps.cpp
+26-0mlir/test/Dialect/AMDGPU/ops.mlir
+279-26 files

LLVM/project 76f5c5dclang-tools-extra/clang-tidy/readability ImplicitBoolConversionCheck.cpp ImplicitBoolConversionCheck.h, clang-tools-extra/docs ReleaseNotes.rst

[clang-tidy] Add AllowLogicalOperatorConversion option to implicit-bool-conversion (#189149)

Fixes https://github.com/llvm/llvm-project/issues/176889.
DeltaFile
+94-0clang-tools-extra/test/clang-tidy/checkers/readability/implicit-bool-conversion-allow-logical-operators.c
+19-0clang-tools-extra/clang-tidy/readability/ImplicitBoolConversionCheck.cpp
+7-0clang-tools-extra/docs/clang-tidy/checks/readability/implicit-bool-conversion.rst
+4-0clang-tools-extra/docs/ReleaseNotes.rst
+1-0clang-tools-extra/clang-tidy/readability/ImplicitBoolConversionCheck.h
+125-05 files

LLVM/project 5f99854llvm/include/llvm/IR IntrinsicsAMDGPU.td, llvm/lib/IR AutoUpgrade.cpp

[AMDGPU] Drop A and B neg modifier from amdgcn_wmma_bf16_16x16x32_bf16 (#189468)

Fixes: LCOMPILER-1673
DeltaFile
+6-46llvm/test/CodeGen/AMDGPU/llvm.amdgcn.wmma.imod.gfx1250.w32.ll
+10-10llvm/test/CodeGen/AMDGPU/wmma-hazards-gfx1250-w32.mir
+13-0llvm/test/Bitcode/amdgpu-wmma-drop-ab-mods-upgrade.ll
+7-3llvm/lib/IR/AutoUpgrade.cpp
+3-5llvm/include/llvm/IR/IntrinsicsAMDGPU.td
+3-5mlir/include/mlir/Dialect/LLVMIR/ROCDLOps.td
+42-6910 files not shown
+60-8816 files

LLVM/project e50f08bmlir/lib/Dialect/XeGPU/Transforms XeGPUSgToWiDistributeExperimental.cpp, mlir/test/Dialect/XeGPU sg-to-wi-experimental-unit.mlir

[MLIR] [XeGPU] Add distribution patterns for vector transpose, bitcast & mask ops in sg to wi pass  (#187392)

This PR adds patterns for following vector ops in the new sg-to-wi pass

1. Transpose
2. BitCast
3. CreateMask
4. ConstantMask
DeltaFile
+178-10mlir/lib/Dialect/XeGPU/Transforms/XeGPUSgToWiDistributeExperimental.cpp
+108-0mlir/test/Dialect/XeGPU/sg-to-wi-experimental-unit.mlir
+286-102 files

LLVM/project 19caff4utils/bazel/llvm-project-overlay/mlir BUILD.bazel, utils/bazel/llvm-project-overlay/mlir/unittests BUILD.bazel

[Bazel] Fixes b6e4d27 (#189473)

This fixes b6e4d27c485af711214b3dafc96fa287e2fe33f6.

Co-authored-by: Google Bazel Bot <google-bazel-bot at google.com>
DeltaFile
+13-0utils/bazel/llvm-project-overlay/mlir/BUILD.bazel
+2-0utils/bazel/llvm-project-overlay/mlir/unittests/BUILD.bazel
+15-02 files

LLVM/project 77bc575llvm/lib/Target/AMDGPU GCNSchedStrategy.cpp GCNSchedStrategy.h

[AMDGPU][Scheduler] Prepare remat. stage for rematerializer integration (NFC)

This NFC prepares the scheduler's rematerialization stage for
integration with the target-independent rematerializer. It brings
various small design changes and optimizations to the stage's internal
state to make the not-exactly-NFC rematerializer integration as small as
possible.

The main changes are, in no particular order:

- Sort and pick useful rematerialization candidates by their index in
  the vector of candidates instead of directly sorting objects within
  the candidate vector. This reduces the amount of data movement and
  simplifies the candidate selection logic.
- Move some data members from `PreRARematStage::RematReg` to
  `PreRARematStage::ScoredRemat`. This makes the former a simplified
  version of the rematerializer's own internal register representation
  (`Rematerializer::Reg`), which can be cleanly deleted during
  integration.

    [8 lines not shown]
DeltaFile
+154-141llvm/lib/Target/AMDGPU/GCNSchedStrategy.cpp
+58-51llvm/lib/Target/AMDGPU/GCNSchedStrategy.h
+11-0llvm/lib/Target/AMDGPU/GCNRegPressure.cpp
+4-0llvm/lib/Target/AMDGPU/GCNRegPressure.h
+227-1924 files

LLVM/project 9d3079allvm/lib/CodeGen InlineAsmPrepare.cpp

[NFC][CodeGen] Prepare for expansion of InlineAsmPrepare (#189469)

Move some functions around so that the CallBrInst processing is
contained. The 'static' functions don't need to be declared at the top;
just place them before the calls. Fix the naming to use lower-case for
the first letter of function names.
DeltaFile
+151-137llvm/lib/CodeGen/InlineAsmPrepare.cpp
+151-1371 files

LLVM/project a0ffdf2clang/lib/CIR/CodeGen CIRGenModule.cpp, clang/test/CIR/CodeGen ctor-alias-prev-decl.cpp dtor-alias-prev-decl.cpp

[CIR] Allow replacement of a structor declaration with an alias (#188320)

We had an errorNYI diagnostic to trigger when we generated an alias for
a ctor or dtor that had an existing declaration. Because functions are
used via flat symbol references, all that is needed is to erase the old
declaration. This change does that.
DeltaFile
+43-0clang/test/CIR/CodeGen/ctor-alias-prev-decl.cpp
+42-0clang/test/CIR/CodeGen/dtor-alias-prev-decl.cpp
+8-2clang/lib/CIR/CodeGen/CIRGenModule.cpp
+93-23 files

LLVM/project f732918clang/docs ClangIRCleanupAndEHDesign.md, clang/lib/CIR/Dialect/Transforms EHABILowering.cpp

[CIR] Handle throwing calls inside EH cleanup (#188341)

This implements handling for throwing calls inside an EH cleanup
handler. When such a call occurs, the CFG flattening pass replaces it
with a cir.try_call op that unwinds to a terminate block.

A new CIR operation, cir.eh.terminate, is added to facilitate this
handling, and the design document is updated to describe the new
behavior.

Assisted-by: Cursor / claude-4.6-opus-high
DeltaFile
+166-0clang/test/CIR/Transforms/flatten-throwing-in-cleanup.cir
+0-120clang/test/CIR/Transforms/flatten-cleanup-scope-nyi.cir
+118-0clang/test/CIR/CodeGen/cleanup-throwing-dtor.cpp
+91-4clang/docs/ClangIRCleanupAndEHDesign.md
+72-0clang/lib/CIR/Dialect/Transforms/EHABILowering.cpp
+57-0clang/test/CIR/Transforms/eh-abi-lowering-itanium.cir
+504-1242 files not shown
+579-1358 files

LLVM/project 8573b5eclang-tools-extra/test/clang-doc enum.cpp, llvm/lib/Target/AMDGPU AMDGPUCoExecSchedStrategy.cpp

Merge branch 'main' into users/amehsan/weakc-delta-overflow
DeltaFile
+464-226clang-tools-extra/test/clang-doc/enum.cpp
+380-248llvm/test/Analysis/CostModel/AMDGPU/log10.ll
+380-248llvm/test/Analysis/CostModel/AMDGPU/log.ll
+606-0llvm/test/CodeGen/AMDGPU/coexec-scheduler.ll
+423-23llvm/lib/Target/AMDGPU/AMDGPUCoExecSchedStrategy.cpp
+218-218llvm/test/Analysis/CostModel/AMDGPU/sqrt.ll
+2,471-963285 files not shown
+8,300-2,896291 files

LLVM/project b6e4d27mlir/include/mlir/Interfaces/Utils MemorySlotUtils.h, mlir/lib/Dialect/SCF/IR MemorySlot.cpp

[MLIR][Mem2Reg] Extract shared utilities for PromotableRegionOpInterface (#188514)

The `PromotableRegionOpInterface` implementations use two helpers that
are likely useful for other dialects implementing this interface as
well:
- `updateTerminator`: Appends the reaching definition as an operand to a
block's terminator, falling back to a default when the block has no
entry (e.g. dead code).
- `replaceWithNewResults`: Clones an operation with additional result
types while preserving its regions, then replaces the original.

This PR extracts them into a common utility header so that downstream
dialects can reuse them directly.
I'm open to discussion about the location of these utilities.
DeltaFile
+188-0mlir/unittests/Interfaces/MemorySlotUtilsTest.cpp
+21-63mlir/lib/Dialect/SCF/IR/MemorySlot.cpp
+51-0mlir/lib/Interfaces/Utils/MemorySlotUtils.cpp
+36-0mlir/include/mlir/Interfaces/Utils/MemorySlotUtils.h
+15-0mlir/lib/Interfaces/Utils/CMakeLists.txt
+7-0mlir/unittests/Interfaces/CMakeLists.txt
+318-631 files not shown
+319-637 files

LLVM/project 2cf73afllvm/lib/Analysis DependenceAnalysis.cpp, llvm/test/Analysis/DependenceAnalysis weak-crossing-siv-addrec-wrap.ll weak-crossing-siv-delta-signed-min.ll

fix testcases after rebase and address conflict resolution issue
DeltaFile
+0-24llvm/lib/Analysis/DependenceAnalysis.cpp
+7-17llvm/test/Analysis/DependenceAnalysis/weak-crossing-siv-addrec-wrap.ll
+2-2llvm/test/Analysis/DependenceAnalysis/weak-crossing-siv-delta-signed-min.ll
+1-1llvm/test/Analysis/DependenceAnalysis/weak-crossing-siv-overflow.ll
+1-1llvm/test/Analysis/DependenceAnalysis/WeakCrossingSIV.ll
+1-1llvm/test/Analysis/DependenceAnalysis/weak-crossing-siv-large-btc.ll
+12-466 files

LLVM/project 06725d7llvm/lib/CodeGen/GlobalISel LegalizerHelper.cpp, llvm/lib/CodeGen/SelectionDAG LegalizeIntegerTypes.cpp

[GISel] Keep non-negative info in SUB(CTLZ) (#189314)

Implement non-negative value tracking for SUB-CTLZ chains in GlobalISel,
matching the behavior previously added to SelectionDAG.

Additionally, refactor the SelectionDAG implementation from the previous
patch to improve performance and code density.

Related to https://github.com/llvm/llvm-project/issues/136516 and
https://github.com/llvm/llvm-project/pull/186338#discussion_r2980420174
DeltaFile
+12-28llvm/test/CodeGen/AArch64/cls.ll
+4-4llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-ctlz.mir
+4-4llvm/test/CodeGen/RISCV/GlobalISel/legalizer/legalize-ctlz-rv64.mir
+4-4llvm/test/CodeGen/RISCV/GlobalISel/legalizer/legalize-ctlz-rv32.mir
+6-1llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp
+3-4llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp
+33-459 files not shown
+49-6215 files

LLVM/project e61d016llvm/lib/IR AutoUpgrade.cpp, llvm/test/Bitcode amdgpu-wmma-drop-ab-mods-upgrade.ll

[AMDGPU] Drop A and B neg modifier from amdgcn_wmma_bf16_16x16x32_bf16

Fixes: LCOMPILER-1673
DeltaFile
+6-46llvm/test/CodeGen/AMDGPU/llvm.amdgcn.wmma.imod.gfx1250.w32.ll
+10-10llvm/test/CodeGen/AMDGPU/wmma-hazards-gfx1250-w32.mir
+13-0llvm/test/Bitcode/amdgpu-wmma-drop-ab-mods-upgrade.ll
+7-3llvm/lib/IR/AutoUpgrade.cpp
+4-4llvm/test/CodeGen/AMDGPU/llvm.amdgcn.wmma.imm.gfx1250.w32.ll
+3-5mlir/include/mlir/Dialect/LLVMIR/ROCDLOps.td
+43-6810 files not shown
+60-8816 files

LLVM/project 26e0d15llvm/lib/Transforms/Vectorize SLPVectorizer.cpp, llvm/test/Transforms/PhaseOrdering/X86 avg.ll

[SLP] Prefer to trim equal-cost alternate-shuffle subtrees

If the trimming candidate subtree is rooted at an alternate-shuffle node
with binary ops, and this subtree has the same cost as the buildvector
node cost, better to stick with the buildvector node to avoid runtime
perf regressions from shuffle/extra operations  overhead that the cost model may
underestimate. Skip trimming if the subtree contains ExtractElement
nodes, since those operate on already-materialized vectors, which may
reduced vector-to-scalar code movement and have better perf.

Reviewers: hiraditya, bababuck, fhahn, RKSimon

Pull Request: https://github.com/llvm/llvm-project/pull/188272
DeltaFile
+21-3llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+10-12llvm/test/Transforms/SLPVectorizer/AArch64/unprofitable-alternate-subtree.ll
+7-7llvm/test/Transforms/PhaseOrdering/X86/avg.ll
+38-223 files

LLVM/project 55f79faclang/test/OpenMP target_teams_distribute_parallel_for_simd_schedule_codegen.cpp teams_distribute_parallel_for_simd_schedule_codegen.cpp, libc/AOR_v20.02/math/test/traces sincosf.txt exp.txt

Merge branch 'main' into users/amehsan/weakc-delta-overflow
DeltaFile
+0-31,999libc/AOR_v20.02/math/test/traces/sincosf.txt
+0-16,000libc/AOR_v20.02/math/test/traces/exp.txt
+6,911-6,946llvm/test/CodeGen/AMDGPU/memintrinsic-unroll.ll
+6,432-6,562llvm/test/CodeGen/X86/vector-interleaved-load-i64-stride-7.ll
+5,294-4,814clang/test/OpenMP/target_teams_distribute_parallel_for_simd_schedule_codegen.cpp
+5,238-4,758clang/test/OpenMP/teams_distribute_parallel_for_simd_schedule_codegen.cpp
+23,875-71,07910,718 files not shown
+673,313-424,81210,724 files

LLVM/project 804ece6llvm/lib/Analysis DependenceAnalysis.cpp, llvm/test/Analysis/DependenceAnalysis WeakCrossingSIV.ll weak-crossing-siv-addrec-wrap.ll

[DA] Require `nsw` for AddRecs in the WeakCrossing SIV test (#185041)

Before the start of the algorithm in weak crossing SIV test, we need to
ensure both addrecs are `nsw`
DeltaFile
+2-2llvm/test/Analysis/DependenceAnalysis/WeakCrossingSIV.ll
+1-3llvm/test/Analysis/DependenceAnalysis/weak-crossing-siv-addrec-wrap.ll
+3-0llvm/lib/Analysis/DependenceAnalysis.cpp
+6-53 files

LLVM/project 6021270utils/bazel/llvm-project-overlay/llvm BUILD.bazel

[Bazel] Fixes 04785ad (#189456)

This fixes 04785adec34ddf9a6ec47f10da5b2b7fe8c9f9c8.

Co-authored-by: Google Bazel Bot <google-bazel-bot at google.com>
DeltaFile
+1-0utils/bazel/llvm-project-overlay/llvm/BUILD.bazel
+1-01 files

LLVM/project e77104bclang/lib/Driver/ToolChains Clang.cpp

clang: Use MakeArgStringRef more often

Avoid an intermediate copy by using MakeArgStringRef. Also
use better use of Twine with MakeArgString.
DeltaFile
+9-12clang/lib/Driver/ToolChains/Clang.cpp
+9-121 files