LLVM/project e438a90lldb/source/Target TargetProperties.td, lldb/test/Shell/Settings TestChildDepthTruncation.test

[lldb] Increment max-children-depth to 5 (#178717)

`max-children-depth` was [originally 6][1], which produced too large of
output. It was then [reduced to 4][2], which for some people is too low.
This change is to try 5 as the default.

Originally upstreamed in
https://github.com/llvm/llvm-project/pull/149282

[1]:
https://github.com/swiftlang/llvm-project/pull/4280/changes/ee0782bf6b2e9705e261c5a82147ce0e45a8d753
[2]: https://github.com/swiftlang/llvm-project/pull/10683
DeltaFile
+8-5lldb/test/Shell/Settings/TestChildDepthTruncation.test
+1-1lldb/source/Target/TargetProperties.td
+9-62 files

LLVM/project de931a2llvm/lib/Transforms/Vectorize VectorCombine.cpp, llvm/test/Transforms/VectorCombine load-shufflevector.ll

Revert "[VectorCombine] Trim low end of loads used in shufflevector rebroadca…"

This reverts commit 6c8d9d0c4da51c7f9e7671902be3ad9b65d56c84.
DeltaFile
+23-32llvm/test/Transforms/VectorCombine/load-shufflevector.ll
+8-18llvm/lib/Transforms/Vectorize/VectorCombine.cpp
+31-502 files

LLVM/project c686002llvm/lib/Target/AMDGPU AMDGPURegBankLegalizeRules.cpp, llvm/test/CodeGen/AMDGPU/GlobalISel llvm.amdgcn.s.sleep.ll

[AMDGPU][GlobalISel] Add RegBankLegalize rules for amdgcn_s_sleep (#178838)

DeltaFile
+2-2llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.s.sleep.ll
+2-0llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeRules.cpp
+4-22 files

LLVM/project c586271llvm/lib/Target/AMDGPU SIInsertWaitcnts.cpp

[AMDGPU][SIInsertWaitcnts][NFC] Introduce WaitEventSet container for events (#178511)

Before this patch WaitEventType events used to be collected in unsigned
integers that were used as small bit vectors.

This patch introduces a WaitEventSet container class to replace the
integer bit vectors with a class that hides the implementation of common
operations like insertion, removal, union, intersection etc. from the
user.

The WaitEventSet API matches that of a set and not a vector because we
don't care about the order of its contents. Internally though it is
still a bit vector that uses an unsigned integer as its storage, just
like the original implementation.

This patch should not change the functionality.
DeltaFile
+149-65llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
+149-651 files

LLVM/project 494079allvm/lib/Target/AMDGPU GCNSchedStrategy.cpp

Cast to wider type
DeltaFile
+3-3llvm/lib/Target/AMDGPU/GCNSchedStrategy.cpp
+3-31 files

LLVM/project baf1d46llvm/lib/Target/AMDGPU GCNSchedStrategy.cpp GCNSchedStrategy.h, llvm/test/CodeGen/AMDGPU debug-value-scheduler.mir sema-v-unsched-bundle.ll

Squashed changes
DeltaFile
+43-50llvm/lib/Target/AMDGPU/GCNSchedStrategy.cpp
+6-2llvm/lib/Target/AMDGPU/GCNSchedStrategy.h
+2-2llvm/test/CodeGen/AMDGPU/debug-value-scheduler.mir
+1-1llvm/test/CodeGen/AMDGPU/sema-v-unsched-bundle.ll
+52-554 files

LLVM/project bf8f6d8llvm/lib/Target/AMDGPU AMDGPURegBankLegalizeRules.cpp, llvm/test/CodeGen/AMDGPU llvm.amdgcn.fma.legacy.ll

[AMDGPU][GlobalISel] Add RegBankLegalize rules for fma_legacy (#178759)

DeltaFile
+27-3llvm/test/CodeGen/AMDGPU/llvm.amdgcn.fma.legacy.ll
+4-0llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeRules.cpp
+31-32 files

LLVM/project b73122dllvm/lib/Transforms/Vectorize SLPVectorizer.cpp, llvm/test/Transforms/SLPVectorizer/X86 minbw-bitcast-to-fp.ll

[SLP]Cast incoming value to a propr type for int nodes, bitcasted to fp

Before casting the value to FP type, need to check, if the type for
reduced during minbitwidth analysis and need to restore the original
source type to generate correct bitcast operation.

Fixes #178884
DeltaFile
+52-0llvm/test/Transforms/SLPVectorizer/X86/minbw-bitcast-to-fp.ll
+7-0llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+59-02 files

LLVM/project 5d01a0aclang/lib/DependencyScanning DependencyScannerImpl.cpp, clang/lib/Tooling DependencyScanningTool.cpp

[clang][Modules] Fixing Incorrect Diagnostics Issued during By-name Dependency Scanning (#178542)

The by-name lookup API uses the same diagnostics engine and consumer for
multiple lookups. When multiple lookups fail, the diagnostics could be
incorrect for all but the first failing lookup. All the subsequent
failing lookups inherit the diagnostics from the first failing lookup.

This PR resets the diagnostics consumer's buffer and the
CompilerInstance's diagnostics engine for each by-name lookup, so each
lookup can produce the correct diagnostics.

Part of work for rdar://136303612.
DeltaFile
+52-0clang/test/ClangScanDeps/modules-full-by-mult-mod-names-diagnostics.c
+5-0clang/lib/DependencyScanning/DependencyScannerImpl.cpp
+3-0clang/lib/Tooling/DependencyScanningTool.cpp
+0-1clang/tools/clang-scan-deps/ClangScanDeps.cpp
+60-14 files

LLVM/project 8fa695allvm/lib/Target/AMDGPU AMDGPURegBankLegalizeRules.cpp, llvm/test/CodeGen/AMDGPU llvm.amdgcn.prng.ll

[AMDGPU][GlobalISel] Add RegBankLegalize rules for amdgcn_prng_b32 (#178741)

DeltaFile
+4-0llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeRules.cpp
+2-2llvm/test/CodeGen/AMDGPU/llvm.amdgcn.prng.ll
+6-22 files

LLVM/project c329074llvm/lib/MC/MCParser MasmParser.cpp

[perf] Replace copy-assign by move-assign in llvm/lib/MC/* (#178176)

DeltaFile
+1-1llvm/lib/MC/MCParser/MasmParser.cpp
+1-11 files

LLVM/project 8029699llvm/lib/Target/AMDGPU GCNSchedStrategy.cpp GCNSchedStrategy.h

[AMDGPU][Scheduler] Make `finalizeGCNRegion` an overridable hook (NFC) (#177199)

This allows individual stages to make decisions after re-scheduling
individual regions.
DeltaFile
+2-4llvm/lib/Target/AMDGPU/GCNSchedStrategy.cpp
+3-3llvm/lib/Target/AMDGPU/GCNSchedStrategy.h
+5-72 files

LLVM/project bc73157llvm/lib/Transforms/InstCombine InstCombineCalls.cpp, llvm/test/Transforms/InstCombine assume.ll assume-loop-align.ll

Revert "[InstCombine] Always fold alignment assumptions into operand bundles (#177597)"

This reverts commit b74e1bca6d77b3de5c05822d1631006ce2a30cc6.
Makes clang assert:
https://github.com/llvm/llvm-project/pull/177597#issuecomment-3824553291
DeltaFile
+48-16llvm/test/Transforms/InstCombine/assume.ll
+8-2llvm/test/Transforms/InstCombine/assume-loop-align.ll
+4-1llvm/test/Transforms/InstCombine/assume_inevitable.ll
+2-1llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
+62-204 files

LLVM/project 1ccb4a2llvm/lib/Target/AMDGPU AMDGPURegBankLegalizeRules.cpp, llvm/test/CodeGen/AMDGPU/GlobalISel llvm.amdgcn.wwm.ll regbankselect-amdgcn.wwm.mir

[AMDGPU][GlobalISel] Add RegBankLegalize rules for amdgcn_wwm/strict_wwm (#178615)

DeltaFile
+23-23llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.wwm.ll
+14-0llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeRules.cpp
+2-4llvm/test/CodeGen/AMDGPU/GlobalISel/regbankselect-amdgcn.wwm.mir
+39-273 files

LLVM/project 45e9de6llvm/test/CodeGen/AMDGPU llvm.amdgcn.class.ll llvm.amdgcn.class.f16.ll

[AMDGPU][NFC] Update test to use update_llc_test_checks (#178826)

Also add global-isel run line to the two amdgcn.class tests.
DeltaFile
+974-460llvm/test/CodeGen/AMDGPU/llvm.amdgcn.class.ll
+222-58llvm/test/CodeGen/AMDGPU/llvm.amdgcn.class.f16.ll
+1,196-5182 files

LLVM/project e62182bllvm/include/llvm/Support KnownFPClass.h, llvm/lib/Transforms/InstCombine InstCombineSimplifyDemanded.cpp

InstCombine: Handle multiple use copysign (#176917)

Handle multiple use copysign in SimplifyDemandedFPClass
DeltaFile
+38-6llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp
+7-7llvm/test/Transforms/InstCombine/simplify-demanded-fpclass.ll
+7-0llvm/include/llvm/Support/KnownFPClass.h
+52-133 files

LLVM/project 88478abllvm/docs DTLTO.rst

[DOC][DTLTO] Update DTLTO documentation for the LLVM 22 release (#177368)

This change updates the documentation to reflect work completed during
the LLVM 22 timeframe, including support for the ThinLTO cache and
static libraries/archives.

It also clarifies that the goal of DTLTO is to support distribution of
ThinLTO backend compilations for any in-process ThinLTO invocation.

SIE Internal Tracker: TOOLCHAIN-21016
DeltaFile
+15-11llvm/docs/DTLTO.rst
+15-111 files

LLVM/project abfd562llvm/lib/Transforms/Vectorize VPlanRecipes.cpp, llvm/test/Transforms/LoopVectorize/AArch64 conditional-branches-cost.ll store-costs-sve.ll

[VPlan] Mark VPActiveLaneMaskPHIRecipe as readnone. (#177886)

VPWidenActiveLaneMaskPHIRecipe does not have side-effects and also does
not access memory. Mark accordingly. This allows hoisting of some
invariant loads out of loops and also removing unused phi recipes in the
future.

In
llvm/test/Transforms/LoopVectorize/AArch64/conditional-branches-cost.ll,
the hoisting makes vectorization profitable.

PR: https://github.com/llvm/llvm-project/pull/177886
DeltaFile
+76-22llvm/test/Transforms/LoopVectorize/AArch64/conditional-branches-cost.ll
+5-5llvm/test/Transforms/LoopVectorize/AArch64/store-costs-sve.ll
+4-4llvm/test/Transforms/LoopVectorize/AArch64/predicated-costs.ll
+2-0llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
+87-314 files

LLVM/project 914a233flang/lib/Semantics check-omp-structure.cpp, flang/test/Semantics/OpenMP lastprivate-intent-in-pointer.f90

[Flang][OpenMP] Reject INTENT(IN) pointers in LASTPRIVATE clause (#178845)

`LASTPRIVATE` clause requires the list item to be definable since the
value from the last iteration is assigned back to the original variable.
For pointers, this assignment occurs "as if by pointer assignment"
(OpenMP 5.2 Section 5.4.5).

An `INTENT(IN)` pointer dummy argument is not a valid target for pointer
assignment, therefore it should not be permitted in a `LASTPRIVATE`
clause.

This patch adds the `CheckIntentInPointer()` call to the `LASTPRIVATE`
clause handler, consistent with other data-sharing clauses like
`PRIVATE`, `COPYPRIVATE`, and `REDUCTION`.

Fixes [#178398](https://github.com/llvm/llvm-project/issues/178398)
DeltaFile
+94-0flang/test/Semantics/OpenMP/lastprivate-intent-in-pointer.f90
+1-0flang/lib/Semantics/check-omp-structure.cpp
+95-02 files

LLVM/project c475b78llvm/lib/Target/AArch64 AArch64SystemOperands.td, llvm/lib/Target/AArch64/AsmParser AArch64AsmParser.cpp

[AArch64][llvm] Gate some `tlbip` insns with +tlbid or +d128

Change the gating of `tlbip` instructions containing `*E1IS*`, `*E1OS*`,
`*E2IS*` or `*E2OS*` to be used with `+tlbid` or `+d128`. This is because
the 2025 Armv9.7-A MemSys specification says:

```
  All TLBIP *E1IS*, TLBIP*E1OS*, TLBIP*E2IS* and TLBIP*E2OS* instructions
  that are currently dependent on FEAT_D128 are updated to be dependent
  on FEAT_D128 or FEAT_TLBID
```
DeltaFile
+259-0llvm/test/MC/AArch64/tlbip-tlbid-or-d128.s
+66-66llvm/test/MC/AArch64/armv9a-sysp.s
+14-6llvm/lib/Target/AArch64/AsmParser/AArch64AsmParser.cpp
+20-0llvm/lib/Target/AArch64/Utils/AArch64BaseInfo.h
+11-2llvm/lib/Target/AArch64/AArch64SystemOperands.td
+370-745 files

LLVM/project 51f8929mlir/lib/Dialect/Transform/IR TransformOps.cpp, mlir/test/Dialect/Transform ops-invalid.mlir test-interpreter.mlir

[mlir] Verify childen interface in transform named sequence (#178881)

Application of sequence blocks in the transform interpreter assumes that
all operations (except for the terminator) in the sequence block have
the `TransformOpInterface`. For `SequenceOp`, this was already verified,
but not for `NamedSequenceOp`, causing assertion failures if the
assumption doesn't hold.

This change adds verification that all operations in the block except
for the terminator have the `TransformOpInterface`.

Signed-off-by: Lukas Sommer <lukas.sommer at amd.com>
DeltaFile
+12-1mlir/test/Dialect/Transform/ops-invalid.mlir
+11-0mlir/lib/Dialect/Transform/IR/TransformOps.cpp
+1-1mlir/test/Dialect/Transform/test-interpreter.mlir
+24-23 files

LLVM/project 25fcc87mlir/lib/Dialect/OpenACC/Utils OpenACCUtilsLoop.cpp, mlir/unittests/Dialect/OpenACC OpenACCUtilsLoopTest.cpp

[acc] Fix acc.loop to scf utilities (#178809)

Fixes a problem encountered with enabling coalesceLoops when bounds were
constructed inside expanded loops. Additionally, ensures that all loop
utilities use rewriter instead of their own builders for proper
tracking.
DeltaFile
+32-22mlir/lib/Dialect/OpenACC/Utils/OpenACCUtilsLoop.cpp
+30-0mlir/unittests/Dialect/OpenACC/OpenACCUtilsLoopTest.cpp
+62-222 files

LLVM/project e9677d1clang/lib/Driver/ToolChains Linux.cpp MSVC.cpp, clang/test/Driver hip-runtime-libs-msvc.hip hip-runtime-libs-linux.hip

[HIP] Make `--no-offloadlib` not link HIP's RT (#177677)

Summary:
Right now we have `--no-hip-rt` to suppress the implicit linking of the
HIP runtime. However, we already have a flag for `--no-offloadlib` which
seems to imply this. However, this one currently only applies to the
device-side library. More targets will likely use this soon, so it would
be nice to unify the behavior here.

The impact of this change is that `-nogpulib` which is commonly used to
suppress the ROCm device libraries will now also suppress this, and
`--no-hip-rt` will suppress the ROCm device libraries. This is a
functional change, but I'm not sure if anyone truly relies on this
distinction in the wild. Functionally, one turns off the host runtime,
the other the device. This PR makes both do both at the same time. Since
these are libraries we should be able to just get users to pass them
manually if needed.
DeltaFile
+4-4clang/test/Driver/hip-runtime-libs-msvc.hip
+5-2clang/test/Driver/hip-runtime-libs-linux.hip
+3-1clang/lib/Driver/ToolChains/Linux.cpp
+3-1clang/lib/Driver/ToolChains/MSVC.cpp
+15-84 files

LLVM/project 2a06846llvm/lib/Target/AArch64 AArch64InstrInfo.td AArch64SystemOperands.td, llvm/test/MC/AArch64 armv9a-sysp.s armv9-mrrs.s

[AArch64][llvm] Remove `+d128` gating on `sysp`, `msrr` and `mrrs` instructions

Remove `+d128` gating on `sysp`, `msrr` and `mrrs` instructions.

We removed gating for `sys`, `mrs` and `mrs` instructions previously,
on the basis that it doesn't add value, as it doesn't indicate that
any particular system registers or system instructions are available.

Therefore, remove `+d128` gating for these too.

(In an upcoming change, some `tlbip` instructions, which are `sysp` aliases
are allowed to be used with either `+d128` or `tlbid`. If we don't remove
this gating, then it would require some ugly work-arounds in the code to
support the relaxation mandated by the 2025 MemSys specification.

In this change, retain `+d128` gating for all `tlbip` instructions, which
will then be loosened to either `+d128` or `+tlbid` in a subsequent change)
DeltaFile
+122-196llvm/test/MC/AArch64/armv9a-sysp.s
+7-97llvm/test/MC/AArch64/armv9-mrrs.s
+42-46llvm/lib/Target/AArch64/AArch64InstrInfo.td
+7-53llvm/test/MC/AArch64/armv9-msrr.s
+4-2llvm/lib/Target/AArch64/AArch64SystemOperands.td
+2-3llvm/test/MC/AArch64/directive-arch_extension-negative.s
+184-3973 files not shown
+190-3989 files

LLVM/project f91da0eclang/include/clang/Analysis/Analyses/LifetimeSafety MovedLoans.h Facts.h, clang/lib/Analysis/LifetimeSafety MovedLoans.cpp Facts.cpp

Revisit handling moved origins
DeltaFile
+108-0clang/lib/Analysis/LifetimeSafety/MovedLoans.cpp
+66-5clang/lib/Analysis/LifetimeSafety/Facts.cpp
+32-24clang/lib/Analysis/LifetimeSafety/FactsGenerator.cpp
+37-17clang/test/Sema/warn-lifetime-safety.cpp
+44-0clang/include/clang/Analysis/Analyses/LifetimeSafety/MovedLoans.h
+36-6clang/include/clang/Analysis/Analyses/LifetimeSafety/Facts.h
+323-5215 files not shown
+480-11721 files

LLVM/project 0c04a64llvm/lib/Transforms/InstCombine InstCombineSimplifyDemanded.cpp

Fix using Known as input
DeltaFile
+2-3llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp
+2-31 files

LLVM/project 87cbb3fllvm/include/llvm/Support KnownFPClass.h, llvm/lib/Transforms/InstCombine InstCombineSimplifyDemanded.cpp

InstCombine: Handle multiple use copysign

Handle multiple use copysign in SimplifyDemandedFPClass
DeltaFile
+36-3llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp
+7-7llvm/test/Transforms/InstCombine/simplify-demanded-fpclass.ll
+7-0llvm/include/llvm/Support/KnownFPClass.h
+50-103 files

LLVM/project 6c8d9d0llvm/lib/Transforms/Vectorize VectorCombine.cpp, llvm/test/Transforms/VectorCombine load-shufflevector.ll

[VectorCombine] Trim low end of loads used in shufflevector rebroadcasts. (#149093)

Following on from #128938, trim the low end of loads where only some of
the incoming lanes are used for rebroadcasts in shufflevector
instructions.

---------

Co-authored-by: Leon Clark <leoclark at amd.com>
Co-authored-by: Simon Pilgrim <llvm-dev at redking.me.uk>
DeltaFile
+32-23llvm/test/Transforms/VectorCombine/load-shufflevector.ll
+18-8llvm/lib/Transforms/Vectorize/VectorCombine.cpp
+50-312 files

LLVM/project b4b8d4ellvm/lib/Target/AMDGPU GCNVOPDUtils.cpp, llvm/test/CodeGen/AMDGPU atomic_optimizations_struct_buffer.ll vopd-combine-gfx1250.mir

[AMDGPU] Fix VOPD checks for commuting OpX and OpY (#178772)

We need to check that OpX does not write the sources of OpY, but if we
swap OpX and OpY with respect to program order, the check was not
swapped correctly.

The checks on gfx1250 can be relaxed slightly, that is planned for a
future patch.

---------

Co-authored-by: Matt Arsenault <arsenm2 at gmail.com>
DeltaFile
+54-113llvm/test/CodeGen/AMDGPU/atomic_optimizations_struct_buffer.ll
+107-0llvm/test/CodeGen/AMDGPU/vopd-combine-gfx1250.mir
+30-28llvm/lib/Target/AMDGPU/GCNVOPDUtils.cpp
+25-20llvm/test/CodeGen/AMDGPU/bf16.ll
+12-24llvm/test/CodeGen/AMDGPU/GlobalISel/shl.ll
+16-16llvm/test/CodeGen/AMDGPU/expand-waitcnt-profiling.ll
+244-20114 files not shown
+340-25920 files

LLVM/project 2eaaaf1clang/include/clang/Basic CodeGenOptions.h, clang/include/clang/Options Options.td

NFC: Rename CodeGenOptions::StackUsageOutput to StackUsageFile (#178898)

Preparation for #178005.

"Output" has too many different interpretations: it could be an
enabled/disabled, a file format, etc. Clarify that it's the destination
file.
DeltaFile
+1-1clang/include/clang/Basic/CodeGenOptions.h
+1-1clang/include/clang/Options/Options.td
+1-1clang/lib/CodeGen/BackendUtil.cpp
+1-1clang/lib/Frontend/CompilerInvocation.cpp
+1-1llvm/include/llvm/Target/TargetOptions.h
+1-1llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
+6-66 files