LLVM/project 9e857c8llvm/test/CodeGen/AMDGPU/GlobalISel regbankselect-mui.ll, llvm/test/CodeGen/X86 atomic-load-store.ll

Merge branch 'main' into users/kparzysz/single-check
DeltaFile
+12,991-3,310llvm/test/MC/AMDGPU/gfx13_asm_vop3_dpp16.s
+11,856-3,719llvm/test/MC/AMDGPU/gfx12_asm_vop3_dpp16.s
+0-8,306llvm/test/MC/Disassembler/AMDGPU/gfx12_dasm_vop3_dpp16.txt
+5,672-0llvm/test/MC/Disassembler/AMDGPU/gfx12_dasm_vop3_dpp16-fake.txt
+0-643llvm/test/CodeGen/AMDGPU/GlobalISel/regbankselect-mui.ll
+302-335llvm/test/CodeGen/X86/atomic-load-store.ll
+30,821-16,3131,241 files not shown
+43,903-23,2831,247 files

LLVM/project ca89329flang/lib/Semantics check-omp-atomic.cpp

Remove unused variable
DeltaFile
+0-1flang/lib/Semantics/check-omp-atomic.cpp
+0-11 files

LLVM/project 559d930flang/lib/Semantics check-omp-structure.cpp

Add expected directive checks when popping context
DeltaFile
+9-1flang/lib/Semantics/check-omp-structure.cpp
+9-11 files

LLVM/project d2ebba8flang/lib/Semantics check-omp-structure.cpp

Use CurrentDirectiveIsNested and update comment
DeltaFile
+2-6flang/lib/Semantics/check-omp-structure.cpp
+2-61 files

LLVM/project 83ef81eflang/lib/Semantics check-omp-atomic.cpp check-omp-structure.h

Remove push/pop from atomic
DeltaFile
+0-5flang/lib/Semantics/check-omp-atomic.cpp
+0-1flang/lib/Semantics/check-omp-structure.h
+0-62 files

LLVM/project a7e3edallvm/include/llvm/Support Allocator.h, llvm/unittests/Support AllocatorTest.cpp

Revert "[Allocator] Keep bump pointer at a minimum alignment" (#205091)

Arithmetic on nullptr is UB and gets flagged by UBSan.

Reverts llvm/llvm-project#203718
DeltaFile
+27-51llvm/include/llvm/Support/Allocator.h
+0-19llvm/unittests/Support/AllocatorTest.cpp
+27-702 files

LLVM/project b0c2017mlir/lib/Dialect/X86/Transforms VectorContractToAMXDotProduct.cpp, mlir/test/Dialect/X86/AMX vector-contract-to-tiled-dp.mlir

[mlir][x86] Fail on missing read source operation (#205077)

Adds an extra check to AMX lowering to fail gracefully when a source
operation for contraction input data is not found.
DeltaFile
+69-0mlir/test/Dialect/X86/AMX/vector-contract-to-tiled-dp.mlir
+4-0mlir/lib/Dialect/X86/Transforms/VectorContractToAMXDotProduct.cpp
+73-02 files

LLVM/project 06dcba6flang/include/flang/Optimizer/Transforms Passes.td, flang/lib/Optimizer/Passes Pipelines.cpp

[flang][debug] Add fake use ops for dynamic array dimension variables (#200061)

In cases where the upper or lower bounds of a dynamic array are not
explicitly referenced in code, flang can optimize away the internal
variables that represent these values. This causes missing values to
appear in the debugger when examining the dynamic array's type. Adding
an llvm.fake.use op for each bound preserves it for use by a debugger,
similar to the fix for #185432.

Resolves #119474
DeltaFile
+135-0flang/test/Transforms/debug-fake-use-multiple-dimensions.fir
+116-0flang/test/Transforms/debug-fake-use-multiple-returns.fir
+48-15flang/lib/Optimizer/Transforms/AddDebugInfo.cpp
+42-3flang/test/Transforms/debug-fake-use.fir
+2-2flang/include/flang/Optimizer/Transforms/Passes.td
+1-1flang/lib/Optimizer/Passes/Pipelines.cpp
+344-216 files

LLVM/project a780818llvm/test/MC/AMDGPU gfx13_asm_vop3_dpp16.s

[AMDGPU][NFC] Templatise and roundtrip gfx13_asm_vop3_dpp16.s (#204849)

Again, this is based on the templatised version of
gfx12_asm_vop3_dpp16.s with the GFX13-specific changes re-applied on top
of it.

gfx13_dasm_vop3_dpp16.txt was never upstreamed, so no changes for the
disassembler side.
DeltaFile
+12,991-3,310llvm/test/MC/AMDGPU/gfx13_asm_vop3_dpp16.s
+12,991-3,3101 files

LLVM/project 27af208llvm/test/MC/AMDGPU gfx12_asm_vop3_dpp16.s, llvm/test/MC/Disassembler/AMDGPU gfx12_dasm_vop3_dpp16.txt gfx12_dasm_vop3_dpp16-fake.txt

[AMDGPU][NFC] Templatise and roundtrip gfx12_asm_vop3_dpp16.s (#203953)

This is effectively the changes between the non-template versions of
gfx11/12_asm_vop3_dpp16.s applied on top of the templatised
gfx11_asm_vop3_dpp16.s.
DeltaFile
+11,856-3,719llvm/test/MC/AMDGPU/gfx12_asm_vop3_dpp16.s
+0-8,306llvm/test/MC/Disassembler/AMDGPU/gfx12_dasm_vop3_dpp16.txt
+5,672-0llvm/test/MC/Disassembler/AMDGPU/gfx12_dasm_vop3_dpp16-fake.txt
+17,528-12,0253 files

LLVM/project 8616fadlldb/source/Target Memory.cpp

[lldb] Avoid tautological copying of a newly created object (NFC) (#204998)
DeltaFile
+1-2lldb/source/Target/Memory.cpp
+1-21 files

LLVM/project 4f5c444orc-rt/include/orc-rt QueueingRunner.h, orc-rt/unittests QueueingRunnerTest.cpp SessionTest.cpp

[orc-rt] Default QueueingRunner to a synchronized queue. (#205088)

Adds orc_rt::detail::SynchronizedDeque<T>, a mutex-protected deque whose
pop_front / pop_back return std::optional<T> (std::nullopt indicates the
queue is empty), and makes it QueueingRunner's default WorkQueue type.

Using a synchronized queue type allows QueueingRunner to be used in
multi-threaded contexts.

Updates SessionTest and InProcessControllerAccessTest to use the new
default, and extends QueueingRunnerTest to cover the new contract.
DeltaFile
+54-19orc-rt/include/orc-rt/QueueingRunner.h
+54-13orc-rt/unittests/QueueingRunnerTest.cpp
+14-17orc-rt/unittests/SessionTest.cpp
+2-4orc-rt/unittests/InProcessControllerAccessTest.cpp
+124-534 files

LLVM/project 6019657offload/ci openmp-offload-amdgpu-libc-runtime.py

Reapply "[AMDGPU] Add compiler-rt checks for the GPU runtime" (#204898)

The original issue should've been solved by
https://github.com/llvm/llvm-project/pull/204694
Reverts llvm/llvm-project#204370
DeltaFile
+7-0offload/ci/openmp-offload-amdgpu-libc-runtime.py
+7-01 files

LLVM/project f8cb6beclang/lib/Driver/ToolChains Clang.cpp

clang: Use the effective triple string for offload jobs (#205065)

Track the future effective triple for the job, rather than
the toolchain's default triple. In the future this will
change the result when amdgpu starts adjusting the triples
to contain subarches.
DeltaFile
+12-7clang/lib/Driver/ToolChains/Clang.cpp
+12-71 files

LLVM/project c5f4abfllvm/lib/Transforms/Vectorize SLPVectorizer.cpp, llvm/test/Transforms/SLPVectorizer revec.ll

[SLP][Revec] Fix SLP crash when trying to fold trailing scalars into the reduced value  (#203477)

Cost modelling change introduced by commit SHA f15666d is enabling revec
of test shown in the above issue. This is introducing a crash as the
reduced value(a scalar) is being added to a vector tail value.

Patch tries to address this.

Fixes #203195
DeltaFile
+60-0llvm/test/Transforms/SLPVectorizer/X86/revec-ordered-reductions.ll
+38-1llvm/test/Transforms/SLPVectorizer/revec.ll
+15-10llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+113-113 files

LLVM/project cbefabcllvm/test/CodeGen/AMDGPU/GlobalISel legalize-load-global.mir legalize-load-private.mir

AMDGPU/GlobalISel: Switch to extended LLTs

With minimal changes. Most notably because of changes to jumptable in isel
GIM_SwitchType requires explicit integer/float types and does not match scalar.
In most places change is in lowering to use LLT::integer or LLT::float.

Other changes:
- replaceRegWith can also change type on Dst register, this can cause CSE data
  corruption (fix is to notify observer)
- mixed i32/f32 in G_MERGE_VALUES/G_UNMERGE_VALUES, common in legalizing
  ray tracing and image intrinsics
- need extra bitcast between i32/f32 in some place
DeltaFile
+7,957-7,957llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-load-global.mir
+6,802-6,774llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-load-private.mir
+6,489-6,465llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-load-local.mir
+5,732-5,732llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-llvm.amdgcn.image.sample.a16.ll
+5,645-5,645llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-store-global.mir
+3,852-3,852llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-llvm.amdgcn.image.dim.a16.ll
+36,477-36,425588 files not shown
+100,715-94,889594 files

LLVM/project b302211clang/lib/Driver Driver.cpp, clang/lib/Driver/ToolChains CommonArgs.cpp AMDGPU.cpp

clang/AMDGPU: Use effective triple instead of raw toolchain triple (#205054)

Start using the effective triple instead of the raw toolchain triple.
For the moment this is NFC, but will change when new uses of the subarch
field are introduced.
DeltaFile
+3-2clang/lib/Driver/ToolChains/CommonArgs.cpp
+2-2clang/lib/Driver/ToolChains/AMDGPU.cpp
+1-1clang/lib/Driver/ToolChains/HIPAMD.cpp
+1-1clang/lib/Driver/Driver.cpp
+7-64 files

LLVM/project 7b0b025llvm/include/llvm/Support Allocator.h, llvm/unittests/Support AllocatorTest.cpp

Revert "[Allocator] Keep bump pointer at a minimum alignment (#203718)"

This reverts commit b95e1e890c025ab4eee3583b7b1e2497991145db.
DeltaFile
+27-51llvm/include/llvm/Support/Allocator.h
+0-19llvm/unittests/Support/AllocatorTest.cpp
+27-702 files

LLVM/project b6a0f6cclang/lib/Analysis/LifetimeSafety Checker.cpp, clang/test/Sema/LifetimeSafety lifetimebound-violation.cpp

[LifetimeSafety] resolved lifetimeBound violation in constructor (#204797)

Fix https://github.com/llvm/llvm-project/issues/203839.

Constructor body does not produce `ReturnEscapeFact`, but a constructor
parameter marked [[clang::lifetimebound]] may still be valid if it
escapes into a field of the constructed object.

Update the `LifetimeBound` logic to accept `FieldEscapeFact` for
constructors, and add a test case for this pattern.
DeltaFile
+4-2clang/lib/Analysis/LifetimeSafety/Checker.cpp
+5-0clang/test/Sema/LifetimeSafety/lifetimebound-violation.cpp
+9-22 files

LLVM/project 8b72350clang/lib/StaticAnalyzer/Core ExprEngine.cpp

[NFC][analyzer] Eliminate some simple node builders (#204187)

Replace four `NodeBuilder`s with direct use of the `makeNode` method
family. This is part of my commit series that simplifies the engine by
gradually eliminating the class `NodeBuilder`.
DeltaFile
+20-32clang/lib/StaticAnalyzer/Core/ExprEngine.cpp
+20-321 files

LLVM/project 4fd4640llvm/lib/Analysis DependenceAnalysis.cpp, llvm/test/Analysis/DependenceAnalysis find-gcd-overflow.ll

[DA] Add overflow check before calculating absolute value (#201964)

In findGCD, we call APInt::abs, which can overflow when the value is a
signed minimum one. Such overflow can lead miscompilation, so this patch
adds an overflow check for absolute value calculations and bail out
early if it actually overflows.
Fix #201559.
DeltaFile
+59-0llvm/test/Analysis/DependenceAnalysis/find-gcd-overflow.ll
+14-3llvm/lib/Analysis/DependenceAnalysis.cpp
+73-32 files

LLVM/project f3fb6feclang/lib/AST/ByteCode Interp.cpp, clang/test/AST/ByteCode new-delete.cpp

[clang][bytecode] Check for block pointers in Free() (#205043)

We need a block pointer here for the following operations, and non-block
pointers aren't valid anyway.
DeltaFile
+13-0clang/test/AST/ByteCode/new-delete.cpp
+3-0clang/lib/AST/ByteCode/Interp.cpp
+16-02 files

LLVM/project 63f9955llvm/lib/Target/AMDGPU AMDGPURegBankLegalizeRules.cpp, llvm/test/CodeGen/AMDGPU packed-u64.ll

AMDGPU/GlobalISel: RegBankLegalize rules for pk_u64 add and sub
DeltaFile
+23-7llvm/test/CodeGen/AMDGPU/packed-u64.ll
+3-1llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeRules.cpp
+26-82 files

LLVM/project c56e892llvm/test/CodeGen/AMDGPU vector-reduce-umin.ll vector-reduce-smax.ll, llvm/test/CodeGen/AMDGPU/GlobalISel fdiv.f32.ll

AMDGPU/GlobalISel: Remove -new-reg-bank-select option (#203929)

AMDGPU's -global-isel pipeline that uses AMDGPURegBankSelect and
AMDGPURegBankLegalize, previously -global-isel -new-reg-bank-select,
is now the default -global-isel pipeline.

Remove -new-reg-bank-select option from the compiler.
Remove -new-reg-bank-select from all llvm regression tests.
Edit a couple comments to reference RegBankLegalize instead of
-new-reg-bank-select.
DeltaFile
+12-12llvm/test/CodeGen/AMDGPU/vector-reduce-umin.ll
+12-12llvm/test/CodeGen/AMDGPU/vector-reduce-smax.ll
+12-12llvm/test/CodeGen/AMDGPU/vector-reduce-umax.ll
+12-12llvm/test/CodeGen/AMDGPU/GlobalISel/fdiv.f32.ll
+12-12llvm/test/CodeGen/AMDGPU/integer-mad-patterns.ll
+11-11llvm/test/CodeGen/AMDGPU/llvm.amdgcn.wave.shuffle.ll
+71-71893 files not shown
+2,533-2,541899 files

LLVM/project b95e1e8llvm/include/llvm/Support Allocator.h, llvm/unittests/Support AllocatorTest.cpp

[Allocator] Keep bump pointer at a minimum alignment (#203718)

Add a `MinAlign` template parameter (default 8, sizeof(size_t) on 64-bit
platforms) so that the common case `Alignment <= MinAlign` can skip
realigning `CurPtr`.

This is achieved by rounding each allocation's size up to MinAlign, so
the bump pointer stays MinAlign-aligned between allocations.

SpecificBumpPtrAllocator::DestroyAll() walks objects at a fixed
sizeof(T) stride and needs tight packing, so it uses MinAlign=1. (alignof(T) would
pack just as tightly and reuse the default instantiation, but T may be
incomplete here, e.g. `SpecificBumpPtrAllocator<MCSectionELF>`.)

Its `Allocate` still skips the realign: the slab is max_align_t-aligned
and every size is a multiple of alignof(T), so the bump pointer stays
alignof(T)-aligned and we can just request alignment 1. Over-aligned
types (alignof(T) > alignof(max_align_t)) keep requesting alignof(T).


    [5 lines not shown]
DeltaFile
+51-27llvm/include/llvm/Support/Allocator.h
+19-0llvm/unittests/Support/AllocatorTest.cpp
+70-272 files

LLVM/project 2a1f306clang/lib/AST/ByteCode Interp.cpp, clang/test/AST/ByteCode dynamic-cast.cpp

[clang][bytecode] Add more sanity checks for pointers used in `dynamic_cast` (#205070)

Make sure it's initialized and that it points to a record.
DeltaFile
+16-0clang/test/AST/ByteCode/dynamic-cast.cpp
+4-2clang/lib/AST/ByteCode/Interp.cpp
+20-22 files

LLVM/project 05d84fdllvm/test/CodeGen/AMDGPU/GlobalISel dropped_debug_info_assert.ll

[AMDGPU] Run update script on test. NFC (#204570)

There's some bogus whitespace in the generated CHECKs that changes when
touching the test.
DeltaFile
+37-37llvm/test/CodeGen/AMDGPU/GlobalISel/dropped_debug_info_assert.ll
+37-371 files

LLVM/project 12d0fcfllvm/test/CodeGen/AMDGPU vector-reduce-umin.ll vector-reduce-smax.ll, llvm/test/CodeGen/AMDGPU/GlobalISel fdiv.f32.ll

AMDGPU/GlobalISel: Remove -new-reg-bank-select option

AMDGPU's -global-isel pipeline that uses AMDGPURegBankSelect and
AMDGPURegBankLegalize, previously -global-isel -new-reg-bank-select,
is now the default -global-isel pipeline.

Remove -new-reg-bank-select option from the compiler.
Remove -new-reg-bank-select from all llvm regression tests.
Edit a couple comments to reference RegBankLegalize instead of
-new-reg-bank-select.
DeltaFile
+12-12llvm/test/CodeGen/AMDGPU/vector-reduce-umin.ll
+12-12llvm/test/CodeGen/AMDGPU/vector-reduce-smax.ll
+12-12llvm/test/CodeGen/AMDGPU/vector-reduce-umax.ll
+12-12llvm/test/CodeGen/AMDGPU/GlobalISel/fdiv.f32.ll
+12-12llvm/test/CodeGen/AMDGPU/integer-mad-patterns.ll
+11-11llvm/test/CodeGen/AMDGPU/llvm.amdgcn.wave.shuffle.ll
+71-71893 files not shown
+2,533-2,541899 files

LLVM/project a2289b7lldb/test/API/macosx/deny-attach main.c TestDenyAttach.py, lldb/tools/debugserver/source/MacOSX MachProcess.mm

Revert "[lldb] Survive ptrace(PT_DENY_ATTACH) when attaching" (#205075)

Reverts llvm/llvm-project#204688

This breaks green dragon where the error message is `error: attach
failed: this is a non-interactive debug session, cannot get permission
to debug processes.`
DeltaFile
+5-87lldb/tools/debugserver/source/MacOSX/MachProcess.mm
+0-60lldb/test/API/macosx/deny-attach/main.c
+0-36lldb/test/API/macosx/deny-attach/TestDenyAttach.py
+0-3lldb/test/API/macosx/deny-attach/Makefile
+5-1864 files

LLVM/project 192ef55llvm/lib/Target/AMDGPU AMDGPUTargetMachine.cpp, llvm/test/CodeGen/AMDGPU maximumnum.ll minimumnum.ll

AMDGPU/GlobalISel: Use AMDGPURegBankSelect + AMDGPURegBankLegalize by default (#203928)

AMDGPU/GlobalISel: Use AMDGPURegBankSelect + AMDGPURegBankLegalize by default

Change AMDGPU's default -global-isel pipeline to use AMDGPURegBankSelect
and AMDGPURegBankLegalize (previously -global-isel -new-reg-bank-select)
by default instead of RegBankSelect which uses AMDGPURegisterBankInfo.

-global-isel pipeline that used RegBankSelect/AMDGPURegisterBankInfo is
now deprecated, since it could not generate functionally correct code in
some cases involving divergent control flow and phis.

-new-reg-bank-select option does nothing and will be removed in followup
patch.

Delete regbankselect-mui.ll and regbankselect-mui-salu-float.ll, which
existed to compare the -global-isel vs -global-isel -new-reg-bank-select.

Temporarily disable a couple of tests that are missing AMDGPURegBankLegalize
support.
DeltaFile
+0-643llvm/test/CodeGen/AMDGPU/GlobalISel/regbankselect-mui.ll
+0-52llvm/test/CodeGen/AMDGPU/GlobalISel/regbankselect-mui-salu-float.ll
+13-13llvm/test/CodeGen/AMDGPU/maximumnum.ll
+13-13llvm/test/CodeGen/AMDGPU/minimumnum.ll
+5-9llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
+3-3llvm/test/CodeGen/AMDGPU/llvm.amdgcn.wqm.vote.ll
+34-7332 files not shown
+36-7358 files