LLVM/project 03cfa6fopenmp/runtime/src kmp_adt.h, openmp/runtime/unittests/ADT TestVector.cpp CMakeLists.txt

[libomp] Add kmp_vector (ADT 2/2)

See rationale in the commit adding kmp_str_ref.

This commit introduces kmp_vector, a class intended primarily for small
vectors. It currently only includes methods I need at the moment, but
it's easily extensible.
DeltaFile
+627-0openmp/runtime/unittests/ADT/TestVector.cpp
+196-0openmp/runtime/src/kmp_adt.h
+1-0openmp/runtime/unittests/ADT/CMakeLists.txt
+824-03 files

LLVM/project 862d176llvm/lib/Target/AMDGPU GCNVOPDUtils.cpp VOP3PInstructions.td, llvm/lib/Target/AMDGPU/Utils AMDGPUBaseInfo.cpp

AMDGPU: Reland: Codegen for v_dual_dot2acc_f32_f16/bf16 from VOP3

For V_DOT2_F32_F16 and V_DOT2_F32_BF16 add their VOPDName and mark
them with usesCustomInserter which will be used to add pre-RA register
allocation hints to preferably assign dst and src2 to the same physical
register. When the hint is satisfied, canMapVOP3PToVOPD recognises the
instruction as eligible for VOPD pairing by checking if it is VOP2 like:
dst==src2, no source modifiers, no clamp, and src1 is a register.
Mark both instructions as commutable to allow a literal in src1 to be
moved to src0, since VOPD only permits a literal in src0.

Original patch had a bug where it did not check if physical src
registers match register class of appropriate operand in fullVOPD
instructions, check is now done via isValidVOPDSrc.
DeltaFile
+442-520llvm/test/CodeGen/AMDGPU/llvm.amdgcn.fdot2.ll
+166-69llvm/test/CodeGen/AMDGPU/llvm.amdgcn.fdot2.f32.bf16.ll
+34-1llvm/lib/Target/AMDGPU/GCNVOPDUtils.cpp
+8-5llvm/lib/Target/AMDGPU/VOP3PInstructions.td
+8-0llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+6-0llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp
+664-5951 files not shown
+666-5977 files

LLVM/project d14b908lld/test/ELF aarch64-long-thunk-converge.s aarch64-thunk-reuse2.s

[AArch64][LLD] Update tests to reduce file size [NFC] (#202547)

Remove large object and executable files after running test. Some tests
need to run within a single OutputSection and cannot use a Linker Script
to increase distance without a large object and corresponding executable
file.

Fixes AArch64 part of #202261
DeltaFile
+7-7lld/test/ELF/aarch64-long-thunk-converge.s
+1-0lld/test/ELF/aarch64-thunk-reuse2.s
+1-0lld/test/ELF/aarch64-cortex-a53-843419-thunk-relocation-crash.s
+9-73 files

LLVM/project dd05657lld/test/ELF arm-thunk-largesection.s arm-thunk-nosuitable.s

[ARM][LLD] Remove large files at end of test [NFC] (#202548)

Some range extension and erratum fix thunks can't easily use a linker
script to make gaps that don't result in a large output. Explicitly
remove the large object and linker output files to reduce storage usage.

Related to #202261
DeltaFile
+2-0lld/test/ELF/arm-thunk-largesection.s
+2-0lld/test/ELF/arm-thunk-nosuitable.s
+2-0lld/test/ELF/arm-thunk-section-too-large.s
+2-0lld/test/ELF/arm-thunk-toolargesection.s
+2-0lld/test/ELF/arm-fix-cortex-a8-toolarge.s
+2-0lld/test/ELF/arm-thunk-multipass-plt.s
+12-09 files not shown
+21-015 files

LLVM/project 9e0508dlld/test/ELF arm-thunk-linkerscript-dotexpr.s arm-thumb-thunk-v6m.s

[ARM][LLD] Reduce thunk test case size, linkerscript changes [NFC] (#202549)

These changes either do some refactoring to use split-file and then
delete the outputs as the size saving is not large. Or it adapts the
linker script to reduce the size by introducing sparse program segments.

All these cases are fairly simple changes, and have made minimal changes
to the CHECK lines.

Related to #202261
DeltaFile
+27-15lld/test/ELF/arm-thunk-linkerscript-dotexpr.s
+22-10lld/test/ELF/arm-thumb-thunk-v6m.s
+19-13lld/test/ELF/arm-thunk-linkerscript-large.s
+17-13lld/test/ELF/arm-thunk-linkerscript.s
+17-9lld/test/ELF/arm-fix-cortex-a8-thunk.s
+14-9lld/test/ELF/arm-thunk-many-passes.s
+116-696 files not shown
+185-10812 files

LLVM/project cd2b669llvm/lib/Target/SPIRV SPIRVUtils.h SPIRVUtils.cpp

[SPIR-V] Take ArrayRef instead of owning containers in selection helpers (NFC) (#203908)

Avoid per call heap allocations where call sites pass braced list
temporaries
DeltaFile
+4-8llvm/lib/Target/SPIRV/SPIRVUtils.h
+5-7llvm/lib/Target/SPIRV/SPIRVUtils.cpp
+3-5llvm/lib/Target/SPIRV/SPIRVCombinerHelper.cpp
+4-4llvm/lib/Target/SPIRV/SPIRVCombinerHelper.h
+3-3llvm/lib/Target/SPIRV/SPIRVInstructionSelector.cpp
+2-1llvm/lib/Target/SPIRV/SPIRVCommandLine.h
+21-281 files not shown
+22-297 files

LLVM/project bcae138llvm/lib/Target/AMDGPU GCNSchedStrategy.cpp

Format

Change-Id: I395a6d065e9b843d4dc33ee75786adbf7e03d9fc
DeltaFile
+9-12llvm/lib/Target/AMDGPU/GCNSchedStrategy.cpp
+9-121 files

LLVM/project b79e696clang/test/AST/ByteCode dynamic-cast.cpp, lldb/include/lldb lldb-enumerations.h

Merge branch 'users/kparzysz/m01-misparsed-call' into users/kparzysz/m02-locator-frontend
DeltaFile
+464-793llvm/test/CodeGen/AMDGPU/fcanonicalize.ll
+1,079-0llvm/test/CodeGen/AMDGPU/usubsat.ll
+422-323lldb/include/lldb/lldb-enumerations.h
+736-0openmp/runtime/unittests/ADT/TestStringRef.cpp
+339-2llvm/test/Transforms/LoopVectorize/RISCV/strided-accesses.ll
+296-0clang/test/AST/ByteCode/dynamic-cast.cpp
+3,336-1,118164 files not shown
+5,819-1,984170 files

LLVM/project 62f85declang/test/AST/ByteCode dynamic-cast.cpp, lldb/include/lldb lldb-enumerations.h

Merge branch 'main' into users/kparzysz/m01-misparsed-call
DeltaFile
+464-793llvm/test/CodeGen/AMDGPU/fcanonicalize.ll
+1,079-0llvm/test/CodeGen/AMDGPU/usubsat.ll
+422-323lldb/include/lldb/lldb-enumerations.h
+736-0openmp/runtime/unittests/ADT/TestStringRef.cpp
+339-2llvm/test/Transforms/LoopVectorize/RISCV/strided-accesses.ll
+296-0clang/test/AST/ByteCode/dynamic-cast.cpp
+3,336-1,118164 files not shown
+5,819-1,984170 files

LLVM/project 1df0924

Restart build
DeltaFile
+0-00 files

LLVM/project a5ffec8llvm/lib/Target/AMDGPU GCNVOPDUtils.cpp

AMDGPU: Validate VOPD/VOPD3 physical source registers against operand RC

Replace isVGPR checks with isValidVOPDSrc that validates physical source
registers against the actual combined VOPD/VOPD3 instruction's operand
register classes. Now we also validate operands for VOPD instructions.
DeltaFile
+40-7llvm/lib/Target/AMDGPU/GCNVOPDUtils.cpp
+40-71 files

LLVM/project 16391bfllvm/lib/Target/AMDGPU GCNVOPDUtils.cpp

AMDGPU: Refactor checkVOPDRegConstraints
DeltaFile
+28-41llvm/lib/Target/AMDGPU/GCNVOPDUtils.cpp
+28-411 files

LLVM/project d4ca116lldb/include/lldb lldb-enumerations.h

[lldb] Reformat doxygen comments in lldb-enumerations.h (NFC) (#203079)

Convert doxygen comments to precede the enumerator to which they apply
(using `///`). This placement of documentation is more consistent with
how functions and classes are documented. Additionally, with the column
limit, the documentation was quite crammed as it was. Lastly, comments
have been reflowed, so that make full use of horizontal space.

Assisted-by: claude
DeltaFile
+422-323lldb/include/lldb/lldb-enumerations.h
+422-3231 files

LLVM/project ae80984llvm/lib/Target/AMDGPU GCNSchedStrategy.cpp, llvm/test/CodeGen/AMDGPU vgpr-excess-threshold-percent.ll vgpr-excess-threshold-percent-invalid.ll

Review comments + change handling of VGPRCriticalLimit

Change-Id: I9aabbf6e40ad59e78fddae46c6a773630f6c54b6
DeltaFile
+67-67llvm/test/CodeGen/AMDGPU/vgpr-excess-threshold-percent.ll
+14-16llvm/lib/Target/AMDGPU/GCNSchedStrategy.cpp
+5-5llvm/test/CodeGen/AMDGPU/vgpr-excess-threshold-percent-invalid.ll
+86-883 files

LLVM/project 132adf7llvm/lib/Target/SPIRV SPIRVInstructionSelector.cpp, llvm/test/CodeGen/SPIRV/llvm-intrinsics fp-intrinsics.ll frexp.ll

[SPIR-V] Skip dead second result load for frexp and modf (#201891)

Skip the load for the result in case it is in fact not used
DeltaFile
+36-0llvm/test/CodeGen/SPIRV/llvm-intrinsics/fp-intrinsics.ll
+15-10llvm/lib/Target/SPIRV/SPIRVInstructionSelector.cpp
+6-6llvm/test/CodeGen/SPIRV/llvm-intrinsics/frexp.ll
+57-163 files

LLVM/project 33674aeflang/lib/Lower Bridge.cpp PFTBuilder.cpp, flang/test/Lower do_loop_execute_region_wrap.f90

[Lower] Wrap unstructured constructs in scf.execute_region

For each unstructured DO construct whose body only exits via the
construct's lexical exit (no GOTOs to outer labels), move the loop's
blocks into the region of a freshly-created scf.execute_region op
marked `no_inline`. The op sits in the outer CFG followed by a
cf.br to what used to be the construct-exit block; in-loop edges to
that block become scf.yield in a single yield block inside the region.

Co-Authored-By: Claude Sonnet 4.6 <noreply at anthropic.com>
DeltaFile
+254-0flang/lib/Lower/Bridge.cpp
+85-0flang/test/Lower/do_loop_execute_region_wrap.f90
+52-2flang/lib/Lower/PFTBuilder.cpp
+8-1flang/lib/Lower/Runtime.cpp
+399-34 files

LLVM/project 8af74a0offload/plugins-nextgen/level_zero/include L0Options.h

[OFFLOAD][L0] Switch to use inorder queues by default (#203897)

Now that other pieces are in place we can switch to using Level Zero
inorder queues by default.
DeltaFile
+1-3offload/plugins-nextgen/level_zero/include/L0Options.h
+1-31 files

LLVM/project 0919b1elldb/include/lldb/Core Statusline.h, lldb/source/Core Statusline.cpp

[lldb] Recompute the statusline on resize without clearing the screen (#202691)

On a terminal resize the statusline cleared the whole screen (ESC[2J)
and redrew, because recomputing in place was buggy: the statusline
wrapped and duplicated. The clear also wiped the visible scrollback on
every resize. I got lots of feedback that this wasn't a great user
experience so I spent some time taking another stab at this.

This PR reverts back to recomputing the statusline. After a resize the
terminal still shows the old statusline:

- Making the terminal smaller (horizontally) reflows the full-width line
into ceil(prev_width / width) rows at the bottom
- Making the terminal larger (vertically) leaves it stranded at its old
row

Clear only the rows it can still occupy and redraw, preserving the
scrollback above. Disable autowrap while drawing so a line briefly wider
than the terminal is clipped at the margin rather than wrapping onto the

    [7 lines not shown]
DeltaFile
+60-0lldb/test/API/functionalities/statusline/TestStatusline.py
+49-6lldb/source/Core/Statusline.cpp
+11-1lldb/source/Host/common/Editline.cpp
+5-2lldb/include/lldb/Core/Statusline.h
+125-94 files

LLVM/project fff4896utils/bazel/llvm-project-overlay/clang/unittests BUILD.bazel

[Bazel] Fixes 63c9264 (#203875)

This fixes 63c9264c770b4ffb7b5b3816547db45969026c3f.

Co-authored-by: Google Bazel Bot <google-bazel-bot at google.com>
DeltaFile
+1-0utils/bazel/llvm-project-overlay/clang/unittests/BUILD.bazel
+1-01 files

LLVM/project 53e36fbclang/lib/Driver Driver.cpp, clang/test/Preprocessor zos-target.c

[z/OS][Clang] Raise minimum supported z/OS target level to 3.1 (#203580)

Update z/OS target validation and tests to reject targets older than
z/OS 3.1
DeltaFile
+6-6clang/test/Preprocessor/zos-target.c
+2-4clang/lib/Driver/Driver.cpp
+8-102 files

LLVM/project edb7ad8llvm/lib/Target/SPIRV SPIRVEmitIntrinsics.cpp, llvm/test/CodeGen/SPIRV/pointers type-deduce-global-array-poison.ll

[SPIR-V] Deduce pointee type for all global variables, not only initialized ones (#202047)

SPIRVEmitIntrinsics::processGlobalValue only recorded a global variable's
element type when hasInitializer() was true. Now it's recording it unconditionally.

Assisted-by: Claude (Anthropic)
DeltaFile
+29-0llvm/test/CodeGen/SPIRV/pointers/type-deduce-global-array-poison.ll
+6-4llvm/lib/Target/SPIRV/SPIRVEmitIntrinsics.cpp
+35-42 files

LLVM/project b32a43dllvm/test/CodeGen/AMDGPU integer-mad-patterns.ll vector-reduce-smax.ll, llvm/test/CodeGen/AMDGPU/GlobalISel fdiv.f32.ll

AMDGPU/GlobalISel: Remove -new-reg-bank-select option

AMDGPU's -global-isel pipeline that uses AMDGPURegBankSelect and
AMDGPURegBankLegalize, previously -global-isel -new-reg-bank-select,
is now the default -global-isel pipeline.

Remove -new-reg-bank-select option from the compiler.
Remove -new-reg-bank-select from all llvm regression tests.
Edit a couple comments to reference RegBankLegalize instead of
-new-reg-bank-select.
DeltaFile
+12-12llvm/test/CodeGen/AMDGPU/integer-mad-patterns.ll
+12-12llvm/test/CodeGen/AMDGPU/GlobalISel/fdiv.f32.ll
+12-12llvm/test/CodeGen/AMDGPU/vector-reduce-smax.ll
+12-12llvm/test/CodeGen/AMDGPU/vector-reduce-umin.ll
+12-12llvm/test/CodeGen/AMDGPU/vector-reduce-umax.ll
+11-11llvm/test/CodeGen/AMDGPU/llvm.amdgcn.wave.shuffle.ll
+71-71885 files not shown
+2,517-2,525891 files

LLVM/project e4bacc6clang/include/clang/Basic DiagnosticParseKinds.td, clang/lib/Parse ParseExprCXX.cpp ParseStmt.cpp

address code review
DeltaFile
+18-2clang/test/C/C2y/n3267.c
+9-6clang/lib/Parse/ParseExprCXX.cpp
+6-0clang/lib/Parse/ParseStmt.cpp
+1-0clang/include/clang/Basic/DiagnosticParseKinds.td
+34-84 files

LLVM/project 5385418flang-rt/lib/runtime io-api.cpp, flang-rt/unittests/Runtime ExternalIOTest.cpp

[Flang] [Runtime ]Fix write endfile abort (#191633)

A WRITE after ENDFILE with ERR= or IOSTAT= was crashing instead of
handling the error properly. Earlier, the program was crashing because
the error was triggered too early (before error handling was ready).

---------

Co-authored-by: Jay Satish Kumar Patel <kumarpat at hpe.com>
DeltaFile
+54-0flang-rt/unittests/Runtime/ExternalIOTest.cpp
+5-0flang-rt/lib/runtime/io-api.cpp
+59-02 files

LLVM/project e57ebfdflang/test/Integration/OpenMP map-types-and-sizes.f90, flang/test/Lower/OpenMP optional-argument-map-2.f90 target-map-complex.f90

[Flang][OpenMP] remove enable-delayed-privatization-staging to suppor… (#203626)

…t target first private default

This commit follows the decision in #182356 to remove the not yet
implemented for delayed privatization for firstprivate and private in
`omp target` regions in flang


Fixes #182356

Assisted with Opus
DeltaFile
+52-0offload/test/offloading/fortran/target-firstprivate.f90
+1-35flang/test/Lower/OpenMP/optional-argument-map-2.f90
+33-0flang/test/Lower/OpenMP/DelayedPrivatization/target-firstprivate.f90
+6-10flang/test/Lower/OpenMP/target-map-complex.f90
+4-8flang/test/Integration/OpenMP/map-types-and-sizes.f90
+1-10flang/test/Lower/OpenMP/target.f90
+97-6327 files not shown
+126-13233 files

LLVM/project 2d8a394openmp/runtime/src kmp_adt.h kmp_adt.cpp, openmp/runtime/unittests/ADT TestStringRef.cpp CMakeLists.txt

[libomp] Add kmp_str_ref (ADT 1/2) (#176162)

libomp currently has two limitations:
1) although it's C++, it doesn't link against the C++ stdlib 2) it
cannot link against the implementation of LLVM ADTs

These limitations shall not be altered at the moment.

As a result, this commit introduces kmp_str_ref, which is similar to
LLVM's StringRef. It currently only includes methods I need at the
moment, but it's easily extensible.
DeltaFile
+736-0openmp/runtime/unittests/ADT/TestStringRef.cpp
+142-0openmp/runtime/src/kmp_adt.h
+63-0openmp/runtime/src/kmp_adt.cpp
+27-0openmp/runtime/unittests/String/TestKmpStr.cpp
+2-2openmp/runtime/src/kmp_str.cpp
+4-0openmp/runtime/unittests/ADT/CMakeLists.txt
+974-23 files not shown
+978-39 files

LLVM/project 5fbc81allvm/lib/Transforms/Scalar JumpThreading.cpp

[JumpThreading] Use isGuaranteedToTransferExecutionToSuccessor() with range (#203918)

Use the overload that accepts a range of instructions. This is not NFC
because the scan is now subject to ScanLimit.
DeltaFile
+3-3llvm/lib/Transforms/Scalar/JumpThreading.cpp
+3-31 files

LLVM/project 4802718llvm/lib/CodeGen TargetPassConfig.cpp, llvm/lib/Target/AMDGPU AMDGPUTargetMachine.cpp

AMDGPU/GlobalISel: Use AMDGPURegBankSelect + AMDGPURegBankLegalize by default

Change AMDGPU's default -global-isel pipeline to use AMDGPURegBankSelect
and AMDGPURegBankLegalize (previously -global-isel -new-reg-bank-select)
by default instead of RegBankSelect which uses AMDGPURegisterBankInfo.

-global-isel pipeline that used RegBankSelect/AMDGPURegisterBankInfo is
now deprecated, since it could not generate functionally correct code in
some cases involving divergent control flow and phis.

-new-reg-bank-select option does nothing and will be removed in followup
patch.

Delete regbankselect-mui.ll and regbankselect-mui-salu-float.ll, which
existed to compare the -global-isel vs -global-isel -new-reg-bank-select.

Temporarily disable a couple of tests that are missing AMDGPURegBankLegalize
support.
DeltaFile
+0-643llvm/test/CodeGen/AMDGPU/GlobalISel/regbankselect-mui.ll
+0-52llvm/test/CodeGen/AMDGPU/GlobalISel/regbankselect-mui-salu-float.ll
+13-13llvm/test/CodeGen/AMDGPU/maximumnum.ll
+13-13llvm/test/CodeGen/AMDGPU/minimumnum.ll
+7-11llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
+6-1llvm/lib/CodeGen/TargetPassConfig.cpp
+39-73310 files not shown
+56-74716 files

LLVM/project f32b9ebllvm/lib/Transforms/Scalar ConstantHoisting.cpp, llvm/test/Transforms/ConstantHoisting/AArch64 consthoist-unreachable-phi-edge.ll

[ConstantHoisting] Skip PHI edges from unreachable blocks (#203892)

When collecting constant candidates, skip incoming PHI edges from blocks
that are unreachable from entry. This avoids assertion failures when the
pass later tries to find insertion points or update users for constants
that only appear on unreachable edges.

While touching this part of the code, also remove an older XFAIL from
test/Transforms/ConstantHoisting/X86/pr52689-not-all-uses-rebased.ll
That test case also triggered the same assert once upon a time, but it
has been set to XFAIL for some time since the reproducer no longer
triggered the bug. This patch turns it into a normal test case instead
of an XFAIL test. Afaict the original problem may have been the same. We
have PHI nodes with edges from unreachable blocks. One difference
compared to the new aarch64 test is that here the involved constants are
GEPs and not simple scalars.

Closes https://github.com/llvm/llvm-project/issues/52689
DeltaFile
+58-0llvm/test/Transforms/ConstantHoisting/AArch64/consthoist-unreachable-phi-edge.ll
+24-14llvm/test/Transforms/ConstantHoisting/X86/pr52689-not-all-uses-rebased.ll
+6-0llvm/lib/Transforms/Scalar/ConstantHoisting.cpp
+88-143 files

LLVM/project 99a9ca4clang/test/Sema warn-lifetime-safety.cpp, clang/test/Sema/LifetimeSafety safety.cpp

Merge branch 'main' into users/c8ef/atomic_minmax
DeltaFile
+3,204-3,450llvm/test/CodeGen/X86/vector-interleaved-store-i16-stride-7.ll
+1,905-2,037llvm/test/CodeGen/X86/vector-interleaved-store-i16-stride-6.ll
+3,721-0clang/test/Sema/LifetimeSafety/safety.cpp
+0-3,653clang/test/Sema/warn-lifetime-safety.cpp
+1,825-1,328llvm/test/Transforms/LoopVectorize/WebAssembly/memory-interleave.ll
+1,813-654llvm/test/CodeGen/AMDGPU/llvm.amdgcn.sched.group.barrier.ll
+12,468-11,1221,722 files not shown
+98,650-43,1741,728 files