LLVM/project 3c89e07utils/bazel/llvm-project-overlay/mlir/unittests BUILD.bazel

[Bazel] Fixes 5c6c424 (#198787)

This fixes 5c6c424a50b840a39a6410a490af668a26d3a97a.

Co-authored-by: Google Bazel Bot <google-bazel-bot at google.com>
DeltaFile
+1-0utils/bazel/llvm-project-overlay/mlir/unittests/BUILD.bazel
+1-01 files

LLVM/project 8ef411amlir/lib/Dialect/XeGPU/Transforms XeGPUUnroll.cpp, mlir/test/Dialect/XeGPU xegpu-unroll-patterns.mlir xegpu-blocking.mlir

[MLIR][XeGPU] Avoid chained-reductions in multi_reduction unrolling (#198307)

The PR adds a new unrolling pattern for `vector.multi_reduction` to the
`xegpu-blocking` pass. In comparison with [the upstream reduction
unrolling](https://github.com/llvm/llvm-project/blob/2da84a8307e4ef729458d990b221650a5da22639/mlir/lib/Dialect/Vector/Transforms/VectorUnroll.cpp#L372),
the new pattern performs partial row-wise reductions via elementwise
ops, instead of generating a chain of several multi-reduction ops:
```mlir
// reduction to unroll:
// tile-shape: [8x16]
vector.multi_reduction <add> %vec, %cst : vector<8x48xf32> to vector<8xf32>

// upstream unrolling:
%3 = vector.multi_reduction <add>, %tile_0, %cst [1] : vector<8x16xf32> to vector<8xf32>
%4 = vector.multi_reduction <add>, %tile_1, %3 [1] : vector<8x16xf32> to vector<8xf32>
%5 = vector.multi_reduction <add>, %tile_2, %4 [1] :  vector<8x16xf32> to vector<8xf32>

// new xegpu-unrolling
%3 = arith.addf %tile_0, %tile_1 : vector<8x16xf32>

    [6 lines not shown]
DeltaFile
+146-5mlir/lib/Dialect/XeGPU/Transforms/XeGPUUnroll.cpp
+134-0mlir/test/Dialect/XeGPU/xegpu-unroll-patterns.mlir
+13-0mlir/test/lib/Dialect/XeGPU/TestXeGPUTransforms.cpp
+4-2mlir/test/Dialect/XeGPU/xegpu-blocking.mlir
+297-74 files

LLVM/project fbf6f2emlir/docs PrivateNameObfuscation.md, mlir/include/mlir/TableGen PrivateName.h

strip op and pass names
DeltaFile
+200-0mlir/docs/PrivateNameObfuscation.md
+174-0mlir/test/mlir-tblgen/private-name-obfuscation.td
+160-0mlir/tools/mlir-tblgen/PrivateName.cpp
+81-0mlir/utils/private-name-obfuscator-example.py
+79-0mlir/include/mlir/TableGen/PrivateName.h
+79-0mlir/test/mlir-tblgen/private-pass-obfuscation.td
+773-014 files not shown
+983-3620 files

LLVM/project 9d61990mlir CMakeLists.txt, mlir/docs PrivateNameObfuscation.md

strip op and pass names
DeltaFile
+193-0mlir/docs/PrivateNameObfuscation.md
+174-0mlir/test/mlir-tblgen/private-name-obfuscation.td
+160-0mlir/tools/mlir-tblgen/PrivateName.cpp
+79-0mlir/include/mlir/TableGen/PrivateName.h
+79-0mlir/test/mlir-tblgen/private-pass-obfuscation.td
+53-0mlir/CMakeLists.txt
+738-013 files not shown
+895-3619 files

LLVM/project 5c6c424mlir/lib/Dialect/MemRef/Utils MemRefUtils.cpp, mlir/unittests/Dialect/MemRef MemRefUtilsTest.cpp CMakeLists.txt

[mlir/memref] handle rank-0 contiguous check (#198541)
DeltaFile
+33-0mlir/unittests/Dialect/MemRef/MemRefUtilsTest.cpp
+8-3mlir/lib/Dialect/MemRef/Utils/MemRefUtils.cpp
+2-0mlir/unittests/Dialect/MemRef/CMakeLists.txt
+43-33 files

LLVM/project 7200655offload/plugins-nextgen/level_zero/src L0Device.cpp

[offload][l0] Clear delayed copy lists after the copy (#198786)
DeltaFile
+7-0offload/plugins-nextgen/level_zero/src/L0Device.cpp
+7-01 files

LLVM/project 4551535libc/test/src/stdlib qsort_r_test.cpp

Fix clang-format
DeltaFile
+1-1libc/test/src/stdlib/qsort_r_test.cpp
+1-11 files

LLVM/project 52a7731llvm/lib/Target/AMDGPU SIISelLowering.cpp, llvm/test/CodeGen/AMDGPU load-atomic-global.ll load-atomic-flat.ll

[AMDGPU] Only support unordered and monotonic atomic misalign
DeltaFile
+40-46llvm/test/CodeGen/AMDGPU/load-atomic-global.ll
+40-46llvm/test/CodeGen/AMDGPU/load-atomic-flat.ll
+24-24llvm/test/CodeGen/AMDGPU/store-atomic-flat.ll
+24-23llvm/test/CodeGen/AMDGPU/store-atomic-global.ll
+18-0llvm/test/Transforms/AtomicExpand/AMDGPU/unaligned-atomic.ll
+12-0llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+158-1396 files

LLVM/project 8e868c5llvm/lib/Transforms/Vectorize VPlanTransforms.cpp

[VPlan] Fix convoluted logic in simpl ext-last-lane (#196355)

Checking the users is unnecessary; if it is single-scalar, it means the
same value is splatted across all lanes. Also, the transformation does
not depend on the Plan being unrolled.
DeltaFile
+7-12llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
+7-121 files

LLVM/project 11b4f57libc/src/stdlib qsort_util.h qsort_r.cpp, libc/test/src/stdlib QsortReentrantTest.h qsort_r_test.cpp

[libc] Refactor qsort code

This patch makes the following changes:
 - Refactor the internal sorting functions to reduce code duplication.
 - Move the testing machinery done for the testing of `qsort_r` to a
   shared place.

These changes are done in anticipation to the introduction of Annex K's
`qsort_s`. This function shares most of its semantics with `qsort_r`,
therefore most of the testing logic can be shared between the two.
Besides, `qsort`, `qsort_r` and `qsort_r` are all very similar, hence we
can attempt to reduce duplication a bit more.
DeltaFile
+150-0libc/test/src/stdlib/QsortReentrantTest.h
+2-134libc/test/src/stdlib/qsort_r_test.cpp
+22-3libc/src/stdlib/qsort_util.h
+1-6libc/src/stdlib/qsort_r.cpp
+1-5libc/src/stdlib/qsort.cpp
+2-1libc/test/src/stdlib/CMakeLists.txt
+178-1496 files

LLVM/project f958514clang/lib/CIR/CodeGen CIRGenExpr.cpp, clang/test/CIR/CodeGen call.cpp

[CIR] Lower calling through a variable (#198672)

We managed to miss a condition when lowering emitCallee, where a
DeclRefExpr referenced a function object. This patch adds that
condition, which will result in these being lowered properly as an
indirect call.
DeltaFile
+21-0clang/test/CIR/CodeGen/call.cpp
+2-2clang/lib/CIR/CodeGen/CIRGenExpr.cpp
+23-22 files

LLVM/project 2de7fe0flang/lib/Semantics resolve-names.cpp, flang/test/Semantics/OpenMP affected-loops.f90

[flang][OpenMP] Limit scope creation to constructs with data environment

Identify specific constructs that require data envorinments, and only
create scopes for them. This avoids scopes for loop-transformation
constructs, for example.

This isn't a correctness fix, but a clarification and a simplification
of the name-resolution code for OpenMP.
DeltaFile
+43-61flang/lib/Semantics/resolve-names.cpp
+2-3flang/test/Semantics/OpenMP/affected-loops.f90
+45-642 files

LLVM/project e83f597lldb/test/API/tools/lldb-dap/launch TestDAP_launch_stdio_redirection_and_console.py, lldb/test/API/tools/lldb-dap/runInTerminal TestDAP_runInTerminal.py

[lldb-dap][windows] skip runInTerminal related tests (#198764)

The following tests fail at desk. This is likely a regression introduced
by
https://github.com/llvm/llvm-project/commit/a614cd391a402c8682c7b4781121eab07da09ec7.
Skip the tests on Windows to unblock the bots.

https://github.com/llvm/llvm-project/issues/198763
DeltaFile
+1-0lldb/test/API/tools/lldb-dap/launch/TestDAP_launch_stdio_redirection_and_console.py
+1-0lldb/test/API/tools/lldb-dap/runInTerminal/TestDAP_runInTerminal.py
+2-02 files

LLVM/project 960d0e4llvm/lib/Target/AMDGPU AMDGPUAsmPrinter.cpp GCNSubtarget.h, llvm/lib/Target/AMDGPU/MCTargetDesc AMDGPUMCExpr.cpp

Reland "[AMDGPU] Account for inline asm size in inst_pref_size calculation" (#197227)

This relands commit 7ddee0b619f658cef905a69427ef9531fd1d229d (PR
#192306) which was reverted in 70a70e0ed664 (#197070) due to a missing
MC assembler parser case for the `instprefsize` MCExpr, breaking text
assembly roundtrip tests.

Fix:

- Add `"instprefsize"` to the `StringSwitch` in
`AMDGPUAsmParser::parsePrimaryExpr` so the MC assembler can parse
`instprefsize(...)` expressions emitted by `llc` in text assembly mode.
- Add roundtrip lit tests (`llc -filetype=asm | llvm-mc -filetype=obj |
llvm-objdump`) for both GFX11 and GFX12 to prevent regressions.

Confirmed by compiling the new lit test using the original commit that
it was failing and passes now.

_Original PR description_

    [15 lines not shown]
DeltaFile
+159-0llvm/test/CodeGen/AMDGPU/inst-prefetch-inline-asm.ll
+42-41llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp
+46-9llvm/test/CodeGen/AMDGPU/inst-prefetch-hint.ll
+45-0llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUMCExpr.cpp
+18-0llvm/lib/Target/AMDGPU/GCNSubtarget.h
+3-14llvm/lib/Target/AMDGPU/SIProgramInfo.cpp
+313-645 files not shown
+335-6811 files

LLVM/project e7a388fflang/lib/Lower/OpenMP OpenMP.cpp, flang/lib/Optimizer/OpenMP DoConcurrentConversion.cpp

[Flang][OpenMP] Add combined construct information

This patch adds the `omp.combined` attribute to OpenMP dialect
operations following changes to the `ComposableOpInterface`.

This attribute is added to operations representing non-innermost leaf
constructs of a combined construct and to standalone block-associated
constructs that can be combined with their parent construct.

Changes are made to the OpenMP lowering logic, as well as the
do-concurrent, workshare and workdistribute transformation passes.
DeltaFile
+1,094-0flang/test/Lower/OpenMP/compound.f90
+56-20flang/lib/Lower/OpenMP/OpenMP.cpp
+6-6flang/test/Transforms/DoConcurrent/use_loop_bounds_in_body.f90
+5-5flang/test/Transforms/DoConcurrent/local_device.mlir
+4-4flang/test/Transforms/DoConcurrent/reduce_device.mlir
+6-2flang/lib/Optimizer/OpenMP/DoConcurrentConversion.cpp
+1,171-3727 files not shown
+1,225-7133 files

LLVM/project 51279e7clang/lib/StaticAnalyzer/Checkers CStringChecker.cpp, clang/test/Analysis bstring_UninitRead.c

[analyzer] Fix false positive in CStringChecker for offset buffer arg… (#198346)

…uments

CStringChecker::checkInit() was checking the wrong array elements when
the buffer argument pointed into the middle of an array (e.g.,
memcpy(dst, &arr[i], size)). It was called with BufEnd instead of
BufStart, making the ElementRegion index off by (size-1), and the
element lookups were relative to array index 0 instead of the actual
buffer start offset.
DeltaFile
+27-5clang/test/Analysis/bstring_UninitRead.c
+7-9clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp
+34-142 files

LLVM/project 068c6c5mlir/include/mlir/Dialect/Tosa/IR TosaOps.h

[mlir][tosa] Remove unused `MulOperandsAndResultElementType` trait (#197968)

Removes an unused trait implementation.
DeltaFile
+0-47mlir/include/mlir/Dialect/Tosa/IR/TosaOps.h
+0-471 files

LLVM/project 05deb27clang/include/clang/Serialization ASTRecordReader.h, clang/lib/AST ASTContext.cpp Type.cpp

trivial changes
DeltaFile
+20-14clang/lib/Sema/SemaOpenMP.cpp
+18-14clang/lib/AST/ASTContext.cpp
+16-15clang/lib/Sema/SemaTemplate.cpp
+14-11clang/lib/AST/Type.cpp
+14-8clang/lib/AST/ASTDiagnostic.cpp
+11-6clang/include/clang/Serialization/ASTRecordReader.h
+93-6833 files not shown
+202-15239 files

LLVM/project 5739010clang/include/clang/AST ASTContext.h, clang/lib/AST ASTContext.cpp ItaniumMangle.cpp

[clang] implement CWG2064: ignore value dependence for decltype

The 'decltype' for a value-dependent (but non-type-dependent) should be known,
so this patch makes them non-opaque instead.

This patch also implements what's neceessary to allow overloading
on pure differences in instantiation dependence, making `std::void_t`
usable for SFINAE purposes.

This also readds a few test cases from da98651, which was a previous attempt
at resolving CWG2064.

Fixes #8740
Fixes #61818
Fixes #190388
DeltaFile
+888-161clang/lib/AST/ASTContext.cpp
+328-12clang/test/SemaTemplate/instantiation-dependence.cpp
+176-96clang/lib/AST/ItaniumMangle.cpp
+100-98clang/lib/Sema/SemaCXXScopeSpec.cpp
+62-57clang/lib/AST/Type.cpp
+88-11clang/include/clang/AST/ASTContext.h
+1,642-43570 files not shown
+2,398-78976 files

LLVM/project 694e4c7llvm/test/Transforms/EarlyCSE/AArch64 intrinsics-1xN.ll

fixup! [AArch64][TTI][EarlyCSE] Add support for ld1xN and st1xN intrinsics
DeltaFile
+24-24llvm/test/Transforms/EarlyCSE/AArch64/intrinsics-1xN.ll
+24-241 files

LLVM/project 545b328llvm/lib/Transforms/Vectorize VPlan.h LoopVectorize.cpp

[VPlan] Strip VPRecipeBase::isScalarCast (NFC) (#197695)

This is done in preparation to consolidate more recipes into
VPInstruction.
DeltaFile
+6-7llvm/lib/Transforms/Vectorize/VPlan.h
+4-5llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+2-7llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
+12-193 files

LLVM/project 3e15a19offload/test CMakeLists.txt, offload/test/unit lit.cfg.py lit.site.cfg.in

[Offload] fix OffloadAPI unittests discovery (#198750)

Commit 3383f0d repointed LIBOMPTARGET_LIBRARY_DIR to a different
runtimes lib dir, but the unit lit config still derived the unittest
binary path from it. Pass the unittest directory explicitly instead.
DeltaFile
+1-1offload/test/unit/lit.cfg.py
+1-0offload/test/CMakeLists.txt
+1-0offload/test/unit/lit.site.cfg.in
+3-13 files

LLVM/project a8c6535mlir/docs PrivateNameObfuscation.md, mlir/include/mlir/TableGen PrivateName.h

strip op and pass names
DeltaFile
+175-0mlir/test/mlir-tblgen/private-name-obfuscation.td
+174-0mlir/docs/PrivateNameObfuscation.md
+139-0mlir/tools/mlir-tblgen/PrivateName.cpp
+74-0mlir/test/mlir-tblgen/private-pass-obfuscation.td
+59-0mlir/include/mlir/TableGen/PrivateName.h
+39-8mlir/tools/mlir-tblgen/OpDefinitionsGen.cpp
+660-817 files not shown
+813-3623 files

LLVM/project 08abd96llvm/test/CodeGen/X86 sad.ll sad_variations.ll, llvm/test/Transforms/PhaseOrdering/X86 sad.ll sad_variations.ll

[X86] Update PSADBW tests to more closely match middle-end vector.reduce.add codegen (#198760)

The middle-end will detect vector.reduce.add patterns - update the
Codegen tests to use the intrinsics directly and add PhaseOrdering tests
to ensure vector.reduce.add intrinsics are created
DeltaFile
+658-0llvm/test/Transforms/PhaseOrdering/X86/sad.ll
+259-0llvm/test/Transforms/PhaseOrdering/X86/sad_variations.ll
+14-106llvm/test/CodeGen/X86/sad.ll
+7-49llvm/test/CodeGen/X86/sad_variations.ll
+938-1554 files

LLVM/project 898dd90clang/include/clang/Serialization ASTRecordReader.h, clang/lib/AST ASTContext.cpp Type.cpp

trivial changes
DeltaFile
+20-14clang/lib/Sema/SemaOpenMP.cpp
+18-14clang/lib/AST/ASTContext.cpp
+16-15clang/lib/Sema/SemaTemplate.cpp
+14-11clang/lib/AST/Type.cpp
+14-8clang/lib/AST/ASTDiagnostic.cpp
+11-6clang/include/clang/Serialization/ASTRecordReader.h
+93-6833 files not shown
+202-15239 files

LLVM/project 4256608clang/include/clang/AST ASTContext.h, clang/lib/AST ASTContext.cpp ItaniumMangle.cpp

[clang] implement CWG2064: ignore value dependence for decltype

The 'decltype' for a value-dependent (but non-type-dependent) should be known,
so this patch makes them non-opaque instead.

This patch also implements what's neceessary to allow overloading
on pure differences in instantiation dependence, making `std::void_t`
usable for SFINAE purposes.

This also readds a few test cases from da98651, which was a previous attempt
at resolving CWG2064.

Fixes #8740
Fixes #61818
Fixes #190388
DeltaFile
+888-161clang/lib/AST/ASTContext.cpp
+328-12clang/test/SemaTemplate/instantiation-dependence.cpp
+176-96clang/lib/AST/ItaniumMangle.cpp
+100-98clang/lib/Sema/SemaCXXScopeSpec.cpp
+62-57clang/lib/AST/Type.cpp
+88-11clang/include/clang/AST/ASTContext.h
+1,642-43570 files not shown
+2,396-78776 files

LLVM/project b588ad8flang/test/Integration/OpenMP atomic-compare.f90

[Flang][tests] Add a missing REQUIRES. (#198753)

A newly added test uses `x86_64-unknown-linux-gnu` as a triple, without
a `REQUIRES: x86-registered-target` line, so that it will fail in builds
of LLVM specific to other architectures.
DeltaFile
+1-0flang/test/Integration/OpenMP/atomic-compare.f90
+1-01 files

LLVM/project 77cdd6cllvm/lib/Target/AArch64 AArch64SchedA64FX.td, llvm/test/tools/llvm-mca/AArch64/A64FX A64FX-sve-instructions.s

[AArch64] Fix fmaxv/fminv/fmaxnmv/fminnmv/lasta/lastb sched info in A64FX (#198483)

I've been experimenting with a new TableGen warning on unused defs and
it found a couple bugs in the A64FX scheduling model [1]:

llvm/lib/Target/AArch64/AArch64SchedA64FX.td:2288:5: warning: def
'A64FXWrite_FMAXVD' appears to be unused
llvm/lib/Target/AArch64/AArch64SchedA64FX.td:2334:5: warning: def
'A64FXWrite_LAST_R' appears to be unused

It looks like similarly named defs were used where they should have been
and the microarchitecture manual [2] seems to confirm it.

[1] https://raw.githubusercontent.com/c-rhodes/llvm-project/860eb23fae9bd40b36bcc56534f3d43b36522173/tblgen-unused-defs-warnings.unique.txt
[2] https://github.com/fujitsu/A64FX/blob/master/doc/A64FX_Microarchitecture_Manual_en_1.8.pdf
DeltaFile
+17-17llvm/test/tools/llvm-mca/AArch64/A64FX/A64FX-sve-instructions.s
+2-2llvm/lib/Target/AArch64/AArch64SchedA64FX.td
+19-192 files

LLVM/project 0425d1ellvm/lib/Target/AArch64 AArch64TargetTransformInfo.cpp AArch64TargetTransformInfo.h, llvm/test/Transforms/EarlyCSE/AArch64 intrinsics-1xN.ll

[AArch64][TTI][EarlyCSE] Add support for ld1xN and st1xN intrinsics

Handle ld1x2, ld1x3, ld1x4, st1x2, st1x3, st1x4 in:
- AArch64TTIImpl::getTgtMemIntrinsic
- AArch64TTIImpl::getOrCreateResultFromMemIntrinsic

This enables EarlyCSE to optimize these NEON load/store intrinsics.

To test the changes, a new testcase (intrinsics-1xN.ll) derived from
llvm/test/Transforms/EarlyCSE/AArch64/intrinsics.ll is added.
DeltaFile
+365-0llvm/test/Transforms/EarlyCSE/AArch64/intrinsics-1xN.ll
+28-3llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
+0-6llvm/lib/Target/AArch64/AArch64TargetTransformInfo.h
+393-93 files

LLVM/project 8736eeamlir/docs PrivateNameObfuscation.md, mlir/include/mlir/TableGen PrivateName.h

strip op and pass names
DeltaFile
+175-0mlir/test/mlir-tblgen/private-name-obfuscation.td
+170-0mlir/docs/PrivateNameObfuscation.md
+140-0mlir/tools/mlir-tblgen/PrivateName.cpp
+74-0mlir/test/mlir-tblgen/private-pass-obfuscation.td
+59-0mlir/include/mlir/TableGen/PrivateName.h
+39-8mlir/tools/mlir-tblgen/OpDefinitionsGen.cpp
+657-817 files not shown
+808-3623 files