LLVM/project 0a39d1fllvm/utils/gn/secondary/llvm/lib/Target/NVPTX BUILD.gn

[gn build] Port 1bada0af22d8
DeltaFile
+1-0llvm/utils/gn/secondary/llvm/lib/Target/NVPTX/BUILD.gn
+1-01 files

LLVM/project 1dacdbeclang/include/clang/Basic LangOptions.h, clang/lib/Sema SemaDeclCXX.cpp

[Clang] Export inline move constructors in dllexport-ed template instantiations on non-MSVC targets (#168170)

Previously, even when MSVC compatibility was not requested, inline move
constructors in dllexport-ed templates were not exported, which was
seemingly unintended.
On non-MSVC targets (MinGW, Cygwin, and PS), such move constructors
should be exported consistently with copy constructors and with the
behavior of modern MSVC.
DeltaFile
+17-0clang/test/CodeGenCXX/mingw-template-dllexport.cpp
+6-4clang/test/CodeGenCXX/dllimport.cpp
+2-0clang/include/clang/Basic/LangOptions.h
+1-0clang/test/CodeGenCXX/dllexport.cpp
+1-0clang/lib/Sema/SemaDeclCXX.cpp
+27-45 files

LLVM/project 0aa8b82lldb/docs/use map.rst

[lldb][docs] Fix plaintext markers in command map

Single backticks RST tries to resolve to a reference.
Double means plaintext.

Fixes these warnings:
map.rst:803: WARNING: 'any' reference target not found: target.prefer-dynamic-value
map.rst:814: WARNING: 'any' reference target not found: expr
DeltaFile
+2-2lldb/docs/use/map.rst
+2-21 files

LLVM/project ab8208flldb/docs/resources build.rst

[lldb][docs] Fix Visual Studio link in build doc

Fixes warning:
build.rst:107: WARNING: 'any' reference target not found: https://visualstudio.microsoft.com
DeltaFile
+1-1lldb/docs/resources/build.rst
+1-11 files

LLVM/project fa60765clang/lib/CIR/CodeGen CIRGenBuiltinX86.cpp, clang/test/CIR/CodeGenBuiltins/X86 avx512vlvbmi2-builtins.c avx512vl-builtins.c

[CIR][CIRGen][Builtin][X86] Masked compress Intrinsics (#169582)

Added masked compress builtin in CIR.
Note: This is my first PR to llvm. Looking forward to corrections

---------

Co-authored-by: bhuvan1527 <balabhuvanvarma at gmail.com>
DeltaFile
+171-0clang/test/CIR/CodeGenBuiltins/X86/avx512vlvbmi2-builtins.c
+33-0clang/test/CIR/CodeGenBuiltins/X86/avx512vl-builtins.c
+25-6clang/lib/CIR/CodeGen/CIRGenBuiltinX86.cpp
+229-63 files

LLVM/project 04a5ee6clang/include/clang/Basic BuiltinsAMDGPU.def

[AMDGPU] Modifies builtin def to take _Float16('x') for both HIP/C++ and for OpenCL (#167652)

For extended imges insts amdgcn_image_sample_*_/gather4_* builtins,
using 'x' in the builtin def so that it will take _Float16 for both
HIP/C++ and OpenCL.
DeltaFile
+17-17clang/include/clang/Basic/BuiltinsAMDGPU.def
+17-171 files

LLVM/project ef47462llvm/lib/Target/SPIRV SPIRVGlobalRegistry.cpp SPIRVCommandLine.cpp, llvm/test/CodeGen/SPIRV/extensions/SPV_ALTERA_arbitrary_precision_integers i128-addsub.ll i128-arith.ll

[SPIRV] Start adding support for `int128` (#170798)

LLVM has pretty thorough support for `int128`, and it has started seeing
some use. Even thouth we already have support for the
`SPV_ALTERA_arbitrary_precision_integers` extension, the BE was oddly
capping integer width to 64-bits. This patch adds partial support for
lowering 128-bit integers to `OpTypeInt 128`. Some work remains to be
done around legalisation support and validating constant uses (e.g.
cases that get lowered to `OpSpecConstantOp`).
DeltaFile
+67-0llvm/test/CodeGen/SPIRV/extensions/SPV_ALTERA_arbitrary_precision_integers/i128-addsub.ll
+27-0llvm/test/CodeGen/SPIRV/extensions/SPV_ALTERA_arbitrary_precision_integers/i128-arith.ll
+27-0llvm/test/CodeGen/SPIRV/extensions/SPV_ALTERA_arbitrary_precision_integers/i128-switch-lower.ll
+10-10llvm/lib/Target/SPIRV/SPIRVGlobalRegistry.cpp
+17-2llvm/lib/Target/SPIRV/SPIRVCommandLine.cpp
+15-0llvm/lib/Target/SPIRV/SPIRVModuleAnalysis.cpp
+163-124 files not shown
+181-1310 files

LLVM/project 4f79552clang/lib/Headers avxvnniint16intrin.h, llvm/lib/IR AutoUpgrade.cpp

[x86][AVX-VNNI] Fix VPDPWXXD Argument Types (#169456)

Fixed the argument types of the following intrinsics to match with the
ISA:
 - vpdpwssd_128, vpdpwssd_256, vpdpwssd_512,
 - vpdpwssds_128, vpdpwssds_256, vpdpwssds_512
 - vpdpwsud_128, vpdpwsud_256, vpdowsud_512
 - vpdpwsuds_128, vpdpwsuds_256, vpdpwsuds_512
 - vpdpwusd_128, vpdpwusd_256, vpdpwusd_512
 - vpdpwusds_128, vpdpwusds_256, vpdpwusds_512
 - vpdpwuud_128, vpdpwuud_256, vpdpwuud_512
 - vpdpwuuds_128, vpdpwuuds_256, vpdpwuuds_512

Fixes #97271. Note that this is the last PR for the issue.
DeltaFile
+260-116llvm/test/Instrumentation/MemorySanitizer/X86/avxvnniint16-intrinsics.ll
+252-108llvm/test/Instrumentation/MemorySanitizer/X86/avx10_2ni-intrinsics.ll
+199-91llvm/test/Instrumentation/MemorySanitizer/X86/avx10_2_512ni-intrinsics.ll
+118-99clang/lib/Headers/avxvnniint16intrin.h
+185-0llvm/test/CodeGen/X86/avxvnniint16-intrinsics-upgrade.ll
+153-30llvm/lib/IR/AutoUpgrade.cpp
+1,167-44429 files not shown
+1,870-90035 files

LLVM/project 1bada0allvm/lib/Target/NVPTX NVPTXIRPeephole.cpp NVPTXTargetMachine.cpp, llvm/test/CodeGen/NVPTX nvptx-fold-fma.ll

[NVPTX] Add IR pass for FMA transformation in the llc pipeline (#154735)

This change introduces a new IR pass in the llc pipeline for NVPTX that
transforms sequences of FMUL followed by FADD or FSUB into a single FMA
instruction.

Currently, all FMA folding for NVPTX occurs at the DAGCombine stage,
which is too late for any IR-level passes that might want to optimize or
analyze FMAs. By moving this transformation earlier into the IR phase,
we enable more opportunities for FMA folding, including across basic
blocks.

Additionally, this new pass relies on the contract instruction level
fast-math flag to perform these transformations, rather than depending
on the -fp-contract=fast or -enable-unsafe-fp-math options passed to
llc.
DeltaFile
+247-0llvm/test/CodeGen/NVPTX/nvptx-fold-fma.ll
+167-0llvm/lib/Target/NVPTX/NVPTXIRPeephole.cpp
+10-0llvm/lib/Target/NVPTX/NVPTXTargetMachine.cpp
+6-0llvm/lib/Target/NVPTX/NVPTX.h
+1-0llvm/lib/Target/NVPTX/CMakeLists.txt
+1-0llvm/lib/Target/NVPTX/NVPTXPassRegistry.def
+432-06 files

LLVM/project 9b12f8flldb/test/API/functionalities/data-formatter/data-formatter-stl/generic/shared_ptr TestDataFormatterStdSharedPtr.py, lldb/test/API/functionalities/data-formatter/data-formatter-stl/generic/unique_ptr TestDataFormatterStdUniquePtr.py

[LLDB] Run MSVC STL smart pointer tests with PDB (#166946)

Runs the `std::shared/unique_ptr` tests with PDB with two changes:

- PDB uses the "full" name, so `std::string` is `std::basic_string<char,
std::char_traits<char>, std::allocator<char>>`
- The type of the pointer inside the shared/unique_ptr isn't the
`element_type` typedef
DeltaFile
+12-2lldb/test/API/functionalities/data-formatter/data-formatter-stl/generic/shared_ptr/TestDataFormatterStdSharedPtr.py
+11-1lldb/test/API/functionalities/data-formatter/data-formatter-stl/generic/unique_ptr/TestDataFormatterStdUniquePtr.py
+23-32 files

LLVM/project e6145e8clang/lib/CIR/CodeGen CIRGenExprScalar.cpp

[CIR][NFC] Add stubs for missing visitors in ScalarExprEmitter (#171222)

This adds stubs that issue NYI errors for any visitor that is present in
the ClangIR incubator but missing in the upstream implementation. This
will make it easier to find to correct locations to implement missing
functionality.
DeltaFile
+169-1clang/lib/CIR/CodeGen/CIRGenExprScalar.cpp
+169-11 files

LLVM/project b3b033bclang/lib/CIR/CodeGen CIRGenStmt.cpp

[CIR][NFC] Fix bad switch fallthroughs in emitStmt (#171224)

This moves a couple of statement emitters that were incorrectly
implemented in the middle of a switch statement where all cases in the
final group are intended to fall through to a handler that emits an NYI
error message. The placement of these implementations was causing some
statement types that should have emitted the NYI error to instead go to
a handler for a different statement type.
DeltaFile
+4-4clang/lib/CIR/CodeGen/CIRGenStmt.cpp
+4-41 files

LLVM/project 632cbeellvm/lib/Target/RISCV RISCVISelLowering.cpp

[RISCV] Use VM and VMNoV0 for "vr" and "vd" inline asm constraints with mask type. (#171235)

The inline assembly handling in SelectionDAG uses the first type
for the register class as the type at the input/output of the
inlineassembly. If this isn't the type for the surrounding DAG,
it needs to be converted.

nxv8i8 is the first type for the VR and VRNoV0 register classes.
So we currently generate insert/extract_subvector and bitcasts to
convert to/from nxv8i8.

I believe some of the special casing we have for this in
splitValueIntoRegisterParts and joinRegisterPartsIntoValue is causing
us to also generate incorrect code for arguments with nxv16i4 types
that should be any extended to nxv16i8. Instead we widen them to nxv32i4
and bitcast to nxv16i8.
    
This patch uses VM and VMNoV0 for masks which has nxv64i1 as their
first type. This means we will only emit an insert/extract_subvector

    [5 lines not shown]
DeltaFile
+17-15llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+17-151 files

LLVM/project 2e16f24llvm/test/CodeGen/RISCV/GlobalISel/instruction-select/rvv select.mir, llvm/test/CodeGen/RISCV/rvv vmv.v.v-peephole.mir vleff-vlseg2ff-output.ll

[RISCV] Add VMNoV0 register class with only the VMaskVTs. (#171231)

I plan to use this for inline assembly "vd" contraints with mask types
in a follow up patch. Due to the test changes I wanted to post this
separately.
DeltaFile
+10-10llvm/test/CodeGen/RISCV/GlobalISel/instruction-select/rvv/select.mir
+3-3llvm/test/CodeGen/RISCV/rvv/vmv.v.v-peephole.mir
+2-2llvm/test/CodeGen/RISCV/rvv/vleff-vlseg2ff-output.ll
+2-2llvm/test/CodeGen/RISCV/rvv/rvv-peephole-vmerge-to-vmv.mir
+2-2llvm/test/CodeGen/RISCV/rvv/vmerge-peephole.mir
+2-2llvm/test/CodeGen/RISCV/rvv/vsetvli-insert-crossbb.mir
+21-213 files not shown
+24-239 files

LLVM/project 0c0ed39lldb/source/Plugins/LanguageRuntime/ObjC/AppleObjCRuntime AppleObjCClassDescriptorV2.cpp AppleObjCClassDescriptorV2.h

[lldb] Don't read firstSubclass and nextSiblingClass from class_rw_t (#171213)

We're considering modifying the ObjC runtime's class_rw_t structure to
remove the firstSubclass and nextSiblingClass fields in some cases. LLDB
is currently reading those but not actually using them. Stop doing that
to avoid issues if they are removed by the runtime.

rdar://166084122
DeltaFile
+1-4lldb/source/Plugins/LanguageRuntime/ObjC/AppleObjCRuntime/AppleObjCClassDescriptorV2.cpp
+0-3lldb/source/Plugins/LanguageRuntime/ObjC/AppleObjCRuntime/AppleObjCClassDescriptorV2.h
+1-72 files

LLVM/project 5236af8mlir/lib/Dialect/XeGPU/IR XeGPUDialect.cpp, mlir/lib/Dialect/XeGPU/Transforms XeGPUSubgroupDistribute.cpp XeGPUPropagateLayout.cpp

[MLIR][XeGPU] Extend propagation and sg_to_lane distribution pass support broadcast with low rank and scalar source input (#170409)

This PR extends XeGPU layout propagation and distribution for
vector.broadcast operation.
It relaxes the restriction of layout propagation to allow low-rank and
scalar source input, and adds a pattern in sg-to-wi distribution to
support the lowering.
DeltaFile
+161-2mlir/lib/Dialect/XeGPU/Transforms/XeGPUSubgroupDistribute.cpp
+143-0mlir/lib/Dialect/XeGPU/IR/XeGPUDialect.cpp
+65-0mlir/test/Dialect/XeGPU/subgroup-distribute-unit.mlir
+61-0mlir/test/Dialect/XeGPU/subgroup-distribute.mlir
+58-0mlir/test/Dialect/XeGPU/propagate-layout.mlir
+31-15mlir/lib/Dialect/XeGPU/Transforms/XeGPUPropagateLayout.cpp
+519-172 files not shown
+552-208 files

LLVM/project 94ebcfdmlir/lib/Dialect/Vector/Transforms VectorTransforms.cpp, mlir/test/Dialect/Vector vector-sink.mlir

[mlir][vector] Fix crash in ReorderCastOpsOnBroadcast with non-vector result (#170985)

Fixes a crash in `ReorderCastOpsOnBroadcast` by ensuring the cast result
is a `VectorType` before applying the pattern.
A regression test has been added to
mlir/test/Dialect/Vector/vector-sink.mlir.

Fixes: #126371
DeltaFile
+15-0mlir/test/Dialect/Vector/vector-sink.mlir
+2-0mlir/lib/Dialect/Vector/Transforms/VectorTransforms.cpp
+17-02 files

LLVM/project a033183compiler-rt/test/sanitizer_common/TestCases/Linux soft_rss_limit_mb_test.cpp

[compiler-rt] Try bumping soft_rss_limit again (#171469)

This is still failing on some of the bots. Try bumping the limit again
to see if this fixes things.
DeltaFile
+3-3compiler-rt/test/sanitizer_common/TestCases/Linux/soft_rss_limit_mb_test.cpp
+3-31 files

LLVM/project 76cffd3.ci generate_test_report_lib_test.py generate_test_report_lib.py

[CI] Tweak wording for builds with passing tests and build errors (#171436)

"All tests passed" is too easily interpreted as every possible test was
run and was fine. A lot of the time it means all the tests that didn't
fail to build ran and were fine.

Maybe the wording is still too subtle but at least it hints to the idea
that the tests run might be fewer than if the build had no compilation
errors.
DeltaFile
+4-4.ci/generate_test_report_lib_test.py
+2-2.ci/generate_test_report_lib.py
+6-62 files

LLVM/project 26283a6llvm/include/llvm/CodeGen TargetLoweringObjectFileImpl.h, llvm/include/llvm/MC MCSectionGOFF.h

[SystemZ] Implement ctor/dtor emission via @@SQINIT and .xtor sections

This patch implements support for constructors/destructors by introducing the
@@SQINIT section and emitting .xtor.<priority> sections within the SystemZ
AsmPrinter and in the GOFF object lowering layer. Improvements to ADA descriptor
handling is also done within this change.
DeltaFile
+79-10llvm/lib/Target/SystemZ/SystemZAsmPrinter.cpp
+49-0llvm/lib/CodeGen/TargetLoweringObjectFileImpl.cpp
+36-0llvm/test/CodeGen/SystemZ/zos_sinit.ll
+9-5llvm/lib/Target/SystemZ/SystemZAsmPrinter.h
+12-0llvm/include/llvm/MC/MCSectionGOFF.h
+4-0llvm/include/llvm/CodeGen/TargetLoweringObjectFileImpl.h
+189-151 files not shown
+190-157 files

LLVM/project c61a481llvm/lib/Transforms/Vectorize VPlanTransforms.cpp, llvm/test/Transforms/LoopVectorize hoist-predicated-loads-with-predicated-stores.ll

[VPlan] Use SCEV to prove non-aliasing for stores at different offsets. (#170347)

Extend the logic add in https://github.com/llvm/llvm-project/pull/168771
to also allow sinking stores past stores in the same noalias set by
checking if we can prove no-alias via the distance between accesses,
checked via SCEV.

PR: https://github.com/llvm/llvm-project/pull/170347
DeltaFile
+81-15llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
+18-30llvm/test/Transforms/LoopVectorize/hoist-predicated-loads-with-predicated-stores.ll
+99-452 files

LLVM/project 1a66474clang/test/CodeGen/AArch64 fmv-explicit-priority.c

[clang][FMV][AArch64] Remove O3 from failing test (#171457)

This fixes the buildbot failures from
https://github.com/llvm/llvm-project/pull/150267.

I could not reproduce them locally but my intuition suggests that the
-O3 option on the RUN line behaves incosistently on different hosts
judging from the error logs.

My intention was to run an integration test which will use llvm's
globalopt pass, but there's no need actually. We have unittests in place
for it.
DeltaFile
+76-134clang/test/CodeGen/AArch64/fmv-explicit-priority.c
+76-1341 files

LLVM/project d922bd8flang/test/Lower/OpenMP do-simd-firstprivate-lastprivate.f90, llvm/lib/Target/Hexagon HexagonVectorCombine.cpp

Merge branch 'main' into users/kparzysz/dims-modifier
DeltaFile
+261-0llvm/test/Transforms/InstCombine/AArch64/tbl.ll
+215-0llvm/test/Transforms/InstCombine/ARM/tbl.ll
+113-23llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
+89-0flang/test/Lower/OpenMP/do-simd-firstprivate-lastprivate.f90
+0-65llvm/test/Transforms/InstCombine/AArch64/tbl1.ll
+12-43llvm/lib/Target/Hexagon/HexagonVectorCombine.cpp
+690-13123 files not shown
+801-25329 files

LLVM/project 7f2bbballvm/lib/Transforms/InstCombine InstCombineCalls.cpp, llvm/test/Transforms/InstCombine/AArch64 tbl.ll tbl1.ll

[AArch64][ARM] Optimize more `tbl`/`tbx` calls into `shufflevector` (#169748)

Resolves #169701.

This PR extends the existing InstCombine operation which folds `tbl1`
intrinsics to `shufflevector` if the mask operand is constant. Before
this change, it only handled 64-bit `tbl1` intrinsics with no
out-of-bounds indices. I've extended it to support both 64-bit and
128-bit vectors, and it now handles the full range of `tbl1`-`tbl4` and
`tbx1`-`tbx4`, as long as at most two of the input operands are actually
indexed into.

For the purposes of `tbl`, we need a dummy vector of zeroes if there are
any out-of-bounds indices, and for the purposes of `tbx`, we use the
"fallback" operand. Both of those take up an operand for the purposes of
`shufflevector`.

This works a lot like https://github.com/llvm/llvm-project/pull/169110,
with some added complexity because we need to handle multiple operands.

    [11 lines not shown]
DeltaFile
+261-0llvm/test/Transforms/InstCombine/AArch64/tbl.ll
+215-0llvm/test/Transforms/InstCombine/ARM/tbl.ll
+113-23llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
+0-65llvm/test/Transforms/InstCombine/AArch64/tbl1.ll
+0-35llvm/test/Transforms/InstCombine/ARM/tbl1.ll
+589-1235 files

LLVM/project c66eb25llvm/bindings/ocaml/llvm llvm_ocaml.c

[OCaml] Fix build

Fix a mistake introduced in https://github.com/llvm/llvm-project/pull/163979:

We should stick with the deprecated LLVMGetGlobalContext() API
in this file, as getGlobalContextForCAPI() is a C++ API that is
not available here.
DeltaFile
+1-1llvm/bindings/ocaml/llvm/llvm_ocaml.c
+1-11 files

LLVM/project b3a5870llvm/docs ReleaseNotes.md

[llvm][docs] Add a release note for LLDB "version -v"

Added by #170772.
DeltaFile
+4-0llvm/docs/ReleaseNotes.md
+4-01 files

LLVM/project 6b58449clang/utils/ClangVisualizers clang.natvis

Update the NATVIS file

ElaboratedType is no longer a thing.
DeltaFile
+0-14clang/utils/ClangVisualizers/clang.natvis
+0-141 files

LLVM/project b2ddb90libcxx/src/include refstring.h

[libc++] Don't try to be compatible with libstdc++ in __libcpp_refstring on iOS (#170816)

iOS doesn't provide a libstdc++ dylib anymore, so we can remove the
compatiblity check whether we can load the dylib.
DeltaFile
+2-2libcxx/src/include/refstring.h
+2-21 files

LLVM/project 51d928fbolt/test/runtime/AArch64 pacret-synchronous-unwind.cpp

[BOLT] Fix pacret-synchronous-unwind.cpp test (#171395)

The test case build a binary from C++, and checks for the number of
functions the PointerAuthCFIFixup pass runs on.
This can change based on the platform. To account for this, the patch
changes the number to a regex.

The test failed when running on RHEL 9.
DeltaFile
+9-6bolt/test/runtime/AArch64/pacret-synchronous-unwind.cpp
+9-61 files

LLVM/project 1007280bolt/test/runtime/AArch64 pacret-synchronous-unwind.cpp

format
DeltaFile
+6-4bolt/test/runtime/AArch64/pacret-synchronous-unwind.cpp
+6-41 files