LLVM/project 7927040llvm/lib/CodeGen/AsmPrinter DwarfCompileUnit.cpp, llvm/test/DebugInfo/X86 dwarf-call-target-clobbered.mir dwarf-call-target-mem-loc.mir

[DebugInfo][DWARF] Use DW_AT_call_target_clobbered for exprs with volatile regs (#172167)

Without this patch DW_AT_call_target is used for all indirect call address
location expressions. The DWARF spec says:

    For indirect calls or jumps where the address is not computable without use
    of registers or memory locations that might be clobbered by the call the
    DW_AT_call_target_clobbered attribute is used instead of the
    DW_AT_call_target attribute.

This patch implements that behaviour.
DeltaFile
+96-0llvm/test/DebugInfo/X86/dwarf-call-target-clobbered.mir
+12-5llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp
+3-2llvm/test/DebugInfo/X86/dwarf-call-target-mem-loc.mir
+2-2llvm/test/DebugInfo/X86/dwarf-callsite-related-attrs-indirect.ll
+113-94 files

LLVM/project 2f9bf3fllvm/include/llvm/CodeGenTypes LowLevelType.h, llvm/lib/CodeGen/GlobalISel LegalizerHelper.cpp

[GlobalISel](NFC) Refactor construction of LLTs in `LegalizerHelper` (#170664)

I spotted a number of places where we're duplicating logic provided by
the `LLT` class inline in `LegalizerHelper`. This PR tidies up these
spots.
DeltaFile
+22-27llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp
+15-1llvm/include/llvm/CodeGenTypes/LowLevelType.h
+37-282 files

LLVM/project 88f602cllvm/test/CodeGen/WinEH wineh-no-demotion.ll

Update llvm/test/CodeGen/WinEH/wineh-no-demotion.ll
DeltaFile
+1-1llvm/test/CodeGen/WinEH/wineh-no-demotion.ll
+1-11 files

LLVM/project ea60b34llvm/lib/Target/LoongArch LoongArchISelLowering.cpp

Use divideCeil
DeltaFile
+1-1llvm/lib/Target/LoongArch/LoongArchISelLowering.cpp
+1-11 files

LLVM/project 25e3c8fllvm/lib/Target/LoongArch LoongArchISelLowering.cpp LoongArchISelLowering.h, llvm/test/CodeGen/LoongArch musttail.ll

Replace addTokenForArgument with getStackArgumentTokenFactor
DeltaFile
+24-24llvm/test/CodeGen/LoongArch/musttail.ll
+11-34llvm/lib/Target/LoongArch/LoongArchISelLowering.cpp
+0-6llvm/lib/Target/LoongArch/LoongArchISelLowering.h
+35-643 files

LLVM/project 46ff53dclang/include/clang/Basic BuiltinsAMDGPU.def, clang/test/SemaHIP amdgpu-ds-atomic-fadd.hip

[Clang] Remove 't' from __builtin_amdgcn_ds_atomic_fadd_f32/f64
DeltaFile
+6-7clang/test/SemaHIP/amdgpu-ds-atomic-fadd.hip
+2-2clang/include/clang/Basic/BuiltinsAMDGPU.def
+8-92 files

LLVM/project 8511600clang/test/SemaHIP amdgpu-ds-atomic-fadd.hip

[NFC][Clang] Add HIP Sema test for __builtin_amdgcn_ds_atomic_fadd_f32/f64
DeltaFile
+34-0clang/test/SemaHIP/amdgpu-ds-atomic-fadd.hip
+34-01 files

LLVM/project 9e4907cclang/test/Headers __clang_hip_math.hip

Update clang/test/Headers/__clang_hip_math.hip
DeltaFile
+22-22clang/test/Headers/__clang_hip_math.hip
+22-221 files

LLVM/project c19692bllvm/test/Examples/IRTransforms/SimplifyCFG tut-simplify-cfg5-del-phis-for-dead-block.ll, llvm/test/Transforms/LoopUnroll/AArch64 unrolling-multi-exit.ll apple-unrolling-multi-exit.ll

Update non-X86 and Example tests
DeltaFile
+7-7llvm/test/Transforms/SimplifyCFG/RISCV/switch-of-powers-of-two.ll
+3-3llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll
+2-2llvm/test/Transforms/LoopUnroll/AArch64/unrolling-multi-exit.ll
+2-2llvm/test/Transforms/LoopUnroll/AArch64/apple-unrolling-multi-exit.ll
+1-1llvm/test/Transforms/PhaseOrdering/AArch64/std-find.ll
+1-1llvm/test/Examples/IRTransforms/SimplifyCFG/tut-simplify-cfg5-del-phis-for-dead-block.ll
+16-163 files not shown
+19-199 files

LLVM/project f66226dllvm/test/Transforms/DFAJumpThreading dfa-unfold-select.ll

update llvm/test/Transforms/DFAJumpThreading/dfa-unfold-select.ll
DeltaFile
+77-84llvm/test/Transforms/DFAJumpThreading/dfa-unfold-select.ll
+77-841 files

LLVM/project 087fff3llvm/test/Transforms/DFAJumpThreading dfa-jump-threading-transform.ll

update llvm/test/Transforms/DFAJumpThreading/dfa-jump-threading-transform.ll
DeltaFile
+8-10llvm/test/Transforms/DFAJumpThreading/dfa-jump-threading-transform.ll
+8-101 files

LLVM/project 26f0d18llvm/lib/IR Instructions.cpp, llvm/test/Transforms/LoopUnroll runtime-loop-multiple-exits.ll

[IR] Optimzie `PHINode::removeIncomingValue()` by swapping with the last of incoming value.

Add an optional argument `KeepIncomingOrder` defaults false, when `KeepIncomingOrder` is false,
the new implementation simply moves the last incoming value and block into the position of the element being removed.

This improve compile-time for PHI nodes with many predecessors.
DeltaFile
+18-18llvm/test/Transforms/LoopVectorize/single_early_exit_live_outs.ll
+12-12llvm/test/Transforms/SimplifyCFG/UnreachableEliminate.ll
+11-11llvm/test/Transforms/PGOProfile/chr.ll
+13-9llvm/lib/IR/Instructions.cpp
+10-10llvm/test/Transforms/LoopUnroll/runtime-loop-multiple-exits.ll
+10-10llvm/test/Transforms/SimpleLoopUnswitch/inject-invariant-conditions.ll
+74-7057 files not shown
+173-16663 files

LLVM/project 72f3995llvm/lib/Transforms/Utils CodeExtractor.cpp

[CodeExtractor] Optimize PHI incoming value removal using removeIncomingValueIf() (NFC) (#171956)

DeltaFile
+4-4llvm/lib/Transforms/Utils/CodeExtractor.cpp
+4-41 files

LLVM/project c9c46a0llvm/lib/Transforms/Utils CloneFunction.cpp

[CloneFunction] Optimize PHI incoming value removal using reverse iteration (NFC) (#171955)

DeltaFile
+4-5llvm/lib/Transforms/Utils/CloneFunction.cpp
+4-51 files

LLVM/project fd6fb04llvm/docs MemProf.rst

Actually fix formatting and warning.
DeltaFile
+3-5llvm/docs/MemProf.rst
+3-51 files

LLVM/project 9f176e3libcxx/docs VendorDocumentation.rst

[libcxx][docs] Fix boostrapping build configure command (#172015)

If I take the command from the page and add my triple like so:

$ cmake -G Ninja -S llvm -B build \
-DCMAKE_BUILD_TYPE=RelWithDebInfo \
-DLLVM_ENABLE_PROJECTS="clang" \ # Configure
-DLLVM_ENABLE_RUNTIMES="libcxx;libcxxabi;libunwind;compiler-rt" \
        -DLLVM_RUNTIME_TARGETS="aarch64-unknown-linux-gnu"
CMake Warning:
  Ignoring extra path from command line:

   " "
<...>
-- Build files have been written to:
/home/david.spickett/llvm-project/build -bash:
-DLLVM_ENABLE_RUNTIMES=libcxx;libcxxabi;libunwind;compiler-rt: command
not found


    [7 lines not shown]
DeltaFile
+5-4libcxx/docs/VendorDocumentation.rst
+5-41 files

LLVM/project b225907llvm/lib/Target/AArch64 AArch64Features.td AArch64TargetTransformInfo.h, llvm/test/CodeGen/AArch64 aggressive-interleaving.ll

[AArch64]Enable aggressive interleaving for A320 (#169825)

This patch makes use of aggressive interleaving options for the A320
subtarget. This is done by adding a new local parameter to the
AArch64Subtarget class. With this enabled we see an aggregate uplift of
0.7% on internal benchmark suites with up to 51% uplift on individual
benchmark workloads.
DeltaFile
+324-0llvm/test/CodeGen/AArch64/aggressive-interleaving.ll
+4-0llvm/lib/Target/AArch64/AArch64Features.td
+4-0llvm/lib/Target/AArch64/AArch64TargetTransformInfo.h
+2-1llvm/lib/Target/AArch64/AArch64Processors.td
+2-0llvm/lib/Target/AArch64/AArch64Subtarget.h
+336-15 files

LLVM/project 10767aallvm/examples/OrcV2Examples/LLJITWithRemoteDebugging CMakeLists.txt, llvm/examples/SpeculativeJIT CMakeLists.txt

[llvm][examples] Disable some JIT examples when threading is disabled (#172282)

This fixes an error on our Armv8 bot:
```
<...>/RemoteJITUtils.cpp:132:24: error: use of undeclared identifier 'DynamicThreadPoolTaskDispatcher'
  132 |       std::make_unique<DynamicThreadPoolTaskDispatcher>(std::nullopt),
      |                        ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
```

These examples require LLVM_ENABLE_THREADS to be ON, and cannot run
otherwise. As a comment says elsewhere:
```
  // Out of process mode using SimpleRemoteEPC depends on threads.
```
DeltaFile
+4-2llvm/examples/SpeculativeJIT/CMakeLists.txt
+1-1llvm/examples/OrcV2Examples/LLJITWithRemoteDebugging/CMakeLists.txt
+1-1llvm/test/CMakeLists.txt
+1-1llvm/test/Examples/OrcV2Examples/lljit-with-remote-debugging.test
+7-54 files

LLVM/project 515c3bdllvm/lib/Target/AMDGPU SIInstrInfo.cpp

[AMDGPU] Stop handling soft waitcnts in pseudoToMCOpcode. NFC. (#172278)

Since #87539 all soft waitcnts should have been promoted by
SIInsertWaitcnts.
DeltaFile
+2-1llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
+2-11 files

LLVM/project 57aab63libcxx/include __tree, libcxx/test/std/algorithms/alg.nonmodifying/alg.foreach ranges.for_each.associative.pass.cpp ranges.for_each.associative.pass copy.cpp

[libc++] Fix std::for_each(associative-container) not using std:invoke and projections (#171984)

#164405 added specializations of `for_each` that didn't do the ranges
call shenanigans, but instead just did what the classic algorithms have
to do. This updates the calls to work for the ranges overloads as well.
DeltaFile
+238-0libcxx/test/std/algorithms/alg.nonmodifying/alg.foreach/ranges.for_each.associative.pass.cpp
+0-168libcxx/test/std/algorithms/alg.nonmodifying/alg.foreach/ranges.for_each.associative.pass copy.cpp
+2-2libcxx/include/__tree
+240-1703 files

LLVM/project 7d08651clang/test/CodeGen builtins-nvptx.c, llvm/include/llvm/IR IntrinsicsNVVM.td

[clang][NVPTX] Add support for mixed-precision FP arithmetic (#168359)

This change adds support for mixed precision floating point 
arithmetic for `f16` and `bf16` where the following patterns:
```
%fh = fpext half %h to float
%resfh = fp-operation(%fh, ...)
...
%fb = fpext bfloat %b to float
%resfb = fp-operation(%fb, ...)

where the fp-operation can be any of:
- fadd
- fsub
- llvm.fma.f32
- llvm.nvvm.add(/fma).*
```
are lowered to the corresponding mixed precision instructions which 
combine the conversion and operation into one instruction from 

    [18 lines not shown]
DeltaFile
+443-0llvm/test/CodeGen/NVPTX/mixed-precision-fp.ll
+115-0llvm/test/CodeGen/NVPTX/fp-arith-sat.ll
+111-0llvm/lib/Target/NVPTX/NVPTXIntrinsics.td
+70-0llvm/test/CodeGen/NVPTX/fp-fold-sub.ll
+39-0clang/test/CodeGen/builtins-nvptx.c
+19-15llvm/include/llvm/IR/IntrinsicsNVVM.td
+797-151 files not shown
+813-157 files

LLVM/project 0636225llvm/test/Transforms/LoopVectorize/AArch64 scalable-strict-fadd.ll partial-reduce-dot-product.ll

[VPlan] Directly unroll VectorPointerRecipe (#168886)

In an effort to get rid of VPUnrollPartAccessor and directly unroll
recipes, start by directly unrolling VectorPointerRecipe, allowing for
VPlan-based simplifications and simplification of the corresponding
execute.
DeltaFile
+394-457llvm/test/Transforms/LoopVectorize/AArch64/scalable-strict-fadd.ll
+42-72llvm/test/Transforms/LoopVectorize/AArch64/partial-reduce-dot-product.ll
+40-60llvm/test/Transforms/LoopVectorize/AArch64/sve-epilog-vect.ll
+45-51llvm/test/Transforms/LoopVectorize/AArch64/uniform-args-call-variants.ll
+24-48llvm/test/Transforms/LoopVectorize/AArch64/sve-wide-lane-mask.ll
+28-43llvm/test/Transforms/LoopVectorize/AArch64/masked-call.ll
+573-73145 files not shown
+901-1,16951 files

LLVM/project b3ec8bemlir/include/mlir/Target/LLVM ModuleToObject.h, mlir/include/mlir/Target/LLVM/ROCDL Utils.h

[mlir][gpu] Expose some utility functions from `gpu-to-binary` infra (#172205)

For people who do not want to use a single monolithic pass.
DeltaFile
+48-48mlir/lib/Target/LLVM/ROCDL/Target.cpp
+7-5mlir/include/mlir/Target/LLVM/ModuleToObject.h
+9-3mlir/include/mlir/Target/LLVM/ROCDL/Utils.h
+5-4mlir/lib/Target/LLVM/NVVM/Target.cpp
+4-4mlir/lib/Target/LLVM/ModuleToObject.cpp
+4-3mlir/lib/Target/LLVM/XeVM/Target.cpp
+77-676 files

LLVM/project 4e95718libcxx/include __tree map

[libc++] Remove unused __parent_pointer alias from __tree and map (#172185)

The `__parent_pointer` type alias was marked to be removed in
d163ab3323495560eb0255ac807da2bf24d3c629.
At that time, <map> still had uses of `__parent_pointer` as a local
variable type in operator[] and at()

Those uses were removed in 4a2dd31f16d60b65a46696a909efad5c11b18c19,
which refactored `__find_equal` to return a pair instead of using an out
parameter

However, the typedef in <map> and the alias in __tree were left behind

This patch removes the unused typedef from <map> and the
`__parent_pointer` alias from __tree

Signed-off-by: Krechals <topala.andrei at gmail.com>
DeltaFile
+0-2libcxx/include/__tree
+0-1libcxx/include/map
+0-32 files

LLVM/project ed79fd7clang/include/clang/Basic BuiltinsX86.td, clang/lib/AST ExprConstant.cpp

[Clang][x86]:  allow PCLMULQDQ intrinsics to be used in constexpr (#169214)

Resolves #168741
DeltaFile
+78-0clang/test/CodeGen/X86/vpclmulqdq-builtins.c
+64-0clang/lib/AST/ByteCode/InterpBuiltin.cpp
+55-0clang/lib/AST/ExprConstant.cpp
+34-1clang/test/CodeGen/X86/pclmul-builtins.c
+6-3clang/include/clang/Basic/BuiltinsX86.td
+237-45 files

LLVM/project 92c6db1mlir/lib/Dialect/Affine/IR AffineOps.cpp

Comment out inline-breaking affine
DeltaFile
+1-1mlir/lib/Dialect/Affine/IR/AffineOps.cpp
+1-11 files

LLVM/project f024026llvm/test/CodeGen/AMDGPU vector-reduce-xor.ll vector-reduce-and.ll, llvm/test/CodeGen/AMDGPU/GlobalISel add.vni16.ll xnor.ll

AMDGPU/GlobalISel: Regbanklegalize for G_CONCAT_VECTORS (#171471)

RegBankLegalize using trivial mapping helper, assigns same reg bank
to all operands, vgpr or sgpr.
Uncovers multiple codegen and regbank combiner regressions related to
looking through sgpr to vgpr copies.
Skip regbankselect-concat-vector.mir since agprs are not yet supported.
DeltaFile
+86-55llvm/test/CodeGen/AMDGPU/GlobalISel/add.vni16.ll
+70-56llvm/test/CodeGen/AMDGPU/vector-reduce-xor.ll
+70-56llvm/test/CodeGen/AMDGPU/vector-reduce-and.ll
+70-56llvm/test/CodeGen/AMDGPU/vector-reduce-or.ll
+80-41llvm/test/CodeGen/AMDGPU/freeze.ll
+43-61llvm/test/CodeGen/AMDGPU/GlobalISel/xnor.ll
+419-3254 files not shown
+525-38210 files

LLVM/project f3e508c

[mlir:bazel] Fix missing dependency introduced in #171727. (#172267)

That PR added an include to `LLVMOps.td` without adding a target
providing that file. Curiously, this does not break the official builds
but it *does* break my bazel build.

Signed-off-by: Ingo Müller <ingomueller at google.com>
DeltaFile
+0-00 files

LLVM/project 90783f5lldb/source/Plugins/LanguageRuntime/ObjC/AppleObjCRuntime AppleObjCDeclVendor.cpp

[lldb][AppleObjCDeclVendor] Fix format specifiers when printing log (#172263)

This was causing a crash when enabling the expression log:
```
4   LLDB                                       0x1376d68d0 llvm::formatv_object_base::parseFormatString(llvm::StringRef, unsigned long, bool) + 532
5   LLDB                                       0x13776d838 llvm::formatv_object_base::format(llvm::raw_ostream&) const + 84
6   LLDB                                       0x13776d7d4 llvm::raw_ostream::operator<<(llvm::formatv_object_base const&) + 36
7   LLDB                                       0x1375f4980 lldb_private::Log::Format(llvm::StringRef, llvm::StringRef, llvm::formatv_object_base const&) + 164
8   LLDB                                       0x12f7b39f0 lldb_private::AppleObjCExternalASTSource::CompleteType(clang::TagDecl*) + 416
9   LLDB                                       0x12fa038dc lldb_private::ClangASTSource::FindExternalLexicalDecls(clang::DeclContext const*, llvm::function_ref<bool (clang::Decl::Kind)>, llvm::SmallVectorImpl<clang::Decl*>&) + 1132
10  LLDB                                       0x135d94838 clang::ExternalASTSource::FindExternalLexicalDecls(clang::DeclContext const*, llvm::SmallVectorImpl<clang::Decl*>&) + 92
11  LLDB                                       0x135d94690 clang::DeclContext::LoadLexicalDeclsFromExternalStorage() const + 204
12  LLDB                                       0x135d95ca0 clang::DeclContext::buildLookup() + 308
13  LLDB                                       0x135d964b8 clang::DeclContext::lookupImpl(clang::DeclarationName, clang::DeclContext const*) const + 824
14  LLDB                                       0x135d96168 clang::DeclContext::lookup(clang::DeclarationName) const + 124
15  LLDB                                       0x134f093d4 clang::Sema::CheckImplicitSpecialMemberDeclaration(clang::Scope*, clang::FunctionDecl*) + 128
16  LLDB                                       0x134efb488 clang::Sema::DeclareImplicitDestructor(clang::CXXRecordDecl*) + 932
17  LLDB                                       0x1352ddf24 clang::Sema::LookupSpecialMember(clang::CXXRecordDecl*, clang::CXXSpecialMemberKind, bool, bool, bool, bool, bool)::$_0::operator()() const + 36
```
DeltaFile
+2-2lldb/source/Plugins/LanguageRuntime/ObjC/AppleObjCRuntime/AppleObjCDeclVendor.cpp
+2-21 files

LLVM/project 96881c1llvm/include/llvm/CGData CodeGenDataReader.h, llvm/tools/llvm-cgdata llvm-cgdata.cpp

llvm: Export IndexedCodeGenDataLazyLoading (#169563)

This is needed so the llvm-cgdata tool properly builds with
`LLVM_BUILD_LLVM_DYLIB` so LLVM can be built as a DLL on Windows.

This effort is tracked in #109483.
DeltaFile
+0-4llvm/tools/llvm-cgdata/llvm-cgdata.cpp
+3-0llvm/include/llvm/CGData/CodeGenDataReader.h
+3-42 files