LLVM/project 8b879d1mlir/lib/ExecutionEngine CMakeLists.txt, mlir/lib/ExecutionEngine/SparseTensor CMakeLists.txt

Revert "[mlir] Link libraries that aren't included in libMLIR to libMLIR (#123477)"

This reverts commit af6616676fb7f9dd4898290ea684ee0c90f1701d.  It broke
builds with `-DBUILD_SHARED_LIBS=ON`.
DeltaFile
+2-5mlir/lib/ExecutionEngine/CMakeLists.txt
+2-2mlir/lib/ExecutionEngine/SparseTensor/CMakeLists.txt
+1-3mlir/test/lib/Analysis/CMakeLists.txt
+2-2mlir/test/lib/Conversion/ConvertToSPIRV/CMakeLists.txt
+1-3mlir/test/lib/Conversion/FuncToLLVM/CMakeLists.txt
+2-2mlir/test/lib/Conversion/MathToVCIX/CMakeLists.txt
+10-1737 files not shown
+77-9843 files

LLVM/project 8424bf2clang/lib/Headers vecintrin.h, clang/test/CodeGen/SystemZ zvector.c

[SystemZ] Add support for new cpu architecture - arch15

This patch adds support for the next-generation arch15
CPU architecture to the SystemZ backend.

This includes:
- Basic support for the new processor and its features.
- Detection of arch15 as host processor.
- Assembler/disassembler support for new instructions.
- Exploitation of new instructions for code generation.
- New vector (signed|unsigned|bool) __int128 data types.
- New LLVM intrinsics for certain new instructions.
- Support for low-level builtins mapped to new LLVM intrinsics.
- New high-level intrinsics in vecintrin.h.
- Indicate support by defining  __VEC__ == 10305.

Note: No currently available Z system supports the arch15
architecture.  Once new systems become available, the
official system name will be added as supported -march name.
DeltaFile
+3,896-0llvm/test/CodeGen/SystemZ/vec-eval.ll
+1,741-41clang/lib/Headers/vecintrin.h
+1,753-0llvm/test/MC/Disassembler/SystemZ/insns-arch15.txt
+1,348-0llvm/test/MC/SystemZ/insn-good-arch15.s
+849-80clang/test/CodeGen/SystemZ/zvector.c
+541-0llvm/test/CodeGen/SystemZ/vec-intrinsics-05.ll
+10,128-12178 files not shown
+16,455-28084 files

LLVM/project 64edde6clang/include/clang/Basic AttrDocs.td

[clang] Improve the documentation for the init_priority attribute (#123098)

The documentation wasn't very clear about whether ordering is controlled
within or across TUs, and same for dylibs. Clarify that, and also add
mentions for the state of support on Mach-O platforms.
DeltaFile
+12-5clang/include/clang/Basic/AttrDocs.td
+12-51 files

LLVM/project 818d6e5llvm/utils/TableGen/Common CodeGenSchedule.cpp

[TableGen] Avoid repeated hash lookups (NFC) (#123562)

DeltaFile
+5-9llvm/utils/TableGen/Common/CodeGenSchedule.cpp
+5-91 files

LLVM/project efae9f3llvm/lib/CodeGen/MIRParser MIParser.cpp

[MIRParser] Avoid repeated map lookups (NFC) (#123561)

DeltaFile
+3-2llvm/lib/CodeGen/MIRParser/MIParser.cpp
+3-21 files

LLVM/project 7fa1936llvm/lib/Transforms/InstCombine InstCombineCalls.cpp

[InstCombine] Avoid repeated hash lookups (NFC) (#123559)

DeltaFile
+4-3llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
+4-31 files

LLVM/project 64749fbllvm/lib/Target/AMDGPU AMDGPULowerBufferFatPointers.cpp, llvm/test/CodeGen/AMDGPU buffer-fat-pointers-contents-legalization.ll lower-buffer-fat-pointers-contents-legalization.ll

Revert "[AMDGPU] Handle natively unsupported types in addrspace(7) lowering" (#123657)

Reverts llvm/llvm-project#110572

Seem to have broken a buildbot, not sure why
https://lab.llvm.org/buildbot/#/builders/108/builds/8346
DeltaFile
+0-3,998llvm/test/CodeGen/AMDGPU/buffer-fat-pointers-contents-legalization.ll
+386-912llvm/test/CodeGen/AMDGPU/lower-buffer-fat-pointers-contents-legalization.ll
+3-562llvm/lib/Target/AMDGPU/AMDGPULowerBufferFatPointers.cpp
+0-11llvm/test/CodeGen/AMDGPU/llvm.amdgcn.raw.ptr.buffer.store.nxv2i32.fail.ll
+1-6llvm/test/CodeGen/AMDGPU/lower-buffer-fat-pointers-calls.ll
+1-6llvm/test/CodeGen/AMDGPU/lower-buffer-fat-pointers-unoptimized-debug-data.ll
+391-5,4956 files

LLVM/project f355a44llvm/lib/Transforms/HipStdPar HipStdPar.cpp

[HipStdPar] Avoid repeated hash lookups (NFC) (#123558)

DeltaFile
+3-2llvm/lib/Transforms/HipStdPar/HipStdPar.cpp
+3-21 files

LLVM/project bc1e699llvm/lib/CodeGen MachineCopyPropagation.cpp

[CodeGen] Avoid repeated hash lookups (NFC) (#123557)

DeltaFile
+4-4llvm/lib/CodeGen/MachineCopyPropagation.cpp
+4-41 files

LLVM/project cac3f5ellvm/lib/Transforms/IPO MemProfContextDisambiguation.cpp

[memprof] Add simplify_type (NFC) (#123556)

IndexCall is a simple wrapper around:

  PointerUnion<CallsiteInfo *, AllocInfo *>

Now, because we don't have CastInfo for IndexCall, we would have to
use getBase like so:

  dyn_cast_if_present<CallsiteInfo *>(Call.getBase())

This patch adds simplify_type<IndexCall>, which in turn enables
CastInfo for IndexCall, so we can drop getBase like so::

  dyn_cast_if_present<CallsiteInfo *>(Call)
DeltaFile
+27-19llvm/lib/Transforms/IPO/MemProfContextDisambiguation.cpp
+27-191 files

LLVM/project bba7783clang/include/clang/Basic BuiltinsX86.td, llvm/test/CodeGen/RISCV/rvv vfma-vp.ll

Rebase

Created using spr 1.3.5
DeltaFile
+4,839-5,345llvm/test/CodeGen/RISCV/rvv/vfma-vp.ll
+3,056-4,201llvm/test/CodeGen/X86/vector-interleaved-load-i8-stride-8.ll
+233-6,579llvm/test/DebugInfo/NVPTX/debug-info.ll
+6,647-0llvm/test/MC/AMDGPU/gfx12_asm_vop3c_dpp16-fake16.s
+5,378-4clang/include/clang/Basic/BuiltinsX86.td
+2,667-2,586llvm/test/MC/Disassembler/AMDGPU/gfx11_dasm_vopc.txt
+22,820-18,7155,948 files not shown
+312,261-154,8845,954 files

LLVM/project 14c52a8libcxx/include/__chrono utc_clock.h, libcxx/test/libcxx/time/time.clock/time.clock.utc get_leap_second_info.pass.cpp

[libc++][chrono] implements UTC clock.

While implementing this feature and its associated LWG issues it turns out
- LWG3316 Correctly define epoch for utc_clock / utc_timepoint
only added non-normative wording to the standard.

Implements parts of:
- P0355 Extending <chrono> to Calendars and Time Zones
- P1361 Integration of chrono with text formatting
- LWG3359 <chrono> leap second support should allow for negative leap seconds
DeltaFile
+1,004-0libcxx/test/std/time/time.syn/formatter.utc_time.pass.cpp
+245-0libcxx/test/std/time/time.clock/time.clock.utc/time.clock.utc.members/from_sys.pass.cpp
+241-0libcxx/test/std/time/time.clock/time.clock.utc/time.clock.utc.members/to_sys.pass.cpp
+165-0libcxx/test/std/time/time.clock/time.clock.utc/utc_time.ostream.pass.cpp
+164-0libcxx/include/__chrono/utc_clock.h
+147-0libcxx/test/libcxx/time/time.clock/time.clock.utc/get_leap_second_info.pass.cpp
+1,966-018 files not shown
+2,630-524 files

LLVM/project 0fbec1ellvm/include/llvm/Transforms/Vectorize/SandboxVectorizer/Passes BottomUpVec.h

[SandboxVec][BottomUpVec][NFC] Add comments
DeltaFile
+5-0llvm/include/llvm/Transforms/Vectorize/SandboxVectorizer/Passes/BottomUpVec.h
+5-01 files

LLVM/project 3805355llvm/lib/Target/AMDGPU AMDGPULowerBufferFatPointers.cpp, llvm/test/CodeGen/AMDGPU buffer-fat-pointers-contents-legalization.ll lower-buffer-fat-pointers-contents-legalization.ll

[AMDGPU] Handle natively unsupported types in addrspace(7) lowering (#110572)

The current lowering for ptr addrspace(7) assumed that the instruction
selector can handle arbtrary LLVM types, which is not the case. Code
generation can't deal with
- Values that aren't 8, 16, 32, 64, 96, or 128 bits long
- Aggregates (this commit only handles arrays of scalars, more may come)
- Vectors of more than one byte
- 3-word values that aren't a vector of 3 32-bit values (for axample, a
<6 x half>)

This commit adds a buffer contents type legalizer that adds the needed
bitcasts, zero-extensions, and splits into subcompnents needed to
convert a load or store operation into one that can be successfully
lowered through code generation.

In the long run, some of the involved bitcasts (though potentially not
the buffer operation splitting) ought to be handled by the instruction
legalizer, but SelectionDAG makes this difficult.

    [6 lines not shown]
DeltaFile
+3,998-0llvm/test/CodeGen/AMDGPU/buffer-fat-pointers-contents-legalization.ll
+912-386llvm/test/CodeGen/AMDGPU/lower-buffer-fat-pointers-contents-legalization.ll
+562-3llvm/lib/Target/AMDGPU/AMDGPULowerBufferFatPointers.cpp
+11-0llvm/test/CodeGen/AMDGPU/llvm.amdgcn.raw.ptr.buffer.store.nxv2i32.fail.ll
+6-1llvm/test/CodeGen/AMDGPU/lower-buffer-fat-pointers-unoptimized-debug-data.ll
+6-1llvm/test/CodeGen/AMDGPU/lower-buffer-fat-pointers-calls.ll
+5,495-3916 files

LLVM/project 9cbd705llvm/tools/llvm-cgdata llvm-cgdata.cpp

[NFC] llvm-cgdata use StringRef in exitWithError to reduce construction (#120771)

Replace `static void exitWithError(Twine Message, std::string Whence =
"", std::string Hint = "")` std::string with StringRef to remove
constructing Strings on every call or passing by value

Fixes: #100065
DeltaFile
+5-5llvm/tools/llvm-cgdata/llvm-cgdata.cpp
+5-51 files

LLVM/project af66166mlir/lib/ExecutionEngine CMakeLists.txt, mlir/test/lib/Dialect/Test CMakeLists.txt

[mlir] Link libraries that aren't included in libMLIR to libMLIR (#123477)

Use `mlir_target_link_libraries()` to link dependencies of libraries
that are not included in libMLIR, to ensure that they link to the dylib
when they are used in Flang. Otherwise, they implicitly pull in all
their static dependencies, effectively causing Flang binaries to
simultaneously link to the dylib and to static libraries, which is never
a good idea.

I have only covered the libraries that are used by Flang. If you wish, I
can extend this approach to all non-libMLIR libraries in MLIR, making
MLIR itself also link to the dylib consistently.
DeltaFile
+5-2mlir/lib/ExecutionEngine/CMakeLists.txt
+2-2mlir/test/lib/Dialect/Test/CMakeLists.txt
+2-2mlir/test/lib/Dialect/Tosa/CMakeLists.txt
+2-2mlir/test/lib/Dialect/Transform/CMakeLists.txt
+2-2mlir/test/lib/Dialect/Vector/CMakeLists.txt
+2-2mlir/test/lib/IR/CMakeLists.txt
+15-1237 files not shown
+98-7743 files

LLVM/project 5810f15llvm/lib/Target/SPIRV SPIRVCallLowering.cpp SPIRVEmitIntrinsics.cpp

[SPIR-V] Fix SPIRVEmitIntrinsics undefined behavior (#123625)

Before this change InstrSet in SPIRVEmitIntrinsics was uninitialized
before running runOnFunction. This change adds a new function
getPreferredInstructionSet in SPIRVSubtarget.
DeltaFile
+3-11llvm/lib/Target/SPIRV/SPIRVCallLowering.cpp
+6-7llvm/lib/Target/SPIRV/SPIRVEmitIntrinsics.cpp
+8-0llvm/lib/Target/SPIRV/SPIRVSubtarget.cpp
+1-0llvm/lib/Target/SPIRV/SPIRVSubtarget.h
+18-184 files

LLVM/project 7084110llvm/lib/Target/X86 X86ISelLowering.cpp

X86ISelLowering.cpp - remove unused variable missed in #123617
DeltaFile
+0-1llvm/lib/Target/X86/X86ISelLowering.cpp
+0-11 files

LLVM/project 8ff195cllvm/lib/Target/AMDGPU SIISelLowering.cpp

SIISelLowering.cpp - remove unused variable missed in #123617
DeltaFile
+0-2llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+0-21 files

LLVM/project 19bd2d6llvm/lib/Analysis ConstantFolding.cpp, llvm/test/Transforms/InstSimplify pr122582.ll

[ConstantFolding] Add ilogb in isMathLibCallNoop (#122582)

ilogb libcall was not being constant folded correctly. This patch adds 
ilogb case in isMathLibCallNoop with correct error condition.

Fixes #101873 
DeltaFile
+213-0llvm/test/Transforms/InstSimplify/pr122582.ll
+3-0llvm/lib/Analysis/ConstantFolding.cpp
+216-02 files

LLVM/project 3606876llvm/lib/CodeGen/SelectionDAG SelectionDAG.cpp, llvm/test/CodeGen/NVPTX addrspacecast-cse.ll

[SDAG] Fix CSE for ADDRSPACECAST nodes (#122912)

Correct CSE in SelectionDAG can make DAG combining more effective and
reduces the size of the DAG and thus should improve compile time.
DeltaFile
+23-0llvm/test/CodeGen/NVPTX/addrspacecast-cse.ll
+6-0llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
+29-02 files

LLVM/project 0fa0545libcxx/include/__algorithm make_projected.h radix_sort.h, libcxx/include/__functional bind.h function.h

[libc++] Define an internal API for std::invoke and friends (#116637)

Currently we're using quite different internal names for the
`std::invoke` family of type traits. This adds a layer around the
current implementation to make it easier to understand when it is used
and makes it easier to define multiple implementations of it.
DeltaFile
+61-15libcxx/include/__type_traits/invoke.h
+10-12libcxx/include/__functional/bind.h
+6-6libcxx/include/__algorithm/make_projected.h
+4-6libcxx/include/__functional/function.h
+4-4libcxx/include/__algorithm/radix_sort.h
+3-3libcxx/test/libcxx/containers/associative/non_const_comparator.verify.cpp
+88-4611 files not shown
+109-6617 files

LLVM/project c248fc1clang/docs LanguageExtensions.rst

[Clang] Document some of the implementation-defined keywords (#84591)

DeltaFile
+108-11clang/docs/LanguageExtensions.rst
+108-111 files

LLVM/project 7abf440llvm/lib/Target/X86/MCTargetDesc X86MCTargetDesc.h

Add missing include to X86MCTargetDesc.h (#123320)

In gcc-15, explicit includes of `<cstdint>` are required when fixed-size
integers are used. In this file, this include only happened as a side
effect of including SmallVector.h

Although llvm compiles fine, the root-project would benefit from
explicitly including it here, so we can backport the patch.

Maybe interesting for @hahnjo and @vgvassilev
DeltaFile
+1-0llvm/lib/Target/X86/MCTargetDesc/X86MCTargetDesc.h
+1-01 files

LLVM/project b95ed30llvm/lib/Transforms/Instrumentation DataFlowSanitizer.cpp, llvm/lib/Transforms/Utils ModuleUtils.cpp

[IR] Remove unused variables from #123617

Failed to notice them when landing that patch - apologies!
DeltaFile
+0-1llvm/lib/Transforms/Instrumentation/DataFlowSanitizer.cpp
+0-1llvm/lib/Transforms/Utils/ModuleUtils.cpp
+0-22 files

LLVM/project e8674afclang/lib/AST/ByteCode Interp.h, clang/test/AST/ByteCode constexpr.c

[clang][bytecode] Diagnose IntegralToPointer casts to non-void (#123619)

But keep evaluating. This is what the current interpreter does as well.
DeltaFile
+4-0clang/lib/AST/ByteCode/Interp.h
+2-2clang/test/AST/ByteCode/constexpr.c
+0-1clang/test/SemaCXX/builtin-assume-aligned.cpp
+6-33 files

LLVM/project f33e3d4llvm/lib/Target/AMDGPU AMDGPUISelDAGToDAG.cpp

[AMDGPU] Fix DAG types for V_MAD_I64_I32 and V_MAD_U64_U32. NFC. (#123629)

These instructions return a 64-bit result and a 1-bit carry, unlike
smul_lohi and umul_lohi which return a pair of 32-bit results.

This does not appear to make any difference in practice because the DAG
types are not used for anything before these nodes are converted to
MachineInstrs.
DeltaFile
+2-1llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp
+2-11 files

LLVM/project c8eb865libclc/generic/lib/math sincos_helpers.cl clc_pow.cl

[libclc] Move mad to the CLC library (#123607)

All targets build `__clc_mad` -- even SPIR-V targets -- since it
compiles to the optimal `llvm.fmuladd` intrinsic. There is no change to
the bytecode generated for non-SPIR-V targets.

The `mix` builtin, which is implemented as a wrapper around `mad`, is
left as an OpenCL-layer wrapper of `__clc_mad`. I don't know if it's
worth having a specific CLC version of `mix`.

The changes to the other CLC files/functions are moving uses of `mad` to
`__clc_mad`, and reformatting. There is an additional instance of
`trunc` becoming `__clc_trunc`, which was missed before.
DeltaFile
+439-409libclc/generic/lib/math/sincos_helpers.cl
+284-266libclc/generic/lib/math/clc_pow.cl
+267-253libclc/generic/lib/math/clc_powr.cl
+252-238libclc/generic/lib/math/clc_pown.cl
+251-238libclc/generic/lib/math/clc_rootn.cl
+80-68libclc/generic/lib/math/clc_exp10.cl
+1,573-1,47216 files not shown
+1,684-1,51022 files

LLVM/project ea3aa97flang/test/Lower zero_init_default_init.f90 zero_init.f90

Avoid module name clashes by choosing unique names
DeltaFile
+6-6flang/test/Lower/zero_init_default_init.f90
+4-4flang/test/Lower/zero_init.f90
+10-102 files

LLVM/project b3902dbbolt/include/bolt/Core BinarySection.h Relocation.h, bolt/lib/Core Relocation.cpp BinarySection.cpp

[BOLT] Skip flushing optional out-of-range pending relocs

When a pending relocation is created it is also marked whether it is
optional or not. It can be optional when such relocation is added as
part of an optimization (i.e., `scanExternalRefs`).

When bolt tries to `flushPendingRelocations`, it safely skips any
optional relocations that cannot be encoded.

Background:
BOLT, as part of scanExternalRefs, identifies external references from
calls and creates some pending relocations for them. Those when flushed
will update references to point to the optimized functions. This
optimization can be disabled using `--no-scan`.

BOLT can assert if any of these pending relocations cannot be encoded.

This patch does not disable this optimization but instead selectively
applies it given that a pending relocation is optional.
DeltaFile
+33-0bolt/lib/Core/Relocation.cpp
+15-10bolt/unittests/Core/BinaryContext.cpp
+13-1bolt/lib/Core/BinarySection.cpp
+8-5bolt/include/bolt/Core/BinarySection.h
+4-1bolt/include/bolt/Core/Relocation.h
+1-1bolt/lib/Core/BinaryFunction.cpp
+74-186 files