LLVM/project 1262acfllvm/lib/CodeGen/AsmPrinter DwarfUnit.cpp DwarfUnit.h

Introduce DwarfUnit::addBlock helper method (#168446)

This patch is just a small cleanup that unifies the various spots that
add a DWARF expression to the output.
DeltaFile
+21-65llvm/lib/CodeGen/AsmPrinter/DwarfUnit.cpp
+3-0llvm/lib/CodeGen/AsmPrinter/DwarfUnit.h
+24-652 files

LLVM/project 5e80358llvm/lib/Target/ARM/MCTargetDesc ARMAsmBackend.cpp, llvm/test/MC/ARM arm-movt-movw-absolute-pass.s

[llvm][ARM] Allow MOVT and MOVW on the offset between two labels (#168072)

In this case, the value is a constant, not an addend to a relocation.
So the "Relocation Not In Range" error must not be triggered.

Regression from PR #112877
Fixes #132322
DeltaFile
+9-0llvm/test/MC/ARM/arm-movt-movw-absolute-pass.s
+1-1llvm/lib/Target/ARM/MCTargetDesc/ARMAsmBackend.cpp
+10-12 files

LLVM/project 5f66203llvm/include/llvm/CodeGen SDPatternMatch.h

DAG: Reorder SDPatternMatch combinators earlier

Split out from #168288
DeltaFile
+65-65llvm/include/llvm/CodeGen/SDPatternMatch.h
+65-651 files

LLVM/project 3e499e9clang/lib/CIR/CodeGen CIRGenModule.cpp, clang/test/CIR/CodeGen no-common.c

[CIR] Add support for common linkage (#168613)

Add support for marking global variables with common linkage.
DeltaFile
+103-0clang/test/CIR/CodeGen/no-common.c
+12-5clang/lib/CIR/CodeGen/CIRGenModule.cpp
+115-52 files

LLVM/project 6b61559llvm/lib/CodeGen/SelectionDAG LegalizeVectorTypes.cpp, llvm/test/CodeGen/AArch64 sve-extract-scalable-vector.ll

DAG: Use poison for some vector result widening
DeltaFile
+216-218llvm/test/CodeGen/X86/vector-constrained-fp-intrinsics.ll
+137-137llvm/test/CodeGen/PowerPC/vector-constrained-fp-intrinsics.ll
+38-36llvm/test/CodeGen/X86/matrix-multiply.ll
+12-12llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
+0-7llvm/test/CodeGen/AArch64/sve-extract-scalable-vector.ll
+403-4105 files

LLVM/project f122995llvm/lib/CodeGen/SelectionDAG LegalizeVectorTypes.cpp

DAG: Use poison when splitting vector_shuffle results
DeltaFile
+1-1llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
+1-11 files

LLVM/project ed78ab7orc-rt/include/orc-rt TaskDispatcher.h ThreadPoolTaskDispatcher.h, orc-rt/lib/executor ThreadPoolTaskDispatcher.cpp Session.cpp

[orc-rt] Introduce Task and TaskDispatcher APIs and implementations. (#168514)

Introduces the Task and TaskDispatcher interfaces (TaskDispatcher.h),
ThreadPoolTaskDispatcher implementation (ThreadPoolTaskDispatch.h), and
updates Session to include a TaskDispatcher instance that can be used to
run tasks.

TaskDispatcher's introduction is motivated by the need to handle calls
to JIT'd code initiated from the controller process: Incoming calls will
be wrapped in Tasks and dispatched. Session shutdown will wait on
TaskDispatcher shutdown, ensuring that all Tasks are run or destroyed
prior to the Session being destroyed.
DeltaFile
+110-0orc-rt/unittests/ThreadPoolTaskDispatcherTest.cpp
+90-4orc-rt/unittests/SessionTest.cpp
+70-0orc-rt/lib/executor/ThreadPoolTaskDispatcher.cpp
+64-0orc-rt/include/orc-rt/TaskDispatcher.h
+40-18orc-rt/lib/executor/Session.cpp
+48-0orc-rt/include/orc-rt/ThreadPoolTaskDispatcher.h
+422-225 files not shown
+467-2511 files

LLVM/project 5738199llvm/lib/CodeGen/SelectionDAG SelectionDAGBuilder.cpp

Really remove getTargetTransformInfo calls
DeltaFile
+8-13llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
+8-131 files

LLVM/project 33e1a55cmake/Modules LLVMVersion.cmake, libcxx/include __config

Bump version to 21.1.7.
DeltaFile
+1-1llvm/utils/gn/secondary/llvm/version.gni
+1-1llvm/utils/lit/lit/__init__.py
+1-1llvm/utils/mlgo-utils/mlgo/__init__.py
+1-1cmake/Modules/LLVMVersion.cmake
+1-1libcxx/include/__config
+5-55 files

LLVM/project 7cd747dllvm/include/llvm/CodeGen LibcallLoweringInfo.h, llvm/lib/Analysis RuntimeLibcallInfo.cpp

CodeGen: Add LibcallLoweringInfo analysis pass

The libcall lowering decisions should be program dependent,
depending on the current module's RuntimeLibcallInfo. We need
another related analysis derived from that plus the current
function's subtarget to provide concrete lowering decisions.

This takes on a somewhat unusual form. It's a Module analysis,
with a lookup keyed on the subtarget. This is a separate module
analysis from RuntimeLibraryAnalysis to avoid that depending on
codegen. It's not a function pass to avoid depending on any
particular function, to avoid repeated subtarget map lookups in
most of the use passes, and to avoid any recomputation in the
common case of one subtarget (and keeps it reusable across
repeated compilations).

This also switches ExpandFp and PreISelIntrinsicLowering as
a sample function and module pass. Note this is not yet wired
up to SelectionDAG, which is still using the LibcallLoweringInfo
constructed inside of TargetLowering.
DeltaFile
+68-0llvm/include/llvm/CodeGen/LibcallLoweringInfo.h
+36-17llvm/lib/CodeGen/PreISelIntrinsicLowering.cpp
+42-0llvm/lib/CodeGen/LibcallLoweringInfo.cpp
+26-5llvm/lib/CodeGen/ExpandFp.cpp
+17-6llvm/tools/opt/NewPMDriver.cpp
+16-0llvm/lib/Analysis/RuntimeLibcallInfo.cpp
+205-2830 files not shown
+303-6136 files

LLVM/project 2921bb9llvm/lib/Target/ARM ARMSubtarget.cpp ARMISelLowering.cpp, llvm/lib/Target/MSP430 MSP430Subtarget.cpp MSP430ISelLowering.cpp

CodeGen: Move libcall lowering configuration to subtarget

Previously libcall lowering decisions were made directly
in the TargetLowering constructor. Pull these into the subtarget
to facilitate turning LibcallLoweringInfo into a separate analysis
in the future.
DeltaFile
+70-0llvm/lib/Target/ARM/ARMSubtarget.cpp
+0-68llvm/lib/Target/ARM/ARMISelLowering.cpp
+64-0llvm/lib/Target/MSP430/MSP430Subtarget.cpp
+0-62llvm/lib/Target/MSP430/MSP430ISelLowering.cpp
+40-0llvm/lib/Target/Sparc/SparcSubtarget.cpp
+1-36llvm/lib/Target/Sparc/SparcISelLowering.cpp
+175-16611 files not shown
+239-20517 files

LLVM/project 1fd957fllvm/include/llvm/CodeGen TargetLowering.h, llvm/lib/CodeGen/SelectionDAG TargetLowering.cpp

CodeGen: Add subtarget to TargetLoweringBase constructor

Currently LibcallLoweringInfo is defined inside of TargetLowering,
which is owned by the subtarget. Pass in the subtarget so we can
construct LibcallLoweringInfo with the subtarget. This is a temporary
step that should be revertable in the future, after LibcallLoweringInfo
is moved out of TargetLowering.
DeltaFile
+16-14llvm/unittests/Target/AArch64/AArch64SelectionDAGTest.cpp
+4-2llvm/include/llvm/CodeGen/TargetLowering.h
+3-2llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp
+3-2llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
+3-2llvm/unittests/CodeGen/MFCommon.inc
+4-0llvm/lib/Target/SPIRV/SPIRVISelLowering.cpp
+33-2225 files not shown
+62-4931 files

LLVM/project deb2094llvm/include/llvm/CodeGen SelectionDAGISel.h, llvm/lib/CodeGen/SelectionDAG SelectionDAGISel.cpp SelectionDAGBuilder.cpp

DAG: Fix constructing a temporary TargetTransformInfo instance

TTI is managed by the pass manager, and should use the pre-existing
analysis result. Also fixes some noise where we were only conditionally
querying TTI.
DeltaFile
+1-5llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
+3-2llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
+1-3llvm/include/llvm/CodeGen/SelectionDAGISel.h
+2-1llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.h
+7-114 files

LLVM/project e47e9f3llvm/lib/Target/NVPTX NVPTXISelLowering.h NVPTXISelLowering.cpp

[NVPTX] TableGen-erate SDNode descriptions (#168367)

This allows SDNodes to be validated against their expected type profiles
and reduces the number of changes required to add a new node.

The verification functionality detected a few issues, two of them were
fixed (missing `SDNPMemOperand` property on `TCGEN05_MMA` nodes and
extra glue operand/result on `CallPrototype`), the one remaining is with
`ProxyReg` node, see `NVPTXSelectionDAGInfo::verifyTargetNode()`.

Part of #119709.

Pull Request: https://github.com/llvm/llvm-project/pull/168367
DeltaFile
+0-114llvm/lib/Target/NVPTX/NVPTXISelLowering.h
+8-98llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp
+51-3llvm/lib/Target/NVPTX/NVPTXSelectionDAGInfo.cpp
+42-2llvm/lib/Target/NVPTX/NVPTXSelectionDAGInfo.h
+11-2llvm/lib/Target/NVPTX/NVPTXInstrInfo.td
+1-1llvm/lib/Target/NVPTX/NVPTXIntrinsics.td
+113-2202 files not shown
+115-2208 files

LLVM/project db71cc5libc/src/sys/mman/linux CMakeLists.txt pkey_mprotect.cpp, libc/src/sys/mman/linux/x86_64 pkey_common.h

[libc] Implement pkey_alloc/free/get/set/mprotect for x86_64 linux (#162362)

This patch provides definitions for `pkey_*` functions for linux x86_64.

`pkey_alloc`, `pkey_free`, and `pkey_mprotect` are simple syscall
wrappers. `pkey_set` and `pkey_get` modify architecture-specific
registers. The logic for these live in architecture specific
directories:

* `libc/src/sys/mman/linux/x86_64/pkey_common.h` has a real
implementation
* `libc/src/sys/mman/linux/generic/pkey_common.h` contains stubs that
just return `ENOSYS`.
DeltaFile
+241-0libc/test/src/sys/mman/linux/pkey_test.cpp
+95-0libc/src/sys/mman/linux/CMakeLists.txt
+91-0utils/bazel/llvm-project-overlay/libc/BUILD.bazel
+61-0libc/src/sys/mman/linux/x86_64/pkey_common.h
+58-0libc/src/sys/mman/linux/pkey_mprotect.cpp
+38-0libc/src/sys/mman/linux/mprotect_common.h
+584-020 files not shown
+1,007-1226 files

LLVM/project 6665642llvm/lib/Target/AMDGPU SIFoldOperands.cpp GCNSubtarget.h, llvm/test/CodeGen/AMDGPU bug-pk-f32-imm-fold.mir packed-fp32.ll

[AMDGPU] Don't fold an i64 immediate value if it can't be replicated from its lower 32-bit (#168458)

On some targets, a packed f32 instruction can only read 32 bits from a
scalar operand (SGPR or literal) and replicates the bits to both
channels. In this case, we should not fold an immediate value if it
can't be replicated from its lower 32-bit.

Fixes SWDEV-567139.
DeltaFile
+64-0llvm/test/CodeGen/AMDGPU/bug-pk-f32-imm-fold.mir
+41-0llvm/lib/Target/AMDGPU/SIFoldOperands.cpp
+5-4llvm/test/CodeGen/AMDGPU/packed-fp32.ll
+7-0llvm/lib/Target/AMDGPU/GCNSubtarget.h
+117-44 files

LLVM/project 1e3ea03llvm/lib/Transforms/Vectorize VPlan.h VPlanRecipes.cpp, llvm/test/Transforms/LoopVectorize vplan-printing.ll

[VPlan] VPIRFlags kind for FCmp with predicate + fast-math flags (NFCI).

FCmp instructions have both a predicate and fast-math flags. Introduce a
new FCmp kind, that combines both to model this correctly in the current
system.

This should be NFC modulo VPlan printing which now includes the correct
fast-math flags.
DeltaFile
+55-16llvm/lib/Transforms/Vectorize/VPlan.h
+26-17llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
+2-2llvm/test/Transforms/LoopVectorize/ARM/mve-icmpcost.ll
+1-1llvm/test/Transforms/LoopVectorize/vplan-printing.ll
+84-364 files

LLVM/project 5cde345runtimes CMakeLists.txt

[runtimes] Remove pstl from the list of supported runtimes (#168414)

The pstl top-level directory was removed, but we forgot to remove pstl
from the list of valid subdirectories.
DeltaFile
+1-1runtimes/CMakeLists.txt
+1-11 files

LLVM/project c4898f3clang/lib/CodeGen HLSLBufferLayoutBuilder.cpp CGHLSLRuntime.cpp, clang/test/CodeGenHLSL/resources cbuffer.hlsl

[HLSL][DirectX] Use a padding type for HLSL buffers. (#167404)

This change drops the use of the "Layout" type and instead uses explicit
padding throughout the compiler to represent types in HLSL buffers.

There are a few parts to this, though it's difficult to split them up as
they're very interdependent:

1. Refactor HLSLBufferLayoutBuilder to allow us to calculate the padding
of arbitrary types.
2. Teach Clang CodeGen to use HLSL specific paths for cbuffers when
generating aggregate copies, array accesses, and structure accesses.
3. Simplify DXILCBufferAccesses such that it directly replaces accesses
with dx.resource.getpointer rather than recalculating the layout.
4. Basic infrastructure for SPIR-V handling, but the implementation
itself will need work in follow ups.

Fixes several issues, including #138996, #144573, and #156084.
Resolves #147352.
DeltaFile
+27-281llvm/lib/Target/DirectX/DXILCBufferAccess.cpp
+79-218clang/lib/CodeGen/HLSLBufferLayoutBuilder.cpp
+266-23clang/lib/CodeGen/CGHLSLRuntime.cpp
+164-85clang/test/CodeGenHLSL/resources/cbuffer.hlsl
+0-216llvm/test/CodeGen/DirectX/CBufferAccess/memcpy.ll
+29-97llvm/test/CodeGen/DirectX/CBufferAccess/arrays.ll
+565-92033 files not shown
+988-1,30339 files

LLVM/project 31ec633clang-tools-extra/clang-tidy/misc CoroutineHostileRAIICheck.cpp, clang-tools-extra/test/clang-tidy/checkers/misc coroutine-hostile-raii.cpp

[clang-tidy] Fix bugs in misc-coroutine-hostile-raii check (#167947)

1. Handle transformed awaitables for `AllowedCallees`, which generate
temporaries and weren't being handled by #167778.

1. Fix name mismatches in `storeOptions`.
DeltaFile
+9-5clang-tools-extra/clang-tidy/misc/CoroutineHostileRAIICheck.cpp
+10-2clang-tools-extra/test/clang-tidy/checkers/misc/coroutine-hostile-raii.cpp
+19-72 files

LLVM/project 8fce476llvm CMakeLists.txt, llvm/include/llvm/Support/SystemZ zOSSupport.h

Implement a more seamless way to provide missing functions on z/OS (#167703)

In this PR I'm changing the way we provide the missing functions like
strnlen() on z/OS from the separate header file to a wrapper around the
system headers that declare these functions. This will be less
intrusive.

---------

Co-authored-by: Zibi Sarbinowski <zibi at ca.ibm.com>
DeltaFile
+82-0llvm/lib/Support/zOSLibFunctions.cpp
+0-47llvm/include/llvm/Support/SystemZ/zOSSupport.h
+35-0llvm/include/llvm/Support/SystemZ/zos_wrappers/string.h
+3-4llvm/lib/Support/Unix/Program.inc
+5-0llvm/CMakeLists.txt
+0-1llvm/tools/obj2yaml/macho2yaml.cpp
+125-5211 files not shown
+126-6217 files

LLVM/project 507f236llvm/lib/Transforms/Vectorize VPlanUtils.h, llvm/test/Transforms/LoopVectorize induction-wrapflags.ll

[VPlan] Fix OpType-mismatch in getFlagsFromIndDesc (#168560)

Follow up on a cse OpType-mismatch crash reported due to ef023cae388d
(Reland [VPlan] Expand WidenInt inductions with nuw/nsw), setting the
OpType correctly when returning from getFlagsFromIndDesc.
DeltaFile
+70-0llvm/test/Transforms/LoopVectorize/induction-wrapflags.ll
+4-1llvm/lib/Transforms/Vectorize/VPlanUtils.h
+74-12 files

LLVM/project d3c2973lldb/source/Plugins/Instruction/ARM64 EmulateInstructionARM64.cpp, lldb/unittests/UnwindAssembly/ARM64 TestArm64InstEmulation.cpp

[lldb/aarch64] Add STR/LDR instructions for FP registers to Emulator (#168187)

A function prologue can begin with a pre-index STR instruction for a
floating-point register. To construct an unwind plan from assembly
correctly, the instruction emulator must support such instructions.
DeltaFile
+108-0lldb/unittests/UnwindAssembly/ARM64/TestArm64InstEmulation.cpp
+32-11lldb/source/Plugins/Instruction/ARM64/EmulateInstructionARM64.cpp
+140-112 files

LLVM/project 3e8dc4dclang/lib/Tooling/DependencyScanning DependencyScannerImpl.cpp

[clang][deps] NFC: Use qualified names for function definitions (#168586)

The compiler doesn't emit a diagnostics when the signature of a function
defined in a namespace gets out-of-sync with its declaration. Let's use
qualified names for function definitions instead of nesting them in a
namespace so that mismatches are diagnosed by the compiler rather than
by the (less understandable) linker.
DeltaFile
+23-21clang/lib/Tooling/DependencyScanning/DependencyScannerImpl.cpp
+23-211 files

LLVM/project bb03359llvm/lib/Target/AMDGPU SIFoldOperands.cpp

further name
DeltaFile
+4-3llvm/lib/Target/AMDGPU/SIFoldOperands.cpp
+4-31 files

LLVM/project a3e8aa6llvm/lib/Target/AMDGPU SIFoldOperands.cpp

use `getEffectiveImmVal` instead of adding a new API function
DeltaFile
+1-6llvm/lib/Target/AMDGPU/SIFoldOperands.cpp
+1-61 files

LLVM/project 67f5f1fllvm/lib/Target/AMDGPU SIFoldOperands.cpp AMDGPU.td, llvm/test/CodeGen/AMDGPU bug-pk-f32-imm-fold.mir packed-fp32.ll

[AMDGPU] Don't fold an i64 immediate value if it can't be replicated from its lower 32-bit

On some targets, a packed f32 instruction can only read 32 bits from a scalar operand (SGPR or literal) and replicates the bits to both channels. In this case, we should not fold an immediate value if it can't be replicated from its lower 32-bit.
DeltaFile
+64-0llvm/test/CodeGen/AMDGPU/bug-pk-f32-imm-fold.mir
+41-0llvm/lib/Target/AMDGPU/SIFoldOperands.cpp
+5-4llvm/test/CodeGen/AMDGPU/packed-fp32.ll
+8-0llvm/lib/Target/AMDGPU/AMDGPU.td
+6-0llvm/lib/Target/AMDGPU/GCNSubtarget.h
+124-45 files

LLVM/project d679d57llvm/lib/Target/AMDGPU AMDGPU.td GCNSubtarget.h

remove target feature
DeltaFile
+0-8llvm/lib/Target/AMDGPU/AMDGPU.td
+3-2llvm/lib/Target/AMDGPU/GCNSubtarget.h
+1-1llvm/lib/Target/AMDGPU/SIFoldOperands.cpp
+4-113 files

LLVM/project 58aa511llvm/lib/Target/AMDGPU SIFoldOperands.cpp

update helper function name
DeltaFile
+7-3llvm/lib/Target/AMDGPU/SIFoldOperands.cpp
+7-31 files

LLVM/project 1157a22llvm/lib/CodeGen/GlobalISel LegalizerHelper.cpp

[GISel] Use getScalarSizeInBits in LegalizerHelper::lowerBitCount (#168584)

For vectors, CTLZ, CTTZ, CTPOP all operate on individual elements. The
lowering should be based on the element width.

I noticed this by inspection. No tests in tree are currently affected,
but I thought it would be good to fix so someone doesn't have to debug
it in the future.
DeltaFile
+3-3llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp
+3-31 files