LLVM/project 245887ellvm/test/CodeGen/X86 veclib-llvm.sincos.ll

[X86] Added sincos vector lib codegen test coverage (#183702)

Added veclib-llvm.sincos.ll tests for amdlibm and libmvec

Fixes #182847
DeltaFile
+122-0llvm/test/CodeGen/X86/veclib-llvm.sincos.ll
+122-01 files

LLVM/project 0b36d42llvm/lib/CodeGen TargetLoweringBase.cpp, llvm/lib/CodeGen/SelectionDAG DAGCombiner.cpp LegalizeVectorOps.cpp

[AArch64] Add vector expansion support for ISD::FCBRT when using ArmPL (#183750)

This patch teaches the backend how to lower the FCBRT DAG node to the
vector math library function when using ArmPL. This is similar to what
we already do for llvm.pow/FPOW, however the only way to expose this is
via a DAG combine that converts

  FPOW(<2 x double> %x, <2 x double> <double 1.0/3.0, double 1.0/3.0>)

into

  FCBRT(<2 x double> %x)

when the appropriate fast math flags are present on the node. I've
updated the DAG combine to handle vector types and only perform the
transformation if there exists a vector library variant of cbrt.
DeltaFile
+45-17llvm/lib/IR/RuntimeLibcalls.cpp
+39-2llvm/test/CodeGen/AArch64/veclib-llvm.pow.ll
+23-0llvm/lib/CodeGen/TargetLoweringBase.cpp
+15-1llvm/test/CodeGen/ARM/pow.ll
+9-3llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+10-0llvm/lib/CodeGen/SelectionDAG/LegalizeVectorOps.cpp
+141-233 files not shown
+152-249 files

LLVM/project 03a9ebcllvm/lib/CodeGen/SelectionDAG SelectionDAG.cpp, llvm/test/CodeGen/X86 known-never-zero.ll

[DAG] isKnownNeverZero - add ISD::UADDSAT/UMAX/UMIN DemandedElts handling and tests (#183992)

Fixes #183038

Adds `isKnownNeverZero` support for `UADDSAT`, `UMAX`, and `UMIN`. This
allows the compiler to prove a vector result is _non-zero_ by analyzing
only the demanded lanes of its operands.
DeltaFile
+7-5llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
+3-6llvm/test/CodeGen/X86/known-never-zero.ll
+10-112 files

LLVM/project 5e814e2mlir/lib/Dialect/LLVMIR/Transforms InlinerInterfaceImpl.cpp, mlir/test/Dialect/LLVMIR inlining.mlir

[mlir][llvm] Fix crash in LLVM inliner when callee has no recognized terminator (#183949)

When the callee of an llvm.call has a body block ending with an
unregistered op (rather than a recognized LLVM terminator like
llvm.return), the LLVM inliner's handleTerminator method was called with
that unregistered op and crashed via a cast<LLVM::ReturnOp>() assertion
or use-after-erase due to unresolved call result uses.

The root cause is that the generic MLIR verifier conservatively treats
unregistered ops as potential terminators (using mightHaveTrait), so
malformed IR of this shape passes verification. The inliner, however,
assumes that the callee's terminator is a recognized LLVM op.

Fix by adding a guard in LLVMInlinerInterface::isLegalToInline() that
refuses to inline a callee containing any block whose last operation
does not have the IsTerminator trait. This prevents the crash and leaves
the call site intact without any IR mutation.

Fixes #108363
Fixes #118766
DeltaFile
+39-0mlir/test/Dialect/LLVMIR/inlining.mlir
+14-0mlir/lib/Dialect/LLVMIR/Transforms/InlinerInterfaceImpl.cpp
+53-02 files

LLVM/project d1c563blldb/unittests/Target CMakeLists.txt

[lldb] Don't link TestingSupport as a component (#184310)

This doesn't work with dylib builds, because TestingSupport is not part
of the dylib. Instead, we should link it via LINK_LIBS, like other tests
already do.
DeltaFile
+1-1lldb/unittests/Target/CMakeLists.txt
+1-11 files

LLVM/project 4d3bdc0lldb/source/Commands CommandObjectWatchpoint.cpp

[lldb] Use AppendMessageWithFormatv in ComandObjectWatchpoint (#184128)

All of the AppendMessage... methods of CommandReturnObject automatically
add a newline, apart from AppendMessageWithFormat. This gets very
confusing when reviewing changes to commands.

While there are use cases for building a message as you go, controlling
when the newline is emitted, a lot of calls to AppendMessageWithFormat
include a newline at the end of the format string.

Such as in the watchpoint commands. So I've converted them to equivalent
AppendMessageWithFormatv calls so that:
* They have the less surprising behaviour re. newlines.
* They are in many cases more readable than the printf style notation.
DeltaFile
+19-19lldb/source/Commands/CommandObjectWatchpoint.cpp
+19-191 files

LLVM/project 91e73b9mlir/lib/Dialect/XeGPU/Transforms XeGPUPropagateLayout.cpp, mlir/test/Dialect/XeGPU resolve-layout-conflicts.mlir

[MLIR][XeGPU] Allow uniform vectors in layout conflict resolution (#183756)

DeltaFile
+12-0mlir/test/Dialect/XeGPU/resolve-layout-conflicts.mlir
+6-3mlir/lib/Dialect/XeGPU/Transforms/XeGPUPropagateLayout.cpp
+18-32 files

LLVM/project 8879ff1llvm/lib/CodeGen/MIRParser MIRParser.cpp, llvm/test/CodeGen/MIR/Generic machine-function-empty-name.mir machine-function-empty-name-no-matching-ir.mir

Support unnamed functions in MIR parser (#183018)

In this PR, unnamed machine functions in an MIR file are associated with
anonymous functions in the embedded LLVM IR according to the order in
which they are specified. If there are more unnamed machine functions
then there are LLVM IR functions, the parsing will fail by reporting the
original error message of `function ‘’ isn’t defined in the provided
LLVM IR`.

Closes #36511
DeltaFile
+31-0llvm/test/CodeGen/MIR/Generic/machine-function-empty-name.mir
+19-5llvm/lib/CodeGen/MIRParser/MIRParser.cpp
+19-0llvm/test/CodeGen/MIR/Generic/machine-function-empty-name-no-matching-ir.mir
+8-0llvm/test/CodeGen/MIR/Generic/machine-function-empty-name-no-ir-section.mir
+77-54 files

LLVM/project b4fffcdllvm/docs NVPTXUsage.rst

[NFC][Docs] Add documentation for NVPTX conversion intrinsics (#175536)

This change adds documentation for the NVPTX narrow floating-point
conversion intrinsics.

PTX ISA Reference:
https://docs.nvidia.com/cuda/parallel-thread-execution/#data-movement-and-conversion-instructions-cvt
DeltaFile
+177-0llvm/docs/NVPTXUsage.rst
+177-01 files

LLVM/project b2c46f9llvm/lib/Target/AMDGPU GCNSubtarget.cpp

init FlatOffsetBitWidth to 13 in initializeSubtargetDependencies
DeltaFile
+3-0llvm/lib/Target/AMDGPU/GCNSubtarget.cpp
+3-01 files

LLVM/project be3eb43llvm/lib/Target/AMDGPU AMDGPU.td AMDGPUFeatures.td

Revert "Option 2 - add two boolean AMDGPUSubtargetFeatures"

This reverts commit fff8d9ba3dab81f72bee954875b8fd40e7c7d90d.
DeltaFile
+0-12llvm/lib/Target/AMDGPU/AMDGPU.td
+10-0llvm/lib/Target/AMDGPU/AMDGPUFeatures.td
+1-0llvm/lib/Target/AMDGPU/AMDGPUSubtarget.h
+11-123 files

LLVM/project 5d8c6c1llvm/docs LangRef.rst

[LangRef] Mention allocation elision (#177592)

allockind / alloc-family enable allocation elision, but this was not
previously mentioned by LangRef.

Related discussion:
https://discourse.llvm.org/t/rfc-clarifying-semantic-assumptions-for-custom-allocators/89469

The semantics here are specified in terms of allowed transforms.
Making the semantics operational is tracked in #184102.
DeltaFile
+13-0llvm/docs/LangRef.rst
+13-01 files

LLVM/project b4743b2llvm/lib/Transforms/Vectorize VPlanTransforms.cpp VPlan.h

[VPlan] Introduce VPlan::get(Zero|AllOnes) (NFC) (#184085)

DeltaFile
+9-8llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
+8-0llvm/lib/Transforms/Vectorize/VPlan.h
+17-82 files

LLVM/project 39f2740llvm/lib/Target/AMDGPU AMDGPUIGroupLP.cpp

[AMDGPU] IGroupLP: Avoid repeating reachability checks in greedy algorithm (#182463)

In the greedy pipeline solver, the group cost is found using the
addEdges function and the edges must be removed from the DAG after
processing each group. The best group edges are then reinserted using
the same function. This repeats the costly reachability checks inside
the function which become problematic for pipelines with many
SchedGroups.

The algorithm is changed to remember the best group edges instead of
recomputing them. Additionally, SchedGroup::tryAddEdge is refactored to
avoid a redundant cycle check which is already performed by DAG->addEdge.
DeltaFile
+38-28llvm/lib/Target/AMDGPU/AMDGPUIGroupLP.cpp
+38-281 files

LLVM/project e8d6c40clang/lib/CIR/CodeGen CIRGenBuiltinAArch64.cpp

Address PR comments
DeltaFile
+18-13clang/lib/CIR/CodeGen/CIRGenBuiltinAArch64.cpp
+18-131 files

LLVM/project 09217balldb/test/API/lang/cpp/template TestTemplateArgs.py, lldb/test/API/python_api/event TestEvents.py

[lldb] Disable shared build for TestTemplateArgs,TestEvents,TestTypeList (#184304)

See https://github.com/llvm/llvm-project/pull/181720
DeltaFile
+1-0lldb/test/API/lang/cpp/template/TestTemplateArgs.py
+1-0lldb/test/API/python_api/event/TestEvents.py
+1-0lldb/test/API/python_api/type/TestTypeList.py
+3-03 files

LLVM/project c4e2f79llvm/lib/CodeGen/GlobalISel CombinerHelper.cpp, llvm/test/CodeGen/AArch64 srem-vec-crash.ll

[AArch64][GlobalISel] Limit srem by const of small sizes. (#184066)

The code in SignedDivisionByConstantInfo::get can only handle bitwidths
>= 3. This adds a check for bitwidth==1 for urem too, although it will
already have been simplified.
DeltaFile
+28-3llvm/test/CodeGen/AArch64/srem-vec-crash.ll
+4-2llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp
+32-52 files

LLVM/project 92bd6eelibc/src/stdio/baremetal file_internal.cpp CMakeLists.txt

[libc] Reland add getc, ungetc, fflush to enable libc++ iostream on baremetal (#183556)

After https://github.com/llvm/llvm-project/pull/168931 landed getc,
ungetc and fflush are still missing at link time while trying to make
libc++ std::cout work with LLVM libc on baremetal.

ungetc implementation is very minimal only to cover the current standard
streams implementation from the patch above.

The original PR https://github.com/llvm/llvm-project/pull/175530 caused
build failure on Windows because of too long command line in the
generated *.bat file which was fixed by
https://github.com/llvm/llvm-project/issues/182374
DeltaFile
+52-0libc/src/stdio/baremetal/file_internal.cpp
+37-1libc/src/stdio/baremetal/CMakeLists.txt
+31-0libc/src/stdio/baremetal/getc.cpp
+27-3libc/src/stdio/baremetal/file_internal.h
+22-0libc/src/stdio/baremetal/fflush.cpp
+20-0libc/src/stdio/baremetal/ungetc.cpp
+189-43 files not shown
+198-69 files

LLVM/project 0933b63llvm/lib/Target/AMDGPU AMDGPUIGroupLP.cpp

[AMDGPU] IGroupLP: Refactor SchedGroup::initSchedGroup (NFC) (#184122)

There are three overloaded SchedGroup::initSchedGroup functions, two of
which are only used for specific types of SchedGroups, namely
SCHED_BARRIER and SCHED_GROUP_BARRIER. This seems to have a led to some
confusion since the different functions perform checks which are not
needed for their intended restricted use cases. Furthermore, there are
several wrong comments surrounding those functions.

Simplify the functions and inline the actual initialization parts of the
SCHED_BARRIER and SCHED_GROUP_BARRIER variants at their only call sites.
Extract a function that finds the candidate SUnits for a given
SchedGroup and use this instead of initSchedGroup. Fix comments.
DeltaFile
+65-94llvm/lib/Target/AMDGPU/AMDGPUIGroupLP.cpp
+65-941 files

LLVM/project eb1e808llvm/test/CodeGen/X86/AMX amx-low-intrinsics.ll, llvm/test/Transforms/SLPVectorizer reduction-gather-non-scheduled-extracts.ll

[IR] Mark reduction intrinsics as nocreateundeforpoison (#184173)

In investigating #156233, it came up that select folds like here:
https://alive2.llvm.org/ce/z/Y6jzj6 cannot be carried out, or easily
fixed for now, because integer reductions do not propagate noundef, even
if their arguments are noundef. This patch adds this propagation.
DeltaFile
+22-1llvm/unittests/Analysis/ValueTrackingTest.cpp
+1-2llvm/test/Transforms/SLPVectorizer/reduction-gather-non-scheduled-extracts.ll
+1-2llvm/test/Transforms/SLPVectorizer/X86/extracts-non-extendable.ll
+1-1llvm/test/CodeGen/X86/AMX/amx-low-intrinsics.ll
+1-1llvm/test/Transforms/SLPVectorizer/X86/non-load-reduced-as-part-of-bv.ll
+1-1llvm/test/Transforms/SLPVectorizer/X86/reduction-logical.ll
+27-83 files not shown
+30-119 files

LLVM/project e68867alldb/packages/Python/lldbsuite/test rvvutil.py, lldb/test/API/riscv/rvv-consistency TestRVVConsistency.py

[lldb][RISCV][test] Add RVV API tests

Support RISC-V vector register context (3/3)

Add API tests for RISC-V vector extension support, covering:
- Register availability detection
- VCSR register consistency checks
- Register access
DeltaFile
+115-0lldb/test/API/riscv/rvv-consistency/TestRVVConsistency.py
+88-0lldb/test/API/riscv/rvv-side-effects/TestRVVSideEffects.py
+84-0lldb/test/API/riscv/rvv-vcsr-consistency/TestRVVConsistencyVCSR.py
+79-0lldb/test/API/riscv/rvv-vcsr-consistency/main.cpp
+69-0lldb/packages/Python/lldbsuite/test/rvvutil.py
+54-0lldb/test/API/riscv/rvv-printout/TestRVVPrintout.py
+489-017 files not shown
+821-023 files

LLVM/project c5dbaa8lldb/source/Plugins/Process/Linux NativeRegisterContextLinux_riscv64.cpp NativeRegisterContextLinux_riscv64.h, lldb/source/Plugins/Process/Utility RegisterInfoPOSIX_riscv64.cpp RegisterInfoPOSIX_riscv64.h

[lldb][RISCV] Support RVV register access

Support RISC-V vector register context (2/3)

Add support for reading and writing RISC-V vector (RVV) registers
through the native register context on Linux. This enables LLDB to
access all 32 vector registers (v0–v31) and the vector CSR registers
during debugging sessions.
DeltaFile
+131-4lldb/source/Plugins/Process/Linux/NativeRegisterContextLinux_riscv64.cpp
+47-1lldb/source/Plugins/Process/Utility/RegisterInfoPOSIX_riscv64.cpp
+34-4lldb/source/Plugins/Process/Utility/RegisterInfoPOSIX_riscv64.h
+13-0lldb/source/Plugins/Process/Linux/NativeRegisterContextLinux_riscv64.h
+4-0lldb/source/Plugins/Process/Utility/RegisterContextPOSIX_riscv64.cpp
+2-0lldb/source/Plugins/Process/Utility/RegisterContextPOSIX_riscv64.h
+231-96 files

LLVM/project 7c5306flldb/source/Plugins/Process/Utility RegisterInfos_riscv64.h lldb-riscv-register-enums.h

[lldb][RISCV] Add vector VCSR register definitions

Support RISC-V vector register context (1/3)

Add definitions for RISC-V vector CSRs to support RVV debugging.
This includes the vstart, vl, vtype, vcsr, and vlenb registers,
which control the vector operation state and behavior.
DeltaFile
+49-16lldb/source/Plugins/Process/Utility/RegisterInfos_riscv64.h
+9-1lldb/source/Plugins/Process/Utility/lldb-riscv-register-enums.h
+1-0lldb/source/Plugins/Process/Utility/RegisterInfoPOSIX_riscv64.cpp
+59-173 files

LLVM/project fff8d9bllvm/lib/Target/AMDGPU AMDGPU.td AMDGPUFeatures.td

Option 2 - add two boolean AMDGPUSubtargetFeatures
DeltaFile
+12-0llvm/lib/Target/AMDGPU/AMDGPU.td
+0-10llvm/lib/Target/AMDGPU/AMDGPUFeatures.td
+0-1llvm/lib/Target/AMDGPU/AMDGPUSubtarget.h
+12-113 files

LLVM/project 9cda407clang/docs ReleaseNotes.rst, clang/lib/Sema SemaCoroutine.cpp

[clang][Sema] Fix initialization of GRO when GRO-return type mismatches (CWG2563) (#179156)

This patch implements one piece of proposed solution to
[CWG2563](https://cplusplus.github.io/CWG/issues/2563.html):

> get-return-object-invocation is as follows:
> ...
> otherwise, get-return-object-invocation initializes a variable with
the exposition-only name gro as if by
> decltype(auto) gro = promise.get_return_object();

Close #98744
DeltaFile
+53-0clang/test/CodeGenCoroutines/coro-gro3.cpp
+2-1clang/lib/Sema/SemaCoroutine.cpp
+2-0clang/docs/ReleaseNotes.rst
+57-13 files

LLVM/project b675369clang/include/clang/Basic TargetCXXABI.h, clang/include/clang/StaticAnalyzer/Core AnalyzerOptions.h

[Clang][NFCI] Make unchanged global state const (#183478)

To avoid modifications to global state that does not currently need to
be modified, this patch makes a selection of trivial cases const. This
aims to help preserve the intention of these variables and reduce the
potential mutable global state surface of clang.

---------

Signed-off-by: Steffen Holst Larsen <sholstla at amd.com>
DeltaFile
+4-4clang/lib/CodeGen/CGBuiltin.cpp
+2-2clang/lib/AST/TypePrinter.cpp
+2-2clang/include/clang/StaticAnalyzer/Core/AnalyzerOptions.h
+2-2clang/include/clang/Basic/TargetCXXABI.h
+1-1clang/lib/CodeGen/ModuleBuilder.cpp
+1-1clang/lib/Options/DriverOptions.cpp
+12-126 files not shown
+18-1812 files

LLVM/project ecb694dclang/lib/Analysis UnsafeBufferUsage.cpp

[Clang][NFCI] Initialize PredefinedNames immediately (#183295)

In isPredefinedUnsafeLibcFunc the set of predefined names is initialized
lazily. However, this pattern is redundant as function-scope static
variables are initialized on first pass through the control flow. This
commit makes the variable constant, makes it a non-heap object, and
initializes it immediately. This has the following benefits:
- The initialization pattern cleaner and potentially easier for the
compiler to optimize.
- Making the variable const avoids it being used as mutable global
state.
- Having immediate initialization removes a potential race condition.

Signed-off-by: Steffen Holst Larsen <sholstla at amd.com>
DeltaFile
+72-75clang/lib/Analysis/UnsafeBufferUsage.cpp
+72-751 files

LLVM/project 5da653dllvm/lib/Target/AMDGPU AMDGPUSubtarget.h

Option 1 - set FlatOffsetBitWidth to 0 by default
DeltaFile
+1-1llvm/lib/Target/AMDGPU/AMDGPUSubtarget.h
+1-11 files

LLVM/project a631c3fmlir/test/Dialect/SPIRV/IR tosa-ops-verification.mlir

[mlir][spirv] Expand verifier testing for spirv.Tosa ops (#184112)

Also keep test order aligned with op definition order.

Signed-off-by: Davide Grohmann <davide.grohmann at arm.com>
DeltaFile
+163-58mlir/test/Dialect/SPIRV/IR/tosa-ops-verification.mlir
+163-581 files

LLVM/project 7fb5a02clang/include/clang/AST pch.h, clang/lib/AST CMakeLists.txt

[CMake][AST] Add PCH (#183358)

Add frequently used expensive headers from clang/AST to a PCH.

Results in a 13% stage2-clang build time improvement.
DeltaFile
+32-0clang/include/clang/AST/pch.h
+3-0clang/lib/AST/CMakeLists.txt
+35-02 files