LLVM/project 3034c09clang/lib/Format UnwrappedLineParser.cpp, clang/unittests/Format FormatTest.cpp

[clang-format] bugfix: Whitesmiths with IndentAccessModifiers (#182432)

Due to special handling of Whitesmiths when parsing, the additional
level(s) needed for the block, when used with IndentAccessModifiers,
were not being applied. Consequently, when calculating the access
modifier indent offset, the modifiers were being placed at the class
level.

This change ensures that the additional level(s) are not omitted for
Whitesmiths.
DeltaFile
+17-0clang/unittests/Format/FormatTest.cpp
+9-5clang/lib/Format/UnwrappedLineParser.cpp
+26-52 files

LLVM/project e61b516clang/test/CodeGenOpenCL cl-uniform-wg-size.cl amdgpu-enqueue-kernel.cl, llvm/lib/IR AutoUpgrade.cpp

[AMDGPU] Make uniform-work-group-size a valueless attribute

The "uniform-work-group-size" function attribute previously took a
string value of "true" or "false". Since presence alone can convey
the "true" semantics and absence can convey "false", the value is
unnecessary.

This patch converts it to a valueless string attribute: presence
indicates true, absence indicates false. For backward compatibility,
auto-upgrade logic is added in both UpgradeAttributes (bitcode) and
UpgradeFunctionAttributes: if the old value is "true", the attribute
is kept without a value; if "false", the attribute is removed.

All setters (Clang CodeGen, OMPIRBuilder, AMDGPUAttributor, ROCDL
translation) and readers (AMDGPUAttributor, AMDGPULowerKernelAttributes,
AMDGPUHSAMetadataStreamer) are updated accordingly. The attribute is
also documented in the AMDGPU LLVM IR Attributes table where it was
previously missing.
DeltaFile
+43-17clang/test/CodeGenOpenCL/cl-uniform-wg-size.cl
+24-26clang/test/CodeGenOpenCL/amdgpu-enqueue-kernel.cl
+21-0llvm/test/Bitcode/upgrade-uniform-work-group-size.ll
+21-0llvm/lib/IR/AutoUpgrade.cpp
+4-9llvm/lib/Target/AMDGPU/AMDGPUAttributor.cpp
+5-6llvm/test/CodeGen/AMDGPU/uniform-work-group-propagate-attribute.ll
+118-5845 files not shown
+196-13851 files

LLVM/project 4d724c0llvm/test/CodeGen/X86 known-never-zero.ll

[X86] known-never-zero.ll - add tests showing failure to handle ISD::EXTRACT_VECTOR_ELT nodes (#183934)

DeltaFile
+70-0llvm/test/CodeGen/X86/known-never-zero.ll
+70-01 files

LLVM/project 1909e43mlir/lib/Dialect/GPU/IR GPUDialect.cpp, mlir/test/Dialect/GPU invalid.mlir

[mlir][GPU] Fix crash in WarpExecuteOnLane0Op::verify with wrong terminator (#183930)

WarpExecuteOnLane0Op::verify() called getTerminator() which performed an
unconditional cast<gpu::YieldOp> on the block's last operation. When the
op body was written with a different terminator (e.g. affine.yield), the
cast asserted immediately instead of emitting a verifier diagnostic.

Fix by using dyn_cast in verify() before calling getTerminator(), and
emitting a proper error message when the terminator is not gpu.yield.

Add a regression test to invalid.mlir.

Fixes #181450
DeltaFile
+17-0mlir/test/Dialect/GPU/invalid.mlir
+3-1mlir/lib/Dialect/GPU/IR/GPUDialect.cpp
+20-12 files

LLVM/project 889714aclang/test/CodeGenOpenCL cl-uniform-wg-size.cl amdgpu-enqueue-kernel.cl, llvm/lib/IR AutoUpgrade.cpp

[AMDGPU] Make uniform-work-group-size a valueless attribute

The "uniform-work-group-size" function attribute previously took a
string value of "true" or "false". Since presence alone can convey
the "true" semantics and absence can convey "false", the value is
unnecessary.

This patch converts it to a valueless string attribute: presence
indicates true, absence indicates false. For backward compatibility,
auto-upgrade logic is added in both UpgradeAttributes (bitcode) and
UpgradeFunctionAttributes: if the old value is "true", the attribute
is kept without a value; if "false", the attribute is removed.

All setters (Clang CodeGen, OMPIRBuilder, AMDGPUAttributor, ROCDL
translation) and readers (AMDGPUAttributor, AMDGPULowerKernelAttributes,
AMDGPUHSAMetadataStreamer) are updated accordingly. The attribute is
also documented in the AMDGPU LLVM IR Attributes table where it was
previously missing.
DeltaFile
+41-17clang/test/CodeGenOpenCL/cl-uniform-wg-size.cl
+24-26clang/test/CodeGenOpenCL/amdgpu-enqueue-kernel.cl
+21-0llvm/lib/IR/AutoUpgrade.cpp
+21-0llvm/test/Bitcode/upgrade-uniform-work-group-size.ll
+4-9llvm/lib/Target/AMDGPU/AMDGPUAttributor.cpp
+5-6llvm/test/CodeGen/AMDGPU/uniform-work-group-propagate-attribute.ll
+116-5845 files not shown
+194-13851 files

LLVM/project e27bbd7clang/test/CodeGenOpenCL cl-uniform-wg-size.cl

[NFC][Clang] Auto generate check lines for `clang/test/CodeGenOpenCL/cl-uniform-wg-size.cl`
DeltaFile
+31-14clang/test/CodeGenOpenCL/cl-uniform-wg-size.cl
+31-141 files

LLVM/project 2430410lldb/docs index.rst, lldb/source/Plugins/Process/FreeBSD-Kernel-Core RegisterContextFreeBSDKernelCore_ppc64le.cpp RegisterContextFreeBSDKernelCore_ppc64le.h

[lldb][Process/FreeBSDKernelCore] Add ppc64le support (#180669)

This is LLDB version of
https://cgit.freebsd.org/ports/tree/devel/gdb/files/kgdb/ppcfbsd-kern.c.
This enables selecting ppc64le and reading registers from PCB structure
on core dump and live kernel debugging. FPU registers aren't supported
yet due to pcb structure issue, but this change still achieves feature
parity with KGDB. Trapframe unwinding support will be implemented in
future. Test files using core dump from ppc64le will be implemented once
other kernel debugging improvements are done.

---------

Signed-off-by: Minsoo Choo <minsoochoo0122 at proton.me>
DeltaFile
+95-0lldb/source/Plugins/Process/FreeBSD-Kernel-Core/RegisterContextFreeBSDKernelCore_ppc64le.cpp
+33-0lldb/source/Plugins/Process/FreeBSD-Kernel-Core/RegisterContextFreeBSDKernelCore_ppc64le.h
+7-0lldb/source/Plugins/Process/FreeBSD-Kernel-Core/ThreadFreeBSDKernelCore.cpp
+1-1lldb/docs/index.rst
+1-1llvm/docs/ReleaseNotes.md
+1-0lldb/source/Plugins/Process/FreeBSD-Kernel-Core/CMakeLists.txt
+138-26 files

LLVM/project 4a93b9allvm/lib/Target/ARM ARMISelLowering.cpp, llvm/test/CodeGen/ARM fp-intrinsics-vector-v8.ll

[ARM] Lower strictfp vector fp16 rounding operations similar to default mode (#183700)

Previously the strictfp rounding nodes were lowered using unrolling to
scalar operations, which has negative impact on performance. Partially
this issue was fixed in #180480, this change continues that work and
implements optimized lowering for v4f16 and v8f16.
DeltaFile
+10-220llvm/test/CodeGen/ARM/fp-intrinsics-vector-v8.ll
+7-12llvm/lib/Target/ARM/ARMISelLowering.cpp
+17-2322 files

LLVM/project a6ceae4llvm/lib/Target/AMDGPU AMDGPUPromoteAlloca.cpp

[AMDGPU] Assert non-array alloca does have a size (#183834)

Refs
https://github.com/llvm/llvm-project/pull/179523/changes#r2851952141
DeltaFile
+1-2llvm/lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp
+1-21 files

LLVM/project f898469clang/test/CodeGenOpenCL cl-uniform-wg-size.cl amdgpu-enqueue-kernel.cl, llvm/lib/IR AutoUpgrade.cpp

[AMDGPU] Make uniform-work-group-size a valueless attribute

The "uniform-work-group-size" function attribute previously took a
string value of "true" or "false". Since presence alone can convey
the "true" semantics and absence can convey "false", the value is
unnecessary.

This patch converts it to a valueless string attribute: presence
indicates true, absence indicates false. For backward compatibility,
auto-upgrade logic is added in both UpgradeAttributes (bitcode) and
UpgradeFunctionAttributes: if the old value is "true", the attribute
is kept without a value; if "false", the attribute is removed.

All setters (Clang CodeGen, OMPIRBuilder, AMDGPUAttributor, ROCDL
translation) and readers (AMDGPUAttributor, AMDGPULowerKernelAttributes,
AMDGPUHSAMetadataStreamer) are updated accordingly. The attribute is
also documented in the AMDGPU LLVM IR Attributes table where it was
previously missing.
DeltaFile
+43-17clang/test/CodeGenOpenCL/cl-uniform-wg-size.cl
+24-26clang/test/CodeGenOpenCL/amdgpu-enqueue-kernel.cl
+21-0llvm/lib/IR/AutoUpgrade.cpp
+21-0llvm/test/Bitcode/upgrade-uniform-work-group-size.ll
+4-9llvm/lib/Target/AMDGPU/AMDGPUAttributor.cpp
+5-6llvm/test/CodeGen/AMDGPU/uniform-work-group-propagate-attribute.ll
+118-5845 files not shown
+196-13851 files

LLVM/project 5d6410fclang/test/CodeGenOpenCL cl-uniform-wg-size.cl

[NFC][Clang] Auto generate check lines for `clang/test/CodeGenOpenCL/cl-uniform-wg-size.cl`
DeltaFile
+46-14clang/test/CodeGenOpenCL/cl-uniform-wg-size.cl
+46-141 files

LLVM/project 3d086f5clang/lib/CIR/CodeGen CIRGenExprComplex.cpp, clang/test/CIR/CodeGen implicit-value-init-expr.cpp

[CIR] Implement ImplicitValueInitExpr for ComplexType (#183836)

Implement ImplicitValueInitExpr for ComplexType
DeltaFile
+25-0clang/test/CIR/CodeGen/implicit-value-init-expr.cpp
+3-3clang/lib/CIR/CodeGen/CIRGenExprComplex.cpp
+28-32 files

LLVM/project 7585ab0llvm/lib/Target/AMDGPU GCNSubtarget.h, llvm/test/CodeGen/AMDGPU hazard-shift64.mir

[AMDGPU] Enable shift64 hazard recognition for gfx9 (#183839)

Enable shift64 hazard recognition for gfx9 cores.

---------

Signed-off-by: John Lu <John.Lu at amd.com>
DeltaFile
+1-3llvm/lib/Target/AMDGPU/GCNSubtarget.h
+2-0llvm/test/CodeGen/AMDGPU/hazard-shift64.mir
+3-32 files

LLVM/project d5a8f1ellvm/test/CodeGen/X86 known-pow2.ll

[X86] known-pow2.ll - add tests showing failure to handle ISD::EXTRACT_VECTOR_ELT nodes (#183918)

DeltaFile
+49-0llvm/test/CodeGen/X86/known-pow2.ll
+49-01 files

LLVM/project ddfbf52llvm/lib/Target/ARM ARMISelLowering.cpp, llvm/test/CodeGen/ARM fp-intrinsics-vector-v8.ll

Lower strictfp vector rounding operations similar to default mode

Previously the strictfp rounding nodes were lowered using unrolling to
scalar operations, which has negative impact on performance. Partially
this issue was fixed in #180480, this change continues that work and
implements optimized lowering for v4f16 and v8f16.
DeltaFile
+10-220llvm/test/CodeGen/ARM/fp-intrinsics-vector-v8.ll
+7-12llvm/lib/Target/ARM/ARMISelLowering.cpp
+17-2322 files

LLVM/project 86df0eemlir/include/mlir/Support InterfaceSupport.h

Experiment: do not use fold expression
DeltaFile
+18-5mlir/include/mlir/Support/InterfaceSupport.h
+18-51 files

LLVM/project f9150cdmlir/include/mlir/IR BuiltinTypeInterfaces.td BuiltinAttributes.td, mlir/lib/AsmParser AttributeParser.cpp

[mlir][IR] Generalize `DenseElementsAttr` to custom element types (#179122)

`DenseElementsAttr` supports only a hard-coded list of element types:
`int`, `index`, `float`, `complex`. This commit generalizes the
`DenseElementsAttr` infrastructure: it now supports arbitrary element
types, as long as they implement the new `DenseElementTypeInterface`.

The `DenseElementTypeInterface` has the following helper functions:
- `getDenseElementBitSize`: Query the size of an element in bits. (When
storing an element in memory, each element is padded to a full byte.
This is an existing limitation of the `DenseElementsAttr`; with an
exception for `i1`.)
- `convertToAttribute`: Attribute factory / deserializer. Converts bytes
into an MLIR attribute. The attribute provides the assembly format /
printer for a single element.
- `convertFromAttribute`: Serializer. Converts an MLIR attribute into
bytes.

Note: `convertToAttribute` / `convertFromAttribute` are mainly for

    [23 lines not shown]
DeltaFile
+124-1mlir/lib/AsmParser/AttributeParser.cpp
+25-92mlir/lib/IR/BuiltinAttributes.cpp
+87-0mlir/lib/IR/BuiltinTypes.cpp
+83-0mlir/test/IR/dense-elements-type-interface.mlir
+74-1mlir/include/mlir/IR/BuiltinTypeInterfaces.td
+32-13mlir/include/mlir/IR/BuiltinAttributes.td
+425-1078 files not shown
+579-11914 files

LLVM/project 5b64aebmlir/include/mlir/IR BuiltinTypeInterfaces.td BuiltinAttributes.td, mlir/lib/AsmParser AttributeParser.cpp

Revert "[mlir][IR] Generalize `DenseElementsAttr` to custom element types" (#183917)

Reverts llvm/llvm-project#183891

Reverting a second time. The build bot failure seems to be
non-deterministic.
DeltaFile
+1-124mlir/lib/AsmParser/AttributeParser.cpp
+92-25mlir/lib/IR/BuiltinAttributes.cpp
+0-87mlir/lib/IR/BuiltinTypes.cpp
+0-83mlir/test/IR/dense-elements-type-interface.mlir
+1-74mlir/include/mlir/IR/BuiltinTypeInterfaces.td
+13-32mlir/include/mlir/IR/BuiltinAttributes.td
+107-4258 files not shown
+119-57914 files

LLVM/project 2342db0lld/tools/lld CMakeLists.txt, llvm/cmake/modules LLVM-Config.cmake

[CMake] Use keyword signature in two additional callsites (#183889)

Fix-forward for https://github.com/llvm/llvm-project/pull/183541.
Two callsites to target_link_libraries were not migrated to the
keyword signature.

Signed-off-by: Itay Bookstein <itay.bookstein at nextsilicon.com>
DeltaFile
+4-1llvm/cmake/modules/LLVM-Config.cmake
+1-1lld/tools/lld/CMakeLists.txt
+5-22 files

LLVM/project be4a51dmlir/include/mlir/IR BuiltinTypeInterfaces.td BuiltinAttributes.td, mlir/lib/AsmParser AttributeParser.cpp

Revert "[mlir][IR] Generalize `DenseElementsAttr` to custom element types (#1…"

This reverts commit e655c36c16c118e3f8ae0c95854f33119218a4bf.
DeltaFile
+1-124mlir/lib/AsmParser/AttributeParser.cpp
+92-25mlir/lib/IR/BuiltinAttributes.cpp
+0-87mlir/lib/IR/BuiltinTypes.cpp
+0-83mlir/test/IR/dense-elements-type-interface.mlir
+1-74mlir/include/mlir/IR/BuiltinTypeInterfaces.td
+13-32mlir/include/mlir/IR/BuiltinAttributes.td
+107-4258 files not shown
+119-57914 files

LLVM/project 225b56emlir/lib/Conversion/VectorToLLVM ConvertVectorToLLVM.cpp, mlir/test/Conversion/VectorToLLVM vector-to-llvm-interface.mlir

[mlir][VectorToLLVM] Fix crash in VectorInsertOpConversion with dynamic index (#183783)

VectorInsertOpConversion crashes with an assertion failure when
inserting a sub-vector at a dynamic position into a multi-dimensional
vector. The pattern calls getAsIntegers() on the position, which asserts
that all fold results are compile-time constant attributes.

The existing guard (checking llvm::IsaPred<Attribute>) only covered the
case where a scalar is inserted into the innermost dimension (the
extractvalue path). The guard was missing for the insertvalue path when
inserting a sub-vector at a dynamic position into a nested aggregate.

Fix: add the same guard before the llvm.insertvalue creation to return
failure() gracefully when any position index is dynamic, matching the
behavior of VectorExtractOpConversion.

Fixes #177829
DeltaFile
+14-0mlir/test/Conversion/VectorToLLVM/vector-to-llvm-interface.mlir
+5-0mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVM.cpp
+19-02 files

LLVM/project 2f7c947llvm/test/CodeGen/ARM fp-intrinsics-vector-v8.ll

Precommit tests: strictfp rounding vector f16 intrinsics (#183699)

DeltaFile
+361-1llvm/test/CodeGen/ARM/fp-intrinsics-vector-v8.ll
+361-11 files

LLVM/project e655c36mlir/include/mlir/IR BuiltinTypeInterfaces.td BuiltinAttributes.td, mlir/lib/AsmParser AttributeParser.cpp

[mlir][IR] Generalize `DenseElementsAttr` to custom element types (#183891)

`DenseElementsAttr` supports only a hard-coded list of element types:
`int`, `index`, `float`, `complex`. This commit generalizes the
`DenseElementsAttr` infrastructure: it now supports arbitrary element
types, as long as they implement the new `DenseElementTypeInterface`.

The `DenseElementTypeInterface` has the following helper functions:
- `getDenseElementBitSize`: Query the size of an element in bits. (When
storing an element in memory, each element is padded to a full byte.
This is an existing limitation of the `DenseElementsAttr`; with an
exception for `i1`.)
- `convertToAttribute`: Attribute factory / deserializer. Converts bytes
into an MLIR attribute. The attribute provides the assembly format /
printer for a single element.
- `convertFromAttribute`: Serializer. Converts an MLIR attribute into
bytes.

Note: `convertToAttribute` / `convertFromAttribute` are mainly for

    [26 lines not shown]
DeltaFile
+124-1mlir/lib/AsmParser/AttributeParser.cpp
+25-92mlir/lib/IR/BuiltinAttributes.cpp
+87-0mlir/lib/IR/BuiltinTypes.cpp
+83-0mlir/test/IR/dense-elements-type-interface.mlir
+74-1mlir/include/mlir/IR/BuiltinTypeInterfaces.td
+32-13mlir/include/mlir/IR/BuiltinAttributes.td
+425-1078 files not shown
+579-11914 files

LLVM/project 72525fbllvm/lib/Transforms/Vectorize VPlanTransforms.cpp VPlanUnroll.cpp

[VPlan] Materialize UF after unrolling (NFCI).

Move materialization of the symbolic UF directly to unrollByUF. At this
point, unrolling materializes the decision and it is natural to also
materialize the symbolic UF here.
DeltaFile
+3-5llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
+4-1llvm/lib/Transforms/Vectorize/VPlanUnroll.cpp
+7-62 files

LLVM/project 94ebc8allvm/test/Transforms/LoopVectorize find-last-iv-sinkable-load.ll

[LV] Remove duplicated IV expression sinking tests. (NFC)

Remove duplicated tests already covered by
llvm/test/Transforms/LoopVectorize/find-last-iv-sinkable-expr.ll.
DeltaFile
+0-334llvm/test/Transforms/LoopVectorize/find-last-iv-sinkable-load.ll
+0-3341 files

LLVM/project 0b61f15llvm/test/CodeGen/AArch64 fcvt-i256.ll

[AArch64] Add fcvt-i256 test cases. NFC
DeltaFile
+2,157-0llvm/test/CodeGen/AArch64/fcvt-i256.ll
+2,157-01 files

LLVM/project 903acc2clang/lib/CodeGen ItaniumCXXABI.cpp, clang/test/DebugInfo/CXX ptrauth-member-function-pointer-debuglocs.cpp

[AArch64][PAC] Emit `!dbg` locations in `*_vfpthunk_` functions (#179688)

The usage of pointers to member functions with Pointer Authentication
requires generation of `*_vfpthunk_` functions. These thunk functions
can be later inlined and optimized by replacing the indirect call
instruction with a direct one and then inlining that function call.

In absence of `!dbg` metadata attached to the original call instruction,
such inlining ultimately results in an assertion "!dbg attachment points
at wrong subprogram for function" in the assertions-enabled builds. By
manually executing `opt` with `-verify-each` option on the LLVM IR
produced by the frontend, an actual issue can be observed: "inlinable
function call in a function with debug info must have a !dbg location"
after the replacement of indirect call instruction with the direct one
takes place.

This commit fixes the issue by attaching artificial `!dbg` locations to
the original call instruction (as well as most other instructions in
`*_vfpthunk_` function) the same way it is done for other
compiler-generated helper functions.
DeltaFile
+39-0clang/test/DebugInfo/CXX/ptrauth-member-function-pointer-debuglocs.cpp
+4-0clang/lib/CodeGen/ItaniumCXXABI.cpp
+43-02 files

LLVM/project b3be782mlir/lib/Dialect/Affine/IR AffineOps.cpp, mlir/test/Dialect/Affine canonicalize.mlir

[mlir][affine] Fix crash in linearize_index fold when multi-index is ub.poison (#183816)

`AffineLinearizeIndexOp::fold` guarded the constant-folding path with
`llvm::is_contained(adaptor.getMultiIndex(), nullptr)`, which only
catches operands that have not been evaluated at all. When an operand
folds to `ub.PoisonAttr`, the attribute is non-null so the guard passed,
and the subsequent `cast<IntegerAttr>(indexAttr)` call crashed with an
assertion failure.

Fix by replacing the null-only check with one that requires every
multi-index attribute to be a concrete `IntegerAttr`, returning
`nullptr` for any other attribute (including null and PoisonAttr).

Fixes #178204
DeltaFile
+14-0mlir/test/Dialect/Affine/canonicalize.mlir
+6-1mlir/lib/Dialect/Affine/IR/AffineOps.cpp
+20-12 files

LLVM/project f05b705mlir/test/IR visitors.mlir

[mlir] Fix crash in testNoSkipErasureCallbacks on empty blocks (#183757)

The `noSkipBlockErasure` callback in `testNoSkipErasureCallbacks` called
`block->front().getParentRegion()` to get the parent region of a block.
This dereferences the ilist sentinel node when the block has no
operations, triggering an assertion failure.

Use `block->getParent()` instead, which directly returns the region
containing the block without requiring any operations to be present.

Fixes #183511
DeltaFile
+10-0mlir/test/IR/visitors.mlir
+10-01 files

LLVM/project 2456214llvm/lib/ProfileData/Coverage CoverageMapping.cpp, llvm/test/tools/llvm-cov mcdc-macro.test

Restore #125407, Make covmap tolerant of nested Decisions (#183073)

Change(s):

- Suppress range errors in CounterExpr
DeltaFile
+144-174llvm/lib/ProfileData/Coverage/CoverageMapping.cpp
+7-7llvm/test/tools/llvm-cov/mcdc-macro.test
+151-1812 files