LLVM/project e778083llvm/utils/gn/secondary/libcxx/include BUILD.gn

[gn build] Port b285745dc1a4
DeltaFile
+1-0llvm/utils/gn/secondary/libcxx/include/BUILD.gn
+1-01 files

LLVM/project 0e4ce03llvm/utils/gn/secondary/llvm/unittests/CodeGen BUILD.gn

[gn build] Port 23a051301541
DeltaFile
+1-0llvm/utils/gn/secondary/llvm/unittests/CodeGen/BUILD.gn
+1-01 files

LLVM/project f8e5908llvm/utils/gn/secondary/clang/lib/StaticAnalyzer/Checkers BUILD.gn

[gn build] Port 2158b83f61a7
DeltaFile
+1-0llvm/utils/gn/secondary/clang/lib/StaticAnalyzer/Checkers/BUILD.gn
+1-01 files

LLVM/project fcd0ef0llvm/lib/Target/AArch64 AArch64InstrInfo.td, llvm/test/CodeGen/AArch64 arm64-cvt-simd-fptoi.ll round-conv.ll

[AArch64] Fuse froundeven+convert into single instruction (#177800)

Stacked on https://github.com/llvm/llvm-project/pull/177799.

We're already able to fuse `fceil`, `ffloor`, `ftrunc`, and `fround`
followed by a float-to-int conversion into a single "rounded conversion"
instruction. However, we were not doing this for `froundeven`, even
though there's a "convert to integer, rounding to even" instruction
(`FCVTNS`/`FCVTNU`).
DeltaFile
+40-80llvm/test/CodeGen/AArch64/arm64-cvt-simd-fptoi.ll
+8-16llvm/test/CodeGen/AArch64/round-conv.ll
+8-16llvm/test/CodeGen/AArch64/round-fptosi-sat-scalar.ll
+8-16llvm/test/CodeGen/AArch64/round-fptoui-sat-scalar.ll
+10-8llvm/lib/Target/AArch64/AArch64InstrInfo.td
+74-1365 files

LLVM/project e5eed4fmlir/lib/Bindings/Python DialectTransform.cpp, mlir/python/mlir/dialects ext.py

Formatting and minor load change
DeltaFile
+3-9mlir/python/mlir/dialects/ext.py
+6-3mlir/lib/Bindings/Python/DialectTransform.cpp
+9-122 files

LLVM/project 319b089llvm/lib/Target/AMDGPU SIWholeQuadMode.cpp

Formatting fixed.
DeltaFile
+2-3llvm/lib/Target/AMDGPU/SIWholeQuadMode.cpp
+2-31 files

LLVM/project 90cc5a4clang/docs ClangFormatStyleOptions.rst ReleaseNotes.rst, clang/include/clang/Format Format.h

[clang-format] Add ObjCSpaceBeforeMethodDeclColon option to control space before Objective-C method return type (#170579)

[clang-format] Add ObjCSpaceBeforeMethodDeclColon option to control
space before Objective-C method return type

This patch introduces the ObjCSpaceBeforeMethodDeclColon style option,
allowing users to add or remove a space between the '-'/'+' and the
return type in Objective-C method declarations (e.g., '- (void)method'
vs '-(void)method').

Includes documentation and unit tests.
DeltaFile
+12-0clang/docs/ClangFormatStyleOptions.rst
+12-0clang/include/clang/Format/Format.h
+8-0clang/unittests/Format/FormatTest.cpp
+3-0clang/lib/Format/Format.cpp
+2-0clang/docs/ReleaseNotes.rst
+1-1clang/lib/Format/TokenAnnotator.cpp
+38-11 files not shown
+39-17 files

LLVM/project 72147b4llvm/lib/Target/RISCV RISCVInstrInfoV.td RISCVInstrFormatsV.td

[RISCV] Remove RVInstV2. NFC (#177901)

It only as 2 users and RVInstV2 isn't a good name. We can use RVInstVX
and set vs2 to 0 at the definition.
DeltaFile
+14-6llvm/lib/Target/RISCV/RISCVInstrInfoV.td
+0-9llvm/lib/Target/RISCV/RISCVInstrFormatsV.td
+14-152 files

LLVM/project 94f8a32utils/bazel/llvm-project-overlay/lldb BUILD.bazel, utils/bazel/llvm-project-overlay/lldb/source/Plugins BUILD.bazel

[bazel] Fix macOS lldb BUILD (#178274)

Missing dep + a layering_check violation in macOS only deps (not tested
on current CI)
DeltaFile
+1-0utils/bazel/llvm-project-overlay/lldb/BUILD.bazel
+1-0utils/bazel/llvm-project-overlay/lldb/source/Plugins/BUILD.bazel
+2-02 files

LLVM/project d1e588dlibc/src/stdio snprintf_modular.cpp sprintf_modular.cpp, libc/src/stdio/baremetal vprintf_modular.cpp printf_modular.cpp

[libc] Modular printf option (float only)

This adds LIBC_CONF_PRINTF_MODULAR, which causes floating point support
(later, others) to be weakly linked into the implementation.
__printf_modular becomes the main entry point of the implementaiton, an
printf itself wraps __printf_modular. printf it also contains a
BFD_RELOC_NONE relocation to bring in the float aspect.

See issue #146159 for context.
DeltaFile
+67-0libc/src/stdio/baremetal/vprintf_modular.cpp
+45-13libc/src/stdio/printf_core/parser.h
+56-0libc/src/stdio/snprintf_modular.cpp
+56-0libc/src/stdio/printf_core/float_impl.cpp
+55-0libc/src/stdio/sprintf_modular.cpp
+55-0libc/src/stdio/baremetal/printf_modular.cpp
+334-1342 files not shown
+772-3348 files

LLVM/project 85812fdpolly/lib/Transform ZoneAlgo.cpp

[Polly][DeLICM] Check for error state (#178281)

When the ISL max-operations is exceeded, `is_wrapping` will return an
error state. Propagate the error state to the caller.

Fixes #175953
DeltaFile
+4-1polly/lib/Transform/ZoneAlgo.cpp
+4-11 files

LLVM/project 4239e85llvm/lib/Target/AArch64 AArch64TargetTransformInfo.h AArch64InstrInfo.td, llvm/test/CodeGen/AArch64 nontemporal-load.ll

[AArch64] Align nontemporal store/load little-endian checks (#177468)

This patch aims to align all nontemporal store/load handling to
systematically enforce a little-endian target. This has been the
effective support LLVM had for NT store/load lowering (there has been no
effective support for big-endian, even with the inconsistencies).

The change in `llvm/lib/Target/AArch64/AArch64InstrInfo.td` is
effectively a NFC, because the only lowering of LDNP, in
`llvm/lib/Target/AArch64/AArch64ISelLowering.cpp`, have already checked
for `isLittleEndian`. The change in
`llvm/lib/Target/AArch64/AArch64TargetTransformInfo.h` affects its
single caller
`llvm/lib/Transforms/Vectorize/LoopVectorizationLegality.cpp`. The
previous logic has been wrong, enabling vectorization of effectively
illegal nontemporal store/load instructions on big-endian.
DeltaFile
+189-189llvm/test/CodeGen/AArch64/nontemporal-load.ll
+115-65llvm/test/Transforms/LoopVectorize/AArch64/nontemporal-load-store.ll
+19-6llvm/lib/Target/AArch64/AArch64TargetTransformInfo.h
+14-2llvm/lib/Target/AArch64/AArch64InstrInfo.td
+10-0llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+347-2625 files

LLVM/project 5ac945fllvm/lib/Target/AMDGPU SIWholeQuadMode.cpp, llvm/test/CodeGen/AMDGPU wqm-debug-instr.mir

[AMDGPU] Fix crash in SIWholeQuadMode with debug instructions.

The prepareInsertion function was crashing when debug instructions
appeared at positions being queried for slot indices. Debug instructions
don't have entries in the slot index map, so getInstructionIndex would
fail with an assertion.

Fixes SWDEV-480902.
DeltaFile
+122-0llvm/test/CodeGen/AMDGPU/wqm-debug-instr.mir
+12-6llvm/lib/Target/AMDGPU/SIWholeQuadMode.cpp
+134-62 files

LLVM/project e470fa2clang/test/CodeGen stack-protector-vars.cpp

Add -emit-llvm to RUN line for test added in #173311. (#178271)

Should fix the test failure on buildbots that do not build the x86
backend such as
https://lab.llvm.org/buildbot/#/builders/190/builds/35171 and
https://lab.llvm.org/buildbot/#/builders/154/builds/26976.
DeltaFile
+1-1clang/test/CodeGen/stack-protector-vars.cpp
+1-11 files

LLVM/project bc6aab2llvm/lib/Target/AArch64 AArch64InstrInfo.td, llvm/test/CodeGen/AArch64 round-fptosi-sat-scalar.ll round-fptoui-sat-scalar.ll

[AArch64] Add missing GlobalISel patterns to round+convert multiclass (#177799)

This allows GlobalISel to fuse floating point round+convert operations
in the same way as SelectionDAG.
DeltaFile
+70-510llvm/test/CodeGen/AArch64/round-fptosi-sat-scalar.ll
+70-350llvm/test/CodeGen/AArch64/round-fptoui-sat-scalar.ll
+15-0llvm/lib/Target/AArch64/AArch64InstrInfo.td
+155-8603 files

LLVM/project f2921e5llvm/lib/Transforms/InstCombine InstCombineCompares.cpp, llvm/test/Transforms/InstCombine icmp-with-selects.ll

[InstCombine][profcheck] More fixes for missing branch data in InstCombineCompares.cpp (#178084)

Again, these fixes are trivial as we're creating new select instructions
with predicates from existing select instructions.

In this case, we create one select instruction from two existing select
instructions, but since both existing select instructions have the same
predicate, their profile data should be the same, so we can reuse the
profile data from either instruction. Therefore, we arbitrarily reuse
the profile data from the first select instruction.

Tracking issue: #147390
DeltaFile
+20-11llvm/test/Transforms/InstCombine/icmp-with-selects.ll
+6-2llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
+0-1llvm/utils/profcheck-xfail.txt
+26-143 files

LLVM/project f7968dallvm/lib/Target/AMDGPU SIWholeQuadMode.cpp, llvm/test/CodeGen/AMDGPU wqm-debug-instr.mir

[AMDGPU] Fix crash in SIWholeQuadMode with debug instructions.

The prepareInsertion function was crashing when debug instructions
appeared at positions being queried for slot indices. Debug instructions
don't have entries in the slot index map, so getInstructionIndex would
fail with an assertion.
DeltaFile
+122-0llvm/test/CodeGen/AMDGPU/wqm-debug-instr.mir
+12-6llvm/lib/Target/AMDGPU/SIWholeQuadMode.cpp
+134-62 files

LLVM/project 5780afbllvm/include/llvm/CodeGen LiveIntervals.h, llvm/lib/CodeGen LiveIntervals.cpp

Use yet another allocator for LiveRanges

Not sure it's worth it for these, there should never be all that
many. We could pre-allocate the maximum size up front.
DeltaFile
+5-3llvm/include/llvm/CodeGen/LiveIntervals.h
+1-2llvm/lib/CodeGen/LiveIntervals.cpp
+6-52 files

LLVM/project 6ba0245llvm/include/llvm/CodeGen LiveIntervals.h, llvm/lib/CodeGen LiveIntervals.cpp

Use SpecificBumpPtrAllocator for LiveInterval

I didn't realize we used a singly linked list for storing subranges,
but that seems bad.

I didn't realize we used a singly linked list for storing subranges.
That seems bad and we should probably switch this to an array
DeltaFile
+5-5llvm/lib/CodeGen/LiveIntervals.cpp
+5-1llvm/include/llvm/CodeGen/LiveIntervals.h
+10-62 files

LLVM/project 1545952llvm/include/llvm/CodeGen LiveIntervals.h, llvm/lib/CodeGen LiveIntervals.cpp

LiveIntervals: Use BumpPtrAllocator
DeltaFile
+8-4llvm/include/llvm/CodeGen/LiveIntervals.h
+5-4llvm/lib/CodeGen/LiveIntervals.cpp
+13-82 files

LLVM/project 9ec6a12llvm/lib/Target/RISCV RISCVInstrInfoZvk.td

[RISCV] Replace VPatBinaryV_VX_VROTATE with VPatBinaryV_VX. NFC (#178254)

VPatBinaryV_VX_VROTATE appeared to be almost exact copy and paste of
VPatBinaryV_VX except it used 'XLenVT' instead of 'vti.Scalar'.
'vti.Scalar' is 'XLenVT' for integer vectors so this wasn't a real
difference.

This change allows VV_VX or VV_VX_VI combination classes to be used,
further reducing the code.

No tablegen outputs change with this patch.
DeltaFile
+2-9llvm/lib/Target/RISCV/RISCVInstrInfoZvk.td
+2-91 files

LLVM/project 2753e1dllvm/lib/Target/RISCV RISCVTargetTransformInfo.cpp, llvm/test/Analysis/CostModel/RISCV arith-int.ll

[RISCV] Set the reciprocal throughtput cost for division to TTI::TCC_Expensive (#177516)

Fixes #176208. Scaled back version of #176515 that only affects the RISCV backend.

Only modifies the cost for cases when DIV is a legal operation.

Updates the cost for both Scalar and Vector types.

Used `TTI::TCC_Expensive` as suggested by
https://github.com/llvm/llvm-project/issues/176208#issuecomment-3760902537.

---------

Co-authored-by: Luke Lau <luke_lau at icloud.com>
DeltaFile
+252-252llvm/test/Analysis/CostModel/RISCV/arith-int.ll
+20-2llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
+5-10llvm/test/Transforms/SLPVectorizer/RISCV/vec3-base.ll
+2-2llvm/test/Transforms/SLPVectorizer/RISCV/load-binop-store.ll
+2-2llvm/test/Transforms/LoopVectorize/RISCV/divrem.ll
+281-2685 files

LLVM/project c183af3clang/lib/CIR/CodeGen CIRGenCall.cpp CIRGenFunctionInfo.h, clang/lib/CIR/Lowering/DirectToLLVM LowerToLLVM.cpp

[CIR] Implement 'noreturn' attribute for functions/calls. (#177978)

This mirrors what LLVM does, and requires propagating into the LLVM
dialect: When the user specifies 'noreturn' we propagate this down
throughout the stack.

Note the similar 'willreturn' is too strong of a guarantee (in that they
are not opposites of each other, as there is a 'unknown' implied by all
others), so we cannot use that on non-noreturn functions.
DeltaFile
+39-8clang/lib/CIR/CodeGen/CIRGenCall.cpp
+41-0clang/test/CIR/CodeGen/noreturn.cpp
+22-4clang/lib/CIR/CodeGen/CIRGenFunctionInfo.h
+25-0mlir/test/Target/LLVMIR/llvmir.mlir
+11-2clang/lib/CIR/Lowering/DirectToLLVM/LowerToLLVM.cpp
+12-0mlir/test/Target/LLVMIR/Import/instructions.ll
+150-1412 files not shown
+186-2218 files

LLVM/project 40f6923mlir/test/python/dialects transform_op_interface.py

Add docstrings to tests
DeltaFile
+23-2mlir/test/python/dialects/transform_op_interface.py
+23-21 files

LLVM/project bdfe03bclang/lib/CodeGen CodeGenModule.cpp CGCall.cpp, clang/test/CodeGen/Mips musttail-pic.c musttail.c

[MIPS][ISel] Fix musttail (#161860)

Properly handle clang::musttail attribute on MIPS backend.

It fixes: https://github.com/llvm/llvm-project/issues/161193
DeltaFile
+120-0llvm/test/CodeGen/Mips/musttail.ll
+36-16llvm/lib/Target/Mips/MipsISelLowering.cpp
+32-9clang/lib/CodeGen/CodeGenModule.cpp
+21-0clang/test/CodeGen/Mips/musttail-pic.c
+19-0clang/test/CodeGen/Mips/musttail.c
+17-1clang/lib/CodeGen/CGCall.cpp
+245-265 files not shown
+296-3511 files

LLVM/project cd13943clang/lib/CodeGen CGExpr.cpp

Fixup code comments regarding matrix element indexing
DeltaFile
+2-3clang/lib/CodeGen/CGExpr.cpp
+2-31 files

LLVM/project 98e55a9clang/test/CodeGenHLSL/BasicFeatures MatrixToVectorCast.hlsl VectorElementwiseCast.hlsl

Move tests to existing VectorElementwiseCast test file
DeltaFile
+0-68clang/test/CodeGenHLSL/BasicFeatures/MatrixToVectorCast.hlsl
+60-1clang/test/CodeGenHLSL/BasicFeatures/VectorElementwiseCast.hlsl
+60-692 files

LLVM/project a2fb416libcxx/test/benchmarks/algorithms lower_bound.bench.cpp, libcxx/test/benchmarks/algorithms/nonmodifying lower_upper_bound.bench.cpp

[libc++] Rewrite the std::lower_bound benchmark to be more efficient and add an upper_bound benchmark (#177180)

The current benchmark is incredibly slow to run. This patch refactors
the benchmark to be faster and also adds an equivalent benchmark for
`std::upper_bound`.

Fixes #177026
DeltaFile
+86-0libcxx/test/benchmarks/algorithms/nonmodifying/lower_upper_bound.bench.cpp
+0-42libcxx/test/benchmarks/algorithms/lower_bound.bench.cpp
+86-422 files

LLVM/project 18280cfflang/include/flang/Optimizer/OpenACC/Support FIROpenACCTypeInterfaces.h, flang/lib/Optimizer/OpenACC/Support FIROpenACCTypeInterfaces.cpp RegisterOpenACCExtensions.cpp

[flang][acc] Use ReducibleType interface on LogicalType (#178253)

Introduce a new ReducibleType type interface in the OpenACC dialect that
provides a type-aware mechanism for translating OpenACC reduction
operators to arith::AtomicRMWKind values. This interface should be
attached to value types that can participate in OpenACC reductions.

For FIR, implement this interface on fir::LogicalType to handle the
AccLand and AccLor reduction operators, which map to
arith::AtomicRMWKind::andi and ori respectively.
DeltaFile
+93-0mlir/unittests/Dialect/OpenACC/OpenACCTypeInterfacesTest.cpp
+29-0mlir/include/mlir/Dialect/OpenACC/OpenACCTypeInterfaces.td
+19-0flang/lib/Optimizer/OpenACC/Support/FIROpenACCTypeInterfaces.cpp
+7-0flang/include/flang/Optimizer/OpenACC/Support/FIROpenACCTypeInterfaces.h
+2-0flang/lib/Optimizer/OpenACC/Support/RegisterOpenACCExtensions.cpp
+1-0mlir/include/mlir/Dialect/OpenACC/OpenACC.h
+151-01 files not shown
+152-07 files

LLVM/project 4ccdd2flibcxx/include/__algorithm pstl.h, libcxx/include/__pstl backend_fwd.h

[libc++][pstl] Generic implementation of parallel std::is_sorted (#176129)

This PR implements a generic backend-agnostic parallel `std::is_sorted`
based on `std::transform_reduce`.

While this approach is suboptimal comparing a direct backend-specific
implementation, since it doesn't support early termination and requires
a reduction operation, it does show speedup when the dataset is large
enough and the comparator is not absolutely trivial.

Parent issue: #99938
DeltaFile
+230-0libcxx/test/std/algorithms/alg.sorting/alg.sort/is.sorted/pstl.is_sorted_comp.pass.cpp
+229-0libcxx/test/std/algorithms/alg.sorting/alg.sort/is.sorted/pstl.is_sorted.pass.cpp
+32-0libcxx/include/__pstl/backends/default.h
+25-0libcxx/include/__algorithm/pstl.h
+8-0libcxx/test/std/algorithms/pstl.exception_handling.pass.cpp
+6-0libcxx/include/__pstl/backend_fwd.h
+530-01 files not shown
+534-07 files