LLVM/project 375f668llvm/lib/Transforms/Scalar LoopInterchange.cpp, llvm/test/Transforms/LoopInterchange reduction-extra-use-in-inner-loop.ll

[LoopInterchange] Reject if outer reduction value has extra user
DeltaFile
+13-23llvm/test/Transforms/LoopInterchange/reduction-extra-use-in-inner-loop.ll
+20-0llvm/lib/Transforms/Scalar/LoopInterchange.cpp
+33-232 files

LLVM/project 8ccd62cllvm/test/Transforms/LoopInterchange reduction-extra-use-in-inner-loop.ll

[LoopInterchange] Add test for extra reduction use in inner loop (NFC)
DeltaFile
+281-0llvm/test/Transforms/LoopInterchange/reduction-extra-use-in-inner-loop.ll
+281-01 files

LLVM/project 8bb9b2ellvm/lib/Target/AArch64 AArch64TargetTransformInfo.cpp, llvm/test/Analysis/CostModel/AArch64 sve-vector-reduce-fp.ll sve-intrinsics.ll

[LLVM][CostModel][SVE] Return InvalidCost for bfloat scalable vector ordered arithmetic reductions. (#202569)
DeltaFile
+1,137-0llvm/test/Analysis/CostModel/AArch64/sve-vector-reduce-fp.ll
+0-76llvm/test/Analysis/CostModel/AArch64/sve-intrinsics.ll
+1-1llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
+1,138-773 files

LLVM/project ed29c68libc/fuzzing CMakeLists.txt, libc/fuzzing/arpa CMakeLists.txt

[libc] Add a differential fuzzer for inet_aton (#200341)
DeltaFile
+50-0libc/fuzzing/arpa/inet/inet_aton_differential_fuzz.cpp
+9-0libc/fuzzing/arpa/inet/CMakeLists.txt
+1-0libc/fuzzing/arpa/CMakeLists.txt
+1-0libc/fuzzing/CMakeLists.txt
+61-04 files

LLVM/project 424e232llvm/lib/Target/AArch64 AArch64TargetTransformInfo.cpp, llvm/test/Transforms/InstCombine/AArch64 sve-intrinsic-mla-one.ll

[AArch64][SVE] add missing instcombine x+1 -> x

Split out from #198566
DeltaFile
+97-0llvm/test/Transforms/InstCombine/AArch64/sve-intrinsic-mla-one.ll
+25-0llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
+122-02 files

LLVM/project 076a0a3flang/lib/Lower/OpenMP OpenMP.cpp Utils.cpp

[flang][OpenMP] Move TargetOMPContext to shared FlangOMPContext (NFC) (#202677)

Moving the class to shared code makes it available for reuse by
forthcoming DECLARE VARIANT lowering without any functional change to
existing metadirective lowering.
DeltaFile
+2-45flang/lib/Lower/OpenMP/OpenMP.cpp
+42-0flang/lib/Lower/OpenMP/Utils.cpp
+14-0flang/lib/Lower/OpenMP/Utils.h
+58-453 files

LLVM/project e0f2daacross-project-tests/debuginfo-tests/dexter/dex/debugger DAP.py

Prevent evaluating through invalid values/nullptrs
DeltaFile
+1-1cross-project-tests/debuginfo-tests/dexter/dex/debugger/DAP.py
+1-11 files

LLVM/project b01d034flang/lib/Semantics expression.cpp check-call.h, flang/test/Semantics call47.f90

[Flang] Reject keyword arguments in statement function calls (#198610)

**Problem**
Flang silently accepted keyword arguments in calls to statement
functions, violating F2018 C1535.


**Standard: F2018 §15.5.1 C1535**: In a reference to a procedure whose
interface is implicit at the point of the reference, the actual argument
shall not be a keyword argument.

Flang silently compiles the following code without giving error` Keyword
argument 'x' at (1) is invalid in a statement function
`
```
program test
  integer :: f1, x, c
  f1(x) = x / 2
  c = f1(x=10)  ! Should be an error

    [14 lines not shown]
DeltaFile
+26-0flang/test/Semantics/call47.f90
+14-0flang/lib/Semantics/expression.cpp
+5-0flang/lib/Semantics/check-call.h
+1-1flang/lib/Semantics/check-call.cpp
+46-14 files

LLVM/project f5bf584clang/lib/Driver/ToolChains PS4CPU.cpp, clang/test/Driver ps5-linker.c

[clang][PS5] Clang driver PS5 - pass the target CPU to lld. (#202924)

Forward the PS5 target CPU from the clang driver to lld as
`-plugin-opt=mcpu=znver2`, matching behavior of other platforms.

Most drivers call addLTOOptions to include LTO-related link options. That includes specifying mcpu. The PS5 driver doesn't yet call addLTOOptions. In time I hope we'll arrive at a point where we can refactor to use the same functionality. This is one step towards that.
---------

Co-authored-by: Edd Dawson <edd.dawson at sony.com>
DeltaFile
+6-0clang/test/Driver/ps5-linker.c
+4-0clang/lib/Driver/ToolChains/PS4CPU.cpp
+10-02 files

LLVM/project 8acfc36libc/include htons-family.yaml, libc/include/arpa inet.yaml

[libc] Add the htons function family to netinet/in.h (#203028)

As required by POSIX.

I've used the merge_yaml_files functionality to avoid duplication.

Assisted by Gemini.
DeltaFile
+2-24libc/include/arpa/inet.yaml
+25-0libc/include/htons-family.yaml
+9-0libc/utils/docgen/netinet/in.yaml
+2-0libc/include/netinet/in.yaml
+38-244 files

LLVM/project 03aafcfflang/lib/Lower OpenACC.cpp, flang/test/Lower/OpenACC acc-unstructured.f90 acc-cache.f90

[OpenACC][flang] Emit NYI when unstructured loops are associated with OpenACC directives

When an unstructured loop is associated with a loop or a combined
directive, we emit an unstructured CFG for the loop's logic nested
within the OpenACC op. This effectively serializes the nested loop on
the device which is not desirable. For now, emit NYI's while working on
a longer-term solution.

The NYI is restricted to the cases where the loop will be lowered with
`independent` parallelism semantics for the default device_type -- i.e.,
the user has explicitly promised the loop is parallel. This covers:
- combined `acc parallel loop`,
- standalone `acc loop` inside `acc parallel`,
- orphan `acc loop` inside a non-`seq` acc routine,
- explicit `independent` clause.

For `auto` (`acc kernels loop` and `acc loop` inside `acc kernels`) and
for `seq` (`acc serial loop`, `acc loop` inside `acc serial`, explicit
`seq`, or orphan inside a `seq` routine), the user has not made a

    [4 lines not shown]
DeltaFile
+88-151flang/test/Lower/OpenACC/acc-unstructured.f90
+123-16flang/lib/Lower/OpenACC.cpp
+120-0flang/test/Lower/OpenACC/Todo/acc-unstructured-loop-construct.f90
+3-116flang/test/Lower/OpenACC/acc-cache.f90
+69-0flang/test/Lower/OpenACC/Todo/acc-unstructured-combined-construct.f90
+0-41flang/test/Lower/OpenACC/acc-loop-exit.f90
+403-3246 files

LLVM/project 9673aaeflang/lib/Lower PFTBuilder.cpp, flang/test/Lower/OpenACC acc-declare-interface-body.f90

[flang][OpenACC] Don't hoist declare directive out of interface bodies (#202806)

Example:
```fortran
program main
  real :: a(10, 60)
  interface
    subroutine compute(a)
      real :: a(10, 60)
!$acc declare present(a)
    end subroutine
  end interface
  call compute(a)
end program
```

In this code, the `!$acc declare` inside the interface body is hoisted
into the
host program unit and lowered there, where its operand (the interface

    [12 lines not shown]
DeltaFile
+43-0flang/test/Lower/OpenACC/acc-declare-interface-body.f90
+15-0flang/lib/Lower/PFTBuilder.cpp
+58-02 files

LLVM/project 4b3deaellvm/unittests/DebugInfo/PDB CMakeLists.txt

Fix DebugInfo unittests shared library build (#202943)

Fixes: `PublicsStreamTest.cpp.o: undefined reference to symbol
'_ZN4llvm6object18GenericBinaryErrorC1ERKNS_5TwineENS0_12object_errorE'`
under `BUILD_SHARED_LIBS=1`.
DeltaFile
+1-0llvm/unittests/DebugInfo/PDB/CMakeLists.txt
+1-01 files

LLVM/project 0cce782llvm/lib/Target/SPIRV SPIRVEmitIntrinsics.cpp, llvm/test/CodeGen/SPIRV select-aggregate.ll select-composite-constant.ll

[SPIR-V] Lower `select` instructions with aggregate operands (#201417)

Context: `SPIRVEmitIntrinsics` represents aggregate (array/struct) SSA
values as i32 value-ids, keeping the real type on the side for SPIR-V
emission. `preprocessCompositeConstants()` rewrites composite constant
operands into those value-ids.

A `select` takes its result type from its operands, so rewriting one arm
leaves the select with an aggregate result type but an i32 operand,
which is invalid. The exact failure mode depends: a composite-constant
arm tripped the verifier ("Select values must have same type as select
instruction"), while a non-constant arm (say a load) only became a
value-id later, in the visitor pass, at which point
`replaceMemInstrUses()` found a `select` among its users and hit an
unreachable.

I pushed two commits fixing this, one limited to my use case, another
more general:


    [20 lines not shown]
DeltaFile
+80-0llvm/test/CodeGen/SPIRV/select-aggregate.ll
+23-33llvm/lib/Target/SPIRV/SPIRVEmitIntrinsics.cpp
+41-0llvm/test/CodeGen/SPIRV/select-composite-constant.ll
+144-333 files

LLVM/project 8210a58libcxxabi/src/demangle Utility.h, llvm/include/llvm/Demangle DemangleConfig.h Utility.h

[Demangle] Guard DEMANGLE_ABI and add missing annotation (#202920)

This updates the DEMANGLE_ABI annotation to only be defined if it is not
already defined. This is required to parse the Demangle headers with the
ids-check script.
In addition, this adds one missing DEMANGLE_ABI annotation.

This effort is tracked in #109483.
DeltaFile
+22-17llvm/include/llvm/Demangle/DemangleConfig.h
+1-1llvm/include/llvm/Demangle/Utility.h
+1-1libcxxabi/src/demangle/Utility.h
+24-193 files

LLVM/project dd07243flang/lib/Lower/OpenMP OpenMP.cpp, flang/test/Lower/OpenMP target-inreduction.f90

[flang][OpenMP] Model target in_reduction through map entries

Model omp.target in_reduction so the target body uses the mapped
map_entries block argument instead of a separate in_reduction entry
block argument.

The in_reduction operands remain on the op for host-side translation.
For the host-fallback path, the matching map block argument is redirected
to the pointer returned by __kmpc_task_reduction_get_th_data, so the
target body accumulates into the task reduction-private storage.

Flang lowering now relies on the implicit address-preserving map for the
target body binding, while task and taskloop keep their existing
in_reduction block-argument behavior.

Offload/device compilation is still diagnosed as not yet implemented, and
each target in_reduction variable must have a matching map_entries entry.
DeltaFile
+67-31mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp
+40-26mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
+17-8flang/lib/Lower/OpenMP/OpenMP.cpp
+11-9mlir/test/Target/LLVMIR/openmp-target-in-reduction.mlir
+8-6flang/test/Lower/OpenMP/target-inreduction.f90
+9-0mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td
+152-804 files not shown
+161-8910 files

LLVM/project 35ac9a4flang/lib/Lower/OpenMP OpenMP.cpp, flang/test/Lower/OpenMP target-inreduction.f90

[flang][OpenMP] Lower target in_reduction for host fallback

Teach Flang lowering and MLIR OpenMP translation to carry
in_reduction through omp.target for the host-fallback path.

The translation looks up task reduction-private storage with
__kmpc_task_reduction_get_th_data and binds the target region's
in_reduction block argument to that private pointer, so uses inside the
region do not keep referring to the original variable.

The patch also preserves in_reduction operands in the TargetOp builder
path and ensures target in_reduction list items are mapped into the
target region when needed.

The device/offload-entry path remains diagnosed as not yet implemented.
DeltaFile
+112-12mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
+83-3mlir/test/Target/LLVMIR/openmp-todo.mlir
+62-18flang/lib/Lower/OpenMP/OpenMP.cpp
+60-0mlir/test/Dialect/OpenMP/invalid.mlir
+50-0mlir/test/Target/LLVMIR/openmp-target-in-reduction.mlir
+28-0flang/test/Lower/OpenMP/target-inreduction.f90
+395-333 files not shown
+432-539 files

LLVM/project b836063llvm/lib/Transforms/Scalar LoopFuse.cpp

[LoopFusion] Drop duplicate write-write dependence check (NFC) (#203173)

`dependencesAllowFusion()` re-tested every FC0-write vs FC1-write pair
in the second loop nest, duplicating the checks already done in the
first. Iterate only the remaining FC0-read vs FC1-write pairs; the set
of checked dependences (W0xW1, W0xR1, R0xW1) is unchanged.
DeltaFile
+4-7llvm/lib/Transforms/Scalar/LoopFuse.cpp
+4-71 files

LLVM/project 5e7ab8fflang/lib/Lower OpenACC.cpp, flang/test/Lower/OpenACC acc-unstructured.f90 acc-cache.f90

[OpenACC][flang] Emit NYI when unstructured loops are associated with OpenACC directives

When an unstructured loop is associated with a loop or a combined
directive, we emit an unstructured CFG for the loop's logic nested
within the OpenACC op. This effectively serializes the nested loop on
the device which is not desirable. For now, emit NYI's while working on
a longer-term solution.

The NYI is restricted to the cases where the loop will be lowered with
`independent` parallelism semantics for the default device_type -- i.e.,
the user has explicitly promised the loop is parallel. This covers:
- combined `acc parallel loop`,
- standalone `acc loop` inside `acc parallel`,
- orphan `acc loop` inside a non-`seq` acc routine,
- explicit `independent` clause.

For `auto` (`acc kernels loop` and `acc loop` inside `acc kernels`) and
for `seq` (`acc serial loop`, `acc loop` inside `acc serial`, explicit
`seq`, or orphan inside a `seq` routine), the user has not made a

    [4 lines not shown]
DeltaFile
+88-151flang/test/Lower/OpenACC/acc-unstructured.f90
+123-16flang/lib/Lower/OpenACC.cpp
+120-0flang/test/Lower/OpenACC/Todo/acc-unstructured-loop-construct.f90
+3-116flang/test/Lower/OpenACC/acc-cache.f90
+69-0flang/test/Lower/OpenACC/Todo/acc-unstructured-combined-construct.f90
+0-41flang/test/Lower/OpenACC/acc-loop-exit.f90
+403-3246 files

LLVM/project 700ff25libcxx/include thread

[libc++] Hoist <compare> outside the threads guard in <thread> (#202535)

The standard mandates [thread.syn] include <compare> as part of
<thread>'s synopsis. This is a standards-mandated dependency, not a
thread-feature dependency, so it should be visible regardless of
_LIBCPP_HAS_THREADS.

This matches how we handle standard-mandated includes elsewhere, see for
example #134877.
DeltaFile
+5-5libcxx/include/thread
+5-51 files

LLVM/project 9081432mlir/lib/Target/LLVMIR/Dialect/OpenMP OpenMPToLLVMIRTranslation.cpp, mlir/test/Target/LLVMIR openmp-taskloop-reduction.mlir openmp-todo.mlir

[mlir][OpenMP] Translate reductions on taskloop

Add LLVM IR translation for reduction and in_reduction clauses on omp.taskloop.context.

For taskloop reduction, emit the implicit taskgroup reduction setup and map each generated task to runtime-provided private reduction storage through __kmpc_task_reduction_get_th_data. For in_reduction, use the same runtime lookup path with a null descriptor to join an enclosing task reduction context.

Unsupported byref, cleanup, and two-argument initializer forms remain diagnosed.

Add MLIR translation tests for the supported taskloop reduction and in_reduction cases.
DeltaFile
+245-0mlir/test/Target/LLVMIR/openmp-taskloop-reduction.mlir
+221-22mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
+92-10mlir/test/Target/LLVMIR/openmp-todo.mlir
+558-323 files

LLVM/project 046bd54llvm/test/CodeGen/RISCV xqccmp-push-pop-popret.ll qci-interrupt-attr.ll, llvm/test/CodeGen/RISCV/rvv nontemporal-vp-scalable.ll

[RISCV] Set CostPerUse to 1 only when optimizing for size (#201501)

We saw some regressions because of bad RAs as the cost of registers
beyond x8-x15 are bigger. This is why `DisableCostPerUse` was added
in https://github.com/llvm/llvm-project/issues/83320.

In this PR, we change it to set `CostPerUse=1` only when optimizing
for size.

Code size increases less than 0.1% in llvm-test-suite.
DeltaFile
+904-904llvm/test/CodeGen/RISCV/xqccmp-push-pop-popret.ll
+870-870llvm/test/CodeGen/RISCV/rvv/nontemporal-vp-scalable.ll
+632-632llvm/test/CodeGen/RISCV/qci-interrupt-attr.ll
+600-600llvm/test/CodeGen/RISCV/push-pop-popret.ll
+288-288llvm/test/CodeGen/RISCV/qci-interrupt-attr-fpr.ll
+244-244llvm/test/CodeGen/RISCV/callee-saved-gprs.ll
+3,538-3,53811 files not shown
+4,469-4,46617 files

LLVM/project 5e7ec28clang/lib/CIR/CodeGen CIRGenBuiltinAArch64.cpp, clang/test/CodeGen/AArch64 neon-misc.c v8.2a-neon-intrinsics.c

[clang][CIR][AArch64] Add lowering for conversion intrinsics (#199990)

This PR adds lowering for intrinsic from the following groups:
* https://arm-software.github.io/acle/neon_intrinsics/advsimd.html#conversions
* https://arm-software.github.io/acle/neon_intrinsics/advsimd.html#conversions-2

It continues the work started in #190961 and #193273. This PR implements
conversions from FP to integer types where the bit-wdith does not
change:
  * vcvt_s64_f64
  * vcvt_u64_f64
  * vcvt_s32_f32
  * vcvtq_s32_f32
  * vcvtq_s64_f64
  * vcvt_u32_f32
  * vcvtq_u32_f32
  * vcvtq_u64_f64
  * vcvt_s16_f16
  * vcvtq_s16_f16

    [10 lines not shown]
DeltaFile
+114-0clang/test/CodeGen/AArch64/neon/intrinsics.c
+87-0clang/test/CodeGen/AArch64/neon/conversion-fullfp16.c
+0-78clang/test/CodeGen/AArch64/neon-misc.c
+0-52clang/test/CodeGen/AArch64/v8.2a-neon-intrinsics.c
+0-28clang/test/CodeGen/AArch64/neon-intrinsics.c
+14-1clang/lib/CIR/CodeGen/CIRGenBuiltinAArch64.cpp
+215-1596 files

LLVM/project 67d211amlir/lib/Conversion/ComplexToSPIRV ComplexToSPIRV.cpp, mlir/test/Conversion/ComplexToSPIRV complex-to-spirv.mlir

[mlir][SPIR-V] Convert complex.neg and complex.conj in ComplexToSPIRV (#202898)
DeltaFile
+36-2mlir/lib/Conversion/ComplexToSPIRV/ComplexToSPIRV.cpp
+31-0mlir/test/Conversion/ComplexToSPIRV/complex-to-spirv.mlir
+67-22 files

LLVM/project 123078cmlir/include/mlir/Conversion Passes.td, mlir/lib/Conversion/ConvertToEmitC ConvertToEmitCPass.cpp

Reland emitc lower multi return functions (#203026)

Reland #200659 reverted by #202911.

Fixed GCC 7 func-to-emitc build: Use the adaptor operand types
when creating the multi-return struct type instead of relying on an
implicit conversion from ValueRange to TypeRange.

Failed buildbot:
https://lab.llvm.org/buildbot/#/builders/116/builds/29302

Assisted-by: Copilot
DeltaFile
+236-25mlir/lib/Conversion/FuncToEmitC/FuncToEmitC.cpp
+96-2mlir/test/Conversion/FuncToEmitC/func-to-emitc.mlir
+87-1mlir/test/Conversion/FuncToEmitC/func-to-emitc-failed.mlir
+63-0mlir/test/Target/Cpp/func.mlir
+13-5mlir/lib/Conversion/ConvertToEmitC/ConvertToEmitCPass.cpp
+6-0mlir/include/mlir/Conversion/Passes.td
+501-337 files not shown
+512-3913 files

LLVM/project 0d1a5e7llvm/lib/Transforms/Scalar LoopInterchange.cpp, llvm/test/Transforms/LoopInterchange inner-induciton-step-is-not-invariant.ll

[LoopInterchange] Reject if inner loop IV has outer-variant step
DeltaFile
+20-48llvm/test/Transforms/LoopInterchange/inner-induciton-step-is-not-invariant.ll
+7-1llvm/lib/Transforms/Scalar/LoopInterchange.cpp
+27-492 files

LLVM/project dd8c5c2llvm/lib/Transforms/Scalar LoopInterchange.cpp, llvm/test/Transforms/LoopInterchange reduction2mem-limitation.ll

[LoopInterchange] Consolidate induction and reduction vars check
DeltaFile
+72-95llvm/lib/Transforms/Scalar/LoopInterchange.cpp
+4-10llvm/test/Transforms/LoopInterchange/reduction2mem-limitation.ll
+76-1052 files

LLVM/project 7e6f2b7llvm/lib/Target/AMDGPU AMDGPURegBankLegalizeRules.cpp, llvm/test/CodeGen/AMDGPU llvm.amdgcn.set.inactive.chain.arg.ll amdgpu-cs-chain-preserve-cc.ll

AMDGPU/GlobalISel: RegBankLegalize rules for set_inactive intrinsics (#203047)
DeltaFile
+103-119llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.set.inactive.ll
+31-27llvm/test/CodeGen/AMDGPU/GlobalISel/global-atomic-fadd.f32-no-rtn.ll
+25-25llvm/test/CodeGen/AMDGPU/GlobalISel/global-atomic-fadd.f32-rtn.ll
+4-4llvm/test/CodeGen/AMDGPU/llvm.amdgcn.set.inactive.chain.arg.ll
+6-0llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeRules.cpp
+2-2llvm/test/CodeGen/AMDGPU/amdgpu-cs-chain-preserve-cc.ll
+171-1771 files not shown
+173-1797 files

LLVM/project d3d0927llvm/lib/Target/LoongArch LoongArchISelLowering.cpp, llvm/test/CodeGen/LoongArch crc.ll

[LoongArch] Propagate demanded bits for CRC[C].W.{B,H}.W

CRC byte and halfword instructions only use the low 8 or 16 bits of
their data operand. Propagate these demanded-bit requirements through
SimplifyDemandedBitsForTargetNode() so redundant masking operations can
be removed during DAG combining.
DeltaFile
+21-4llvm/lib/Target/LoongArch/LoongArchISelLowering.cpp
+0-4llvm/test/CodeGen/LoongArch/crc.ll
+21-82 files

LLVM/project 9ebbc1ellvm/docs ReleaseNotes.md, llvm/include/llvm/ADT APInt.h

[APInt] Provide sqrtFloor (floor of square root) instead of sqrt (rounded) (#197406)

This simplifies both the implementation and the only in-tree user.

I changed the name to avoid silently changing the behavour of an
existing function that might have out-of-tree users.
DeltaFile
+12-33llvm/lib/Support/APInt.cpp
+20-6llvm/unittests/ADT/APIntTest.cpp
+2-2llvm/include/llvm/ADT/APInt.h
+3-0llvm/docs/ReleaseNotes.md
+37-414 files