LLVM/project 3157758llvm/lib/Target/AArch64 AArch64TargetTransformInfo.cpp, llvm/lib/Transforms/Vectorize VPlanTransforms.cpp

[LV] Handle partial sub-reductions with sub in middle block. (#178919)

Sub-reductions can be implemented in two ways:
(1) negate the operand in the vector loop (the default way).
(2) subtract the reduced value from the init value in the middle block.

Note that both ways keep the reduction itself as an 'add' reduction,
which is necessary because only llvm.vector.partial.reduce.add exists.

The ISD nodes for partial reductions don't support folding the
sub/negation into its operands because the following is not a valid
transformation:
```
     sub(0, mul(ext(a), ext(b)))
  -> mul(ext(a), ext(sub(0, b)))
```
It can therefore be better to choose option (2) such that the partial
reduction is always positive (starting at '0') and to do a final
subtract in the middle block.

    [9 lines not shown]
DeltaFile
+67-27llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
+84-0llvm/test/Transforms/LoopVectorize/AArch64/partial-reduce-sub-sdot.ll
+9-12llvm/test/Transforms/LoopVectorize/AArch64/partial-reduce-sub.ll
+9-12llvm/test/Transforms/LoopVectorize/AArch64/partial-reduce-chained.ll
+5-0llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
+174-515 files

LLVM/project 7756bfdllvm/lib/Passes PassBuilderPipelines.cpp

[NFC] Modify the comment of LoopRotate param (#180675)

The first param of LoopRotatePass is EnableHeaderDuplication. The value
'true' means 'enable the header duplication'.
`LoopRotatePass(bool EnableHeaderDuplication, bool PrepareForLTO)`

---------

Co-authored-by: Pengcheng Wang <wangpengcheng.pp at bytedance.com>
DeltaFile
+2-2llvm/lib/Passes/PassBuilderPipelines.cpp
+2-21 files

LLVM/project b62a752utils/bazel/llvm-project-overlay/mlir BUILD.bazel

[bazel] Port 9de8463
DeltaFile
+20-0utils/bazel/llvm-project-overlay/mlir/BUILD.bazel
+20-01 files

LLVM/project c975385compiler-rt/include CMakeLists.txt, compiler-rt/include/sanitizer tysan_interface.h

[TySan] Add skeleton for adding interface functions (#170859)

This pr has the more straightforward changes from the initial interfaces
pr (https://github.com/llvm/llvm-project/pull/169023). By supporting
interfaces, it also will help me fix [this
issue](https://github.com/llvm/llvm-project/issues/169024) where we
don't test tysan with the sanitizer_common codebase
DeltaFile
+29-0compiler-rt/include/sanitizer/tysan_interface.h
+1-0compiler-rt/include/CMakeLists.txt
+1-0compiler-rt/lib/sanitizer_common/tests/sanitizer_common_test.cpp
+1-0llvm/utils/gn/secondary/compiler-rt/include/BUILD.gn
+32-04 files

FreeBSD/ports 32d1833math/py-seaborn Makefile distinfo

math/py-seaborn: Update to 0.13.2

Adjust dependencies while here.

PR:             292971
Approved by:    maintainer
DeltaFile
+6-8math/py-seaborn/Makefile
+3-3math/py-seaborn/distinfo
+9-112 files

LLVM/project 9914ee6flang/include/flang/Optimizer/Transforms Passes.td, flang/lib/Optimizer/Passes Pipelines.cpp

[flang] Fix -debug crash from VScaleAttrPass (#180234)

This pass splits up the `vscaleRange` pass-option from the
`VScaleAttrPass` into `vscaleMin` and `vscaleMax` respectively, since a
`std::pair<>` cannot be used as a cli-option and crashes when running
`flang -march=rv64gcv -O3 file.f90 -mmlir -debug`.

Since the options can now be set individually I added some error
checking following the semantics described in the langref
https://llvm.org/docs/LangRef.html#function-attributes.

I also added tests since there were none for only this pass before.
DeltaFile
+20-5flang/lib/Optimizer/Transforms/VScaleAttr.cpp
+24-0flang/test/Transforms/vscale-attr.fir
+4-4flang/include/flang/Optimizer/Transforms/Passes.td
+1-1flang/lib/Optimizer/Passes/Pipelines.cpp
+49-104 files

LLVM/project 41aed21llvm/lib/Transforms/Coroutines CoroSplit.cpp, llvm/test/Transforms/Coroutines coro-retcon-continuation-scope.ll

[CoroSplit][DebugInfo] Fix scope of continuation funclets (#180523)

The heuristic for deciding which scope line to use for a continuation
funclet relies on iterating on the instructions of the first BB of the
continuation. Often, this contains a single unconditional branch, which
is skipped by the heuristic. However, in coro-retcon, two such
"jump-only" BBs are generated. This patch amends the heuristic to
account for that.
DeltaFile
+69-0llvm/test/Transforms/Coroutines/coro-retcon-continuation-scope.ll
+4-2llvm/lib/Transforms/Coroutines/CoroSplit.cpp
+73-22 files

LLVM/project af74bc9clang/lib/AST/ByteCode Compiler.cpp, clang/test/AST/ByteCode invalid.cpp

[clang][bytecode] Improve rejecting UnaryExprOrTypeTraitExprs (#180710)

Some of them work just fine, even if the expression contains errors.
DeltaFile
+6-3clang/lib/AST/ByteCode/Compiler.cpp
+6-0clang/test/AST/ByteCode/invalid.cpp
+1-0clang/test/SemaCXX/alignof-sizeof-reference.cpp
+13-33 files

LLVM/project 25f5e97flang/lib/Optimizer/HLFIR/Transforms ScheduleOrderedAssignments.cpp ScheduleOrderedAssignments.h, flang/test/HLFIR/order_assignments where-array-sections.f90 where-scheduling.f90

[flang] optimize WHERE with identical and disjoint array sections (#180279)

Improve `ScheduleOrderedAssignments` to avoid creating temporary storage
for masks in `WHERE` constructs when the mask modification is "aligned"
with the assignment (e.g., `where(a(i)>0) a(i)=...`).

- Identify "aligned" conflicts (identical array elements accessed in
order) using the `ArraySectionAnalyzer` that is extracted from
OptimizedBufferization.
- Defer saving regions with aligned conflicts, allowing fusion if
possible.
- Implement retroactive saving: if a region was modified in a previous
run (fused via aligned conflict) but is needed by a later split run,
insert a `SaveEntity` action before the modifying run.
- Use `std::list` for the schedule to support stable iterators for run
insertion.
- Update tests to verify fewer temporaries and correct retroactive
saves.
- Update flang pipeline at O2 and more to try fusing assignments in

    [6 lines not shown]
DeltaFile
+423-108flang/lib/Optimizer/HLFIR/Transforms/ScheduleOrderedAssignments.cpp
+128-0flang/test/HLFIR/order_assignments/where-array-sections.f90
+43-15flang/test/HLFIR/order_assignments/where-scheduling.f90
+54-2flang/lib/Optimizer/HLFIR/Transforms/ScheduleOrderedAssignments.h
+2-53flang/lib/Optimizer/HLFIR/Transforms/LowerHLFIROrderedAssignments.cpp
+6-2flang/test/HLFIR/order_assignments/inlined-stack-temp.fir
+656-1802 files not shown
+660-1828 files

LLVM/project bd6dd94llvm/lib/Target/AMDGPU AMDGPURegBankLegalizeRules.cpp, llvm/test/CodeGen/AMDGPU/GlobalISel atomicrmw_minmax.ll regbankselect-atomicrmw-minmax-uminmax.mir

[AMDGPU] Add legalization rules for atomicrmw max/min ops (#180502)

Adds rules for G_ATOMICRMW_{MAX, MIN, UMAX, UMIN, UINC_WRAP, UDEC_WRAP}.
Each of these generic opcode are supported for S32 and S64 types
on flat, global and local address spaces.
DeltaFile
+1,216-0llvm/test/CodeGen/AMDGPU/GlobalISel/atomicrmw_minmax.ll
+245-0llvm/test/CodeGen/AMDGPU/GlobalISel/regbankselect-atomicrmw-minmax-uminmax.mir
+125-0llvm/test/CodeGen/AMDGPU/GlobalISel/regbankselect-atomicrmw-uinc-udec-wrap.mir
+10-7llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeRules.cpp
+6-6llvm/test/CodeGen/AMDGPU/GlobalISel/atomicrmw_uinc_wrap.ll
+5-5llvm/test/CodeGen/AMDGPU/GlobalISel/atomicrmw_udec_wrap.ll
+1,607-186 files

OPNSense/plugins 85f1bb9www/web-proxy-sso/src/opnsense/mvc/app/models/OPNsense/ProxySSO ProxySSO.xml

www/web-proxy-sso: model style
DeltaFile
+2-6www/web-proxy-sso/src/opnsense/mvc/app/models/OPNsense/ProxySSO/ProxySSO.xml
+2-61 files

FreeNAS/freenas 636d620src/middlewared/middlewared/utils __init__.py web_app.py

More mypy fixes in utils
DeltaFile
+11-4src/middlewared/middlewared/utils/__init__.py
+8-5src/middlewared/middlewared/utils/web_app.py
+5-5src/middlewared/middlewared/utils/sed.py
+4-4src/middlewared/middlewared/utils/functools_.py
+6-1src/middlewared/middlewared/utils/pydantic_.py
+2-3src/middlewared/middlewared/utils/syslog.py
+36-2210 files not shown
+53-3716 files

OPNSense/plugins 703f9f2net/haproxy/src/opnsense/service/templates/OPNsense/HAProxy haproxy.conf, www/web-proxy-sso/src/opnsense/mvc/app/models/OPNsense/ProxySSO ProxySSO.xml

net/haproxy: sync with master
DeltaFile
+2-6www/web-proxy-sso/src/opnsense/mvc/app/models/OPNsense/ProxySSO/ProxySSO.xml
+2-0net/haproxy/src/opnsense/service/templates/OPNsense/HAProxy/haproxy.conf
+4-62 files

LLVM/project ceec2c7llvm/test/Analysis/ScalarEvolution ptrtoaddr.ll

[SCEV] Add ptrtoaddr tests with external state/unstable addrspaces.

Add ptrtoaddr tests with address spaces with unstable and external but
stable pointer representations.

Currently we incorrectly form ptrtoaddr for unstsable pointers. See
discussion in https://github.com/llvm/llvm-project/pull/178861 for more
details.
DeltaFile
+76-1llvm/test/Analysis/ScalarEvolution/ptrtoaddr.ll
+76-11 files

OPNSense/core 9f156aeMk style.mk

make: fix nightly build issues et al

(cherry picked from commit 57f148201b60854ac9c2d278fd5bb80deaa968c6)
DeltaFile
+17-9Mk/style.mk
+17-91 files

FreeBSD/src f19cb3csys/modules/rtw89 Makefile

rtw89: module Makefile add USB bus attachments

Sponsored by:   The FreeBSD Foundation
MFC after:      3 days
DeltaFile
+30-13sys/modules/rtw89/Makefile
+30-131 files

FreeBSD/src 9e17556sys/modules/rtw88 Makefile

rtw88: Add bus attachments to the module Makefile

In addition to PCIe we will support USB and also prepare for SDIO (still
disabled locally).  The module SRCS are split up into a common part,
which we always add.  All three bus parts are guarded by a local
variable in the Makefile.
In addition the PCI parts require PCI to be compiled into the kernel.
We add that check in case of, e.g., SoCs with SDIO but no PCI, which
may not have PCI in the kernel config and thus the module would fail
to attach.
USB has no additional check as it is fully loadable and does not have
to be in a kernel config.
SDIO depends on an MMCCAM-enabled kernel but is otherwise loadable.

While we could, we are not splitting the various bus attachments into
individual modules as we generally do not do that in FreeBSD. [1]

Sponsored by:   The FreeBSD Foundation
MFC after:      3 days

    [3 lines not shown]
DeltaFile
+61-25sys/modules/rtw88/Makefile
+61-251 files

FreeBSD/src 7fc5c8dsys/contrib/dev/rtw89 pci.c debug.c

rtw89: harmonize all MODULE_DEPEND to rtw89

rtw89 came like rtw88 was done.  Given rtw88 once was split up rtw89
got modelled the same way.  Clean this up too.

Sponsored by:   The FreeBSD Foundation
MFC after:      3 days
DeltaFile
+0-8sys/contrib/dev/rtw89/pci.c
+5-0sys/contrib/dev/rtw89/debug.c
+5-0sys/contrib/dev/rtw89/core.c
+3-0sys/contrib/dev/rtw89/usb.c
+13-84 files

FreeBSD/src 57b8396sys/contrib/dev/rtw89 debug.c fw.h

rtw89: cleanup static_assert() calls

These days we can use static_assert() without trouble so remove the
FreeBSD-specific rtw89_static_assert implementation.  This reduces
the diff to upstream and will ease future driver updates.

Sponsored by:   The FreeBSD Foundation
MFC after:      3 days
DeltaFile
+0-12sys/contrib/dev/rtw89/debug.c
+0-8sys/contrib/dev/rtw89/fw.h
+0-8sys/contrib/dev/rtw89/phy.c
+0-6sys/contrib/dev/rtw89/rtw8851b.c
+0-6sys/contrib/dev/rtw89/core.h
+0-6sys/contrib/dev/rtw89/rtw8852c.c
+0-463 files not shown
+0-589 files

FreeBSD/src 49c1b38sys/contrib/dev/rtw88 pci.c usb.c

rtw88: harmonize all MODULE_DEPEND to rtw88

From the time I used to split up the driver into a core part and
bus attachment sub-drivers the various bus attachments had their own
module name but all is "rtw88" now.

Core functionality depends on linuxkpi, linuxkpi_wlan, and for debug.c
lindebugfs.
Each bus attachment then depends on its own parent layer if needed:
PCI gets pull in through linuxkpi, USB: depends on [the future] linuxkpi_usb,
and SDIO: depends on [the future] linuxkpi_sdio.

Sponsored by:   The FreeBSD Foundation
MFC after:      3 days
Differential Revision: https://reviews.freebsd.org/D55021
DeltaFile
+0-8sys/contrib/dev/rtw88/pci.c
+1-4sys/contrib/dev/rtw88/usb.c
+5-0sys/contrib/dev/rtw88/main.c
+4-0sys/contrib/dev/rtw88/debug.c
+10-124 files

LLVM/project 437566dmlir/include/mlir/Dialect/LLVMIR ROCDLOps.td, mlir/test/Dialect/LLVMIR rocdl.mlir

[ROCDL] Added workgroup cluster ids to ROCDL (#179897)

DeltaFile
+20-12mlir/test/Target/LLVMIR/rocdl.mlir
+13-7mlir/test/Dialect/LLVMIR/rocdl.mlir
+4-0mlir/include/mlir/Dialect/LLVMIR/ROCDLOps.td
+37-193 files

LLVM/project e043195clang/test/CodeGen arm_acle.c builtins-arm64.c, llvm/lib/Target/AArch64 AArch64ISelLowering.cpp

[AArch64] Add support for intent to read prefetch intrinsic (#179709)

This patch adds support in Clang for the PRFM IR instruction, by adding
the following builtin:

  void __pldir(void const *addr);

This builtin is described in the following ACLE proposal:
https://github.com/ARM-software/acle/pull/406
DeltaFile
+13-0clang/test/CodeGen/arm_acle.c
+12-0llvm/test/CodeGen/AArch64/arm64-prefetch-ir.ll
+9-1llvm/test/MC/AArch64/armv9.6a-pcdphint.s
+6-0llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp
+5-0llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+5-0clang/test/CodeGen/builtins-arm64.c
+50-14 files not shown
+59-110 files

LLVM/project 8c92576clang/test/OpenMP task_codegen.cpp threadprivate_codegen.cpp, llvm/test/CodeGen/AMDGPU combine_andor_with_cmps_nnan.ll combine_andor_with_cmps.ll

Merge branch 'main' into users/jeanPerier/where_array_section
DeltaFile
+5,835-5,584llvm/test/tools/llvm-dwarfdump/X86/simplified-template-names.s
+3,458-2,041clang/test/OpenMP/task_codegen.cpp
+2,140-2,140clang/test/OpenMP/threadprivate_codegen.cpp
+1,449-0llvm/test/CodeGen/AMDGPU/combine_andor_with_cmps_nnan.ll
+431-901llvm/test/CodeGen/X86/sse-minmax.ll
+396-789llvm/test/CodeGen/AMDGPU/combine_andor_with_cmps.ll
+13,709-11,4551,201 files not shown
+62,232-29,9611,207 files

LLVM/project 1a287bdflang/lib/Optimizer/HLFIR/Transforms ScheduleOrderedAssignments.cpp ScheduleOrderedAssignments.h

add and fix comments
DeltaFile
+16-1flang/lib/Optimizer/HLFIR/Transforms/ScheduleOrderedAssignments.cpp
+8-1flang/lib/Optimizer/HLFIR/Transforms/ScheduleOrderedAssignments.h
+24-22 files

OpenBSD/ports ZXTx4EOdatabases/freetds distinfo Makefile

   update to freetds-1.5.11
VersionDeltaFile
1.107+2-2databases/freetds/distinfo
1.173+1-1databases/freetds/Makefile
+3-32 files

OpenBSD/ports qN1zk6Kdatabases/py-limits distinfo Makefile, databases/py-limits/pkg PLIST

   update to py3-limits-5.8.0
VersionDeltaFile
1.2+3-3databases/py-limits/pkg/PLIST
1.3+2-2databases/py-limits/distinfo
1.3+1-1databases/py-limits/Makefile
+6-63 files

OpenBSD/ports FXDuswjdatabases/py-redis/patches patch-tests_conftest_py patch-tests_test_asyncio_conftest_py, databases/py-redis/pkg PLIST

   update to py3-redis-5.3.1 (last version with redis 6.x support)
VersionDeltaFile
1.1+191-0databases/py-redis/patches/patch-tests_conftest_py
1.1+188-0databases/py-redis/patches/patch-tests_test_asyncio_conftest_py
1.1+149-0databases/py-redis/patches/patch-tests_entraid_utils_py
1.1+53-0databases/py-redis/patches/patch-tests_test_credentials_py
1.1+49-0databases/py-redis/patches/patch-tests_test_asyncio_test_credentials_py
1.18+23-3databases/py-redis/pkg/PLIST
+653-34 files not shown
+657-710 files

OpenBSD/src 1DxRFTdusr.bin/tmux resize.c

   Fix clients_calculate_size for manual type when window is NULL. From
   Elias Oenal in GitHub issue 4849.
VersionDeltaFile
1.54+2-2usr.bin/tmux/resize.c
+2-21 files

LLVM/project f22a178llvm/lib/Analysis IVDescriptors.cpp, llvm/lib/Transforms/Vectorize VPlanConstruction.cpp

Reland "[LV] Support conditional scalar assignments of masked operations" (#180708)

This patch extends the support added in #158088 to loops where the
assignment is non-speculatable (e.g. a conditional load or divide).

For example, the following loop can now be vectorized:

```
int simple_csa_int_load(
  int* a, int* b, int default_val, int N, int threshold)
{
  int result = default_val;
  for (int i = 0; i < N; ++i)
    if (a[i] > threshold)
      result = b[i];
  return result;
}
```


    [9 lines not shown]
DeltaFile
+1,144-0llvm/test/Transforms/LoopVectorize/AArch64/conditional-scalar-assignment.ll
+100-0llvm/test/Transforms/LoopVectorize/conditional-scalar-assignment-vplan.ll
+79-0llvm/unittests/Analysis/IVDescriptorsTest.cpp
+49-4llvm/lib/Analysis/IVDescriptors.cpp
+25-2llvm/lib/Transforms/Vectorize/VPlanConstruction.cpp
+0-9llvm/test/Transforms/LoopVectorize/X86/vectorization-remarks-missed.ll
+1,397-151 files not shown
+1,398-167 files

LLVM/project b4032dbmlir/test/Dialect/Vector transform-vector.mlir vector-transfer-flatten.mlir, mlir/test/Dialect/Vector/td flatten.mlir

[mlir][vector] Reuse vector TD op in vector.xfer flatten tests (#180606)

This change adds a `RUN` line in vector-transfer-flatten.mlir that will
use `vector.flatten_vector_transfer_ops` that was introduced in #178134.
It also removes a test added in the original PR whose coverage is
already provided by pre-existing tests.
DeltaFile
+0-32mlir/test/Dialect/Vector/transform-vector.mlir
+9-0mlir/test/Dialect/Vector/td/flatten.mlir
+3-0mlir/test/Dialect/Vector/vector-transfer-flatten.mlir
+12-323 files