LLVM/project 1324ea1llvm/lib/Target/WebAssembly WebAssemblyInstrSIMD.td, llvm/test/CodeGen/WebAssembly simd-reductions.ll simd-memcmp.ll

[WebAssembly] Fold any/alltrue SIMD boolean reductions with eqz (#184704)

Existing ISel patterns match setne/seteq following SIMD boolean reductions
any_true and all_true, and drop the ones that are redundant (because the
reductions always return 1 or 0). This adds patterns to also produce eqz
instructions instead of a comparison with a const.
DeltaFile
+243-48llvm/test/CodeGen/WebAssembly/simd-reductions.ll
+1-2llvm/test/CodeGen/WebAssembly/simd-memcmp.ll
+2-0llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td
+246-503 files

LLVM/project 331a91cllvm/lib/Target/AMDGPU AMDGPULowerVGPREncoding.cpp

[NFC][AMDGPU] Add debug print to `AMDGPULowerVGPREncoding.cpp` (#185331)
DeltaFile
+91-3llvm/lib/Target/AMDGPU/AMDGPULowerVGPREncoding.cpp
+91-31 files

LLVM/project a8b726aflang-rt/lib/runtime execute.cpp, flang-rt/unittests/Runtime CommandTest.cpp

[flang-rt] Need to pad the output of execute_command_line(..., CMDMSG) (#185509)

Previously the error message was copied, but not padded for cases where
the message was shorter than the passed CMDMSG string. Add the padding
and also change the test case to test padding on all platforms.
DeltaFile
+12-5flang-rt/unittests/Runtime/CommandTest.cpp
+4-8flang-rt/lib/runtime/execute.cpp
+16-132 files

LLVM/project f2dc489llvm/test/Analysis/CostModel/AMDGPU exp.ll exp10.ll

[AMDGPU] Replace undef with poison in exp/exp2/exp10 cost tests NFC (#185527)
DeltaFile
+192-192llvm/test/Analysis/CostModel/AMDGPU/exp.ll
+192-192llvm/test/Analysis/CostModel/AMDGPU/exp10.ll
+192-192llvm/test/Analysis/CostModel/AMDGPU/exp2.ll
+576-5763 files

LLVM/project 9d35ee2llvm/lib/Target/AMDGPU AMDGPULowerVGPREncoding.cpp

change all
DeltaFile
+9-9llvm/lib/Target/AMDGPU/AMDGPULowerVGPREncoding.cpp
+9-91 files

LLVM/project a9e457aoffload/plugins-nextgen/amdgpu/src rtl.cpp, offload/plugins-nextgen/common/include PluginInterface.h

[Offload][AMDGPU] Fix RPC server on mixed w32 w64 workloads (#185496)

Summary:
This was a regression from the original LLVM-gpu-loader. We used to
handle `-mwavefrontsize64` correctly in the loader by over-allocating
memory and just leaving the upper 32-bits masked off. In order to handle
this in offload we need to scan loaded kernels to see how much memory we
need to allocate. This should be safe, the protocol is designed to
handle an arbitrary size and worst-case this just wastes space.
DeltaFile
+21-0offload/plugins-nextgen/amdgpu/src/rtl.cpp
+3-3offload/plugins-nextgen/common/src/RPC.cpp
+3-0offload/plugins-nextgen/common/include/PluginInterface.h
+27-33 files

LLVM/project 6a6564clibc/hdr/types Elf64_auxv_t.h Elf32_auxv_t.h, libc/include elf.yaml

[libc] Add more macro/type declarations to Elf headers. (#185348)

* Add several `AT_` macro values from `<sys/auxv.h>`. In particular,
this allows to make internal Linux auxv header parsing more hermetic by
removing one of Linux header includes.
* Add constants between `DT_ADDRNGLO` and `DT_ADDRNGHI`, in particular
`DT_GNU_HASH`, which is de-facto standard on many platforms.
* Add `Elf32_auxv_t` and `Elf64_auxv_t` types which define the auxv
entries and can be used by VDSO parsing code. Note that this PR doesn't
yet update libc's own Linux auxv header support (in
`src/__support/OSUtil/linux/auxv.h`).

This fixes some of the missing definitions when building code working
with Elf files, such as Abseil's debugging support in

https://github.com/abseil/abseil-cpp/tree/master/absl/debugging/internal.
DeltaFile
+38-0libc/include/elf.yaml
+22-0libc/hdr/types/Elf64_auxv_t.h
+22-0libc/hdr/types/Elf32_auxv_t.h
+21-0libc/include/llvm-libc-types/Elf32_auxv_t.h
+21-0libc/include/llvm-libc-types/Elf64_auxv_t.h
+16-0libc/hdr/types/CMakeLists.txt
+140-07 files not shown
+154-113 files

LLVM/project 0559fe6clang-tools-extra/clang-doc CMakeLists.txt, clang-tools-extra/clang-doc/benchmarks CMakeLists.txt

[clang-doc] Cleanup CMake files and ensure benchmarks build (#185469)

There's some poor formatting, and ClangDocBenchmark references several
targets that are required, but only because they're required for clang-doc
itself. We can just get those requirements from the clangDoc target.

Additionally, we can make sure the benchmark builds as part of testing
when LLVM_INCLUDE_BENCHMARKS is set.
DeltaFile
+0-5clang-tools-extra/clang-doc/benchmarks/CMakeLists.txt
+4-0clang-tools-extra/test/clang-doc/CMakeLists.txt
+1-1clang-tools-extra/clang-doc/CMakeLists.txt
+5-63 files

LLVM/project d5685acllvm/lib/Target/AMDGPU AMDGPULowerKernelAttributes.cpp, llvm/lib/Target/AMDGPU/Utils AMDGPUBaseInfo.h AMDGPUBaseInfo.cpp

Revert "AMDGPU: Annotate group size ABI loads with range metadata (#185420)"

This reverts commit 76daf31b4000623d5c9548348a859ea3ed8712e1.

Bot failure.
DeltaFile
+15-122llvm/test/CodeGen/AMDGPU/implicit-arg-v5-opt.ll
+19-48llvm/lib/Target/AMDGPU/AMDGPULowerKernelAttributes.cpp
+7-8llvm/test/CodeGen/AMDGPU/amdgpu-max-num-workgroups-load-annotate.ll
+7-8llvm/test/CodeGen/AMDGPU/implicit-arg-block-count.ll
+2-5llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.h
+5-0llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp
+55-1912 files not shown
+57-1938 files

LLVM/project 90978e4llvm/lib/Target/AArch64 AArch64Arm64ECCallLowering.cpp, llvm/test/CodeGen/AArch64 arm64ec-entry-thunks.ll

[arm64ec] Fix missing sret return in Arm64EC entry thunks for large struct returns (#185452)

When an Arm64EC function returns a struct by value that is too large for
x64's `RAX` (>8 bytes), the entry thunk synthesizes a hidden sret
pointer parameter for the x64 side. However, this
parameter was never marked with the sret attribute, so ISel did not copy
its value into `x8` (the Arm64EC mapping of `RAX`) on return. This
caused the x64 caller to see a garbage pointer in `RAX` instead of the
return buffer address.

The change adds the sret attribute to the thunk's synthesized pointer
parameter, so that `LowerFormalArguments` saves it and `LowerReturn`
restores it to `x8` before the tail call to `__os_arm64x_dispatch_ret`.

Fixes #185390
DeltaFile
+5-0llvm/lib/Target/AArch64/AArch64Arm64ECCallLowering.cpp
+2-0llvm/test/CodeGen/AArch64/arm64ec-entry-thunks.ll
+7-02 files

LLVM/project 13f5238llvm/test/CodeGen/AArch64 movi64_sve.ll

[AArch64][GlobalISel] Add test coverage to movi64_sve.ll. NFC
DeltaFile
+404-183llvm/test/CodeGen/AArch64/movi64_sve.ll
+404-1831 files

LLVM/project aa64d7elibc/test/src/sys/socket/linux sendmsg_recvmsg_test.cpp send_recv_test.cpp, libc/test/src/time mktime_test.cpp

[libc] Fix implicit lossy conversion warnings in tests (NFC). (#185503)

Cast expected error values (-1) to ssize_t for send/recv family of
functions.
DeltaFile
+4-2libc/test/src/sys/socket/linux/sendmsg_recvmsg_test.cpp
+2-2libc/test/src/sys/socket/linux/send_recv_test.cpp
+2-2libc/test/src/sys/socket/linux/sendto_recvfrom_test.cpp
+2-1libc/test/src/time/mktime_test.cpp
+10-74 files

LLVM/project 4a75d33llvm/lib/Target/PowerPC PPCRegisterInfo.td

[PowerPC][NFC] fix indentation and spacing (#185500)
DeltaFile
+34-37llvm/lib/Target/PowerPC/PPCRegisterInfo.td
+34-371 files

LLVM/project 089d69dflang-rt/lib/cuda memory.cpp descriptor.cpp, flang/include/flang/Runtime/CUDA common.h

[flang][cuda][NFC] Add filename and line number in error reporting (#185516)

Some entry points carry over filename and line number for error
reporting. Use this information when reporting cuda error.
DeltaFile
+14-9flang-rt/lib/cuda/memory.cpp
+11-0flang/include/flang/Runtime/CUDA/common.h
+5-3flang-rt/lib/cuda/descriptor.cpp
+30-123 files

LLVM/project 53cb236clang/test/CodeGenObjC expose-direct-method-linkedlist.m expose-direct-method-visibility-linkage.m

address comments
DeltaFile
+11-59clang/test/CodeGenObjC/expose-direct-method-linkedlist.m
+10-15clang/test/CodeGenObjC/expose-direct-method-visibility-linkage.m
+21-742 files

LLVM/project a815666libunwind/src libunwind.cpp, libunwind/test cfi_violating_handler.pass.cpp

[libunwind][PAC] Defang ptrauth's PC in valid CFI range abort

It turns out making the CFI check a release mode abort causes many,
if not the majority, of JITs to fail during unwinding as they do not
set up CFI sections for their generated code. As a result any JITs
that do nominally support unwinding (and catching) through their JIT
or assembly frames trip this abort.

rdar://170862047
DeltaFile
+54-0libunwind/test/cfi_violating_handler.pass.cpp
+11-17libunwind/src/libunwind.cpp
+65-172 files

LLVM/project 6e8e8eallvm/lib/Analysis InstructionSimplify.cpp, llvm/test/Transforms/InstSimplify and-or-implied-cond.ll

Revert "[InstSimplify] Simplify and/or of trunc nuw to i1 with op replacement"

This reverts commit dacb62989db8084cc2865d6b9ef85bbdf34e112d.
DeltaFile
+11-3llvm/test/Transforms/InstSimplify/and-or-implied-cond.ll
+2-6llvm/lib/Analysis/InstructionSimplify.cpp
+13-92 files

LLVM/project a89bb62clang/lib/Headers gpuintrin.h, clang/test/Headers gpuintrin.c

[Clang] Update the 'gpuintrin.h' lane scan handling (#185451)

Summary:
This patch uses a more efficient algorithm for the reduction rather than
a divergent branch. We also provide a prefix and suffix version, the sum
is now just the first element of this.

This changes the name to this, which is technically breaking but I don't
think these were really used in practice and it's a trivial change based
on the clang version if it's really needed..
```
__gpu_prefix_scan_sum_u32(...)
__gpu_suffix_scan_sum_u32(...)
```
DeltaFile
+322-0clang/test/Headers/gpuintrin.c
+45-57clang/lib/Headers/gpuintrin.h
+1-1libc/src/__support/GPU/utils.h
+368-583 files

LLVM/project 7030a34libc/docs index.rst conf.py, libc/docs/Helpers Styles.rst

[libc][docs] Furo theme, new landing page, cleanups (#184303)

Switch the libc documentation site from the alabaster theme to Furo,
which provides mobile-friendly layout, a collapsible sidebar with
caption-based section grouping, and built-in "Edit this page" links.

Changes by area:

conf.py
- Switch html_theme to "furo"
- Add myst_parser extension (already in llvm/docs/requirements.txt, used
by LLDB/Clang/LLVM docs) to allow Markdown alongside RST
- Accept both .rst and .md source suffixes
- Configure Furo source_repository/source_branch/source_directory for
"Edit this page" links pointing to GitHub
- Wire _static/copybutton.{js,css} for copy-to-clipboard buttons on code
blocks (no new pip dependency; can migrate to sphinx-copybutton later
once it's in requirements-hashed.txt)
- Exclude plan-docs.md and Helpers/ from Sphinx processing

    [31 lines not shown]
DeltaFile
+73-51libc/docs/index.rst
+93-0libc/docs/dev/building_docs.rst
+56-0libc/docs/_static/copybutton.js
+40-0libc/docs/Helpers/Styles.rst
+39-0libc/docs/_static/copybutton.css
+17-7libc/docs/conf.py
+318-5820 files not shown
+340-13926 files

LLVM/project 13b3943lldb/examples/python formatter_bytecode.py, lldb/test/Shell/ScriptInterpreter/Python python-bytecode.test

[lldb][bytecode] Add Python to formatter bytecode compiler (#113734)

A compiler from Python to the assembly syntax of the [lldb data
formatter
bytecode](https://discourse.llvm.org/t/a-bytecode-for-lldb-data-formatters/82696).

Assisted-by: claude
DeltaFile
+406-10lldb/examples/python/formatter_bytecode.py
+38-0lldb/test/Shell/ScriptInterpreter/Python/Inputs/FormatterBytecode/RigidArrayLLDBFormatter.txt
+29-0lldb/test/Shell/ScriptInterpreter/Python/python-bytecode.test
+473-103 files

LLVM/project 575267fllvm/lib/Target/AMDGPU AMDGPULowerVGPREncoding.cpp

use ' instead of " for single character
DeltaFile
+2-2llvm/lib/Target/AMDGPU/AMDGPULowerVGPREncoding.cpp
+2-21 files

LLVM/project dacb629llvm/lib/Analysis InstructionSimplify.cpp, llvm/test/Transforms/InstSimplify and-or-implied-cond.ll

[InstSimplify] Simplify and/or of trunc nuw to i1 with op replacement
DeltaFile
+3-11llvm/test/Transforms/InstSimplify/and-or-implied-cond.ll
+6-2llvm/lib/Analysis/InstructionSimplify.cpp
+9-132 files

LLVM/project fa23f91llvm/test/Transforms/InstSimplify and-or-implied-cond.ll

[InstSimplify] Test Simplify and/or of trunc nuw to i1 with op replacement (NFC)
DeltaFile
+60-0llvm/test/Transforms/InstSimplify/and-or-implied-cond.ll
+60-01 files

LLVM/project cc331callvm/lib/Target/AMDGPU AMDGPULowerVGPREncoding.cpp

[NFC][AMDGPU] Add debug print to `AMDGPULowerVGPREncoding.cpp`
DeltaFile
+91-3llvm/lib/Target/AMDGPU/AMDGPULowerVGPREncoding.cpp
+91-31 files

LLVM/project 91ef927llvm/test/CodeGen/MIR/Generic prefetch-targets-error.mir

[CodeGen] Fix 4f094816ef7d2811b36ee328bac3b418dfd021cc

Missed fixing some stuff up due to files still being left around in my
build directory.
DeltaFile
+4-4llvm/test/CodeGen/MIR/Generic/prefetch-targets-error.mir
+4-41 files

LLVM/project 215b905mlir/include/mlir/Dialect/SPIRV/IR SPIRVTypes.h, mlir/lib/Dialect/SPIRV/IR SPIRVTypes.cpp

[mlir][spirv] Make `MatrixType` type a `ShapedType` (#185470)

This will allow to enforce some of the type constraints in ODS using
builtin classes e.g., `AllElementTypesMatch`. This is a first PR in a series of PRs moving all verification for Matrix
ops to ODS.
DeltaFile
+23-10mlir/lib/Dialect/SPIRV/IR/SPIRVTypes.cpp
+23-2mlir/include/mlir/Dialect/SPIRV/IR/SPIRVTypes.h
+46-122 files

LLVM/project 2bd2332utils/bazel/llvm-project-overlay/mlir/unittests BUILD.bazel

[Bazel] Fixes 48e6adc (#185507)

This fixes 48e6adc97edd0d833908b49e2c504afb4a90c61a.
DeltaFile
+2-0utils/bazel/llvm-project-overlay/mlir/unittests/BUILD.bazel
+2-01 files

LLVM/project 4f09481llvm/test/CodeGen/MIR/Generic prefetch-targets-error.mir

[CodeGen] Fix prefetch-targets-error.mir

\#184194 introduced this test which was failing in some configurations
as it would try and write output to the test directory by having
incorrectly specified -o flags.
DeltaFile
+4-4llvm/test/CodeGen/MIR/Generic/prefetch-targets-error.mir
+4-41 files

LLVM/project be5da92clang/lib/Sema SemaHLSL.cpp

clang-format
DeltaFile
+1-1clang/lib/Sema/SemaHLSL.cpp
+1-11 files

LLVM/project 2cde7aclibc/test/src/sys/time utimes_test.cpp

[libc] Use explicit cast to time_t in utimes_test. (#185307)

This fixes an error on RISCV-32 bot, where time_t is "long long" type
(64-bit, as required by POSIX), instead of "long".
DeltaFile
+2-2libc/test/src/sys/time/utimes_test.cpp
+2-21 files