LLVM/project afdbe7bmlir/lib/Conversion/MemRefToSPIRV MemRefToSPIRV.cpp, mlir/test/Conversion/MemRefToSPIRV bitwidth-emulation.mlir atomic.mlir

[mlir][SPIR-V] Combine storage class bit with atomic memory semantics (#195049)
DeltaFile
+42-14mlir/lib/Conversion/MemRefToSPIRV/MemRefToSPIRV.cpp
+14-14mlir/test/Conversion/MemRefToSPIRV/bitwidth-emulation.mlir
+9-9mlir/test/Conversion/MemRefToSPIRV/atomic.mlir
+2-2mlir/test/Conversion/MemRefToSPIRV/alloc.mlir
+67-394 files

LLVM/project 6f782eellvm/docs ReleaseNotes.md LangRef.rst, llvm/include/llvm/IR DataLayout.h

[DataLayout] Add null pointer value infrastructure

Add support for specifying the null pointer bit representation per address space
in DataLayout via new pointer spec flags:
- 'z': null pointer is all-zeros
- 'o': null pointer is all-ones

When neither flag is present, the address space inherits the default set by the
new 'N<null-value>' top-level specifier ('Nz' or 'No'). If that is also absent,
the null pointer value is zero.

No target DataLayout strings are updated in this change. This is pure
infrastructure for a future ConstantPointerNull semantic change to support
targets with non-zero null pointers (e.g. AMDGPU).
DeltaFile
+75-1llvm/unittests/IR/DataLayoutTest.cpp
+24-6llvm/lib/IR/DataLayout.cpp
+8-1llvm/include/llvm/IR/DataLayout.h
+7-0llvm/docs/ReleaseNotes.md
+5-2llvm/docs/LangRef.rst
+119-105 files

LLVM/project b6130afflang/lib/Semantics check-omp-structure.cpp check-omp-structure.h

[flang][OpenMP] Remove deferredNonVariables_ from OmpStructureChecker… (#195100)

…, NFC

It was created to defer error messages about invalid argument types
until the end of the analysis of the construct. That is not necessary
since diagnostic messages are emitted in the order corresponding to
their location in the source, not the order they were generated.
DeltaFile
+2-6flang/lib/Semantics/check-omp-structure.cpp
+0-1flang/lib/Semantics/check-omp-structure.h
+2-72 files

LLVM/project 9dcb6f7llvm/test/tools/llubi intr_vector_manip.ll intr_vscale_poison.ll, llvm/tools/llubi llubi.cpp

[llubi] Vector manipulation intrinsics cleanup (#195004)

This PR fixes llvm.vector.insert and llvm.vector.extract by adding a
missing UB case and handle scalable vectors correctly.

See also #194345.
DeltaFile
+26-7llvm/tools/llubi/lib/Interpreter.cpp
+14-4llvm/test/tools/llubi/intr_vector_manip.ll
+10-0llvm/tools/llubi/llubi.cpp
+2-2llvm/test/tools/llubi/intr_vscale_poison.ll
+52-134 files

LLVM/project 3d47936llvm/lib/CodeGen/SelectionDAG TargetLowering.cpp, llvm/test/CodeGen/AArch64 aarch64-mulv.ll vecreduce-fmul.ll

[DAG] expandVecReduce - widen sub-legal vectors to not prematurely scalarize later reduction levels (#194672)

When repeatedly splitting the pow2 vector source, we currently begin to
scalarize as soon as the split ops drop below the legal vector op type.

This patch attempts to widen the source vectors back to legal op types
to avoid excess scalarization / additional vector element extractions.

Fixes #194655
DeltaFile
+172-260llvm/test/CodeGen/X86/vector-compress.ll
+59-43llvm/test/CodeGen/X86/vector-extract-last-active.ll
+32-48llvm/test/CodeGen/PowerPC/cttz-elts.ll
+48-30llvm/test/CodeGen/AArch64/aarch64-mulv.ll
+35-29llvm/test/CodeGen/AArch64/vecreduce-fmul.ll
+37-2llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
+383-4122 files not shown
+406-4358 files

LLVM/project 76db420libc/docs CMakeLists.txt, libc/docs/headers index.rst

[libc][docs] Add nl_types.h POSIX header documentation (#122006) (#194373)

Add nl_types.h implementation-status docs to llvm-libc.

Depends on PR #194367. That change fixes docgen lookup for underscored
headers, without it, the macros of nl_types.h implementation status is
not reported accurately.
DeltaFile
+13-0libc/utils/docgen/nl_types.yaml
+1-0libc/docs/CMakeLists.txt
+1-0libc/docs/headers/index.rst
+15-03 files

LLVM/project 62310b0llvm/lib/ProfileData CMakeLists.txt

[ProfileData] Use FORCE_ON for LLVM_ENABLE_OPENCSD (#194973)

Use FORCE_ON instead of ON to only report the error but proceed when the
dependency is not found.
DeltaFile
+3-3llvm/lib/ProfileData/CMakeLists.txt
+3-31 files

LLVM/project da66e6dflang/include/flang/Lower/Support ReductionProcessor.h, flang/lib/Lower/OpenMP OpenMP.cpp ClauseProcessor.cpp

[flang][openmp] Fix incorrect reduction for array section in OpenMP DO SIMD (#192394)

for "!omp do parallel simd reduction" ensuring that reduction for array
section is done properly by :
1) per-SIMD-lane reduction results are combined into the wsloop's
   thread-local copies.
2) wsloop thread-local copies are combined across threads by the wsloop
   reduction.
   
Issue is in [192077](https://github.com/llvm/llvm-project/issues/192077)

---------

Co-authored-by: Sunil Kuravinakop <kuravina at pe31.hpc.amslabs.hpecorp.net>
DeltaFile
+27-4flang/lib/Lower/Support/ReductionProcessor.cpp
+22-8flang/lib/Lower/OpenMP/OpenMP.cpp
+19-0flang/test/Lower/OpenMP/wsloop-simd.f90
+9-1flang/include/flang/Lower/Support/ReductionProcessor.h
+4-2flang/lib/Lower/OpenMP/ClauseProcessor.cpp
+3-1flang/lib/Lower/OpenMP/ClauseProcessor.h
+84-166 files

LLVM/project 497d850utils/bazel/llvm-project-overlay/libc BUILD.bazel

[Bazel] Fixes bcc9a55 (#195091)

This fixes bcc9a55bdb228661d98444f0d6c74b47ed0426bb.

Co-authored-by: Google Bazel Bot <google-bazel-bot at google.com>
DeltaFile
+42-0utils/bazel/llvm-project-overlay/libc/BUILD.bazel
+42-01 files

LLVM/project b97c960llvm/lib/Transforms/Utils LoopUnroll.cpp

Add missing comment
DeltaFile
+1-1llvm/lib/Transforms/Utils/LoopUnroll.cpp
+1-11 files

LLVM/project d729590llvm/docs LangRef.rst ReleaseNotes.md, llvm/include/llvm/IR DataLayout.h

[DataLayout] Add null pointer value infrastructure

Add support for specifying the null pointer bit representation per address space
in DataLayout via new pointer spec flags:
- 'z': null pointer is all-zeros
- 'o': null pointer is all-ones

When neither flag is present, the address space inherits the default set by the
new 'N<null-value>' top-level specifier ('Nz' or 'No'). If that is also absent,
the null pointer value is zero.

No target DataLayout strings are updated in this change. This is pure
infrastructure for a future ConstantPointerNull semantic change to support
targets with non-zero null pointers (e.g. AMDGPU).
DeltaFile
+75-1llvm/unittests/IR/DataLayoutTest.cpp
+28-6llvm/lib/IR/DataLayout.cpp
+5-2llvm/docs/LangRef.rst
+7-0llvm/docs/ReleaseNotes.md
+6-1llvm/include/llvm/IR/DataLayout.h
+121-105 files

LLVM/project cc04ed6libc/cmake/modules LLVMLibCArchitectures.cmake

[libc] Stop passing `--version` to compiler when detecting target (#176680)

This reverts c267501c155f9, and also adds a `-c` flag.

Both gcc and clang print the `Target:` line that we're trying to find
just find with just `-v`.

When passing `--version`, gcc passes `--version` to the system linker,
and when using gcc on macOS, the system linker does not understand
`--version`. Since `--version` does not seem to be necessary, drop it.

Also, passing `-c` lets gcc not print linker details, so add that too,
as a belt-and-suspenders fix.

---

Makes `cmake` succeed for me on my mac with
`/Applications/CMake.app/Contents/bin/cmake ../llvm-project/llvm -G
Ninja -DLLVM_ENABLE_PROJECTS="libc" -DCMAKE_BUILD_TYPE=Release
-DCMAKE_C_COMPILER=gcc-12 -DCMAKE_CXX_COMPILER=g++-12` (with gcc-12 from
homebrew).
DeltaFile
+1-1libc/cmake/modules/LLVMLibCArchitectures.cmake
+1-11 files

LLVM/project 97e9716lldb/test/API/functionalities/breakpoint/delayed_breakpoints TestDelayedBreakpoint.py main.c

fixup! Add test
DeltaFile
+41-0lldb/test/API/functionalities/breakpoint/delayed_breakpoints/TestDelayedBreakpoint.py
+7-0lldb/test/API/functionalities/breakpoint/delayed_breakpoints/main.c
+3-0lldb/test/API/functionalities/breakpoint/delayed_breakpoints/Makefile
+51-03 files

LLVM/project a791695lldb/source/Target Process.cpp

fixup! Also delay initial breakpoint creation
DeltaFile
+8-1lldb/source/Target/Process.cpp
+8-11 files

LLVM/project b6aa86allvm/lib/Target/SPIRV SPIRVBuiltins.td, llvm/test/CodeGen/SPIRV iaddcarry-builtin.ll isubborrow-builtin.ll

[SPIR-V] Register __spirv_* arithmetic builtins for GLSL_std_450 (#195018)
DeltaFile
+5-0llvm/test/CodeGen/SPIRV/iaddcarry-builtin.ll
+5-0llvm/lib/Target/SPIRV/SPIRVBuiltins.td
+5-0llvm/test/CodeGen/SPIRV/isubborrow-builtin.ll
+0-3llvm/test/CodeGen/SPIRV/instructions/quantizeto16.ll
+2-0llvm/test/CodeGen/SPIRV/umulextended-builtin.ll
+2-0llvm/test/CodeGen/SPIRV/smulextended-builtin.ll
+19-36 files

LLVM/project 9a65e4alldb/source/Plugins/SymbolLocator/SymStore SymbolLocatorSymStore.cpp

[lldb] Change verbose logs into regular ones in SymbolLocatorSymStore (#195095)
DeltaFile
+5-6lldb/source/Plugins/SymbolLocator/SymStore/SymbolLocatorSymStore.cpp
+5-61 files

LLVM/project bc4aa89llvm/lib/Target/RISCV RISCVInstrInfo.cpp RISCVVLOptimizer.cpp, llvm/test/CodeGen/RISCV/rvv vl-opt.mir

[RISCV] Fix crash when tryReduceVL tries to sink to the end of the basic block. (#194706)

tryReduceVL may need to move an instruction to make the VL dominate. If
there is no instruction after the VL instruction, getNextNode will
return a nullptr.

Rewrite the code to use iterators so we will get an end iterator
instead. Replace the call to MachineInstr::moveBefore with the
equivalent MachineBasicBlock::slice which works on iterators.
DeltaFile
+34-0llvm/test/CodeGen/RISCV/rvv/vl-opt.mir
+3-3llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
+3-2llvm/lib/Target/RISCV/RISCVVLOptimizer.cpp
+2-1llvm/lib/Target/RISCV/RISCVInstrInfo.h
+42-64 files

LLVM/project 3279844clang/test/OpenMP nvptx_teams_reduction_codegen.cpp target_teams_reduction_codegen.cpp, openmp/device/src Reduction.cpp

[OpenMP][offload] Cross-team reductions with variable number of teams

This is the first patch in an upcoming series of patches that rework
OpenMP cross-team reductions.

This patch tries to be as minimal as possible and includes the following
changes:
1) Don't work through larger number of teams in chunks. Allocate a
  suitable-sized global buffer for the team values and launch them all
at once. The last team that finishes uses a strided loop to reduce the
team values from the global buffer.
2) Inline the new functions to reduce register usage, get rid of spills,
  and get rid of long switch-tables that codegen produced for the
indirect callbacks that are passed to the parallel/xteam reduction.*

The performance benefits in comparison to the previous state are often
up to 5x-10x. I did not observe any performance regressions. Can be
reproduced using my benchmark suite https://github.com/ro-i/xteam-test
(6854b7abc8848702b5a2d9ce2ea02849b5dc590b). Set compiler paths in

    [14 lines not shown]
DeltaFile
+0-3,642clang/test/OpenMP/nvptx_teams_reduction_codegen.cpp
+2,297-0clang/test/OpenMP/target_teams_reduction_codegen.cpp
+143-157openmp/device/src/Reduction.cpp
+60-60clang/test/OpenMP/target_teams_distribute_parallel_for_simd_schedule_codegen.cpp
+60-60clang/test/OpenMP/target_teams_distribute_parallel_for_schedule_codegen.cpp
+60-60clang/test/OpenMP/teams_distribute_parallel_for_simd_schedule_codegen.cpp
+2,620-3,979161 files not shown
+4,067-5,424167 files

LLVM/project ca96b67flang/lib/Semantics check-omp-structure.cpp check-omp-structure.h

[flang][OpenMP] Remove deferredNonVariables_ from OmpStructureChecker, NFC

It was created to defer error messages about invalid argument types until
the end of the analysis of the construct. That is not necessary since
diagnostic messages are emitted in the order corresponding to their
location in the source, not the order they were generated.
DeltaFile
+2-6flang/lib/Semantics/check-omp-structure.cpp
+0-1flang/lib/Semantics/check-omp-structure.h
+2-72 files

LLVM/project 2ab9a9bcompiler-rt/lib/builtins/arm divdf3.S muldf3.S

Formatting changes (left shifts, change of base, flip #if)
DeltaFile
+12-12compiler-rt/lib/builtins/arm/divdf3.S
+2-2compiler-rt/lib/builtins/arm/muldf3.S
+14-142 files

LLVM/project 8b7dd15clang/docs ReleaseNotes.rst, clang/lib/Sema SemaDeclCXX.cpp

Fix memcpy-operator= generation with restrict parameters. (#194906)

The below issue (and #63884) both report that we reject (and also
assert, because the memcpy failed) the memcpy we're generating for a
restrict field of a type with an implicit copy constructor.

First, we shouldn't be rejecting it this late, IF we wanted to reject it
(I contend we do not), we should do it at the same time we reject
const-members/make this a deleted operator. Second, of course we
shouldn't fail.

This patch NOW works by just having us skip the premature 'memcpy'
optimization here. In the end, the memcpy is generally skipped by
`CodeGenFunction::EmitCXXMemberOrOperatorMemberCallExpr` in the example
(as this is a trivial type), but this reverts it to using a 'for' loop
for restrict, as it does for const, and volatile qualified values.

We perhaps might think about doing this for address-spaces/ptr-auth, but
at the moment, this fixes restrict version.

Fixes: #37979
DeltaFile
+16-0clang/test/SemaCXX/GH37979.cpp
+1-1clang/lib/Sema/SemaDeclCXX.cpp
+1-0clang/docs/ReleaseNotes.rst
+18-13 files

LLVM/project 3e08390llvm/docs AMDGPUUsage.rst, llvm/docs/AMDGPU DeveloperGuideline.rst

[NFC][AMDGPU][Doc] Add developer guideline

This guideline covers topics on top of existing LLVM guideline.
DeltaFile
+394-0llvm/docs/AMDGPU/DeveloperGuideline.rst
+1-0llvm/docs/AMDGPUUsage.rst
+395-02 files

LLVM/project fc77aa9llvm/lib/Target/AArch64 AArch64Combine.td

[AArch64][GlobalISel] Use generic matchinfo. NFC (#195094)

This removes some of the simple AArch64 matchinfo's, using the generic
alternatives instead.
DeltaFile
+4-8llvm/lib/Target/AArch64/AArch64Combine.td
+4-81 files

LLVM/project c9f6984lldb/test/API/functionalities/breakpoint/delayed_breakpoints TestDelayedBreakpoint.py main.c

fixup! Add test
DeltaFile
+41-0lldb/test/API/functionalities/breakpoint/delayed_breakpoints/TestDelayedBreakpoint.py
+7-0lldb/test/API/functionalities/breakpoint/delayed_breakpoints/main.c
+3-0lldb/test/API/functionalities/breakpoint/delayed_breakpoints/Makefile
+51-03 files

LLVM/project c72a01fmlir/lib/Dialect/XeGPU/Transforms XeGPULayoutImpl.cpp, mlir/test/Dialect/XeGPU propagate-layout-subgroup.mlir

[MLIR][XeGPU] Consider alignment in dpas sg_layout creation (#181141)
DeltaFile
+39-10mlir/test/Dialect/XeGPU/propagate-layout-subgroup.mlir
+11-0mlir/lib/Dialect/XeGPU/Transforms/XeGPULayoutImpl.cpp
+50-102 files

LLVM/project 327cfc5llvm/include/llvm/Target/GlobalISel Combine.td

[GlobalISel] Remove spaces at the ends of liness in Combine.td. NFC (#195086)

Some editors do this automatically. Clean up the file so that it doesn't
come up again and again.
DeltaFile
+14-14llvm/include/llvm/Target/GlobalISel/Combine.td
+14-141 files

LLVM/project d6410f3offload/plugins-nextgen/common/include ErrorReporting.h

[Offload] Remove use of raw Linux fd for error reporting (#195073)

Summary:
This is a blocker on builiding Windows, we should be able to share the
common err stream tha LLVM provides
DeltaFile
+2-3offload/plugins-nextgen/common/include/ErrorReporting.h
+2-31 files

LLVM/project abc0093lldb/source/Plugins/Process/Windows/Common ProcessWindows.cpp

[lldb][windows] fix build issue (#195089)

Fix a build issue on Windows introduced by
https://github.com/llvm/llvm-project/pull/192964.
DeltaFile
+1-1lldb/source/Plugins/Process/Windows/Common/ProcessWindows.cpp
+1-11 files

LLVM/project bcc9a55libc/src/__support/OSUtil/linux sysinfo.h CMakeLists.txt, libc/src/__support/OSUtil/linux/syscall_wrappers sched_getaffinity.h CMakeLists.txt

[libc] add proc number parser and sysconf wrapper (#194159)

Add the functionality to detect number of processors with best effort.
Needed by STL to detect parallelism.

Assisted-by: Codex with gpt-5.4 high fast
DeltaFile
+191-0libc/src/__support/OSUtil/linux/sysinfo.h
+93-0libc/test/src/__support/OSUtil/linux/sysinfo_test.cpp
+36-0libc/src/__support/OSUtil/linux/syscall_wrappers/sched_getaffinity.h
+14-7libc/src/unistd/linux/sysconf.cpp
+17-0libc/src/__support/OSUtil/linux/CMakeLists.txt
+14-0libc/src/__support/OSUtil/linux/syscall_wrappers/CMakeLists.txt
+365-75 files not shown
+403-1011 files

LLVM/project 3575a57lldb/source/Target Process.cpp

fixup! Also delay initial breakpoint creation
DeltaFile
+2-1lldb/source/Target/Process.cpp
+2-11 files