LLVM/project 6a202b8clang/docs ReleaseNotes.rst, clang/include/clang/Basic DiagnosticLexKinds.td

Revert "[Clang] Implement P2843R3 - Preprocessing is never undefined (#192073)"

This reverts commit 42e0cdf2fcc3455d95be7d875302a0f7cb7c592d.
DeltaFile
+0-48clang/test/Preprocessor/p2843r3.cpp
+8-12clang/test/Lexer/cxx-features.cpp
+1-9clang/www/cxx_status.html
+0-6clang/docs/ReleaseNotes.rst
+1-3clang/lib/Lex/PPExpressions.cpp
+0-2clang/include/clang/Basic/DiagnosticLexKinds.td
+10-806 files

LLVM/project 2581cc8clang/include/clang/AST ASTContext.h, clang/lib/AST ASTContext.cpp ItaniumMangle.cpp

[clang] implement CWG2064: ignore value dependence for decltype

The 'decltype' for a value-dependent (but non-type-dependent) should be known,
so this patch makes them non-opaque instead.

This patch also implements what's neceessary to allow overloading
on pure differences in instantiation dependence, making `std::void_t`
usable for SFINAE purposes.

This also readds a few test cases from da98651, which was a previous attempt
at resolving CWG2064.

Fixes #8740
Fixes #61818
Fixes #190388
DeltaFile
+889-175clang/lib/AST/ASTContext.cpp
+312-12clang/test/SemaTemplate/instantiation-dependence.cpp
+151-93clang/lib/AST/ItaniumMangle.cpp
+76-68clang/lib/AST/Type.cpp
+76-48clang/lib/Sema/SemaTemplate.cpp
+93-16clang/include/clang/AST/ASTContext.h
+1,597-41282 files not shown
+2,352-77588 files

LLVM/project 6ff9ca2llvm/lib/Target/RISCV RISCVISelDAGToDAG.cpp

[RISCV] Don't check isApplicableToPLI for simm12 constants. (#192522)

It won't match except when the constant is -1, which we should use li
for. This avoids an unecessary call for hasAllWUsers in that case.
DeltaFile
+2-2llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp
+2-21 files

LLVM/project ce02e11llvm/test/CodeGen/AMDGPU amdgcn.bitcast.1024bit.ll amdgcn.bitcast.512bit.ll, llvm/test/CodeGen/AMDGPU/NextUseAnalysis spill-vreg-many-lanes.mir acyclic-770bb.mir

rebase

Created using spr 1.3.7
DeltaFile
+160,429-171,418llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.1024bit.ll
+275,101-0llvm/test/CodeGen/AMDGPU/NextUseAnalysis/spill-vreg-many-lanes.mir
+144,679-0llvm/test/CodeGen/AMDGPU/NextUseAnalysis/acyclic-770bb.mir
+54,182-54,736llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.512bit.ll
+92,827-0llvm/test/CodeGen/RISCV/rvv/clmulh-sdnode.ll
+42,349-42,348llvm/test/MC/AMDGPU/gfx8_asm_vop3.s
+769,567-268,50232,932 files not shown
+5,732,585-2,797,52232,938 files

LLVM/project c195385utils/bazel MODULE.bazel.lock MODULE.bazel

[bazel] Update rules_python (#192518)

This pulls in this fix
https://github.com/bazel-contrib/rules_python/pull/3420
DeltaFile
+5-5utils/bazel/MODULE.bazel.lock
+1-1utils/bazel/MODULE.bazel
+6-62 files

LLVM/project 7039515clang/lib/CodeGen CGStmtOpenMP.cpp, clang/test/OpenMP metadirective_device_arch_codegen.cpp

[OpenMP] Fix convention of SPIRV outline functions (#192450)

When creating an outline function for device code we're not setting the
right calling convention when the target is SPIRV. This results in the
calls to the function to be removed by the InstCombine pass as it thinks
is not callable.
DeltaFile
+6-0clang/lib/CodeGen/CGStmtOpenMP.cpp
+2-3clang/test/OpenMP/metadirective_device_arch_codegen.cpp
+0-1offload/test/offloading/ompx_coords.c
+8-43 files

LLVM/project 19ad75emlir/include/mlir/Dialect/OpenACC OpenACCCGOps.td, mlir/lib/Dialect/OpenACC/IR OpenACCCG.cpp

[mlir][acc] Ensure implicit declare hoisting works for compute_region (#192501)

Any hoisting across `acc.compute_region` needs to be wired through block
arguments as the region is `IsolatedFromAbove`. Thus update
`ACCImplicitDeclare` to do so by using new API
`wireHoistedValueThroughIns` which handles the value wiring after
hoisting.
DeltaFile
+178-0mlir/unittests/Dialect/OpenACC/OpenACCCGOpsTest.cpp
+21-0mlir/test/Dialect/OpenACC/acc-implicit-declare.mlir
+18-0mlir/lib/Dialect/OpenACC/IR/OpenACCCG.cpp
+10-0mlir/include/mlir/Dialect/OpenACC/OpenACCCGOps.td
+7-2mlir/lib/Dialect/OpenACC/Transforms/ACCImplicitDeclare.cpp
+1-0mlir/unittests/Dialect/OpenACC/CMakeLists.txt
+235-21 files not shown
+236-27 files

LLVM/project 981da65clang/test/ClangScanDeps prune-scanning-modules.m, llvm/test/tools/llvm-objcopy/ELF strip-preserve-atime.test

Invalidate tests using "touch -a" on Darwin (#192521)

Tests uses 'touch -a' which is known to fail on macOS.
DeltaFile
+1-1llvm/test/tools/llvm-objcopy/ELF/strip-preserve-atime.test
+1-1clang/test/ClangScanDeps/prune-scanning-modules.m
+2-22 files

LLVM/project e210f22clang/docs MemorySanitizer.rst ThreadSanitizer.rst

[Clang][Docs] Fix malformed code-block directive in MSan and TSan docs (#190461)

The `code-block` directives in MemorySanitizer.rst and
ThreadSanitizer.rst were missing a leading period (`. code-block`
instead of `.. code-block`). This syntax error caused Sphinx to fail to
recognize the directives, resulting in the the subsequent C code being
rendered as plain text rather than a syntax-highlighted block.

The currently broken rendering on the official docs can be seen
[here](https://clang.llvm.org/docs/MemorySanitizer.html#interaction-of-inlining-with-disabling-sanitizer-instrumentation)
and
[here](https://clang.llvm.org/docs/ThreadSanitizer.html#interaction-of-inlining-with-disabling-sanitizer-instrumentation).

Fixed the typos to ensure proper HTML rendering.
DeltaFile
+1-1clang/docs/MemorySanitizer.rst
+1-1clang/docs/ThreadSanitizer.rst
+2-22 files

LLVM/project f162be2llvm/lib/Target/RISCV RISCVISelLowering.cpp, llvm/test/CodeGen/RISCV stack-clash-prologue.ll

[RISCV] Use unsigned comparison for stack clash probing loop (#192485)

The stack clash probing loop generated in `emitDynamicProbedAlloc` used
a signed comparison (`RISCV::COND_BLT`) to determine when the allocation
target had been reached.

In 32-bit mode, memory addresses above `0x80000000` have the sign bit
set. If the stack pointer lands in this region, treating the addresses
as signed integers causes the comparison logic to fail.

This patch changes the condition code to `RISCV::COND_BLTU` (Branch if
Less Than Unsigned), which generates an unsigned comparison. This
ensures that addresses are treated correctly as unsigned quantities on
all targets.

On 64-bit systems, this change has no practical effect on valid
user-space addresses because they do not use the sign bit (being
restricted to the lower half of the address space). However, using
unsigned comparison is the correct behavior for pointer arithmetic and

    [2 lines not shown]
DeltaFile
+12-12llvm/test/CodeGen/RISCV/rvv/stack-probing-dynamic.ll
+2-2llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+2-2llvm/test/CodeGen/RISCV/stack-clash-prologue.ll
+16-163 files

LLVM/project 0f45edelibc/test/src/__support/wctype CMakeLists.txt

revert constexpr steps
DeltaFile
+2-2libc/test/src/__support/wctype/CMakeLists.txt
+2-21 files

LLVM/project e60e400llvm/lib/Target/RISCV RISCVISelDAGToDAG.cpp, llvm/test/CodeGen/RISCV rv32p.ll rv64p.ll

[RISCV][P-ext] Use pli.b when only the lower 2 bytes are used. (#192400)

If the lower 2 bytes are the same and are the only bytes used we
can use pli.b instead of lui+addi.
DeltaFile
+24-10llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp
+16-6llvm/test/CodeGen/RISCV/rv32p.ll
+10-0llvm/test/CodeGen/RISCV/rv64p.ll
+50-163 files

LLVM/project bec1019clang/cmake/caches Fuchsia-stage2.cmake

Revert "[Fuchsia] Stack analysis flags for runtimes" (#192515)

Reverts llvm/llvm-project#175677

We noticed using -fexperimental-call-graph-section with Control Flow
Integrity causes link failures in certain situations. Reverting this
change that sets the call graph section flag until we investigate the
root cause of the problem and handle it in the compiler well.
DeltaFile
+5-5clang/cmake/caches/Fuchsia-stage2.cmake
+5-51 files

LLVM/project af4d33bllvm/docs AMDGPUUsage.rst, llvm/lib/Target/AMDGPU AMDGPUAsmPrinter.cpp

[AMDGPU] Add `.amdgpu.info` section for per-function metadata

AMDGPU object linking requires the linker to propagate resource usage
(registers, stack, LDS) across translation units. To support this, the compiler
must emit per-function metadata and call graph edges in the relocatable object
so the linker can compute whole-program resource requirements.

This PR introduces a `.amdgpu.info` ELF section using a tagged, length-prefixed
binary format: each entry is encoded as:

```
[kind: u8] [len: u8] [payload: <len> bytes]
```

A function scope is opened by an `INFO_FUNC` entry (containing a symbol
reference), followed by per-function attributes (register counts, flags, private
segment size) and relational edges (direct calls, LDS uses, indirect call
signatures). String data such as function type signatures is stored in a
companion `.amdgpu.strtab` section.

    [4 lines not shown]
DeltaFile
+196-0llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUTargetStreamer.cpp
+172-2llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp
+121-0llvm/test/MC/AMDGPU/amdgpu-info-roundtrip.s
+117-0llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp
+113-0llvm/docs/AMDGPUUsage.rst
+83-0llvm/test/CodeGen/AMDGPU/lds-link-time-codegen-prototype.ll
+802-29 files not shown
+1,171-1215 files

LLVM/project c35e66elibc/test/src/__support/wctype CMakeLists.txt

reduce constexpr steps
DeltaFile
+2-2libc/test/src/__support/wctype/CMakeLists.txt
+2-21 files

LLVM/project 40fb302flang/test/Parser/OpenMP nonblock-do-nested-omp.f90

Remove OpenMP version from test
DeltaFile
+2-2flang/test/Parser/OpenMP/nonblock-do-nested-omp.f90
+2-21 files

LLVM/project 9931b78clang/lib/Driver/ToolChains Cuda.cpp AMDGPU.cpp, clang/test/Driver amdgpu-multilib.yaml nvptx-multilib.yaml

[Clang] Add multilib support for GPU targets (#192285)

Summary:
This PR uses the new, generic multilib support added in
https://github.com/llvm/llvm-project/pull/188584
to also function for GPU targets. This will allow toolchains to easy
provide variants of these GPU libraries (for debug or asan). In
practice, this will look something like this:

```console
  -DRUNTIMES_amdgcn-amd-amdhsa+debug_CMAKE_BUILD_TYPE=Debug \
  -DRUNTIMES_amdgcn-amd-amdhsa+debug_LIBOMPTARGET_ENABLE_DEBUG=ON \
  -DRUNTIMES_amdgcn-amd-amdhsa+debug_LLVM_ENABLE_RUNTIMES=openmp \
  -DLLVM_RUNTIME_MULTILIBS=debug \
  -DLLVM_RUNTIME_MULTILIB_debug_TARGETS="amdgcn-amd-amdhsa" \
```

This will then install it into the tree like this:
```

    [7 lines not shown]
DeltaFile
+80-0clang/test/Driver/amdgpu-multilib.yaml
+80-0clang/test/Driver/nvptx-multilib.yaml
+15-1clang/lib/Driver/ToolChains/Cuda.cpp
+14-0clang/lib/Driver/ToolChains/AMDGPU.cpp
+1-0clang/lib/Driver/ToolChains/Clang.cpp
+190-15 files

LLVM/project 02589e1flang/lib/Parser openmp-parsers.cpp, flang/test/Parser/OpenMP nonblock-do-nested-omp.f90

[flang][OpenMP] Get final label from nested constructs

Non-block DO loops can share termination statements. When parsing
a non-block DO loop, account for labels on terminating statements
from recursively parsed ExecutionPartConstructs.

Fixes https://github.com/llvm/llvm-project/issues/188892
DeltaFile
+88-0flang/test/Parser/OpenMP/nonblock-do-nested-omp.f90
+6-0flang/lib/Parser/openmp-parsers.cpp
+94-02 files

LLVM/project 8d8be91libc/src/__support/wctype perfect_hash_map.h, libc/test/src/__support/wctype wctype_perfect_hash_test.cpp

Apply code review
DeltaFile
+67-62libc/src/__support/wctype/perfect_hash_map.h
+6-0libc/test/src/__support/wctype/wctype_perfect_hash_test.cpp
+73-622 files

LLVM/project ca3bc44flang/lib/Optimizer/Builder IntrinsicCall.cpp, flang/test/Lower/Intrinsics transfer.f90

[flang] Inline scalar-to-scalar TRANSFER for same-size trivial types (#191589)

Inline the TRANSFER intrinsic for scalar-to-scalar cases where the
result is a trivial type (integer, real, etc.) and source and result
have the same storage size. Instead of calling _FortranATransfer, the
lowering now emits a fir.convert on the source address followed by a
fir.load, effectively performing a reinterpret cast.
DeltaFile
+105-8flang/test/Lower/Intrinsics/transfer.f90
+33-2flang/lib/Optimizer/Builder/IntrinsicCall.cpp
+138-102 files

LLVM/project 38f3d0bllvm/lib/CodeGen MachineCopyPropagation.cpp, llvm/test/CodeGen/X86 machine-copy-prop.mir

[MCP] Never eliminate frame-setup/destroy instructions (#186237)

Presumably targets only insert frame instructions which are significant,
and there may be effects MCP doesn't model. Similar to reserved
registers this
is probably overly conservative, but as this causes no codegen change in
any lit test I think it is benign.

The motivation is just to clean up #183149 for AMDGPU, as we can spill
to physical registers, and currently have to spill the EXEC mask purely
to enable debug-info.

Change-Id: I9ea4a09b34464c43322edd2900361bf635efd9f7
DeltaFile
+22-0llvm/test/CodeGen/X86/machine-copy-prop.mir
+11-5llvm/lib/CodeGen/MachineCopyPropagation.cpp
+33-52 files

LLVM/project 2086b87clang/docs ReleaseNotes.rst, clang/include/clang/AST CommentSema.h

[clang] Fix false positive with -Wdocumentation and explicit instanti… (#178223)

…ations

Solves a use case listed in #64087.
DeltaFile
+19-0clang/lib/AST/CommentSema.cpp
+10-0clang/test/Sema/warn-documentation.cpp
+3-0clang/docs/ReleaseNotes.rst
+1-0clang/include/clang/AST/CommentSema.h
+33-04 files

LLVM/project 45dd0d7clang/cmake/caches Fuchsia-stage2.cmake

Revert "[Fuchsia] Stack analysis flags for runtimes (#175677)"

This reverts commit aacda8da6bc66287f45712c7b334ed552f315fcc.
DeltaFile
+5-5clang/cmake/caches/Fuchsia-stage2.cmake
+5-51 files

LLVM/project 23dcca9llvm/lib/Target/AMDGPU AMDGPUInstCombineIntrinsic.cpp, llvm/test/Transforms/InstCombine/AMDGPU llvm.amdgcn.cluster.id.ll llvm.amdgcn.workitem.id.ll

Revert "[AMDGPU] InstCombine: fold invalid calls to amdgcn intrinsics into poison values" (#192514)

Reverts llvm/llvm-project#191904
DeltaFile
+0-80llvm/test/Transforms/InstCombine/AMDGPU/llvm.amdgcn.cluster.id.ll
+0-57llvm/test/Transforms/InstCombine/AMDGPU/llvm.amdgcn.workitem.id.ll
+0-57llvm/test/Transforms/InstCombine/AMDGPU/llvm.amdgcn.workgroup.id.ll
+0-55llvm/lib/Target/AMDGPU/AMDGPUInstCombineIntrinsic.cpp
+0-48llvm/test/Transforms/InstCombine/AMDGPU/llvm.amdgcn.dispatch.ptr.ll
+0-48llvm/test/Transforms/InstCombine/AMDGPU/llvm.amdgcn.queue.ptr.ll
+0-3452 files not shown
+0-4098 files

LLVM/project 59aab2allvm/lib/Target/AMDGPU AMDGPUAsmPrinter.cpp SIISelLowering.cpp, llvm/test/CodeGen/AMDGPU lds-link-time-codegen.ll lds-link-time-codegen-named-barrier.ll

[AMDGPU] Emit the relocation symbol for LDS and named barrier when object linking is enabled
DeltaFile
+50-0llvm/test/CodeGen/AMDGPU/lds-link-time-codegen.ll
+35-0llvm/test/CodeGen/AMDGPU/lds-link-time-codegen-named-barrier.ll
+12-3llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp
+12-0llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+109-34 files

LLVM/project 5aea02aoffload/test/api omp_indirect_func_struct.c omp_indirect_call_table_manual.c, openmp/device/src Misc.cpp

[OpenMP][Device] Fix __llvm_omp_indirect_call_lookup function pointer types (#192502)

`__llvm_omp_indirect_call_lookup` takes in and returns a function
pointer, so make sure the types are correct, which includes the correct
address space.

The FE was recently changed to generate the correct code
[here](https://github.com/llvm/llvm-project/pull/192470).

With this change, three function pointer tests start passing.

Signed-off-by: Nick Sarnie <nick.sarnie at intel.com>
DeltaFile
+4-4openmp/device/src/Misc.cpp
+0-1offload/test/api/omp_indirect_func_struct.c
+0-1offload/test/api/omp_indirect_call_table_manual.c
+0-1offload/test/api/omp_indirect_func_array.c
+4-74 files

LLVM/project 6b054fdclang/lib/CIR/CodeGen CIRGenDecl.cpp CIRGenFunction.h, clang/test/CIR/CodeGen field-init-eh.cpp

[CIR] Implement EH handling for field initializers (#192360)

This implements the handling to call the dtor for any previously
initialized fields of destructed type if an exception is thrown later in
the initialization of the containing class.

The basic infrastructure to handle this was already in place. We just
needed a function to push an EH-only destroy cleanup on the EH stack and
a call to that function.
DeltaFile
+83-0clang/test/CIR/CodeGen/field-init-eh.cpp
+11-0clang/lib/CIR/CodeGen/CIRGenDecl.cpp
+3-0clang/lib/CIR/CodeGen/CIRGenFunction.h
+1-2clang/lib/CIR/CodeGen/CIRGenClass.cpp
+98-24 files

LLVM/project 0bbfddfllvm/test/Transforms/SLPVectorizer/AArch64 spillcost-call-between-operands.ll

[SLP][NFC]Add a test with the incorrect spill cost calculation between operands



Reviewers: 

Pull Request: https://github.com/llvm/llvm-project/pull/192509
DeltaFile
+45-0llvm/test/Transforms/SLPVectorizer/AArch64/spillcost-call-between-operands.ll
+45-01 files

LLVM/project 2427dc4llvm/lib/Transforms/Vectorize SLPVectorizer.cpp

[SLP][NFC] Remove unused PtrN parameter from analyzeConstantStrideCandidate() (#191567)
DeltaFile
+6-6llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+6-61 files

LLVM/project 95389ddllvm/lib/Target/AMDGPU AMDGPUMCInstLower.cpp SIInstrInfo.cpp

AMDGPU: Implement getInstSizeVerifyMode

Replace the custom instruction size check.
DeltaFile
+0-22llvm/lib/Target/AMDGPU/AMDGPUMCInstLower.cpp
+7-0llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
+3-0llvm/lib/Target/AMDGPU/SIInstrInfo.h
+10-223 files