LLVM/project da28d01mlir/include/mlir/Dialect/OpenACC OpenACCCGOps.td OpenACCOpsTypes.td, mlir/test/Dialect/OpenACC ops-cg-privatization.mlir

[mlir][acc] Introduce privatization operations for codegen (#195273)

This change adds codegen-oriented operations for representing
private-variable storage and materializing the storage that a particular
parallel execution actually uses.

The two operations are meant to be used together:
- acc.privatize introduces an abstract handle for the privatized
storage,
including the parallel levels that determine the ultimate size of the
storage needed. Which parallel levels apply can be stated when that
structure is known, or omitted so the same representation can be refined
later as launch and loop parallelism are decided.
- acc.private_local takes that handle and yields the concrete storage
for the current execution context(for example the slice that corresponds
to this gang or worker).
DeltaFile
+116-0mlir/test/Dialect/OpenACC/ops-cg-privatization.mlir
+65-0mlir/include/mlir/Dialect/OpenACC/OpenACCCGOps.td
+11-0mlir/include/mlir/Dialect/OpenACC/OpenACCOpsTypes.td
+192-03 files

LLVM/project ffa3e2bllvm/test/CodeGen/X86 vector-reduce-smin.ll vector-reduce-smax.ll

rebase

Created using spr 1.3.7
DeltaFile
+2,928-1,388llvm/test/CodeGen/X86/vector-reduce-smin.ll
+2,924-1,389llvm/test/CodeGen/X86/vector-reduce-smax.ll
+2,677-1,279llvm/test/CodeGen/X86/vector-reduce-umax.ll
+2,628-1,271llvm/test/CodeGen/X86/vector-reduce-umin.ll
+1,491-563llvm/test/CodeGen/X86/vector-reduce-or-cmp.ll
+1,334-623llvm/test/CodeGen/X86/vector-reduce-and-bool.ll
+13,982-6,513819 files not shown
+40,137-21,094825 files

LLVM/project 84fe3a2llvm/test/CodeGen/X86 vector-reduce-smin.ll vector-reduce-smax.ll

[𝘀𝗽𝗿] changes introduced through rebase

Created using spr 1.3.7

[skip ci]
DeltaFile
+2,928-1,388llvm/test/CodeGen/X86/vector-reduce-smin.ll
+2,924-1,389llvm/test/CodeGen/X86/vector-reduce-smax.ll
+2,677-1,279llvm/test/CodeGen/X86/vector-reduce-umax.ll
+2,628-1,271llvm/test/CodeGen/X86/vector-reduce-umin.ll
+1,491-563llvm/test/CodeGen/X86/vector-reduce-or-cmp.ll
+1,334-623llvm/test/CodeGen/X86/vector-reduce-and-bool.ll
+13,982-6,513819 files not shown
+40,137-21,094825 files

LLVM/project fa48d10clang/lib/Analysis/LifetimeSafety FactsGenerator.cpp, clang/test/Sema warn-lifetime-safety-invalidations.cpp

Revert "[LifetimeSafety] Detect iterator invalidation through container alias…"

This reverts commit b561bdbedd7bb59112cbb3eeafda70e3493555f4.
DeltaFile
+21-65clang/test/Sema/warn-lifetime-safety-invalidations.cpp
+3-5clang/lib/Analysis/LifetimeSafety/FactsGenerator.cpp
+24-702 files

LLVM/project 0d1bd85llvm/include/llvm/ADT ArrayRef.h

[NFC][LLVM][ADT] Fix indendation for ArrayRef.h (#195522)

Remove extra indendation for ArrayRef.h in conformance with
https://llvm.org/docs/CodingStandards.html#namespace-indentation
DeltaFile
+474-486llvm/include/llvm/ADT/ArrayRef.h
+474-4861 files

LLVM/project 40a97b5clang-tools-extra/clang-tidy/modernize UseStringViewCheck.cpp UseStringViewCheck.h, clang-tools-extra/test/clang-tidy/checkers/modernize use-string-view-overloaded.cpp use-string-view.cpp

Revert "[clang-tidy] An option for conditional skipping overloaded functions …"

This reverts commit c859a273b516b5b50ab0967c966a913401dd47eb.
DeltaFile
+0-146clang-tools-extra/test/clang-tidy/checkers/modernize/use-string-view-overloaded.cpp
+91-0clang-tools-extra/test/clang-tidy/checkers/modernize/use-string-view.cpp
+3-7clang-tools-extra/clang-tidy/modernize/UseStringViewCheck.cpp
+0-4clang-tools-extra/clang-tidy/modernize/UseStringViewCheck.h
+94-1574 files

LLVM/project a0330b3flang/lib/Optimizer/CodeGen CodeGen.cpp, flang/test/Fir convert-to-llvm-access-group.fir

[flang] Fix missed access group attribute when converting FIR to LLVM dialect. (#195376)

Apply group access attribute to memcpy when lowering fir.load/fir.store
of a box if an original FIR operation had it.
DeltaFile
+109-0flang/test/Fir/convert-to-llvm-access-group.fir
+10-2flang/lib/Optimizer/CodeGen/CodeGen.cpp
+119-22 files

LLVM/project c738bfacompiler-rt/lib/asan asan_errors.cpp

[asan] Change error to note when poison record is not found (#195669)

When `CheckPoisonRecords` fails to find a record, it's often due to the
history buffer being too small rather than a functional error in the
logic.
DeltaFile
+1-1compiler-rt/lib/asan/asan_errors.cpp
+1-11 files

LLVM/project 4e32fa9llvm/test/CodeGen/AArch64 rem-by-const.ll mul_pow2.ll, llvm/test/CodeGen/AArch64/GlobalISel combine-sub-of-mul-const.ll

[GIsel] Add combine (sub a, (mul x, C)) -> (add a, (mul x, -C)) (#194282)

Copy this canonicalization from InstCombine so it can run on
post-legalized expansions. This is especially useful if the sub is a
neg.
DeltaFile
+370-389llvm/test/CodeGen/AMDGPU/GlobalISel/fshl.ll
+160-172llvm/test/CodeGen/AMDGPU/GlobalISel/fshr.ll
+142-75llvm/test/CodeGen/AMDGPU/GlobalISel/urem.i64.ll
+98-95llvm/test/CodeGen/AArch64/rem-by-const.ll
+91-0llvm/test/CodeGen/AArch64/GlobalISel/combine-sub-of-mul-const.ll
+57-25llvm/test/CodeGen/AArch64/mul_pow2.ll
+918-7567 files not shown
+1,036-81813 files

LLVM/project ba728cellvm/test/CodeGen/X86 vector-reduce-smin.ll vector-reduce-smax.ll

rebase

Created using spr 1.3.7
DeltaFile
+2,928-1,388llvm/test/CodeGen/X86/vector-reduce-smin.ll
+2,924-1,389llvm/test/CodeGen/X86/vector-reduce-smax.ll
+2,677-1,279llvm/test/CodeGen/X86/vector-reduce-umax.ll
+2,628-1,271llvm/test/CodeGen/X86/vector-reduce-umin.ll
+1,491-563llvm/test/CodeGen/X86/vector-reduce-or-cmp.ll
+1,334-623llvm/test/CodeGen/X86/vector-reduce-and-bool.ll
+13,982-6,513805 files not shown
+38,538-19,828811 files

LLVM/project 97dec52llvm/test/CodeGen/X86 vector-reduce-smin.ll vector-reduce-smax.ll

[𝘀𝗽𝗿] changes introduced through rebase

Created using spr 1.3.7

[skip ci]
DeltaFile
+2,928-1,388llvm/test/CodeGen/X86/vector-reduce-smin.ll
+2,924-1,389llvm/test/CodeGen/X86/vector-reduce-smax.ll
+2,677-1,279llvm/test/CodeGen/X86/vector-reduce-umax.ll
+2,628-1,271llvm/test/CodeGen/X86/vector-reduce-umin.ll
+1,491-563llvm/test/CodeGen/X86/vector-reduce-or-cmp.ll
+1,334-623llvm/test/CodeGen/X86/vector-reduce-and-bool.ll
+13,982-6,513805 files not shown
+38,538-19,828811 files

LLVM/project 8ebe9d5llvm/test/CodeGen/X86 vector-reduce-smin.ll vector-reduce-smax.ll

rebase

Created using spr 1.3.7
DeltaFile
+2,928-1,388llvm/test/CodeGen/X86/vector-reduce-smin.ll
+2,924-1,389llvm/test/CodeGen/X86/vector-reduce-smax.ll
+2,677-1,279llvm/test/CodeGen/X86/vector-reduce-umax.ll
+2,628-1,271llvm/test/CodeGen/X86/vector-reduce-umin.ll
+1,491-563llvm/test/CodeGen/X86/vector-reduce-or-cmp.ll
+1,334-623llvm/test/CodeGen/X86/vector-reduce-and-bool.ll
+13,982-6,513803 files not shown
+38,523-19,803809 files

LLVM/project a5313a2llvm/test/CodeGen/X86 vector-reduce-smin.ll vector-reduce-smax.ll

[𝘀𝗽𝗿] changes introduced through rebase

Created using spr 1.3.7

[skip ci]
DeltaFile
+2,928-1,388llvm/test/CodeGen/X86/vector-reduce-smin.ll
+2,924-1,389llvm/test/CodeGen/X86/vector-reduce-smax.ll
+2,677-1,279llvm/test/CodeGen/X86/vector-reduce-umax.ll
+2,628-1,271llvm/test/CodeGen/X86/vector-reduce-umin.ll
+1,491-563llvm/test/CodeGen/X86/vector-reduce-or-cmp.ll
+1,334-623llvm/test/CodeGen/X86/vector-reduce-and-bool.ll
+13,982-6,513803 files not shown
+38,523-19,803809 files

LLVM/project b8142eccompiler-rt/lib/asan asan_errors.cpp

[asan] Improve manual poison reporting (#195666)

Always print the thread ID that poisoned the memory, even if the
stack trace is unavailable.
DeltaFile
+2-3compiler-rt/lib/asan/asan_errors.cpp
+2-31 files

LLVM/project bda0016mlir/include/mlir/Dialect/AMDGPU/IR AMDGPUOps.td, mlir/lib/Conversion/AMDGPUToROCDL AMDGPUToROCDL.cpp

[MLIR][AMDGPU] Add amdgpu.global_transpose_load op for gfx1200+ global memory transpose loads (#195287)

Adds a new `amdgpu.global_transpose_load` op to the AMDGPU dialect that
wraps the `global_load_tr` family of instructions introduced in RDNA4
(gfx1250+). Each thread reads a column of a matrix from global memory
and receives the corresponding transposed row in its result register.

The op is kept separate from the existing `amdgpu.transpose_load` (which
targets LDS via `ds_read_tr` on gfx950+) because the two variants target
different GPU architecture families, have different chipset
requirements, and differ in their valid (element size, num elements)
combinations — in particular the 16-bit case produces a 128-bit
(8-element) result via `global_load_tr.b128` rather than the 64-bit
(4-element) result from `ds_read_tr16.b64`.

Lowering to the existing ROCDL `global.load.tr{4,6,.}.b{64,96,128}`
intrinsics added for gfx1200+.

---------

    [2 lines not shown]
DeltaFile
+81-1mlir/lib/Conversion/AMDGPUToROCDL/AMDGPUToROCDL.cpp
+57-0mlir/test/Conversion/AMDGPUToROCDL/global_transpose_load.mlir
+49-0mlir/include/mlir/Dialect/AMDGPU/IR/AMDGPUOps.td
+37-0mlir/lib/Dialect/AMDGPU/IR/AMDGPUOps.cpp
+10-0mlir/test/Dialect/AMDGPU/invalid.mlir
+9-0mlir/test/Dialect/AMDGPU/ops.mlir
+243-16 files

LLVM/project b057c78mlir/lib/Conversion/MathToLLVM MathToLLVM.cpp, mlir/test/Conversion/MathToLLVM math-to-llvm.mlir

[mlir][MathToLLVM] Fix vector type checks in math.absi lowering. (#195360)

For vector types, the lowered type is LLVMArrayType not VectorType. We
should use the original result type to guide if we can do the lowering
for vectors or not.

Signed-off-by: hanhanW <hanhan0912 at gmail.com>
DeltaFile
+11-0mlir/test/Conversion/MathToLLVM/math-to-llvm.mlir
+1-1mlir/lib/Conversion/MathToLLVM/MathToLLVM.cpp
+12-12 files

LLVM/project d27d0f0mlir/include/mlir/Dialect/SPIRV/IR SPIRVBarrierOps.td SPIRVTypes.h, mlir/lib/Dialect/SPIRV/IR SPIRVTypes.cpp SPIRVDialect.cpp

[mlir][SPIRV] Add named-barrier type and OpNamedBarrierInitialize / OpMemoryNamedBarrier (#195664)

Adds the SPIR-V named-barrier object (TypeNamedBarrier) along with
NamedBarrierInitialize and MemoryNamedBarrier ops, gated on the
NamedBarrier capability and SPIR-V 1.1+.

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply at anthropic.com>
DeltaFile
+101-0mlir/include/mlir/Dialect/SPIRV/IR/SPIRVBarrierOps.td
+34-0mlir/test/Dialect/SPIRV/IR/barrier-ops.mlir
+17-4mlir/lib/Dialect/SPIRV/IR/SPIRVTypes.cpp
+15-1mlir/test/Target/SPIRV/barrier-ops.mlir
+11-0mlir/include/mlir/Dialect/SPIRV/IR/SPIRVTypes.h
+9-2mlir/lib/Dialect/SPIRV/IR/SPIRVDialect.cpp
+187-76 files not shown
+222-712 files

LLVM/project c6bf92eclang/include/clang/Options FlangOptions.td, flang/docs Extensions.md

[flang][semantics] Add a flag to relax some of the semantic constraints on C_LOC (#195112)

This PR adds a flag that downgrades some of the semantic constraints on
C_LOC so that it can be used more like LOC. Without the flag behavior is
unmodified, with the flag the constraint that the address be object
pointer or target is removed. There are other constraints we might
consider relaxing, but I think this is a start.
DeltaFile
+99-0flang/test/Semantics/c_loc01-relaxed.f90
+37-14flang/lib/Evaluate/intrinsics.cpp
+11-0clang/include/clang/Options/FlangOptions.td
+6-0flang/docs/Extensions.md
+5-0flang/lib/Frontend/CompilerInvocation.cpp
+2-2flang/include/flang/Support/Fortran-features.h
+160-162 files not shown
+164-168 files

LLVM/project 8a26eccclang/test/CXX/drs cwg27xx.cpp, clang/www cxx_dr_status.html

[clang][NFC] Mark CWG2785 as implemented and add a test (#195547)

[CWG2785](https://wg21.link/cwg2785) clarifies that a
*requires-expression* is never type-dependent, it always has type
`bool`. That means that in a snippet like this:
```cpp
void g(void *);

template <typename T>
void f() {
  g(requires { T(); });
}
```
The call to `g` should be diagnosed as invalid (`bool` is not
convertible to `void *`) even if the template is never instantiated.
Clang does the right thing since version 10:
https://godbolt.org/z/s61rEbsfz
DeltaFile
+13-0clang/test/CXX/drs/cwg27xx.cpp
+1-1clang/www/cxx_dr_status.html
+14-12 files

LLVM/project 55124c4clang/lib/CIR/CodeGen CIRGenBuiltinNVPTX.cpp, clang/test/CIR/CodeGenBuiltins/NVPTX builtins-nvptx-sync.cu builtins-sm90.cu

[CIR][NVPTX] Implement sync and cluster barrier builtins (#195217)
DeltaFile
+28-33clang/lib/CIR/CodeGen/CIRGenBuiltinNVPTX.cpp
+45-0clang/test/CIR/CodeGenBuiltins/NVPTX/builtins-nvptx-sync.cu
+44-0clang/test/CIR/CodeGenBuiltins/NVPTX/builtins-sm90.cu
+117-333 files

LLVM/project c19b9cfflang/lib/Semantics resolve-names.cpp, flang/test/Lower/CUDA cuda-gpu-managed-without-fcuda.f90

[flang][CUDA] Only apply implicit managed attribute when CUDA Fortran is enabled (#195353)

The implicit-managed tagging added in #175648 was intended for CUDA
Fortran allocatables. However, the gate was just
LanguageFeature::CudaManaged, so the tagging also fires on
non-CUDA-Fortran translation units when -gpu=mem:managed is in effect.

This patch adds a LanguageFeature::CUDA check so the implicit tagging
only fires for CUDA Fortran TUs (driver-set -fcuda or .cuf/.CUF source).
Adds a regression test that bbc -gpu=managed without -fcuda on a .f90
source must not produce any cuf.* ops or #cuf.cuda<managed> attributes.
DeltaFile
+59-0flang/test/Lower/CUDA/cuda-gpu-managed-without-fcuda.f90
+6-1flang/lib/Semantics/resolve-names.cpp
+65-12 files

LLVM/project 6a8e7e4lldb/test/API/functionalities/data-formatter/builtin-formats TestBuiltinFormats.py

[lldb] Make TestBuiltinFormats.py work on arm64e (#195163)

Co-authored-by: Med Ismail Bennani <ismail at bennani.ma>
DeltaFile
+1-1lldb/test/API/functionalities/data-formatter/builtin-formats/TestBuiltinFormats.py
+1-11 files

LLVM/project fec54afllvm/include/llvm/SandboxIR Region.h, llvm/include/llvm/Transforms/Vectorize/SandboxVectorizer RegionWithScore.h

[SandboxIR][SandboxVec] Remove score tracking from Region, add RegionWithScore (#190293)

Up until now the `Region` class contained a `ScoreBoard` and was
tracking instruction costs by default. However, design-wise the Region
is a generic IR-level structure and should be independent from score
tracking.

So this patch removes the score tracking capability from the base
`Region` class and creates a separate `RegionWithScore` derived class
for that. The new class is placed in the vectorizer directory because
the score tracking is meant for the vectorizer.

Should be NFC.
DeltaFile
+102-0llvm/include/llvm/Transforms/Vectorize/SandboxVectorizer/RegionWithScore.h
+20-77llvm/unittests/SandboxIR/RegionTest.cpp
+84-0llvm/unittests/Transforms/Vectorize/SandboxVectorizer/RegionWithScoreTest.cpp
+29-51llvm/include/llvm/SandboxIR/Region.h
+7-43llvm/lib/SandboxIR/Region.cpp
+34-0llvm/lib/Transforms/Vectorize/SandboxVectorizer/RegionWithScore.cpp
+276-1717 files not shown
+291-18213 files

LLVM/project c61be37clang/lib/CodeGen BackendUtil.cpp, clang/test/DebugInfo/Generic codeview-buildinfo.c

[clang][CodeView] Prevent the input name from appearing in LF_BUILDINFO (#194140)

The implicit contract of an `LF_BUILDINFO` record (represented in LLVM
by
[`BuildInfoRecord`](https://github.com/llvm/llvm-project/blob/6f0b55ec55f3e5e1ccc0d6b0d04a307479218768/llvm/include/llvm/DebugInfo/CodeView/TypeRecord.h#L667))
is that its `CommandLine` field should not contain the input source file
— a separate `SourceFile` field is reserved for that.

When the command-line flattening was moved from `llvm/` to `clang/` in
#106369, the comparison value used to identify and strip the source
positional was switched from `MainSourceFile->getFilename()` (the full
input path resolved by clang) to `CodeGenOpts.MainFileName` (just the
basename, set via `-main-file-name`). As a result, when the driver is
invoked with an absolute source path the cc1 positional is that absolute
path and no longer matches `MainFileName`, so the source filename leaks
into `CommandLine` as a trailing positional cc1 argument.

This is a regression in Clang 20. It breaks downstream tooling such as
Live++, whose unity-splitting feature relies on the embedded command

    [12 lines not shown]
DeltaFile
+33-0clang/test/DebugInfo/Generic/codeview-buildinfo.c
+19-3clang/lib/CodeGen/BackendUtil.cpp
+52-32 files

LLVM/project a4aca5alibunwind/src AddressSpace.hpp DwarfParser.hpp

[libunwind] fix build errors on x32 and mips n32 (#194310)

(cherry picked from commit 06ddfcf0ca9cdb1481fff3cff6f73d5c26d45ffe)
DeltaFile
+1-1libunwind/src/AddressSpace.hpp
+1-1libunwind/src/DwarfParser.hpp
+2-22 files

LLVM/project f306525compiler-rt/test/xray/TestCases/Posix basic-filtering.cpp

Fix flaky test xray/basic-filtering.cpp (#186611)

Increase time thresholds and sleep time to decrease the probability of
failure.

Closes: #175866
DeltaFile
+3-3compiler-rt/test/xray/TestCases/Posix/basic-filtering.cpp
+3-31 files

LLVM/project 4a15b84llvm/lib/Transforms/Vectorize LoopVectorize.cpp

[LV] Strip an outdated TODO about runPass (NFC) (#195228)
DeltaFile
+0-2llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+0-21 files

LLVM/project 943c007bolt/docs BinaryAnalysis.md, bolt/include/bolt/Utils CommandLineOpts.h

[BOLT] Gadget scanner: add less strict version of tail call checker

During tail call, it may be worth making sure the link register is as
trusted as during a regular call, though it may require inserting
expensive checking code by the compiler.

On the other hand, with pac-ret hardening enabled, there should be no
reason not to protect tail-calling functions at least as well as those
exited via regular return instruction.

This commit splits tail call checker into two versions: the basic one
which is suitable to make sure regular `PAC*` + `AUT*` are emitted as
needed, and the strict one, that additionally ensures the authentication
(if any) succeeded.
DeltaFile
+90-87bolt/test/binary-analysis/AArch64/gs-pauth-tail-calls.s
+31-9bolt/docs/BinaryAnalysis.md
+22-16bolt/test/binary-analysis/AArch64/gs-pauth-scanners.s
+27-6bolt/lib/Passes/PAuthGadgetScanner.cpp
+15-8bolt/include/bolt/Utils/CommandLineOpts.h
+9-7bolt/test/binary-analysis/AArch64/cmdline-args.test
+194-1332 files not shown
+216-1428 files

LLVM/project 3f8d899bolt/test/binary-analysis/AArch64 gs-pauth-tail-calls.s

gs-pauth-tail-calls.s: make check lines exhaustive; rename FPAC/NOFPAC
DeltaFile
+44-40bolt/test/binary-analysis/AArch64/gs-pauth-tail-calls.s
+44-401 files

LLVM/project 4b19ac1bolt/docs BinaryAnalysis.md

Address the review comments. Misc cleanups
DeltaFile
+43-28bolt/docs/BinaryAnalysis.md
+43-281 files