LLVM/project a14ac01llvm/include/llvm/Support SourceMgr.h, llvm/lib/MC/MCParser MasmParser.cpp AsmParser.cpp

clean up approach, extend tests

Created using spr 1.3.8-beta.1
DeltaFile
+34-11llvm/test/MC/AsmParser/macro-unknown-directive.s
+8-23llvm/lib/MC/MCParser/MasmParser.cpp
+13-17llvm/include/llvm/Support/SourceMgr.h
+7-19llvm/lib/MC/MCParser/AsmParser.cpp
+9-17llvm/lib/Support/SourceMgr.cpp
+5-5llvm/test/MC/AsmParser/macros-darwin.s
+76-924 files not shown
+90-10310 files

LLVM/project c679d99llvm/lib/Transforms/IPO InstrumentorStubPrinter.cpp Instrumentor.cpp, llvm/test/Instrumentation/Instrumentor default_rt.h rt_config.json

[Instrumentor] Improve stub printer (for C/C++ and value packs)

The stub printer now emits a helper header to deal with value packs (in
C and C++). We also make the files C/C++ compatible and use the proper
format strings for int32_t and int64_t.
DeltaFile
+410-5llvm/lib/Transforms/IPO/InstrumentorStubPrinter.cpp
+264-0llvm/test/Instrumentation/Instrumentor/default_rt.h
+190-2llvm/test/Instrumentation/Instrumentor/rt_config.json
+124-0llvm/test/Instrumentation/Instrumentor/default_rt.c
+0-37llvm/test/Instrumentation/Instrumentor/default_rt
+7-6llvm/lib/Transforms/IPO/Instrumentor.cpp
+995-504 files not shown
+1,004-5110 files

LLVM/project f089949llvm/utils instrumentor-config-wizard.py

[Instrumentor] Improve the config wizard script

This makes the config wizard script more generic as we grow
instrumentation opportunities. Better output, e.g., clear paths, are
also displayed now.

Prepared with Claude (AI) and tested by me afterwards.
DeltaFile
+279-153llvm/utils/instrumentor-config-wizard.py
+279-1531 files

LLVM/project c32de3e.github/workflows libc-shared-tests.yml

[libc] Switch libc-shared-tests precommit CI to use docker image. (#197962)
DeltaFile
+15-9.github/workflows/libc-shared-tests.yml
+15-91 files

LLVM/project 38489af.github/workflows libcxx-run-benchmarks.yml

workflows/libcxx-run-benchmarks: Only run job for people with commit access (#199087)

This job checks out untrusted code from a PR in a trusted context
(issue_comment trigger), so we need to limit it to people with commit
access to avoid possible privilege escalation.
DeltaFile
+23-4.github/workflows/libcxx-run-benchmarks.yml
+23-41 files

LLVM/project ff83218clang/cmake/caches HLSL.cmake

[HLSL] Fix improper parsing of IN_LIST within if condition (#199276)

Cmake does not properly parse IN_LIST within the if condition, and
treats it as a token.
This is not desired behavior.
The CMP0057 policy supports the new [if() IN_LIST
](https://cmake.org/cmake/help/latest/command/if.html#command:if)
operator.
Enable this policy and resolve the build error.


Fixes https://github.com/llvm/llvm-project/issues/199282
Assisted by: Github Copilot
DeltaFile
+4-0clang/cmake/caches/HLSL.cmake
+4-01 files

LLVM/project 56bf985clang/include/clang/CIR/Dialect/IR CIRTypes.td, clang/lib/CIR/Dialect/IR CIRTypes.cpp

[CIR] Include union tail pad in getTypeSizeInBits (#198361)

Padded CIR unions (e.g. libstdc++ `std::string` SSO layout) carry a
trailing byte-array member so the record matches the AST layout size.
`RecordType::getTypeSizeInBits` was returning only the largest-aligned
member and ignored that tail, so the CIR view of the union was 8 bytes
smaller than what `LowerToLLVM` emits.  Parent structs then picked up
a spurious trailing pad via `insertPadding`, arrays of those structs
used the wrong stride, and heap allocations could be overrun (Eigen's
`array_of_string` hits this directly).

The fix adds the padding member's size when the union is marked
`padded`, so struct size, GEP strides, and `new T[n]` allocation sizes
match OGCG.  Regression test models the SSO-shaped record and checks
the 96-byte `new` for three elements.
DeltaFile
+32-0clang/test/CIR/CodeGen/record-with-padded-union.cpp
+26-1clang/lib/CIR/Dialect/IR/CIRTypes.cpp
+5-0clang/include/clang/CIR/Dialect/IR/CIRTypes.td
+63-13 files

LLVM/project 65e49a6clang/lib/Sema OpenCLBuiltins.td, clang/test/SemaOpenCL intel-subgroup-local-block-io-builtins.cl intel-subgroup-buffer-prefetch-builtins.cl

[OpenCL] Add Intel subgroup buffer prefetch and local block I/O builtins (#199258)

Add cl_intel_subgroup_buffer_prefetch and
cl_intel_subgroup_local_block_io
declarations to OpenCLBuiltins.td and cover them with header-free SPIR
tests.

This keeps the generated OpenCL builtins in sync with opencl-c.h for the
Intel subgroup buffer prefetch and local block I/O extensions.

Per the cl_intel_subgroup_local_block_io specification, the _ui local
aliases (intel_sub_group_block_read_ui*, intel_sub_group_block_write_ui*
with __local pointer) are declared under
FuncExtIntelSubgroupLocalBlockIO
alone, without a char/short/long prerequisite.  A dedicated test
(intel-subgroup-local-block-io-ui-without-char-short-long.cl) verifies
that
they resolve when only cl_intel_subgroup_local_block_io is active.


    [6 lines not shown]
DeltaFile
+165-0clang/test/SemaOpenCL/intel-subgroup-local-block-io-builtins.cl
+101-0clang/lib/Sema/OpenCLBuiltins.td
+47-0clang/test/SemaOpenCL/intel-subgroup-buffer-prefetch-builtins.cl
+40-0clang/test/SemaOpenCL/intel-subgroup-local-block-io-ui-without-char-short-long.cl
+353-04 files

LLVM/project 83c752fclang/lib/Headers opencl-c.h, clang/lib/Sema OpenCLBuiltins.td

[OpenCL] Fix image2d_t qualifier for intel_sub_group_block_write_ui (#199232)

The intel_sub_group_block_write_ui[2,4,8] overloads for image2d_t were
declared with a read_only qualifier, both in opencl-c.h and in
OpenCLBuiltins.td. A write operation cannot target a read_only image,
and
the base intel_sub_group_block_write together with the analogous _us,
_uc
and _ul aliases all correctly use write_only image2d_t.

Per the cl_intel_subgroups_short [1], cl_intel_subgroups_char [2] and
cl_intel_subgroups_long [3] specifications, the _ui aliases are added
"for
naming consistency [...] There is no change to the description or
behavior
of these functions" relative to the cl_intel_subgroups base, which uses
write_only image2d_t for writes.

The typo was introduced in b833bf6ae14f and preserved across all

    [18 lines not shown]
DeltaFile
+16-16clang/lib/Headers/opencl-c.h
+4-4clang/lib/Sema/OpenCLBuiltins.td
+1-1clang/test/SemaOpenCL/intel-subgroups-builtins.cl
+21-213 files

LLVM/project cc92693offload/unittests/OffloadAPI/kernel olLaunchKernel.cpp

[offload] Use device memory for the multithreaded kernel lanuch test (#199132)

This commit modifies the multithreaded kernel launch test to use device
memory instead of managed memory. The test is reported to be failing
intermittently in systems where concurrent managed memory access is
not supported. This is the case for NVIDIA devices that do not support
CU_DEVICE_ATTRIBUTE_CONCURRENT_MANAGED_ACCESS.

The concept of concurrent and coherent managed memory access should
be exposed to liboffload users somehow, e.g., adding it as device property,
so it is clear what execution patterns are allowed with managed memory.
However, this test is just testing concurrent kernel launches. This commit
fixes it until we decide how to proceed with the guarantees on that type of
allocations.
DeltaFile
+12-7offload/unittests/OffloadAPI/kernel/olLaunchKernel.cpp
+12-71 files

LLVM/project d755b04llvm/lib/Analysis ScalarEvolution.cpp, llvm/test/Analysis/LoopAccessAnalysis depend_diff_types.ll

[SCEV] Fold zext(C+A)<nsw> -> (sext(C) + zext(A))<nsw> if possible. (#142599)

Simplify zext(C+A)<nsw> -> (sext(C) + zext(A))<nsw> if
 * zext (C + A)<nsw> >=s 0 and
 * A >=s V.

For now this is limited to cases where the first operand is a constant,
so the SExt can be folded to a new constant. This can be relaxed in the
future.

The initial version checks for non-negative manually to limit compile-time,
supporting only A = smax(C2, ..) where C2 >= abs(C)

Alive2 proof of the general pattern and the test changes in zext-nuw.ll
(times out in the online instance but verifies locally)

https://alive2.llvm.org/ce/z/_BtyGy

PR: github.com/llvm/llvm-project/pull/142599
DeltaFile
+38-8llvm/test/Transforms/LoopUnroll/peel-last-iteration-with-guards.ll
+10-10llvm/test/Transforms/LoopVectorize/reduction.ll
+14-0llvm/lib/Analysis/ScalarEvolution.cpp
+1-7llvm/test/Analysis/LoopAccessAnalysis/depend_diff_types.ll
+2-2llvm/test/Transforms/LoopVectorize/AArch64/predicated-costs.ll
+2-2llvm/test/Analysis/ScalarEvolution/zext-add-nsw-fold.ll
+67-291 files not shown
+68-307 files

LLVM/project 90d4ed5clang-tools-extra/clang-doc YAMLGenerator.cpp

[clang-doc][nfc] Use static declarations to enforce internal linkage (#198072)
DeltaFile
+4-4clang-tools-extra/clang-doc/YAMLGenerator.cpp
+4-41 files

LLVM/project c61c880clang-tools-extra/clang-doc Serialize.cpp

[clang-doc][nfc] Silence tidy warning about anonymous namespace (#198071)

clang-tidy complains that we should prefer static over the anonymous
namespace, despite the API being static in addition to being in the
anonymous namespace. We can silence the diagnostic by simply removing
the namespace declaration.
DeltaFile
+0-2clang-tools-extra/clang-doc/Serialize.cpp
+0-21 files

LLVM/project f839261mlir/include/mlir/Dialect/NVGPU/IR CMakeLists.txt

[MLIR] Fix mlir-doc build, add missing "-dialect nvgpu" (#199279)

Was broken with

> when more than 1 dialect is present, one must be selected via
'-dialect'
DeltaFile
+1-1mlir/include/mlir/Dialect/NVGPU/IR/CMakeLists.txt
+1-11 files

LLVM/project f10e1a8clang/include/clang/CIR MissingFeatures.h, clang/lib/CIR/Dialect/Transforms LoweringPrepare.cpp

[CIR][CUDA] Emit global var registration
DeltaFile
+92-5clang/lib/CIR/Dialect/Transforms/LoweringPrepare.cpp
+43-6clang/test/CIR/CodeGenCUDA/device-stub.cu
+0-1clang/include/clang/CIR/MissingFeatures.h
+135-123 files

LLVM/project 39d3090clang/test/Headers __clang_hip_math.hip, llvm/lib/Transforms/Utils SimplifyCFG.cpp

Reapply [SimplifyCFG] Extend jump-threading to allow live local defs (#197850)

Restore "Extend jump-threading to allow live local defs" #135079. Long
compilation time with reduce.cu in hipcub/warp was partially addressed
in #195744. Compilation time for reduce.cu with this PR (after #195744)
is 6 minutes 40 seconds. Without (#195744) compilation time was several
hours.

Long compilation time in reduce.cu was only exposed by jump-threading.
In my view the primary causes were due to inlining, SROA tripling the IR
code size, and SSA updating 26K phi-nodes resulting in an O(N^2) search
for duplicates. #195744 limits phi search times.

This reverts commit a76750e6de6aba2223097dc505578556ec245d50.

---------

Signed-off-by: John Lu <John.Lu at amd.com>
DeltaFile
+647-736clang/test/Headers/__clang_hip_math.hip
+195-0llvm/test/Transforms/SimplifyCFG/jump-threading-live-on-exit.ll
+167-0llvm/test/CodeGen/AMDGPU/sroa-phi-nodes.ll
+95-0llvm/test/Transforms/SimplifyCFG/jump-threading-max-jump-threading-live-blocks.ll
+61-10llvm/lib/Transforms/Utils/SimplifyCFG.cpp
+16-20llvm/test/CodeGen/AArch64/avoid-free-ext-promotion.ll
+1,181-7661 files not shown
+1,190-7767 files

LLVM/project e47f8deflang/lib/Optimizer/Analysis AliasAnalysis.cpp, flang/test/Analysis/AliasAnalysis alias-analysis-acc.mlir

[flang] Fixed FIR AA's getSource() for box loads inside acc.compute_region. (#199157)

This patch fixes a regression caused by #198635: when we call getSource()
for a `fir.load` of a box we have to handle the input value that might be
a `BlockArgument` and pass-through it.
DeltaFile
+32-4flang/lib/Optimizer/Analysis/AliasAnalysis.cpp
+26-0flang/test/Analysis/AliasAnalysis/alias-analysis-acc.mlir
+58-42 files

LLVM/project d7dc3fcclang/include/clang/CIR MissingFeatures.h, clang/lib/CIR/Dialect/Transforms LoweringPrepare.cpp

[CIR][CUDA] Emit global var registration
DeltaFile
+93-6clang/lib/CIR/Dialect/Transforms/LoweringPrepare.cpp
+43-6clang/test/CIR/CodeGenCUDA/device-stub.cu
+0-1clang/include/clang/CIR/MissingFeatures.h
+136-133 files

LLVM/project 913ba80llvm/docs LangRef.rst

add warning

Created using spr 1.3.6-beta.1
DeltaFile
+9-0llvm/docs/LangRef.rst
+9-01 files

LLVM/project 6a84676llvm/test/Transforms/LoopVectorize/WebAssembly memory-interleave.ll

[NFC] Remove fractional part of Estimated cost per lane in memory-interleave.ll (#198666)

On the memory-interleave.ll test, some of the CHECK lines are failing on
z/OS, due to difference in rounding behaviour when printing the
Estimated cost per lane. Resolve this by removing the fractional part,
similar to what done in the past with
https://github.com/llvm/llvm-project/commit/e8556ff6b664df6e595f8aed175eff3a27a4a020
and
https://github.com/llvm/llvm-project/commit/aeb88f6778756ea889918308241a2b34bd7f64e2
.
DeltaFile
+117-117llvm/test/Transforms/LoopVectorize/WebAssembly/memory-interleave.ll
+117-1171 files

LLVM/project ef4e882llvm/tools/dsymutil dsymutil.cpp DebugMap.cpp, llvm/tools/llvm-gsymutil llvm-gsymutil.cpp

Open yaml, etc as text files (#199253)

These tests were failing on z/OS because the text input files were being
opened as binary.

```
FAIL: LLVM :: tools/dsymutil/AArch64/typedef-different-types.test
FAIL: LLVM :: tools/dsymutil/X86/mismatch.m
FAIL: LLVM :: tools/dsymutil/embed-resource.test
FAIL: LLVM :: tools/llvm-gsymutil/X86/elf-symtab-file.yaml
```
Open the files as text to solve the problems.
DeltaFile
+3-2llvm/tools/llvm-gsymutil/llvm-gsymutil.cpp
+1-1llvm/tools/dsymutil/dsymutil.cpp
+1-1llvm/tools/dsymutil/DebugMap.cpp
+5-43 files

LLVM/project 7b0b6a0lldb/tools/lldb-dap/extension package-lock.json, llvm/lib/Support UnicodeNameToCodepointGenerated.cpp

Merge branch 'filecheck-test-braced-search-ranges' into filecheck-braced-search-ranges
DeltaFile
+23,873-20,923llvm/lib/Support/UnicodeNameToCodepointGenerated.cpp
+12,365-0llvm/test/CodeGen/AMDGPU/llvm.amdgcn.av.load.b128.ll
+1,243-8,768llvm/test/CodeGen/X86/vector-replicaton-i1-mask.ll
+1,381-2,562llvm/test/CodeGen/X86/avx512-calling-conv.ll
+3,903-0llvm/test/CodeGen/NVPTX/machine-cse-predicate-inversion.ll
+2,504-1,285lldb/tools/lldb-dap/extension/package-lock.json
+45,269-33,5383,111 files not shown
+155,671-96,9413,117 files

LLVM/project 980d9b4lldb/tools/lldb-dap/extension package-lock.json, llvm/lib/Support UnicodeNameToCodepointGenerated.cpp

Merge branch 'filecheck-refactor-dump-input-tests' into filecheck-test-braced-search-ranges
DeltaFile
+23,873-20,923llvm/lib/Support/UnicodeNameToCodepointGenerated.cpp
+12,365-0llvm/test/CodeGen/AMDGPU/llvm.amdgcn.av.load.b128.ll
+1,243-8,768llvm/test/CodeGen/X86/vector-replicaton-i1-mask.ll
+1,381-2,562llvm/test/CodeGen/X86/avx512-calling-conv.ll
+3,903-0llvm/test/CodeGen/NVPTX/machine-cse-predicate-inversion.ll
+2,504-1,285lldb/tools/lldb-dap/extension/package-lock.json
+45,269-33,5383,111 files not shown
+155,671-96,9413,117 files

LLVM/project 3d34338clang/include/clang/Basic AttrDocs.td, clang/include/clang/CIR/Dialect/IR CIRCUDAAttrs.td

Merge branch 'main' into filecheck-refactor-dump-input-tests
DeltaFile
+163-0clang/lib/CIR/CodeGen/CIRGenCUDANV.cpp
+96-16clang/test/CIR/CodeGenCUDA/address-spaces.cu
+73-0clang/lib/CIR/Dialect/IR/CIRAttrs.cpp
+55-2clang/include/clang/Basic/AttrDocs.td
+27-5clang/lib/CIR/CodeGen/CIRGenModule.cpp
+29-1clang/include/clang/CIR/Dialect/IR/CIRCUDAAttrs.td
+443-247 files not shown
+486-2513 files

LLVM/project 939d325llvm/test/FileCheck/dump-input annotations.txt, llvm/utils/FileCheck FileCheck.cpp

[FileCheck] Refactor -dump-input test (#198137)

This PR is stacked on PR #198136.

This patch refactors `llvm/test/FileCheck/dump-input/annotations.txt` to
improve maintainability and coverage and to prepare for the upcoming
implementation of search range annotations.

Lit substitutions
=================

The test repeats the same basic set of RUN lines *many* times. This
patch encapsulates those in lit substitutions to improve
maintainability. By doing so, it also helps to ensure more consistent
coverage of all cases and thus slightly expands coverage.

-strict-whitespace
==================


    [25 lines not shown]
DeltaFile
+591-509llvm/test/FileCheck/dump-input/annotations.txt
+11-0llvm/utils/FileCheck/FileCheck.cpp
+602-5092 files

LLVM/project 1a264a9clang/docs ReleaseNotes.rst LifetimeSafety.rst, clang/include/clang/Basic AttrDocs.td

[docs] update noescape semantics to disallow free (#195973)

This changes the documented semantics of the `noescape` attribute to
disallow freeing the pointer, and allow escapes of the integer value of
the memory address, as discussed in

https://discourse.llvm.org/t/rfc-updating-the-semantics-of-the-noescape-attribute/90326.

It also clarifies that the attribute may only be used to annotate the
outermost pointer level of nested pointer parameters.
DeltaFile
+55-2clang/include/clang/Basic/AttrDocs.td
+7-0clang/docs/ReleaseNotes.rst
+2-0clang/docs/LifetimeSafety.rst
+64-23 files

LLVM/project 1e14fd0flang-rt/lib/runtime transformational.cpp

[flang-rt] Added missing RT_API_ATTRS for CheckBoundaryType(). (#199244)
DeltaFile
+1-1flang-rt/lib/runtime/transformational.cpp
+1-11 files

LLVM/project 7831822lldb/tools/lldb-dap/extension package-lock.json, llvm/lib/Support UnicodeNameToCodepointGenerated.cpp

Merge branch 'filecheck-resurrect-overflow-tests' into filecheck-refactor-dump-input-tests
DeltaFile
+23,873-20,923llvm/lib/Support/UnicodeNameToCodepointGenerated.cpp
+12,365-0llvm/test/CodeGen/AMDGPU/llvm.amdgcn.av.load.b128.ll
+1,243-8,768llvm/test/CodeGen/X86/vector-replicaton-i1-mask.ll
+1,381-2,562llvm/test/CodeGen/X86/avx512-calling-conv.ll
+3,903-0llvm/test/CodeGen/NVPTX/machine-cse-predicate-inversion.ll
+2,504-1,285lldb/tools/lldb-dap/extension/package-lock.json
+45,269-33,5383,102 files not shown
+155,184-96,9143,108 files

LLVM/project c1148e7llvm/test/FileCheck/dump-input annotations.txt

Use -COUNT-2 to make it clearer duplicate is intended
DeltaFile
+1-2llvm/test/FileCheck/dump-input/annotations.txt
+1-21 files

LLVM/project 1627671clang/lib/AST ASTContext.cpp

Convert the key before cache lookup to prevent encoding differences
DeltaFile
+9-9clang/lib/AST/ASTContext.cpp
+9-91 files