LLVM/project ce1b12elldb/source/Core SearchFilter.cpp, lldb/test/Shell/Breakpoint source-regex-missing-source.test

[lldb] Iterate over a copy of the ModuleList in SearchFilter (#189009)

Avoid a potential deadlock caused by the search filter callback
acquiring the target's module lock by iterating over a copy of the list.

Fixes #188766
DeltaFile
+14-13lldb/source/Core/SearchFilter.cpp
+11-0lldb/test/Shell/Breakpoint/source-regex-missing-source.test
+1-0lldb/test/Shell/Breakpoint/Inputs/main.c
+26-133 files

LLVM/project eb2ff71llvm/lib/Analysis DependenceAnalysis.cpp

[DA] Mark variable only used in assert as maybe_unused (#189100)

Fix 00aebbff71ff4e348538708064ba2e033ccd6b2a.
DeltaFile
+1-1llvm/lib/Analysis/DependenceAnalysis.cpp
+1-11 files

LLVM/project a9f5f93llvm/lib/Target/AMDGPU AMDGPUCoExecSchedStrategy.cpp AMDGPUCoExecSchedStrategy.h, llvm/test/CodeGen/AMDGPU coexec-scheduler.ll coexec-sched-effective-stall.mir

[AMDGPU] Add HWUI pressure heuristics to coexec strategy (#184929)

Adds basic support for new heuristics for the CoExecSchedStrategy.

InstructionFlavor provides a way to map instructions to different
"Flavors". These "Flavors" all have special scheduling considerations --
either they map to different HarwareUnits, or have unique scheduling
properties like fences.

HardwareUnitInfo provides a way to track and analyze the usage of some
hardware resource across the current scheduling region.

CandidateHeuristics holds the state for new heuristics, as well as the
implementations.

In addition, this adds new heuristics to use the various support pieces
listed above. tryCriticalResource attempts to schedule instructions that
use the most demanded HardwareUnit. If no such instructions are ready to
be scheduled, tryCriticalResourceDependency attempts to schedule

    [4 lines not shown]
DeltaFile
+606-0llvm/test/CodeGen/AMDGPU/coexec-scheduler.ll
+412-23llvm/lib/Target/AMDGPU/AMDGPUCoExecSchedStrategy.cpp
+285-2llvm/lib/Target/AMDGPU/AMDGPUCoExecSchedStrategy.h
+5-5llvm/test/CodeGen/AMDGPU/coexec-sched-effective-stall.mir
+1,308-304 files

LLVM/project 560b8c9.ci premerge_advisor_explain.py

[CI] Make AArch64 Premerge Job Fail on Errors (#188801)

Right now we report the errors, but the job does not actually fail. This
patch fixes that.
DeltaFile
+1-1.ci/premerge_advisor_explain.py
+1-11 files

LLVM/project 1788345llvm/test/CodeGen/AMDGPU memmove-param-combinations.ll, llvm/test/MC/AMDGPU gfx10_unsupported.s gfx7_unsupported.s

Merge remote-tracking branch 'upstream/main' into rewrite-hlsl-intrinsics-to-tablegen
DeltaFile
+2,210-1,106llvm/test/MC/AMDGPU/gfx10_unsupported.s
+863-863llvm/test/MC/AMDGPU/gfx7_unsupported.s
+601-1,016llvm/test/CodeGen/AMDGPU/memmove-param-combinations.ll
+1,185-397llvm/test/MC/AMDGPU/gfx950_asm_features.s
+691-691llvm/test/MC/AMDGPU/gfx11_unsupported.s
+613-613llvm/test/MC/AMDGPU/gfx8_unsupported.s
+6,163-4,6862,156 files not shown
+62,177-31,8902,162 files

LLVM/project 502b5e0llvm/lib/Transforms/Instrumentation MemProfUse.cpp, llvm/test/Transforms/PGOProfile memprof-inline-call-stacks.ll

[MemProf] Dump inline call stacks as optimization remarks (#188678)

This patch teaches the MemProf matching pass to dump inline call
stacks as analysis remarks like so:

frame: 704e4117e6a62739 main:10:5
frame: 273929e54b9f1234 foo:2:12
inline call stack: 704e4117e6a62739,273929e54b9f1234

The output consists of two types of remarks:

- "frame": Acts as a dictionary mapping a unique MD5-based FrameID
  to source information (function name, line offset, and column).

- "inline call stack": Provides the full call stack for a call site
  as a sequence of FrameIDs.

Both types of remarks are deduplicated to reduce the output size.

This patch is intended to be a debugging aid.
DeltaFile
+65-2llvm/lib/Transforms/Instrumentation/MemProfUse.cpp
+38-0llvm/test/Transforms/PGOProfile/memprof-inline-call-stacks.ll
+103-22 files

LLVM/project 4537293llvm/lib/Target/AMDGPU AMDGPUCodeGenPrepare.cpp, llvm/test/CodeGen/AMDGPU fract-match.ll

AMDGPU: Match fract from compare and select and minimum

Implementing this with any of the minnum variants is overconstraining
for the actual use. Existing patterns use fmin, then have to manually
clamp nan inputs to get nan propagating behavior. It's cleaner to express
this with a nan propagating operation to start with.
DeltaFile
+197-264llvm/test/CodeGen/AMDGPU/fract-match.ll
+124-85llvm/lib/Target/AMDGPU/AMDGPUCodeGenPrepare.cpp
+321-3492 files

LLVM/project 0cfea9cllvm/lib/Target/AMDGPU AMDGPUCodeGenPrepare.cpp, llvm/test/CodeGen/AMDGPU fract-match.ll

AMDGPU: Match fract pattern with swapped edge case check

A fract implementation can equivalently be written as
  r = fmin(x - floor(x))
  r = isnan(x) ? x : r;
  r = isinf(x) ? 0.0 : r;

or:
  r = fmin(x - floor(x));
  r = isinf(x) ? 0.0 : r;
  r = isnan(x) ? x : r;

Previously this only matched the previous form. Match
the case where the isinf check is the inner clamp. There are
a few more ways to write this pattern (e.g., move the clamp of
infinity to the input) but I haven't encountered that in the wild.

The existing code seems to be trying too hard to match noncanonical
variants of the pattern. Only handles the result that all 4 permutations
of compare and select produce out of instcombine.
DeltaFile
+328-349llvm/test/CodeGen/AMDGPU/fract-match.ll
+47-17llvm/lib/Target/AMDGPU/AMDGPUCodeGenPrepare.cpp
+375-3662 files

LLVM/project 28f24b5llvm/test/CodeGen/AMDGPU fract-match.ll

AMDGPU: Add baseline tests for more fract patterns (#189092)
DeltaFile
+2,235-0llvm/test/CodeGen/AMDGPU/fract-match.ll
+2,235-01 files

LLVM/project 871d675compiler-rt/lib/profile CMakeLists.txt

[compiler-rt] Add PTX feature specifically when CUDA is not available (#189083)

Summary:
People need to be able to build this without a CUDA installation.

Long term we should bump up the minimum version as I'm pretty sure every
architecture before this has been deprecated by NVIDIA.
DeltaFile
+2-0compiler-rt/lib/profile/CMakeLists.txt
+2-01 files

LLVM/project df6d6c9compiler-rt/lib/scudo/standalone/tests combined_test.cpp

[Scudo] Disable ScudoCombinedTests.NewType (#189070)

This is failing in some configurations on AArch64 Linux. Given there are
a lot of follow-up commits that makes this hard to revert, just disable
it for now pending future investigation.
DeltaFile
+1-1compiler-rt/lib/scudo/standalone/tests/combined_test.cpp
+1-11 files

LLVM/project ba44df4clang/tools/clang-format git-clang-format

[clang-format] Add pre-commit CI env var support to git-clang-format (#188816)

When git-clang-format is invoked with no explicit commit arguments and
both PRE_COMMIT_FROM_REF and PRE_COMMIT_TO_REF are set, the script
automatically uses those refs as the diff range and implies --diff. If
the variables are absent, existing behavior is fully preserved.

This allows projects to use `git-clang-format` directly inside CI
pipelines via the [pre-commit](https://pre-commit.com/) framework
without any wrapper scripts or extra configuration.


Closes: #188813

No existing lit test suite for this script. Verified manually that env
vars activate two-commit diff mode, existing behavior is preserved
without them, and explicit CLI args always override them.
DeltaFile
+15-0clang/tools/clang-format/git-clang-format
+15-01 files

LLVM/project 354f742clang/lib/ScalableStaticAnalysisFramework/Analyses/UnsafeBufferUsage UnsafeBufferUsage.cpp

Update clang/lib/ScalableStaticAnalysisFramework/Analyses/UnsafeBufferUsage/UnsafeBufferUsage.cpp

Replace "const char * const" with "llvm::StringLiteral"

Co-authored-by: Balázs Benics <benicsbalazs at gmail.com>
DeltaFile
+1-3clang/lib/ScalableStaticAnalysisFramework/Analyses/UnsafeBufferUsage/UnsafeBufferUsage.cpp
+1-31 files

LLVM/project 1611a23offload/test/offloading back2back_distribute.c bug49021.cpp, openmp/device/src Synchronization.cpp

[OFFLOAD] Add spirv implementation for named barrier (#180393)

This change adds implementation for named barriers for SPIRV backend.
Since there is no built in API/intrinsics for named barrier in SPIRV,
the implementation loosely follows implementation for AMD
DeltaFile
+22-9openmp/device/src/Synchronization.cpp
+2-1offload/test/offloading/back2back_distribute.c
+2-1offload/test/offloading/bug49021.cpp
+2-1offload/test/offloading/atomic-compare-signedness.c
+2-1offload/test/offloading/bug51781.c
+2-1offload/test/offloading/bug51982.c
+32-1482 files not shown
+56-9688 files

LLVM/project de65a73llvm/lib/Analysis ValueTracking.cpp

Rename function to show nan doesn't matter
DeltaFile
+4-4llvm/lib/Analysis/ValueTracking.cpp
+4-41 files

LLVM/project f8a2e0eclang/lib/ScalableStaticAnalysisFramework/Analyses/UnsafeBufferUsage UnsafeBufferUsage.cpp

Update clang/lib/ScalableStaticAnalysisFramework/Analyses/UnsafeBufferUsage/UnsafeBufferUsage.cpp

Remove "#include SSAFForceLinker.h"

Co-authored-by: Balázs Benics <benicsbalazs at gmail.com>
DeltaFile
+0-1clang/lib/ScalableStaticAnalysisFramework/Analyses/UnsafeBufferUsage/UnsafeBufferUsage.cpp
+0-11 files

LLVM/project 8420612clang/lib/ScalableStaticAnalysisFramework/Analyses/UnsafeBufferUsage UnsafeBufferUsage.cpp

Update clang/lib/ScalableStaticAnalysisFramework/Analyses/UnsafeBufferUsage/UnsafeBufferUsage.cpp

adjust the position of the file title

Co-authored-by: Balázs Benics <benicsbalazs at gmail.com>
DeltaFile
+1-1clang/lib/ScalableStaticAnalysisFramework/Analyses/UnsafeBufferUsage/UnsafeBufferUsage.cpp
+1-11 files

LLVM/project a609bffllvm/test/CodeGen/AMDGPU fract-match.ll

AMDGPU: Add baseline tests for more fract patterns
DeltaFile
+2,235-0llvm/test/CodeGen/AMDGPU/fract-match.ll
+2,235-01 files

LLVM/project 3c625a1llvm/test/MC/AMDGPU gfx10_unsupported.s gfx7_unsupported.s

[AMDGPU][MC] Improving assembler error message for unsupported instructions (#185778)

The updated error message shows both the instruction name and the GPU
target name.
DeltaFile
+2,210-1,106llvm/test/MC/AMDGPU/gfx10_unsupported.s
+863-863llvm/test/MC/AMDGPU/gfx7_unsupported.s
+1,185-397llvm/test/MC/AMDGPU/gfx950_asm_features.s
+691-691llvm/test/MC/AMDGPU/gfx11_unsupported.s
+613-613llvm/test/MC/AMDGPU/gfx8_unsupported.s
+376-376llvm/test/MC/AMDGPU/gfx1250_asm_wmma_w32.s
+5,938-4,04652 files not shown
+10,005-7,66658 files

LLVM/project 7b5c33dflang/lib/Optimizer/OpenMP LowerWorkdistribute.cpp, mlir/include/mlir/Dialect/OpenMP OpenMPClauses.td OpenMPOps.td

[mlir][OpenMP] Add iterator support to depend clause

Extend the depend clause to support `!omp.iterated<Ty>` handles
alongside plain depend vars, so the IR can represent both forms.

Assisted with copilot
DeltaFile
+107-58mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp
+35-2mlir/test/Dialect/OpenMP/ops.mlir
+24-4mlir/test/Dialect/OpenMP/invalid.mlir
+11-5mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td
+3-3mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td
+4-0flang/lib/Optimizer/OpenMP/LowerWorkdistribute.cpp
+184-721 files not shown
+186-727 files

LLVM/project 07f63daclang/test/SemaHLSL Texture2D-mips-errors.hlsl

[HLSL] Fix up Texture2D-mips-errors test

The Texture2D-mips-errors test was supposed to test for an error when the mips
types are used as templates. It was initially disabled because of a
crash. On further investigation, the crash was related to int2(0,0), and
not the mips type.

Follow-up issue for the int2(0,0) crash: #189086

Fixes #188556
DeltaFile
+5-7clang/test/SemaHLSL/Texture2D-mips-errors.hlsl
+5-71 files

LLVM/project 55f15adclang/lib/Headers/hlsl hlsl_alias_intrinsics.h, clang/test/CodeGenHLSL/builtins fma.hlsl

Merge branch 'main' into users/amehsan/weakc-nsw
DeltaFile
+0-220llvm/test/CodeGen/AMDGPU/frame-index-disjoint-s-or-b32.ll
+0-161llvm/test/CodeGen/AMDGPU/eliminate-frame-index-scalar-bit-ops.mir
+138-0clang/test/CodeGenHLSL/builtins/fma.hlsl
+113-0clang/test/SemaHLSL/BuiltIns/fma-errors.hlsl
+54-0clang/lib/Headers/hlsl/hlsl_alias_intrinsics.h
+53-0llvm/test/CodeGen/DirectX/fma.ll
+358-38111 files not shown
+481-41517 files

LLVM/project 509f181flang/test/Transforms debug-imported-entity.fir, mlir/test/Dialect/LLVMIR bytecode.mlir

[MLIR][TableGen] Fix ArrayRefParameter in struct format roundtrip  (#189065)

When an ArrayRefParameter (or OptionalArrayRefParameter) appears in a
non-last position within a struct() assembly format directive, the
printed
output is ambiguous: the comma-separated array elements are
indistinguishable from the struct-level commas separating key-value
pairs.

Fix this by wrapping such parameters in square brackets in both the
generated printer and parser. The printer emits '[' before and ']' after
the array value; the parser calls parseLSquare()/parseRSquare() around
the
FieldParser call. Parameters with a custom printer or parser are
unaffected
(the user controls the format in that case).

Fixes #156623

Assisted-by: Claude Code
DeltaFile
+62-9mlir/tools/mlir-tblgen/AttrOrTypeFormatGen.cpp
+68-0mlir/test/mlir-tblgen/attr-or-type-format.td
+33-0mlir/test/lib/Dialect/Test/TestAttrDefs.td
+18-1mlir/test/mlir-tblgen/attr-or-type-format-roundtrip.mlir
+1-1mlir/test/Dialect/LLVMIR/bytecode.mlir
+1-1flang/test/Transforms/debug-imported-entity.fir
+183-122 files not shown
+185-148 files

LLVM/project 0760a72flang/lib/Optimizer/OpenMP LowerWorkdistribute.cpp, mlir/include/mlir/Dialect/OpenMP OpenMPClauses.td OpenMPOps.td

[mlir][OpenMP] Add iterator support to depend clause

Extend the depend clause to support `!omp.iterated<Ty>` handles
alongside plain depend vars, so the IR can represent both forms.
DeltaFile
+102-50mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp
+35-2mlir/test/Dialect/OpenMP/ops.mlir
+11-5mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td
+6-6mlir/test/Dialect/OpenMP/invalid.mlir
+3-3mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td
+4-0flang/lib/Optimizer/OpenMP/LowerWorkdistribute.cpp
+161-661 files not shown
+163-667 files

LLVM/project a996f2allvm/lib/Target/AMDGPU SIRegisterInfo.cpp, llvm/test/CodeGen/AMDGPU frame-index-disjoint-s-or-b32.ll eliminate-frame-index-scalar-bit-ops.mir

Revert "AMDGPU: Fold frame indexes into disjoint s_or_b32" (#189074)

Reverts llvm/llvm-project#102345

unblock bot: https://lab.llvm.org/buildbot/#/builders/10/builds/25403
DeltaFile
+0-220llvm/test/CodeGen/AMDGPU/frame-index-disjoint-s-or-b32.ll
+0-161llvm/test/CodeGen/AMDGPU/eliminate-frame-index-scalar-bit-ops.mir
+2-6llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
+2-3873 files

LLVM/project 3405dc4libclc/clc/lib/generic/math clc_fract.inc

libclc: Simplify fract implementation

This is nan propagating, so it's unnatural to implement it
in terms of the nan avoiding fmin. Implement with compare and
select, which is the least constrained way to implement the clamp.
DeltaFile
+2-2libclc/clc/lib/generic/math/clc_fract.inc
+2-21 files

LLVM/project ac1863ellvm/lib/Target/AMDGPU AMDGPUCodeGenPrepare.cpp, llvm/test/CodeGen/AMDGPU fract-match.ll

AMDGPU: Match fract from compare and select and minimum

Implementing this with any of the minnum variants is overconstraining
for the actual use. Existing patterns use fmin, then have to manually
clamp nan inputs to get nan propagating behavior. It's cleaner to express
this with a nan propagating operation to start with.
DeltaFile
+780-30llvm/test/CodeGen/AMDGPU/fract-match.ll
+124-85llvm/lib/Target/AMDGPU/AMDGPUCodeGenPrepare.cpp
+904-1152 files

LLVM/project 6f23cbdllvm/lib/Analysis ValueTracking.cpp, llvm/test/Transforms/Attributor nofpclass-fmul.ll

ValueTracking: x - floor(x) cannot introduce overflow

This returns a value with an absolute value less than 1 so it
should be possible to propagate no-infs.
DeltaFile
+42-0llvm/test/Transforms/Attributor/nofpclass-fmul.ll
+9-1llvm/lib/Analysis/ValueTracking.cpp
+51-12 files

LLVM/project 6587af1llvm/lib/Target/AMDGPU AMDGPUCodeGenPrepare.cpp, llvm/test/CodeGen/AMDGPU fract-match.ll

AMDGPU: Match fract pattern with swapped edge case check

A fract implementation can equivalently be written as
  r = fmin(x - floor(x))
  r = isnan(x) ? x : r;
  r = isinf(x) ? 0.0 : r;

or:
  r = fmin(x - floor(x));
  r = isinf(x) ? 0.0 : r;
  r = isnan(x) ? x : r;

Previously this only matched the previous form. Match
the case where the isinf check is the inner clamp. There are
a few more ways to write this pattern (e.g., move the clamp of
infinity to the input) but I haven't encountered that in the wild.

The existing code seems to be trying too hard to match noncanonical
variants of the pattern. Only handles the result that all 4 permutations
of compare and select produce out of instcombine.
DeltaFile
+1,401-1llvm/test/CodeGen/AMDGPU/fract-match.ll
+47-17llvm/lib/Target/AMDGPU/AMDGPUCodeGenPrepare.cpp
+1,448-182 files

LLVM/project ba823d0llvm/lib/Analysis ValueTracking.cpp, llvm/test/Transforms/Attributor/AMDGPU nofpclass-amdgcn-fract.ll

ValueTracking: llvm.amdgcn.fract cannot introduce overflow

This returns a value with an absolute value less than 1.
DeltaFile
+26-0llvm/test/Transforms/Attributor/AMDGPU/nofpclass-amdgcn-fract.ll
+2-1llvm/lib/Analysis/ValueTracking.cpp
+28-12 files