LLVM/project a03c35flibcxx/include/__format unicode.h

[libc++][NFC] Avoid checking that string::iterator is a contiguous iterator (#178636)

`__is_continuation` is only used in contexts where we already know that
the argument is a contiguous iterator. However, due to the context in
which it is used, we check it as soon as the header is included. The
`contiguous_iterator` check is quite expensive (~12ms on my system), so
avoiding it reduces compile times for quite a few headers, including
`<vector>`.
DeltaFile
+1-1libcxx/include/__format/unicode.h
+1-11 files

LLVM/project 04ae88allvm/include/llvm/ADT GenericUniformityImpl.h, llvm/include/llvm/CodeGen TargetInstrInfo.h

Implement per-output machine uniformity analysis
DeltaFile
+76-14llvm/lib/CodeGen/MachineUniformityAnalysis.cpp
+27-11llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
+16-5llvm/include/llvm/ADT/GenericUniformityImpl.h
+8-9llvm/test/Analysis/UniformityAnalysis/AMDGPU/MIR/per-output-uniformity.mir
+4-4llvm/lib/Target/AMDGPU/SIInstrInfo.h
+4-3llvm/include/llvm/CodeGen/TargetInstrInfo.h
+135-462 files not shown
+140-498 files

LLVM/project ec86761lldb/source/Core CoreProperties.td, lldb/source/Plugins/SymbolFile/PDB SymbolFilePDBProperties.td

[lldb] Fix typos in property descriptions. (#178757)

DeltaFile
+3-3lldb/source/Core/CoreProperties.td
+1-1lldb/source/Plugins/SymbolFile/PDB/SymbolFilePDBProperties.td
+4-42 files

LLVM/project d473a87clang/docs ReleaseNotes.rst, clang/include/clang/Sema Sema.h

[Clang] speed up -Wassign-enum via enumerator caching (#176560)

Fixes #176454

---

This patch addresses a performance issue in `-Wassign-enum` where
enumerator values were repeatedly rebuilt and sorted for each assignment
check, leading to excessive compile-time overhead for large enums.


The following charts demonstrate the benchmark results before and after
caching enumerator values

Before
<img width="640" height="480" alt="before_enum_assign"
src="https://github.com/user-attachments/assets/cbc9de29-32cd-452e-84e9-383dcf334bac"
/>


    [3 lines not shown]
DeltaFile
+20-20clang/lib/Sema/SemaStmt.cpp
+12-1clang/test/Sema/warn-outof-range-assign-enum.c
+4-0clang/include/clang/Sema/Sema.h
+2-0clang/docs/ReleaseNotes.rst
+38-214 files

LLVM/project 5bd8dadclang/test/C/C2y n3605.c n3605_1.c, clang/www c_status.html

clang: add test for C2y n3605 (#178479)

Add a test for N3605: Generic replacement (v. 2 of quasi-literals)

The paper clarifies existing behavior of _Generic selection and
parenthesis. This PR adds tests along the same lines and mark the
feature as supported.
DeltaFile
+57-0clang/test/C/C2y/n3605.c
+19-0clang/test/C/C2y/n3605_1.c
+4-0clang/test/C/C2y/n3605_2.c
+1-1clang/www/c_status.html
+81-14 files

LLVM/project 8bfe65cclang/www hacking.html

clang: improve lit testing docs (#178244)

The LLVM Integrated Tester now generates an "easy to use" script to run
clang tests. It is no longer needed to pass all of the commandline
arguments to it. This PR simplifies the documentation a little by
removing the unneeded commandline arguments and adding a link to the lit
man page.
DeltaFile
+32-39clang/www/hacking.html
+32-391 files

LLVM/project 481146ellvm/test/Analysis/UniformityAnalysis/AMDGPU/MIR per-output-uniformity.mir

[AMDGPU] Add test for amdgcn.if/else per-output uniformity (NFC)

Add a test to document the current behavior of uniformity analysis for
amdgcn.if and amdgcn.else intrinsics. Currently both outputs are marked
divergent regardless of input uniformity.
DeltaFile
+60-0llvm/test/Analysis/UniformityAnalysis/AMDGPU/MIR/per-output-uniformity.mir
+60-01 files

LLVM/project 0551c8ellvm/include/llvm-c Core.h, llvm/lib/IR Core.cpp

IR: Add stub LLVMCreateDenormalFPEnvAttribute to C API

This is a staging commit for #174293 to avoid an intermediate
break of the mesa build when committed. Implement just enough
of the API in terms of the old attributes to avoid breaking
the mesa use. #174293 will then implement the full API in terms
of the new attribute.
DeltaFile
+34-0llvm/include/llvm-c/Core.h
+18-0llvm/unittests/IR/AttributesTest.cpp
+13-0llvm/lib/IR/Core.cpp
+65-03 files

LLVM/project e9e8b38llvm/docs AMDGPUUsage.rst

[AMDGPU] Update documentation for wave reduction intrinsics (#175132)

DeltaFile
+76-4llvm/docs/AMDGPUUsage.rst
+76-41 files

LLVM/project 7a62033llvm/lib/Target/AMDGPU SOPInstructions.td, llvm/test/MC/AMDGPU gfx13_asm_sop2.s gfx13_asm_sop2_alias.s

[AMDGPU] Add SOP2 support for gfx13 (#178848)

Co-authored-by: Jay Foad <jay.foad at amd.com>
DeltaFile
+4,716-0llvm/test/MC/AMDGPU/gfx13_asm_sop2.s
+123-90llvm/lib/Target/AMDGPU/SOPInstructions.td
+51-0llvm/test/MC/AMDGPU/gfx13_asm_sop2_alias.s
+4,890-903 files

LLVM/project f190477clang/include/clang/Basic BuiltinsAMDGPU.td, clang/lib/CodeGen/TargetBuiltins AMDGPU.cpp

[AMDGPU] Add builtins for wave reduction intrinsics (#170813)

DeltaFile
+86-0clang/test/Sema/wave-reduce-builtins-validate-amdgpu.cl
+84-0clang/test/CodeGenOpenCL/builtins-amdgcn.cl
+26-22clang/include/clang/Basic/BuiltinsAMDGPU.td
+8-0clang/lib/CodeGen/TargetBuiltins/AMDGPU.cpp
+204-224 files

LLVM/project 4ded7e0llvm/lib/Target/AMDGPU SIISelLowering.cpp SIInstructions.td, llvm/test/CodeGen/AMDGPU llvm.amdgcn.reduce.fadd.ll llvm.amdgcn.reduce.fsub.ll

[AMDGPU] Add wave reduce intrinsics for double types - 2 (#170812)

Supported Ops: `add`, `sub`
DeltaFile
+1,115-0llvm/test/CodeGen/AMDGPU/llvm.amdgcn.reduce.fadd.ll
+1,102-0llvm/test/CodeGen/AMDGPU/llvm.amdgcn.reduce.fsub.ll
+91-19llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+2-0llvm/lib/Target/AMDGPU/SIInstructions.td
+2,310-194 files

LLVM/project d1817c4llvm/docs AMDGPUUsage.rst

[AMDGPU] Update documentation for wave reduction intrinsics
DeltaFile
+76-4llvm/docs/AMDGPUUsage.rst
+76-41 files

LLVM/project 9e7919allvm/lib/Target/AMDGPU SIISelLowering.cpp

Use getRegClass() API
DeltaFile
+1-2llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+1-21 files

LLVM/project 88ec5a0clang/include/clang/Basic BuiltinsAMDGPU.td, clang/lib/CodeGen/TargetBuiltins AMDGPU.cpp

[AMDGPU] Add builtins for wave reduction intrinsics
DeltaFile
+86-0clang/test/Sema/wave-reduce-builtins-validate-amdgpu.cl
+84-0clang/test/CodeGenOpenCL/builtins-amdgcn.cl
+26-22clang/include/clang/Basic/BuiltinsAMDGPU.td
+8-0clang/lib/CodeGen/TargetBuiltins/AMDGPU.cpp
+204-224 files

LLVM/project 61d5361llvm/lib/Target/AMDGPU SIISelLowering.cpp, llvm/test/CodeGen/AMDGPU llvm.amdgcn.reduce.fsub.ll llvm.amdgcn.reduce.fadd.ll

Use _pseudo instead of _gfx12 encoding, plus minor code cleanup
DeltaFile
+19-14llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+4-4llvm/test/CodeGen/AMDGPU/llvm.amdgcn.reduce.fsub.ll
+4-4llvm/test/CodeGen/AMDGPU/llvm.amdgcn.reduce.fadd.ll
+27-223 files

LLVM/project daefbefllvm/lib/Target/AMDGPU SIISelLowering.cpp

Refactor code and add some comments
DeltaFile
+8-5llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+8-51 files

LLVM/project 66042a3llvm/lib/Target/AMDGPU SIISelLowering.cpp

    Avoid generation check in callee function
DeltaFile
+17-7llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+17-71 files

LLVM/project 0ee4dd0llvm/lib/Target/AMDGPU SIISelLowering.cpp, llvm/test/CodeGen/AMDGPU llvm.amdgcn.reduce.fadd.ll

use `v_mul_f64_pseudo_e64`
DeltaFile
+3-3llvm/test/CodeGen/AMDGPU/llvm.amdgcn.reduce.fadd.ll
+1-1llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+4-42 files

LLVM/project 7794646llvm/lib/Target/AMDGPU SIISelLowering.cpp

Use `WAVE_REDUCE_FSUB_PSEUDO_F64` in switch statements
DeltaFile
+17-13llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+17-131 files

LLVM/project eab54ballvm/lib/Target/AMDGPU SIISelLowering.cpp

Use pseudo opcode for switch statements
DeltaFile
+9-9llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+9-91 files

LLVM/project 6cf1198llvm/lib/Target/AMDGPU SIISelLowering.cpp

Don't use the pseudo as a case label.
DeltaFile
+17-23llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+17-231 files

LLVM/project cedbeaellvm/lib/Target/AMDGPU SIISelLowering.cpp

Use enum values for source modifiers
DeltaFile
+3-3llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+3-31 files

LLVM/project 49be7ballvm/lib/Target/AMDGPU SIISelLowering.cpp

Use `e32` encoding as placeholder
DeltaFile
+9-9llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+9-91 files

LLVM/project 70226dbllvm/lib/Target/AMDGPU SIISelLowering.cpp

Use enum values for src modifiers.
DeltaFile
+8-8llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+8-81 files

LLVM/project 929f55dllvm/lib/Target/AMDGPU SIISelLowering.cpp SIInstructions.td, llvm/test/CodeGen/AMDGPU llvm.amdgcn.reduce.fadd.ll llvm.amdgcn.reduce.fsub.ll

[AMDGPU] Add wave reduce intrinsics for double types - 2

Supported Ops: `add`, `sub`
DeltaFile
+1,115-0llvm/test/CodeGen/AMDGPU/llvm.amdgcn.reduce.fadd.ll
+1,102-0llvm/test/CodeGen/AMDGPU/llvm.amdgcn.reduce.fsub.ll
+76-19llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+2-0llvm/lib/Target/AMDGPU/SIInstructions.td
+2,295-194 files

LLVM/project cd91d31llvm/lib/Transforms/Utils EntryExitInstrumenter.cpp

[EntryExitInstrumenter] Mark CFG as preserved (#178875)

This pass does not change the CFG, so mark all CFG analyses as
preserved, instead of DT in particular. This matches what the NewPM
implementation does.

(This currently has no direct benefit as nearby passes end up
invalidating things anyway.)
DeltaFile
+1-1llvm/lib/Transforms/Utils/EntryExitInstrumenter.cpp
+1-11 files

LLVM/project 41f453ellvm/lib/CodeGen/SelectionDAG DAGCombiner.cpp, llvm/test/CodeGen/AArch64 sve-streaming-mode-fixed-length-int-mulh.ll sve-streaming-mode-fixed-length-fp-rounding.ll

Revert "[DAG] Enable bitcast STLF for Constant/Undef" (#178872)

Reverts llvm/llvm-project#172523

As explained in
https://github.com/llvm/llvm-project/pull/172523#issuecomment-3823234270
(along with reproducer), this causes compiler crashes building
llvm-test-suite for RVV targets.
DeltaFile
+0-71llvm/test/CodeGen/X86/dag-stlf-mismatch.ll
+22-8llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-int-mulh.ll
+3-26llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+14-0llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-fp-rounding.ll
+8-4llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-fp-to-int.ll
+0-12llvm/test/CodeGen/RISCV/rvv/stlf.ll
+47-12136 files not shown
+172-18742 files

LLVM/project c88dd45llvm/test/DebugInfo/Generic debug-label-mi.ll, llvm/test/Feature optnone-llc.ll

[AArch64][GlobalISel] Exclude arm64 from failing tests (#178849)

As pointed out by the buildbots &
https://github.com/llvm/llvm-project/pull/174746#issuecomment-3821237987,
some of the tests modified by
https://github.com/llvm/llvm-project/pull/174746 are missing exclusions
for arm64 target triples causing the builds to fail.

I have added these exclusions here.
DeltaFile
+1-1llvm/test/DebugInfo/Generic/debug-label-mi.ll
+1-1llvm/test/Feature/optnone-llc.ll
+2-22 files

LLVM/project fbffdaamlir/include/mlir/Dialect/GPU/IR CompilationInterfaces.h, mlir/lib/Target/LLVM/NVVM Target.cpp

[MLIR][GPU] Update serializeToObject to use SerializedObject wrapper and include ISA compiler logs (#176697)

This PR makes the compilation log from ISA compiler available to users
by returning it as part of the `gpu::ObjectAttr` properties, following
the existing pattern like `LLVMIRToISATimeInMs`.

Currently, the compiler log (which contains useful information such as
spill statistics when --verbose is passed) is only accessible in debug
builds via `LLVM_DEBUG`. However, there are good reasons to make this
information available in release builds as well:

1. Both `ptxas` and `libnvptxcompiler` are publicly available
tools/libraries distributed with the CUDA Toolkit. The `--verbose` flag
and its output are documented public features, not internal debug
information.
2. The verbose output provides valuable insights for users.

A new `SerializedObject` class is used to carry the metadata alongside
the binary when returning from `serializeObject`.
DeltaFile
+57-28mlir/lib/Target/LLVM/NVVM/Target.cpp
+22-18mlir/unittests/Target/LLVM/SerializeROCDLTarget.cpp
+21-18mlir/unittests/Target/LLVM/SerializeToLLVMBitcode.cpp
+18-14mlir/unittests/Target/LLVM/SerializeNVVMTarget.cpp
+12-8mlir/lib/Target/LLVM/ROCDL/Target.cpp
+19-0mlir/include/mlir/Dialect/GPU/IR/CompilationInterfaces.h
+149-868 files not shown
+191-10114 files