LLVM/project ed26a4bclang/include/clang/Basic BuiltinsAMDGPU.td, clang/test/Sema wave-reduce-builtins-validate-amdgpu.cl

Mark stratergy argument as constant
DeltaFile
+86-0clang/test/Sema/wave-reduce-builtins-validate-amdgpu.cl
+26-26clang/include/clang/Basic/BuiltinsAMDGPU.td
+112-262 files

LLVM/project b3c8562clang/include/clang/Basic BuiltinsAMDGPU.td, clang/lib/CodeGen/TargetBuiltins AMDGPU.cpp

[AMDGPU] Add builtins for wave reduction intrinsics
DeltaFile
+84-0clang/test/CodeGenOpenCL/builtins-amdgcn.cl
+8-0clang/lib/CodeGen/TargetBuiltins/AMDGPU.cpp
+4-0clang/include/clang/Basic/BuiltinsAMDGPU.td
+96-03 files

LLVM/project fff45e7llvm/docs AMDGPUUsage.rst

Modelled fmin/fmax similar to llvm.minimumnum/maximumnum
DeltaFile
+8-2llvm/docs/AMDGPUUsage.rst
+8-21 files

LLVM/project 9a27389llvm/docs AMDGPUUsage.rst

[AMDGPU] Update documentation for wave reduction intrinsics
DeltaFile
+70-4llvm/docs/AMDGPUUsage.rst
+70-41 files

LLVM/project 18fd307llvm/lib/Target/AMDGPU SIISelLowering.cpp

Use getRegClass() API
DeltaFile
+1-2llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+1-21 files

LLVM/project 6bba170lld/ELF Relocations.cpp

[ELF] Remove unneeded -z ifunc-noplt check. NFC

The `isIfunc && zIfuncNoplt` code path does not use the RelExpr, so we
don't need to adjust it.
DeltaFile
+1-1lld/ELF/Relocations.cpp
+1-11 files

LLVM/project c4528aallvm/lib/Target/AMDGPU SIISelLowering.cpp

Don't use the pseudo as a case label.
DeltaFile
+17-23llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+17-231 files

LLVM/project 6f3a197llvm/lib/Target/AMDGPU SIISelLowering.cpp

Refactor code and add some comments
DeltaFile
+8-5llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+8-51 files

LLVM/project b0ef64dllvm/lib/Target/AMDGPU SIISelLowering.cpp, llvm/test/CodeGen/AMDGPU llvm.amdgcn.reduce.fadd.ll llvm.amdgcn.reduce.fsub.ll

Use _pseudo instead of _gfx12 encoding, plus minor code cleanup
DeltaFile
+19-14llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+4-4llvm/test/CodeGen/AMDGPU/llvm.amdgcn.reduce.fadd.ll
+4-4llvm/test/CodeGen/AMDGPU/llvm.amdgcn.reduce.fsub.ll
+27-223 files

LLVM/project 0ed2fd7llvm/lib/Target/AMDGPU SIISelLowering.cpp

Use pseudo opcode for switch statements
DeltaFile
+9-9llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+9-91 files

LLVM/project f278902llvm/lib/Target/AMDGPU SIISelLowering.cpp

    Avoid generation check in callee function
DeltaFile
+17-7llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+17-71 files

LLVM/project 29b0208llvm/lib/Target/AMDGPU SIISelLowering.cpp, llvm/test/CodeGen/AMDGPU llvm.amdgcn.reduce.fadd.ll

use `v_mul_f64_pseudo_e64`
DeltaFile
+3-3llvm/test/CodeGen/AMDGPU/llvm.amdgcn.reduce.fadd.ll
+1-1llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+4-42 files

LLVM/project ac878edllvm/lib/Target/AMDGPU SIISelLowering.cpp

Use `WAVE_REDUCE_FSUB_PSEUDO_F64` in switch statements
DeltaFile
+17-13llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+17-131 files

LLVM/project 5d335d7llvm/lib/Target/AMDGPU SIISelLowering.cpp

Use enum values for source modifiers
DeltaFile
+3-3llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+3-31 files

LLVM/project 5f82668llvm/lib/Target/AMDGPU SIISelLowering.cpp

Use `e32` encoding as placeholder
DeltaFile
+9-9llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+9-91 files

LLVM/project 9aec66dllvm/lib/Target/AMDGPU SIISelLowering.cpp SIInstructions.td, llvm/test/CodeGen/AMDGPU llvm.amdgcn.reduce.fadd.ll llvm.amdgcn.reduce.fsub.ll

[AMDGPU] Add wave reduce intrinsics for double types - 2

Supported Ops: `add`, `sub`
DeltaFile
+1,115-0llvm/test/CodeGen/AMDGPU/llvm.amdgcn.reduce.fadd.ll
+1,102-0llvm/test/CodeGen/AMDGPU/llvm.amdgcn.reduce.fsub.ll
+76-19llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+2-0llvm/lib/Target/AMDGPU/SIInstructions.td
+2,295-194 files

LLVM/project ddc5fa3llvm/lib/Target/AMDGPU SIISelLowering.cpp

Use enum values for src modifiers.
DeltaFile
+8-8llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+8-81 files

LLVM/project cbec464llvm/lib/Target/AMDGPU SIISelLowering.cpp

Running clang format
DeltaFile
+1-2llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+1-21 files

LLVM/project ac47d8cllvm/lib/Transforms/Instrumentation MemorySanitizer.cpp, llvm/test/Instrumentation/MemorySanitizer/AArch64 aarch64-bf16-dotprod-intrinsics.ll

[msan] Handle Arm NEON BFloat16 multiply-add to single-precision (#178510)

aarch64.neon.bfmlalb/t perform dot-products after zeroing out the
odd/even-indexed values. We handle these by generalizing
handleVectorDotProductIntrinsic() and (mis-)using getPclmulMask().
DeltaFile
+72-90llvm/test/Instrumentation/MemorySanitizer/AArch64/aarch64-bf16-dotprod-intrinsics.ll
+67-9llvm/lib/Transforms/Instrumentation/MemorySanitizer.cpp
+139-992 files

LLVM/project 1079273clang-tools-extra/clang-tidy/cppcoreguidelines ProBoundsArrayToPointerDecayCheck.cpp

[clang-tidy] Speed up `cppcoreguidelines-pro-bounds-array-to-pointer-decay` (#178775)

By just changing the order of some conditions, the check goes from
fairly expensive to very cheap:

```txt
                    ---User Time---   --System Time--   --User+System--   ---Wall Time---  --- Name ---
Status quo:         0.7812 (  1.7%)   0.0469 (  0.7%)   0.8281 (  1.6%)   0.5585 (  1.1%)  cppcoreguidelines-pro-bounds-array-to-pointer-decay
With this change:   0.0312 (  0.1%)   0.0000 (  0.0%)   0.0312 (  0.1%)   0.0598 (  0.1%)  cppcoreguidelines-pro-bounds-array-to-pointer-decay
```
`hicpp-no-array-decay` is an alias of this check and so benefits too.
DeltaFile
+1-3clang-tools-extra/clang-tidy/cppcoreguidelines/ProBoundsArrayToPointerDecayCheck.cpp
+1-31 files

LLVM/project 44e0811llvm/lib/Target/AMDGPU AMDGPURegBankLegalizeRules.cpp, llvm/test/CodeGen/AMDGPU llvm.amdgcn.class.ll llvm.amdgcn.class.f16.ll

[AMDGPU][GlobalISel] Add RegBankLegalize rules for amdgcn.class
DeltaFile
+212-101llvm/test/CodeGen/AMDGPU/llvm.amdgcn.class.ll
+57-15llvm/test/CodeGen/AMDGPU/llvm.amdgcn.class.f16.ll
+8-0llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeRules.cpp
+1-2llvm/test/CodeGen/AMDGPU/GlobalISel/regbankselect-amdgcn.class.mir
+278-1184 files

LLVM/project b52591bcompiler-rt CMakeLists.txt

comipler-rt: add atomic to SANITIZER_COMMON_LINK_LIBS for MIPS (#178819)

atomic is needed explicitly for MIPS.
DeltaFile
+4-0compiler-rt/CMakeLists.txt
+4-01 files

LLVM/project a17bc05libc/shared/math sincosf.h, libc/src/__support/math sincosf.h CMakeLists.txt

[libc][math] Refactor sincosf implementation to header only (#177523)

Part of #147386

in preparation for:
https://discourse.llvm.org/t/rfc-make-clang-builtin-math-functions-constexpr-with-llvm-libc-to-support-c-23-constexpr-math-functions/86450

Closes #177640
DeltaFile
+226-0libc/src/__support/math/sincosf.h
+2-196libc/src/math/generic/sincosf.cpp
+20-6utils/bazel/llvm-project-overlay/libc/BUILD.bazel
+23-0libc/shared/math/sincosf.h
+16-0libc/src/__support/math/CMakeLists.txt
+6-5libc/src/__support/math/sincosf_utils.h
+293-2074 files not shown
+301-21610 files

LLVM/project a00f5a6llvm/test/CodeGen/AMDGPU llvm.amdgcn.class.ll llvm.amdgcn.class.f16.ll

[AMDGPU][NFC] Update test to use update_llc_test_checks
DeltaFile
+974-460llvm/test/CodeGen/AMDGPU/llvm.amdgcn.class.ll
+222-58llvm/test/CodeGen/AMDGPU/llvm.amdgcn.class.f16.ll
+1,196-5182 files

LLVM/project 947df33libc/shared/math f16sqrt.h, libc/src/__support/math f16sqrt.h CMakeLists.txt

[libc][math] Refactor f16sqrt to Header Only (#177167)

Fixes #175330
Part of https://github.com/llvm/llvm-project/issues/147386

in preparation for:
https://discourse.llvm.org/t/rfc-make-clang-builtin-math-functions-constexpr-with-llvm-libc-to-support-c-23-constexpr-math-functions/86450
DeltaFile
+32-0libc/src/__support/math/f16sqrt.h
+29-0libc/shared/math/f16sqrt.h
+12-1utils/bazel/llvm-project-overlay/libc/BUILD.bazel
+10-0libc/src/__support/math/CMakeLists.txt
+2-6libc/src/math/generic/f16sqrt.cpp
+1-2libc/src/math/generic/CMakeLists.txt
+86-93 files not shown
+90-99 files

LLVM/project 3441623llvm/lib/CodeGen/SelectionDAG TargetLowering.cpp

address wangleiat's comment
DeltaFile
+1-1llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
+1-11 files

LLVM/project b6e6837lldb/source/Commands CommandObjectDWIMPrint.cpp

[lldb] Make `print` delegate to synthetic frames.

This patch is more of a proposal in that it's a pretty dramatic change to the way that `print` works. It completely delegates getting values to the frame if the frame is synthetic, and does not redirect at all if the frame fails.

For this patch, the main goal was to allow the synthetic frame to bubble up its own errors in expression evaluation, rather than having errors come back with an extra "could not find identifier <blah>" or worse, simply get swallowed. If there's a better way to handle this, I'm more than happy to change this as long as the core goals of 'delegate variable/value extraction to the synthetic frame', and 'allow the synthetic frame to give back errors that are displayed to the user' can be met.

stack-info: PR: https://github.com/llvm/llvm-project/pull/178602, branch: users/bzcheeseman/stack/7
DeltaFile
+23-2lldb/source/Commands/CommandObjectDWIMPrint.cpp
+23-21 files

LLVM/project 5d52734lldb/include/lldb/Interpreter/Interfaces ScriptedFrameInterface.h, lldb/source/Plugins/Process/scripted ScriptedFrame.cpp ScriptedFrame.h

[lldb] Add support for ScriptedFrame to provide values/variables.

This patch adds plumbing to support the implementations of StackFrame::Get{*}Variable{*} on ScriptedFrame. The major pieces required are:
- A modification to ScriptedFrameInterface, so that we can actually call the python methods.
- A corresponding update to the python implementation to call the python methods.
- An implementation in ScriptedFrame that can get the variable list on construction inside ScriptedFrame::Create, and pass that list into the ScriptedFrame so it can get those values on request.

There is a major caveat, which is that if the values from the python side don't have variables attached, right now, they won't be passed into the scripted frame to be stored in the variable list. Future discussions around adding support for 'extended variables' when printing frame variables may create a reason to change the VariableListSP into a ValueObjectListSP, and generate the VariableListSP on the fly, but that should be addressed at a later time.

This patch also adds tests to the frame provider test suite to prove these changes all plumb together correctly.

Related radar: rdar://165708771

stack-info: PR: https://github.com/llvm/llvm-project/pull/178575, branch: users/bzcheeseman/stack/6
DeltaFile
+82-0lldb/test/API/functionalities/scripted_frame_provider/test_frame_providers.py
+66-0lldb/source/Plugins/Process/scripted/ScriptedFrame.cpp
+53-0lldb/test/API/functionalities/scripted_frame_provider/TestScriptedFrameProvider.py
+28-0lldb/source/Plugins/ScriptInterpreter/Python/Interfaces/ScriptedFramePythonInterface.cpp
+21-0lldb/source/Plugins/Process/scripted/ScriptedFrame.h
+9-0lldb/include/lldb/Interpreter/Interfaces/ScriptedFrameInterface.h
+259-02 files not shown
+269-08 files

LLVM/project 4953ebfllvm/lib/Target/LoongArch LoongArchISelLowering.cpp LoongArchRegisterInfo.cpp, llvm/test/CodeGen/LoongArch preserve_nonecc_call.ll preserve_nonecc_varargs.ll

[LoongArch] Support `preserve_none` calling convention (#178566)

Add support for the `preserve_none` calling convention on LoongArch.
Registers `R4-R20` and `R23-R31` are treated as caller-saved and may be
used for argument passing, except for `R31`.
DeltaFile
+455-0llvm/test/CodeGen/LoongArch/preserve_nonecc_call.ll
+196-0llvm/test/CodeGen/LoongArch/preserve_nonecc_varargs.ll
+113-0llvm/test/CodeGen/LoongArch/preserve_nonecc.ll
+39-5llvm/lib/Target/LoongArch/LoongArchISelLowering.cpp
+11-0llvm/test/CodeGen/LoongArch/preserve_nonecc_musttail.ll
+7-2llvm/lib/Target/LoongArch/LoongArchRegisterInfo.cpp
+821-72 files not shown
+824-88 files

LLVM/project 0568797llvm/lib/Target/LoongArch LoongArchInstrInfo.cpp, llvm/test/CodeGen/LoongArch disable-reloc-sched.ll

[LoongArch] Add option to disable scheduling of instructions with target flags

This patch adds a hidden command-line option
-loongarch-disable-reloc-sched. When enabled, isSafeToMove returns
false for instructions that have operands with target flags.

This effectively prevents code motion for instructions involved in
relocations, which is useful for debugging code generation issues
related to relocation sequences or scheduling boundaries.

Reviewers: heiher, SixWeining

Pull Request: https://github.com/llvm/llvm-project/pull/178639
DeltaFile
+48-0llvm/test/CodeGen/LoongArch/disable-reloc-sched.ll
+12-0llvm/lib/Target/LoongArch/LoongArchInstrInfo.cpp
+60-02 files