LLVM/project 1006009llvm/lib/Transforms/Vectorize VPlanRecipes.cpp VPlan.h, llvm/test/Transforms/LoopVectorize cast-costs.ll vscale-cost.ll

Revert "[LV] Add initial costs for VPInstructionWithType::computeCost (#198291)" (#202933)

This reverts commit 690b0b0c63125aaf6b517df9d528789bb8c9c08a.

Fixes buildbot failure:
https://lab.llvm.org/buildbot/#/builders/132/builds/6656
DeltaFile
+49-24llvm/test/Transforms/LoopVectorize/RISCV/gather-scatter-cost.ll
+0-25llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
+6-6llvm/test/Transforms/LoopVectorize/cast-costs.ll
+5-7llvm/test/Transforms/LoopVectorize/RISCV/tail-folding-gather-scatter.ll
+4-1llvm/lib/Transforms/Vectorize/VPlan.h
+2-2llvm/test/Transforms/LoopVectorize/vscale-cost.ll
+66-651 files not shown
+68-677 files

LLVM/project c838b5dlibc/shared math_check_exceptions.h, libc/shared/math/check exp.h

[libc][math] Add shared functions to check exceptions for exp* functions. (#202503)

To be used inside LLVM and other projects.
DeltaFile
+84-0libc/src/__support/math/check/exp_exceptions.h
+62-0libc/test/shared/shared_math_check_exp_test.cpp
+25-0libc/shared/math/check/exp.h
+16-0libc/shared/math_check_exceptions.h
+12-0libc/test/shared/CMakeLists.txt
+199-05 files

LLVM/project 14a9660llvm/lib/Target/X86 X86ISelLowering.cpp, llvm/test/CodeGen/X86 pr53842.ll

[X86] combineConcatVectorOps - add 512-bit PCMPEQ/PCMPGT handling (#202928)

If we can freely concatenate both operands, then its worth replacing
with a VPCMP+VPMOVM2 pair

Managed to notice this while triaging #198162 - and the AVX512DQ SGT
test shows another vpmovq2m+vpmovm2q pair codegen issue :(
DeltaFile
+74-47llvm/test/CodeGen/X86/pr53842.ll
+13-1llvm/lib/Target/X86/X86ISelLowering.cpp
+87-482 files

LLVM/project 49affe5clang/include/clang/Basic AttrDocs.td Attr.td

Document the warn_unused attribute (#201881)

Basically, this attribute is useful for getting -Wunused-variable
diagnostics from class types with a nontrivial constructor or
destructor.
DeltaFile
+34-0clang/include/clang/Basic/AttrDocs.td
+1-1clang/include/clang/Basic/Attr.td
+35-12 files

LLVM/project 3e470fcclang/lib/CIR/CodeGen CIRGenBuiltinAMDGPU.cpp, clang/test/CIR/CodeGenHIP builtins-amdgcn-vi-f16.hip builtins-amdgcn.hip

[CIR][AMDGPU] Add support for AMDGCN div_fixup builtins (#197468)

Adds codegen for the following AMDGCN division fixup builtins:

- __builtin_amdgcn_div_fixup (double)
- __builtin_amdgcn_div_fixupf (float)
- __builtin_amdgcn_div_fixuph (half)

These are lowered to the corresponding `llvm.amdgcn.div.fixup` intrinsic.
DeltaFile
+65-0clang/test/CIR/CodeGenHIP/builtins-amdgcn-vi-f16.hip
+18-2clang/test/CIR/CodeGenHIP/builtins-amdgcn.hip
+6-4clang/lib/CIR/CodeGen/CIRGenBuiltinAMDGPU.cpp
+89-63 files

LLVM/project 7f5b575llvm/lib/Target/AMDGPU SIInsertWaitcnts.cpp, llvm/test/CodeGen/AMDGPU waitcnt-debug.mir

[RFC][AMDGPU] Remove DebugCounter-based WaitCnt debugging

It's 8 years old, only used by a handful of tests, and has not been updated
in a while except for maintenance as far as I can see.

I don't mind keeping it in if there are users of it, but right now it
looks like a dead feature. If we want some more elaborate waitcnt debugging,
we should have a modern, generic system that works on any waitcnt, not
something specific to 3 GFX9 counters.
DeltaFile
+1-50llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
+0-44llvm/test/CodeGen/AMDGPU/waitcnt-debug.mir
+1-942 files

LLVM/project 64d4596llvm/lib/Target/AMDGPU SIInsertWaitcnts.cpp AMDGPUWaitcntUtils.cpp, llvm/lib/Target/AMDGPU/Utils AMDGPUBaseInfo.cpp AMDGPUBaseInfo.h

[NFC][AMDGPU][InsertWaitCnts] Move some simple functions into Utils

Move really trivial functions into helpers to declutter InsertWaitCnt a bit more.
I had to move HardwareLimits into a different header but it's only used in InsertWaitCnt so it doesn't matter.
DeltaFile
+21-86llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
+75-0llvm/lib/Target/AMDGPU/AMDGPUWaitcntUtils.cpp
+32-0llvm/lib/Target/AMDGPU/AMDGPUWaitcntUtils.h
+0-20llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp
+0-20llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.h
+128-1265 files

LLVM/project d58808allvm/include/llvm/ADT GenericUniformityImpl.h, llvm/lib/Analysis UniformityAnalysis.cpp

review
DeltaFile
+1-28llvm/lib/CodeGen/MachineUniformityAnalysis.cpp
+3-12llvm/include/llvm/ADT/GenericUniformityImpl.h
+10-1llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
+2-2llvm/lib/Analysis/UniformityAnalysis.cpp
+16-434 files

LLVM/project c655905llvm/lib/Target/AMDGPU/Utils AMDGPUHWEvents.cpp

Comments
DeltaFile
+86-70llvm/lib/Target/AMDGPU/Utils/AMDGPUHWEvents.cpp
+86-701 files

LLVM/project 7b1dc59llvm/lib/Transforms/Vectorize VPlanRecipes.cpp VPlan.h, llvm/test/Transforms/LoopVectorize widen-gep-all-indices-invariant.ll narrow-to-single-scalar-widen-gep-scalable.ll

[VPlan] Simplify WidenGEP::execute (#193543)

WidenGEP::execute is currently dependent on whether or not a given
operand is defined outside loop regions, but it loop-invariant operands
are not guaranteed to be hoisted outside the loop, and neither are
single-scalar operands guaranteed to be maximally narrowed to
single-scalars. Use the vputils::isSingleScalar helper to analyze the
single-scalar status of each operand and the result instead, simplifying
the execute, while also leading to some improvements.
DeltaFile
+6-54llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
+22-24llvm/test/Transforms/LoopVectorize/widen-gep-all-indices-invariant.ll
+0-8llvm/lib/Transforms/Vectorize/VPlan.h
+2-2llvm/test/Transforms/PhaseOrdering/ARM/arm_var_q31.ll
+1-3llvm/test/Transforms/LoopVectorize/narrow-to-single-scalar-widen-gep-scalable.ll
+1-1llvm/test/Transforms/LoopVectorize/VPlan/vplan-printing.ll
+32-921 files not shown
+33-937 files

LLVM/project 80460f1mlir/lib/Dialect/SPIRV/IR CastOps.cpp, mlir/test/Dialect/SPIRV/IR cast-ops.mlir

[mlir][SPIR-V] Fix ConvertUToPtr verifier error message (NFC) (#202899)
DeltaFile
+10-0mlir/test/Dialect/SPIRV/IR/cast-ops.mlir
+1-1mlir/lib/Dialect/SPIRV/IR/CastOps.cpp
+11-12 files

LLVM/project e63f9beclang/lib/Sema SemaType.cpp, clang/test/CXX/dcl.dcl/dcl.spec/dcl.type/dcl.spec.auto p3-generic-lambda-1y.cpp p5.cpp

[Clang] Accept auto parameters pre-C++20 as an extension (#200670)

GCC already accepts auto parameters as an extenion.
DeltaFile
+25-26clang/test/SemaCXX/deduced-return-type-cxx14.cpp
+32-18clang/test/CXX/dcl/dcl.fct/p17.cpp
+11-6clang/lib/Sema/SemaType.cpp
+5-5clang/test/CXX/dcl.dcl/dcl.spec/dcl.type/dcl.spec.auto/p3-generic-lambda-1y.cpp
+3-3clang/test/CXX/dcl.dcl/dcl.spec/dcl.type/dcl.spec.auto/p5.cpp
+2-2clang/test/SemaCXX/crash-GH173943.cpp
+78-605 files not shown
+85-6311 files

LLVM/project b68444ellvm/lib/Target/AArch64 SVEInstrFormats.td SMEInstrFormats.td

fixup! Tighten code some more
DeltaFile
+13-13llvm/lib/Target/AArch64/SVEInstrFormats.td
+2-6llvm/lib/Target/AArch64/SMEInstrFormats.td
+3-0llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp
+18-193 files

LLVM/project 53ae585llvm/test/MC/AMDGPU gfx11_asm_vop3_dpp16.s, llvm/test/MC/Disassembler/AMDGPU gfx11_dasm_vop3_dpp16.txt gfx11_dasm_vop3_dpp16-fake16.txt

[AMDGPU][NFC] Templatise and roundtrip gfx11_asm_vop3_dpp16.s (#202721)

I tried to make sure this covers all important cases from asm/disasm
tests here upstream and the true16 branch downstream.

This will resolve ~4k lines of differences vs the true16 branch.
DeltaFile
+8,554-3,250llvm/test/MC/AMDGPU/gfx11_asm_vop3_dpp16.s
+0-6,200llvm/test/MC/Disassembler/AMDGPU/gfx11_dasm_vop3_dpp16.txt
+4,380-0llvm/test/MC/Disassembler/AMDGPU/gfx11_dasm_vop3_dpp16-fake16.txt
+12,934-9,4503 files

LLVM/project e518b41llvm/lib/Transforms/InstCombine InstCombineCalls.cpp

[InstCombine][NFC] Don't try non-bundle folds on assumes with bundles (#202914)
DeltaFile
+24-19llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
+24-191 files

LLVM/project 284681elldb/source/Plugins/Process/Windows/Common TargetThreadWindows.cpp TargetThreadWindows.h

[NFC][lldb][Windows] Clean up TargetThreadWindows (#202722)

- Drop dead `//#include "ForwardDecl.h"` and stale `class HostThread;`
forward declaration.
- Remove redundant `m_thread_reg_ctx_sp()` default-init in the
constructor initializer list.
DeltaFile
+1-1lldb/source/Plugins/Process/Windows/Common/TargetThreadWindows.cpp
+0-2lldb/source/Plugins/Process/Windows/Common/TargetThreadWindows.h
+1-32 files

LLVM/project d1f15d0lldb/source/Plugins/Process/Windows/Common NativeThreadWindows.cpp

[NFC][lldb][Windows] Clean up NativeThreadWindows (#202723)

- Drop unused #includes lldb/Target/Process.h and lldb/lldb-forward.h.
- Inline the one-shot NativeProcessProtocol& local in DoResume and
modernize GetStopReason's stale legacy log->Printf idiom to LLDB_LOGF.
DeltaFile
+3-9lldb/source/Plugins/Process/Windows/Common/NativeThreadWindows.cpp
+3-91 files

LLVM/project 84285e4lldb/source/Plugins/Process/Windows/Common DebuggerThread.cpp DebuggerThread.h

[NFC][lldb][Windows] Clean up DebuggerThread (#202719)

- Fix typos in a llvm_unreachable string and a local variable name.
- Replace a C-style downcast to HostProcessWindows with static_cast.
- Drop redundant braces around a single-statement if and add the
namespace-closer comment in the header.
DeltaFile
+6-6lldb/source/Plugins/Process/Windows/Common/DebuggerThread.cpp
+1-1lldb/source/Plugins/Process/Windows/Common/DebuggerThread.h
+7-72 files

LLVM/project 5adc2d1llvm/lib/Target/AMDGPU SIISelLowering.cpp AMDGPULegalizerInfo.cpp, llvm/test/CodeGen/AMDGPU llvm.amdgcn.reduce.fmax.ll llvm.amdgcn.reduce.fmin.ll

[AMDGPU] Support Wave Reduction intrinsics for half types

Supported Ops: `fmin`, `fmax`, `fadd`, `fsub`.
DeltaFile
+941-264llvm/test/CodeGen/AMDGPU/llvm.amdgcn.reduce.fmax.ll
+941-264llvm/test/CodeGen/AMDGPU/llvm.amdgcn.reduce.fmin.ll
+902-160llvm/test/CodeGen/AMDGPU/llvm.amdgcn.reduce.fsub.ll
+899-160llvm/test/CodeGen/AMDGPU/llvm.amdgcn.reduce.fadd.ll
+18-5llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+15-3llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
+3,716-8566 files

LLVM/project 38e73fcclang/include/clang/Basic arm_sve.td, clang/test/CodeGen/AArch64/sve2p3-intrinsics acle_sve2p3_luti6.c

fixup! Adjust definitions after ACLE updates from @rockdreamer
DeltaFile
+12-12clang/test/CodeGen/AArch64/sve2p3-intrinsics/acle_sve2p3_luti6.c
+4-4clang/test/Sema/aarch64-sve2p3-intrinsics/acle_sve2p3_target.c
+1-1clang/include/clang/Basic/arm_sve.td
+17-173 files

LLVM/project 71290adclang/include/clang/Basic arm_sve.td, clang/test/Sema/aarch64-sme2p3-intrinsics acle_sme2p3_target_lane.c acle_sme2p3_target.c

fixup! Address more PR comments
DeltaFile
+15-9clang/test/Sema/aarch64-sve2p3-intrinsics/acle_sve2p3_target_lane.c
+0-16clang/test/Sema/aarch64-sme2p3-intrinsics/acle_sme2p3_target_lane.c
+5-5llvm/test/CodeGen/AArch64/sme2p3-intrinsics-luti6.ll
+3-3clang/test/Sema/aarch64-sme2p3-intrinsics/acle_sme2p3_target.c
+1-4llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
+4-0clang/include/clang/Basic/arm_sve.td
+28-371 files not shown
+28-387 files

LLVM/project 72945f2clang/include/clang/Basic arm_sve.td, clang/test/CodeGen/AArch64/sme2p3-intrinsics acle_sme2p3_luti6.c

fixup! Reformat classes to make more sense, and other CR updates
DeltaFile
+27-27llvm/lib/Target/AArch64/SVEInstrFormats.td
+45-3llvm/test/CodeGen/AArch64/sme2p3-intrinsics-luti6.ll
+22-23llvm/lib/Target/AArch64/SMEInstrFormats.td
+26-18llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp
+32-8clang/test/CodeGen/AArch64/sme2p3-intrinsics/acle_sme2p3_luti6.c
+2-2clang/include/clang/Basic/arm_sve.td
+154-813 files not shown
+156-869 files

LLVM/project 1b713aeclang/test/Sema/AArch64 arm_sve_streaming_only_sme_AND_sme2p3.c arm_sve_feature_dependent_sve_AND_sve2p3___sme_AND_LP_sve2p3_OR_sme2p3_RP.c

fixup! Run `clang/utils/aarch64_builtins_test_generator.py`
DeltaFile
+118-0clang/test/Sema/AArch64/arm_sve_streaming_only_sme_AND_sme2p3.c
+77-0clang/test/Sema/AArch64/arm_sve_feature_dependent_sve_AND_sve2p3___sme_AND_LP_sve2p3_OR_sme2p3_RP.c
+62-0clang/test/Sema/AArch64/arm_sve_non_streaming_only_sve_AND_sve2p3.c
+56-0clang/test/Sema/AArch64/arm_sme_streaming_only_sme_AND_sme2p3.c
+313-04 files

LLVM/project 0050b3ellvm/lib/Target/AArch64 AArch64ISelDAGToDAG.cpp

fixup! Don't modify SelectMultiVectorLutiLane
DeltaFile
+38-45llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp
+38-451 files

LLVM/project 50c38e8clang/include/clang/Basic arm_sve.td, clang/test/CodeGen/AArch64/sme2p3-intrinsics acle_sme2p3_luti6.c

fixup! Adjust after ACLE changes to svluti6_lane_s16_x4
DeltaFile
+48-4clang/test/CodeGen/AArch64/sme2p3-intrinsics/acle_sme2p3_luti6.c
+15-6llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp
+12-4llvm/lib/Target/AArch64/SMEInstrFormats.td
+8-8clang/test/Sema/aarch64-sme2p3-intrinsics/acle_sme2p3_imm.c
+4-0llvm/include/llvm/IR/IntrinsicsAArch64.td
+2-1clang/include/clang/Basic/arm_sve.td
+89-232 files not shown
+92-248 files

LLVM/project 9b13d4eclang/test/Sema/aarch64-sve2p3-intrinsics acle_sve2p3_target_lane.c acle_sve2p3_imm.cpp, llvm/test/CodeGen/AArch64 sve2p3-intrinsics-luti6.ll

fixup! Add some more _bf16 tests
DeltaFile
+27-0clang/test/Sema/aarch64-sve2p3-intrinsics/acle_sve2p3_target_lane.c
+11-0llvm/test/CodeGen/AArch64/sve2p3-intrinsics-luti6.ll
+3-0clang/test/Sema/aarch64-sve2p3-intrinsics/acle_sve2p3_imm.cpp
+41-03 files

LLVM/project 6af8c68clang/include/clang/Basic arm_sve.td arm_sme.td, clang/test/CodeGen/AArch64/sme2p3-intrinsics acle_sme2p3_luti6.c

fixup! Adjust `def`s and split out tests
DeltaFile
+0-158clang/test/CodeGen/AArch64/sve2p3-intrinsics/acle_sve2p3_luti6.c
+138-0clang/test/CodeGen/AArch64/sve2p3-intrinsics/acle_sve2p3_luti6_lane_x2.c
+5-5clang/test/CodeGen/AArch64/sme2p3-intrinsics/acle_sme2p3_luti6.c
+0-4clang/include/clang/Basic/arm_sve.td
+1-0clang/include/clang/Basic/arm_sme.td
+144-1675 files

LLVM/project 2eff7b0llvm/include/llvm/IR IntrinsicsAArch64.td, llvm/lib/Target/AArch64 SVEInstrFormats.td AArch64ISelDAGToDAG.cpp

fixup! Amend after PR comments
DeltaFile
+11-8llvm/include/llvm/IR/IntrinsicsAArch64.td
+4-4llvm/lib/Target/AArch64/SVEInstrFormats.td
+2-3llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp
+1-1llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
+18-164 files

LLVM/project e875707clang/test/Sema/aarch64-sve2p3-intrinsics acle_sve2p3_target_lane.c acle_sve2p3_target.c, llvm/lib/Target/AArch64 AArch64InstrInfo.td

fixup! Move tests
DeltaFile
+0-54clang/test/Sema/aarch64-sve2p3-intrinsics/acle_sve2p3_target_lane.c
+38-3clang/test/Sema/aarch64-sve2p3-intrinsics/acle_sve2p3_target.c
+1-0llvm/lib/Target/AArch64/AArch64InstrInfo.td
+39-573 files

LLVM/project ae19a18clang/include/clang/Basic arm_sme.td arm_sve.td, clang/test/CodeGen/AArch64/sme2p3-intrinsics acle_sme2p3_luti6.c

fixup! Fix more PR comments
DeltaFile
+10-13llvm/include/llvm/IR/IntrinsicsAArch64.td
+4-4clang/test/CodeGen/AArch64/sme2p3-intrinsics/acle_sme2p3_luti6.c
+2-2clang/include/clang/Basic/arm_sme.td
+1-1clang/include/clang/Basic/arm_sve.td
+17-204 files