LLVM/project d94caealibc/shared/math ffmal.h, libc/src/__support/math ffmal.h CMakeLists.txt

[libc][math] Refactor ffmal to Header Only. (#179069)

closes #175326

Part of #147386
DeltaFile
+28-0libc/src/__support/math/ffmal.h
+25-0libc/shared/math/ffmal.h
+11-1utils/bazel/llvm-project-overlay/libc/BUILD.bazel
+10-0libc/src/__support/math/CMakeLists.txt
+2-4libc/src/math/generic/ffmal.cpp
+1-1libc/src/math/generic/CMakeLists.txt
+77-63 files not shown
+80-69 files

LLVM/project 15832a6libc/shared/math sqrtf128.h, libc/src/__support/math sqrtf128.h CMakeLists.txt

[libc][math] Refactor sqrtf128 to header only (#177760)

Closes #177652 
DeltaFile
+454-0libc/src/__support/math/sqrtf128.h
+2-422libc/src/math/generic/sqrtf128.cpp
+29-0libc/shared/math/sqrtf128.h
+16-1utils/bazel/llvm-project-overlay/libc/BUILD.bazel
+15-0libc/src/__support/math/CMakeLists.txt
+1-7libc/src/math/generic/CMakeLists.txt
+517-4303 files not shown
+521-4309 files

LLVM/project e890821llvm/lib/Transforms/Scalar LoopStrengthReduce.cpp, llvm/test/Transforms/LoopStrengthReduce/X86 debuginfo-scev-salvage-ptrtoaddr.ll

[LSR] Support SCEVPtrToAddr in SCEVDbgValueBuilder.

Allow SCEVPtrToAddr as cast in assertion in SCEVDbgValueBuilder.
SCEVPtrToAddr is handled similarly to SCEVPtrToInt.

Fixes a crash with debug info after bd40d1de9c9ee, which started to
generate ptrtoaddr instead of ptrtoint expressions.
DeltaFile
+61-0llvm/test/Transforms/LoopStrengthReduce/X86/debuginfo-scev-salvage-ptrtoaddr.ll
+2-1llvm/lib/Transforms/Scalar/LoopStrengthReduce.cpp
+63-12 files

LLVM/project f6219e8mlir/include/mlir/Interfaces ExecutionProgressOpInterface.td ExecutionProgressOpInterface.h, mlir/lib/Dialect/SCF/IR SCF.cpp

[mlir][Interfaces] Add `ExecutionProgressOpInterface` + folding pattern (#179039)

Add the `ExecutionProgressOpInterface` with an interface method to check
if an operation "must progress". Add `mustProgress` attributes to
`scf.for` and `scf.while` (default value is "true").

`mustProgress` corresponds to the [`llvm.loop.mustprogress`
metadata](https://llvm.org/docs/LangRef.html#langref-llvm-loop-mustprogress).

Also add a canonicalization pattern to erase `RegionBranchOpInterface`
ops that must progress but loop infinitely (and are non-side-effecting).
This canonicalization pattern is enabled for `scf.for` and `scf.while`.

RFC: https://discourse.llvm.org/t/infinite-loops-and-dead-code/89530

[mlir] Fix build after #179039 (#179180)

Fix build after #179039.
DeltaFile
+73-30mlir/lib/Interfaces/ControlFlowInterfaces.cpp
+51-0mlir/test/Dialect/SCF/canonicalize.mlir
+45-3mlir/lib/Dialect/SCF/IR/SCF.cpp
+48-0mlir/include/mlir/Interfaces/ExecutionProgressOpInterface.td
+39-0mlir/lib/Dialect/UB/IR/UBOps.cpp
+29-0mlir/include/mlir/Interfaces/ExecutionProgressOpInterface.h
+285-3313 files not shown
+368-4419 files

LLVM/project 7054a4bllvm/include/llvm/Analysis ValueTracking.h, llvm/lib/Analysis ValueTracking.cpp

[ValueTracking] Propagate sign information out of loop (#175590)

LLVM converts sqrt libcall to intrinsic call if the argument is within
the range(greater than or equal to 0.0). In this case the compiler is
not able to deduce the non-negativity on its own. Extended ValueTracking
to understand such loops.

Fixes llvm/llvm-project#174813
DeltaFile
+220-0llvm/test/Transforms/AggressiveInstCombine/X86/pr175590.ll
+90-0llvm/lib/Analysis/ValueTracking.cpp
+18-0llvm/include/llvm/Analysis/ValueTracking.h
+328-03 files

LLVM/project 0fd4ad2clang/include/clang/CIR/Dialect/IR CIROps.td, clang/lib/CIR/CodeGen CIRGenAtomic.cpp

[CIR] Scoped atomic exchange (#173781)

This patch adds support for for scoped atomic exchange operations in
CIR.
DeltaFile
+23-23clang/test/CIR/CodeGen/atomic.c
+32-0clang/test/CIR/CodeGen/atomic-scoped.c
+12-12clang/test/CIR/IR/atomic.cir
+7-4clang/lib/CIR/Lowering/DirectToLLVM/LowerToLLVM.cpp
+5-2clang/lib/CIR/CodeGen/CIRGenAtomic.cpp
+4-1clang/include/clang/CIR/Dialect/IR/CIROps.td
+83-426 files

LLVM/project dc1cb33libcxx/modules/std atomic.inc

guard modules
DeltaFile
+2-0libcxx/modules/std/atomic.inc
+2-01 files

LLVM/project c381180mlir/include/mlir/Dialect/AMDGPU/IR AMDGPUOps.td, mlir/lib/Dialect/AMDGPU/IR AMDGPUOps.cpp

[mlir][AMDGPU] Avoid verifier crash in DPPOp on vector operand types (#178887)

### whats the problem 
mlir-opt could crash while verifying amdgpu.dpp when its operands had
vector
types, such as ARM SME tile vectors produced by arm_sme.get_tile.
The crash occurred during IR verification, before any lowering or passes
ran.

### why it happens 
DPPOp::verify() called Type::getIntOrFloatBitWidth() on the operand
type.
When the operand was a VectorType, this hit an assertion because only
scalar
integer and float types have a bitwidth.

### whats the fix 
Query the bitwidth on the element type using getElementTypeOrSelf()
instead of

    [5 lines not shown]
DeltaFile
+13-2mlir/include/mlir/Dialect/AMDGPU/IR/AMDGPUOps.td
+8-0mlir/test/Dialect/AMDGPU/invalid.mlir
+7-0mlir/test/Dialect/AMDGPU/ops.mlir
+0-6mlir/lib/Dialect/AMDGPU/IR/AMDGPUOps.cpp
+28-84 files

LLVM/project 9029744libcxx/test/std/re/re.alg/re.alg.search grep.pass.cpp

[libcxx] Add missing header to re/re.alg/re.alg.search/grep.pass.cpp (#180024)

This test can't be compiled with GCC without this fix.
DeltaFile
+2-1libcxx/test/std/re/re.alg/re.alg.search/grep.pass.cpp
+2-11 files

LLVM/project ee7d2b7clang/test/DebugInfo/CXX ptrauth-member-function-pointer-debuglocs.cpp

Address review comments
DeltaFile
+11-3clang/test/DebugInfo/CXX/ptrauth-member-function-pointer-debuglocs.cpp
+11-31 files

LLVM/project 46257eflibcxx/include/__atomic atomic_ref.h

remove unused include
DeltaFile
+0-1libcxx/include/__atomic/atomic_ref.h
+0-11 files

LLVM/project 5eeeeealibcxx/include/__atomic/support gcc.h

fix gcc
DeltaFile
+4-4libcxx/include/__atomic/support/gcc.h
+4-41 files

LLVM/project 9e8caa7llvm/lib/CodeGen/SelectionDAG SelectionDAGBuilder.cpp, llvm/test/CodeGen/X86 selectiondag-dbgvalue-null-crash.ll

[SelectionDAG] Fix null pointer dereference in resolveDanglingDebugInfo (#174341)

## Summary
Fix null pointer dereference in
`SelectionDAGBuilder::resolveDanglingDebugInfo`.

## Problem
`Val.getNode()->getIROrder()` is called before checking if
`Val.getNode()` is null, causing crashes when compiling code with debug
info that contains aggregate constants with nested empty structs.

## Solution
Move the `ValSDNodeOrder` declaration inside the `if (Val.getNode())`
block.

## Test Case
Reproduces with aggregate types containing nested empty structs:
```llvm
%3 = insertvalue { { i1, {} }, ptr, { { {} }, { {} } }, i64 } 

    [47 lines not shown]
DeltaFile
+46-0llvm/test/CodeGen/X86/selectiondag-dbgvalue-null-crash.ll
+1-1llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
+47-12 files

LLVM/project a64208dlibcxx/include/__atomic atomic_ref.h, libcxx/include/__atomic/support gcc.h

fix gcc
DeltaFile
+12-2libcxx/include/__atomic/atomic_ref.h
+4-4libcxx/include/__atomic/support/gcc.h
+16-62 files

LLVM/project 8d20783llvm/lib/Transforms/InstCombine InstCombineSimplifyDemanded.cpp, llvm/test/Transforms/InstCombine add2.ll rem-mul-shl.ll

[InstCombine] Shrink added constant using LHS known zeros (#174380)

Previously, `SimplifyDemandedUseBits` for `add` instructions only
used known zeros from the RHS to simplify the LHS. It failed to
handle the symmetric case where the LHS has known zeros and the
result does not demand the low bits.

This patch implements this missing optimization, allowing the RHS
constant to be shrunk when the LHS low bits are known zero and unused.

Proof: https://alive2.llvm.org/ce/z/6v9iFY
Fixed: https://github.com/llvm/llvm-project/issues/135411
DeltaFile
+75-0llvm/test/Transforms/InstCombine/add2.ll
+6-0llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp
+3-3llvm/test/Transforms/InstCombine/rem-mul-shl.ll
+2-2llvm/test/Transforms/InstCombine/add.ll
+86-54 files

LLVM/project 906a95elibcxx/include/__atomic atomic_ref.h

fix gcc atomic_ref
DeltaFile
+3-2libcxx/include/__atomic/atomic_ref.h
+3-21 files

LLVM/project 806ad88compiler-rt/lib/ubsan CMakeLists.txt

[compiler-rt] [ubsan] Fix missing include directory (#180341)

Fixes missing `-I` path that broke standalone builds in #179011. Matches
`include_directories()` in other compiler-rt libraries.

Signed-off-by: Michał Górny <mgorny at gentoo.org>
DeltaFile
+1-0compiler-rt/lib/ubsan/CMakeLists.txt
+1-01 files

LLVM/project fb39fd5libcxx/include/__atomic/support gcc.h

fix gcc build
DeltaFile
+24-4libcxx/include/__atomic/support/gcc.h
+24-41 files

LLVM/project a14bc2fmlir/lib/AsmParser AttributeParser.cpp, mlir/lib/IR AsmPrinter.cpp

simplify parser
DeltaFile
+27-36mlir/lib/AsmParser/AttributeParser.cpp
+47-1mlir/test/IR/dense-elements-type-interface.mlir
+1-1mlir/lib/IR/AsmPrinter.cpp
+75-383 files

LLVM/project b342b40mlir/lib/AsmParser AttributeParser.cpp, mlir/test/IR dense-elements-type-interface.mlir

simplify parser
DeltaFile
+27-36mlir/lib/AsmParser/AttributeParser.cpp
+47-1mlir/test/IR/dense-elements-type-interface.mlir
+74-372 files

LLVM/project 7637618libcxx/test/benchmarks stop_token.bench.cpp

[libc++] Reduce the number of runs on the stop_token benchmarks (#179914)

Testing a bunch of sizes has relatively little value. This reduces the
number of benchmarks so we can run them on a regular basis.

Fixes #179697
DeltaFile
+4-4libcxx/test/benchmarks/stop_token.bench.cpp
+4-41 files

LLVM/project 3463c3fmlir/lib/AsmParser AttributeParser.cpp, mlir/test/IR dense-elements-type-interface.mlir

simplify parser
DeltaFile
+27-40mlir/lib/AsmParser/AttributeParser.cpp
+47-1mlir/test/IR/dense-elements-type-interface.mlir
+74-412 files

LLVM/project 269fda1llvm/test/CodeGen/AMDGPU mad-mix.ll mad-mix-bf16.ll, llvm/test/CodeGen/AMDGPU/GlobalISel fdiv.f16.ll

[AMDGPU] Fix pattern selecting fmul to v_fma_mix_f32 (#180210)

This needs to use an addend of -0.0 to get the correct result when the
result should be -0.0.
DeltaFile
+74-74llvm/test/CodeGen/AMDGPU/GlobalISel/fdiv.f16.ll
+52-52llvm/test/CodeGen/AMDGPU/mad-mix.ll
+21-21llvm/test/CodeGen/AMDGPU/mad-mix-bf16.ll
+16-16llvm/test/CodeGen/AMDGPU/frem.ll
+5-5llvm/test/CodeGen/AMDGPU/bf16.ll
+4-4llvm/test/CodeGen/AMDGPU/fdiv.f16.ll
+172-1721 files not shown
+174-1757 files

LLVM/project 6c6fb00llvm/lib/Target/AMDGPU SIShrinkInstructions.cpp, llvm/test/CodeGen/AMDGPU amdgcn.bitcast.512bit.ll min.ll

[AMDGPU] Optimize S_OR_B32 to S_ADDK_I32 where possible (#177949)

This PR fixes #177753, converting disjoint S_OR_B32 to S_ADDK_I32
whenever possible, it avoids this transformation in case S_OR_B32 can be
converted to bitset.

Note on Test Failures (Draft Status) This change causes significant
register reshuffling across the test suite due to the new allocation
hints and the swaps performed in case src0 is not a register and src1,
along with the change from or to addk. To avoid a massive, noisy diff
during the initial logic review:

This Draft PR only includes a representative sample of updated tests.
CodeGen/AMDGPU/combine-reg-or-const.ll -> Showcases change from S_OR to
S_ADDK
CodeGen/AMDGPU/s-barrier.ll -> Showcases swap between Src0 and Src1 if
src0 is not a register

The rest of the tests show the result of the register allocation hint we

    [3 lines not shown]
DeltaFile
+578-590llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.512bit.ll
+33-33llvm/test/CodeGen/AMDGPU/GlobalISel/fshl.ll
+25-25llvm/test/CodeGen/AMDGPU/GlobalISel/fshr.ll
+17-17llvm/test/CodeGen/AMDGPU/min.ll
+16-15llvm/lib/Target/AMDGPU/SIShrinkInstructions.cpp
+31-0llvm/test/CodeGen/AMDGPU/s_or_b32_transformation.ll
+700-6801 files not shown
+701-6817 files

LLVM/project ab79abblibcxx/include/__atomic atomic.h, libcxx/test/std/atomics/atomics.ref fetch_min.pass.cpp fetch_max.pass.cpp

implement atomic max/min
DeltaFile
+70-0libcxx/test/std/atomics/atomics.ref/fetch_min.pass.cpp
+70-0libcxx/test/std/atomics/atomics.ref/fetch_max.pass.cpp
+68-0libcxx/include/__atomic/atomic.h
+57-0libcxx/test/std/atomics/atomics.types.operations/atomics.types.operations.req/atomic_fetch_min_explicit.pass.cpp
+57-0libcxx/test/std/atomics/atomics.types.operations/atomics.types.operations.req/atomic_fetch_max_explicit.pass.cpp
+55-0libcxx/test/std/atomics/atomics.types.operations/atomics.types.operations.req/atomic_fetch_min.pass.cpp
+377-014 files not shown
+541-2620 files

LLVM/project 4bffbc8llvm/test/CodeGen/AMDGPU/GlobalISel regbankcombiner-copy-scc-vcc.mir ssubsat.ll

Add known bits, and adjust combine pattern.
DeltaFile
+275-256llvm/test/CodeGen/AMDGPU/GlobalISel/regbankcombiner-copy-scc-vcc.mir
+21-61llvm/test/CodeGen/AMDGPU/GlobalISel/ssubsat.ll
+28-54llvm/test/CodeGen/AMDGPU/GlobalISel/icmp.ll
+29-47llvm/test/CodeGen/AMDGPU/GlobalISel/regbankcombiner-umed3.mir
+29-47llvm/test/CodeGen/AMDGPU/GlobalISel/regbankcombiner-smed3.mir
+17-58llvm/test/CodeGen/AMDGPU/GlobalISel/saddsat.ll
+399-52311 files not shown
+495-71517 files

LLVM/project 15c9c77llvm/lib/Target/Mips MipsISelLowering.cpp, llvm/test/CodeGen/Mips musttail-disabled.ll

[MIPS] Do not silently ignore musttail (#178310)

Do not silently ignore musttail markings if UseMipsTailCalls is false.
DeltaFile
+16-0llvm/test/CodeGen/Mips/musttail-disabled.ll
+3-0llvm/lib/Target/Mips/MipsISelLowering.cpp
+19-02 files

LLVM/project 19d6811llvm/lib/Target/AMDGPU SIInstrInfo.h, llvm/test/TableGen RegClassByHwMode.td

Revert "[MC][TableGen] Expand Opcode field of MCInstrDesc" (#180321)

Reverts llvm/llvm-project#179652

This PR causes the out-of-memory build failures on many Windows
builders.
DeltaFile
+82-82llvm/lib/Target/AMDGPU/SIInstrInfo.h
+50-50llvm/test/TableGen/GlobalISelEmitter/GlobalISelEmitter.td
+37-37llvm/test/TableGen/RegClassByHwMode.td
+29-29llvm/test/TableGen/GlobalISelEmitter/DefaultOpsGlobalISel.td
+22-22llvm/test/TableGen/GlobalISelEmitter/Subreg.td
+14-14llvm/test/TableGen/GlobalISelEmitter/CustomPredicate.td
+234-23464 files not shown
+473-48070 files

LLVM/project f599f16llvm/lib/Target/AMDGPU SIInstrInfo.h, llvm/lib/Target/AMDGPU/Utils AMDGPUBaseInfo.cpp

Revert "[MC][TableGen] Expand Opcode field of MCInstrDesc (#179652)"

This reverts commit 13d8870d455fafa734d29b1f3703386ef6e3b5f8.
DeltaFile
+82-82llvm/lib/Target/AMDGPU/SIInstrInfo.h
+50-50llvm/test/TableGen/GlobalISelEmitter/GlobalISelEmitter.td
+37-37llvm/test/TableGen/RegClassByHwMode.td
+29-29llvm/test/TableGen/GlobalISelEmitter/DefaultOpsGlobalISel.td
+22-22llvm/test/TableGen/GlobalISelEmitter/Subreg.td
+14-14llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp
+234-23464 files not shown
+473-48070 files

LLVM/project 060f325mlir/lib/Dialect/Linalg/Transforms Vectorization.cpp, mlir/test/Dialect/Linalg/vectorization convolution-with-patterns.mlir

[mlir][Linalg] Promote lhs/rhs when vectorizing conv1D as outerproduct (#179883)

-- vector.outerproduct requires lhs/rhs to have same element type as the
   result.
-- This commit adds a fix to promote lhs/rhs to have result's element
   type when vectorizing conv1D slice to vector.outerproduct.
-- This is along the similar lines of what happens when we are
   vectorizing conv1D slice to vector.contract - the corresponding
   CHECK line was incorrect and this commit fixes that too.

Signed-off-by: Abhishek Varma <abhvarma at amd.com>
DeltaFile
+57-2mlir/test/Dialect/Linalg/vectorization/convolution-with-patterns.mlir
+8-2mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp
+65-42 files