LLVM/project 120f10dllvm/include/llvm/ADT GenericUniformityImpl.h, llvm/lib/Analysis UniformityAnalysis.cpp

review: address suggestions
DeltaFile
+21-18llvm/include/llvm/ADT/GenericUniformityImpl.h
+3-3llvm/test/Analysis/UniformityAnalysis/AMDGPU/MIR/always-uniform-gmir.mir
+3-3llvm/test/Analysis/UniformityAnalysis/AMDGPU/MIR/always-uniform.mir
+2-1llvm/lib/Analysis/UniformityAnalysis.cpp
+1-1llvm/test/Analysis/UniformityAnalysis/AMDGPU/MIR/never-uniform.mir
+30-265 files

LLVM/project 96ca63fmlir/test/Target/LLVMIR openmp-taskloop-bounds-cast.mlir

Fix new test again
DeltaFile
+2-2mlir/test/Target/LLVMIR/openmp-taskloop-bounds-cast.mlir
+2-21 files

LLVM/project 4b4fbf1llvm/lib/Target/AArch64 AArch64ConditionOptimizer.cpp

[NFC][AArch64] ConditionOptimizer: add CmpCondPair and tryOptimizePair (#187160)

Add CmpCondPair to bundle a compare/conditional instruction pair with
its condition code.

Update applyCmpAdjustment() to take CmpCondPair, and extract core
optimization logic into tryOptimizePair() to be used in both intra- and
cross-block paths.
DeltaFile
+128-107llvm/lib/Target/AArch64/AArch64ConditionOptimizer.cpp
+128-1071 files

LLVM/project 2a60a9dflang/test/Lower/OpenMP taskloop.f90, mlir/include/mlir/Dialect/OpenMP OpenMPOps.td

[mlir][OpenMP] Move taskloop clauses to the context op

The clauses are implemented when lowering the context op (which
generates the runtime calls, and handles the outlining of the task
function: including privatization etc). Therefore I thought it made more
sense to put the clauses on this operation rather than on the wrapped
loop.

RFC: https://discourse.llvm.org/t/rfc-openmp-alloca-placement-for-openmp-loop-wrappers/89512/7

Patch 2/3
DeltaFile
+64-64mlir/test/Dialect/OpenMP/ops.mlir
+64-56mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td
+46-46mlir/test/Dialect/OpenMP/invalid.mlir
+44-36mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp
+20-20flang/test/Lower/OpenMP/taskloop.f90
+16-19mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
+254-24130 files not shown
+367-35536 files

LLVM/project cf6589dflang/lib/Lower/OpenMP OpenMP.cpp, mlir/lib/Dialect/OpenMP/IR OpenMPDialect.cpp

Address review comments: mark unused param and move var decl

- Mark the unused 'clauses' parameter in TaskloopOp::build with
  [[maybe_unused]]
- Move the declaration of 'wrapperClauseOps' in genStandaloneTaskloop
  to immediately before its first use

Assisted-by: Copilot, Claude Sonnet 4.6
DeltaFile
+1-1mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp
+1-1flang/lib/Lower/OpenMP/OpenMP.cpp
+2-22 files

LLVM/project 91b34bcmlir/test/Target/LLVMIR openmp-taskloop-bounds-cast.mlir

Fix test added in later commit

It seems the testing bots cherry-pick the branch onto current main
instead of testing the branch as it is.
DeltaFile
+7-4mlir/test/Target/LLVMIR/openmp-taskloop-bounds-cast.mlir
+7-41 files

LLVM/project 0cefe63lldb/source/Plugins/Process/FreeBSD-Kernel-Core ProcessFreeBSDKernelCore.cpp

[lldb][Process/FreeBSDKernelCore] Rework plugin destruction (#188426)

Destroy the plugin classes similar to `ProcessElfCore`, another process
plugin derived from `PostMortemProcess` class. After clearing thread
list, invoke `Finalize()` to cleanup resources properly. `Finalize()`
will call `DoDestroy()` which releases `m_kvm` via `kvm_close()`.

---------

Signed-off-by: Minsoo Choo <minsoochoo0122 at proton.me>
DeltaFile
+14-3lldb/source/Plugins/Process/FreeBSD-Kernel-Core/ProcessFreeBSDKernelCore.cpp
+14-31 files

LLVM/project d24c347mlir/lib/Dialect/OpenMP/IR OpenMPDialect.cpp, mlir/test/Dialect/OpenMP invalid.mlir

Improve error message

Make it clear that the requirement is for direct nesting.
DeltaFile
+3-3mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp
+2-2mlir/test/Dialect/OpenMP/invalid.mlir
+5-52 files

LLVM/project 1463d8amlir/include/mlir/Dialect/OpenMP OpenMPOps.td, mlir/lib/Target/LLVMIR/Dialect/OpenMP OpenMPToLLVMIRTranslation.cpp

[mlir][OpenMP] Separate OutlinableInterface from taskloop LoopWrapper

Separate taskloop context and loop lowering into different operations.
This allows us to have separate operations representing the outlinable
interface and the loop wrapper interface so that there is somewhere
better than the loop body to put task-local allocations:

```
omp.taskloop.context {
  llvm.alloca ...
  omp.taskloop {
    omp.loop_nest ... {
      ...
    }
  }
  omp.terminator
}
```


    [11 lines not shown]
DeltaFile
+225-150mlir/test/Dialect/OpenMP/ops.mlir
+221-139mlir/test/Dialect/OpenMP/invalid.mlir
+66-48mlir/test/Target/LLVMIR/openmp-taskloop-collapse.mlir
+58-37mlir/test/Target/LLVMIR/openmp-taskloop-cancel.mlir
+66-11mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td
+55-10mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
+691-39530 files not shown
+1,196-66836 files

LLVM/project 7e57184flang/lib/Lower/OpenMP OpenMP.cpp, mlir/lib/Dialect/OpenMP/IR OpenMPDialect.cpp

[mlir][OpenMP] Rename taskLoopOp/taskloopOp variables to taskLoopWrapperOp/taskloopWrapperOp

Rename local variables for clarity to better reflect the type they hold.

Assisted-by: Copilot, Claude Sonnet 4.6
DeltaFile
+2-2flang/lib/Lower/OpenMP/OpenMP.cpp
+2-2mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp
+4-42 files

LLVM/project 64e75b1lldb/docs/use troubleshooting.rst, lldb/source/Plugins/ExpressionParser/Clang ClangExpressionSourceCode.cpp

[lldb] Correct spelling of "language" (#188456)
DeltaFile
+2-2lldb/docs/use/troubleshooting.rst
+1-1lldb/source/Plugins/ExpressionParser/Clang/ClangExpressionSourceCode.cpp
+3-32 files

LLVM/project 5999c53libc/include stdlib-malloc.yaml

[libc] Declare free_sized and free_aligned_sized in stdlib.h / malloc.h (#188364)
DeltaFile
+15-0libc/include/stdlib-malloc.yaml
+15-01 files

LLVM/project 2f38a8fllvm/lib/Target/AMDGPU GCNVOPDUtils.cpp VOP3PInstructions.td, llvm/lib/Target/AMDGPU/Utils AMDGPUBaseInfo.cpp

AMDGPU: Codegen for v_dual_dot2acc_f32_f16/bf16 from VOP3 (#179226)

For V_DOT2_F32_F16 and V_DOT2_F32_BF16 add their VOPDName and mark
them with usesCustomInserter which will be used to add pre-RA register
allocation hints to preferably assign dst and src2 to the same physical
register. When the hint is satisfied, canMapVOP3PToVOPD recognises the
instruction as eligible for VOPD pairing by checking if it is VOP2 like:
dst==src2, no source modifiers, no clamp, and src1 is a register.
Mark both instructions as commutable to allow a literal in src1 to be
moved to src0, since VOPD only permits a literal in src0.
DeltaFile
+258-592llvm/test/CodeGen/AMDGPU/llvm.amdgcn.fdot2.ll
+75-93llvm/test/CodeGen/AMDGPU/llvm.amdgcn.fdot2.f32.bf16.ll
+32-1llvm/lib/Target/AMDGPU/GCNVOPDUtils.cpp
+8-5llvm/lib/Target/AMDGPU/VOP3PInstructions.td
+8-0llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+6-0llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp
+387-6911 files not shown
+389-6937 files

LLVM/project 1d2f14fcompiler-rt/lib/scudo/standalone/tests common_test.cpp combined_test.cpp

[scudo] Use portable TEST_SKIP macro (#188045)

Which expands to ZXTEST_SKIP on Fuchsia.
DeltaFile
+2-2compiler-rt/lib/scudo/standalone/tests/common_test.cpp
+1-1compiler-rt/lib/scudo/standalone/tests/combined_test.cpp
+3-32 files

LLVM/project 5f49ce5llvm/lib/Target/ARM ARMTargetTransformInfo.cpp ARMTargetTransformInfo.h, llvm/test/Transforms/LoopVectorize/ARM mve-reg-pressure-spills.ll

[ARM] Consider register pressure when vectorizing with MVE (#188053)

MVE only has 8 vector registers, so it's not too hard for the vectorizer
to end up using more than that resulting in enough spilling that it's
worse than not vectorizing. Enable
shouldConsiderVectorizationRegPressure for targets with MVE so the
vectorizer doesn't vectorize in those cases.
DeltaFile
+7-0llvm/lib/Target/ARM/ARMTargetTransformInfo.cpp
+2-0llvm/lib/Target/ARM/ARMTargetTransformInfo.h
+1-0llvm/test/Transforms/LoopVectorize/ARM/mve-reg-pressure-spills.ll
+10-03 files

LLVM/project 315afd6flang/test/Lower/OpenMP taskloop.f90, mlir/lib/Dialect/OpenMP/IR OpenMPDialect.cpp

[mlir][OpenMP] Rename TaskloopOp/omp.taskloop to TaskloopWrapperOp/omp.taskloop.wrapper

Rename the loop wrapper operation to better distinguish it from the
context op (omp.taskloop.context), which handles outlining and runtime calls.
The new name makes the role of each operation clearer at a glance.

RFC: https://discourse.llvm.org/t/rfc-openmp-alloca-placement-for-openmp-loop-wrappers/89512/7

Patch 3/3

Assisted-by: Copilot, Claude Sonnet 4.6
DeltaFile
+37-37mlir/test/Dialect/OpenMP/ops.mlir
+21-21mlir/test/Dialect/OpenMP/invalid.mlir
+15-14mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp
+13-12mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
+10-10flang/test/Lower/OpenMP/taskloop.f90
+9-9mlir/test/Target/LLVMIR/openmp-taskloop-cancel.mlir
+105-10331 files not shown
+176-17437 files

LLVM/project 7a89602flang/lib/Lower/OpenMP OpenMP.cpp, mlir/lib/Dialect/OpenMP/IR OpenMPDialect.cpp

Address review comments: mark unused param and move var decl

- Mark the unused 'clauses' parameter in TaskloopOp::build with
  [[maybe_unused]]
- Move the declaration of 'wrapperClauseOps' in genStandaloneTaskloop
  to immediately before its first use

Assisted-by: Copilot, Claude Sonnet 4.6
DeltaFile
+1-1flang/lib/Lower/OpenMP/OpenMP.cpp
+1-1mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp
+2-22 files

LLVM/project 197b709flang/test/Lower/OpenMP taskloop.f90, mlir/include/mlir/Dialect/OpenMP OpenMPOps.td

[mlir][OpenMP] Move taskloop clauses to the context op

The clauses are implemented when lowering the context op (which
generates the runtime calls, and handles the outlining of the task
function: including privatization etc). Therefore I thought it made more
sense to put the clauses on this operation rather than on the wrapped
loop.

RFC: https://discourse.llvm.org/t/rfc-openmp-alloca-placement-for-openmp-loop-wrappers/89512/7

Patch 2/3
DeltaFile
+64-64mlir/test/Dialect/OpenMP/ops.mlir
+64-56mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td
+46-46mlir/test/Dialect/OpenMP/invalid.mlir
+44-36mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp
+20-20flang/test/Lower/OpenMP/taskloop.f90
+16-19mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
+254-24130 files not shown
+367-35536 files

LLVM/project 306b777mlir/include/mlir/Dialect/Arith/IR ArithOps.td, mlir/lib/Conversion/ArithToLLVM ArithToLLVM.cpp

[mlir][arith] Add rounding mode flags to binary arithmetic operations
DeltaFile
+76-14mlir/include/mlir/Dialect/Arith/IR/ArithOps.td
+75-0mlir/test/Dialect/Arith/canonicalize.mlir
+61-0mlir/test/Conversion/ArithToLLVM/arith-to-llvm.mlir
+40-15mlir/lib/Conversion/ArithToLLVM/ArithToLLVM.cpp
+28-8mlir/lib/Dialect/Arith/IR/ArithOps.cpp
+9-9mlir/lib/Dialect/Math/Transforms/ExpandOps.cpp
+289-465 files not shown
+325-6111 files

LLVM/project 5d67e4dmlir/lib/Dialect/OpenMP/IR OpenMPDialect.cpp, mlir/test/Dialect/OpenMP invalid.mlir

Improve error message

Make it clear that the requirement is for direct nesting.
DeltaFile
+3-3mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp
+2-2mlir/test/Dialect/OpenMP/invalid.mlir
+5-52 files

LLVM/project 0effa7clibclc/clc/lib/generic/math clc_asinpi.inc clc_asinpi.cl

libclc: Update asinpi (#188454)

This was originally ported from rocm device libs in
eea0997566cad3be13df897a06dfda74cbd684b9. Update for more recent
changes.
DeltaFile
+109-108libclc/clc/lib/generic/math/clc_asinpi.inc
+3-4libclc/clc/lib/generic/math/clc_asinpi.cl
+112-1122 files

LLVM/project 56714adclang/test/OpenMP target_teams_distribute_parallel_for_simd_schedule_codegen.cpp teams_distribute_parallel_for_simd_schedule_codegen.cpp, libc/AOR_v20.02/math/test/traces sincosf.txt exp.txt

Merge branch 'main' into users/ssahasra/asyncmark-gfx1250
DeltaFile
+0-31,999libc/AOR_v20.02/math/test/traces/sincosf.txt
+0-16,000libc/AOR_v20.02/math/test/traces/exp.txt
+6,835-6,798llvm/test/CodeGen/AMDGPU/memintrinsic-unroll.ll
+6,432-6,562llvm/test/CodeGen/X86/vector-interleaved-load-i64-stride-7.ll
+5,294-4,814clang/test/OpenMP/target_teams_distribute_parallel_for_simd_schedule_codegen.cpp
+5,238-4,758clang/test/OpenMP/teams_distribute_parallel_for_simd_schedule_codegen.cpp
+23,799-70,9318,045 files not shown
+543,123-369,6578,051 files

LLVM/project 89431a3llvm/lib/Analysis LazyValueInfo.cpp

[LVI] Use block numbers (#188270)

Store the cache as a vector indexed by block numbers instead of a map,
which results in a small compile-time improvement.
DeltaFile
+48-32llvm/lib/Analysis/LazyValueInfo.cpp
+48-321 files

LLVM/project f64cd66libclc/clc/lib/generic/math clc_acospi.inc clc_acospi.cl

libclc: Update acospi

This was originally ported from rocm device libs in
084124a8fab6fd71d49ac4928d17c3ef8b350ead. Merge in more
recent changes.
DeltaFile
+94-111libclc/clc/lib/generic/math/clc_acospi.inc
+2-3libclc/clc/lib/generic/math/clc_acospi.cl
+96-1142 files

LLVM/project 7588660libclc/clc/lib/generic/math clc_asinpi.inc clc_asinpi.cl

libclc: Update asinpi

This was originally ported from rocm device libs in
eea0997566cad3be13df897a06dfda74cbd684b9. Update for more recent
changes.
DeltaFile
+109-108libclc/clc/lib/generic/math/clc_asinpi.inc
+3-4libclc/clc/lib/generic/math/clc_asinpi.cl
+112-1122 files

LLVM/project 065a39bllvm/lib/Transforms/Vectorize VPlanTransforms.cpp

[VPlan] Tighten SafeAVL matching in convertEVLExitCond. NFC (#179164)

Follow-up from
https://github.com/llvm/llvm-project/pull/178181#discussion_r2743630145
DeltaFile
+6-3llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
+6-31 files

LLVM/project d21e1a3llvm/include/llvm/Analysis VecFuncs.def, llvm/include/llvm/IR RuntimeLibcalls.td

[LIBM][AMDLIBM] - New vector calls for cdfnorm and round scalar calls (#187232)

In amdlibm, new vector calls

cdfnorm
amd_vrd2_cdfnorm
amd_vrd4_cdfnorm
amd_vrd8_cdfnorm

round
amd_vrs16_roundf
amd_vrs8_roundf
amd_vrs4_roundf
amd_vrd8_round 
amd_vrd4_round 
amd_vrd2_round

Link to aocl repo -
[aocl-libm-ose](https://github.com/amd/aocl-libm-ose)
DeltaFile
+141-5llvm/test/Transforms/LoopVectorize/X86/amdlibm-calls.ll
+18-0llvm/include/llvm/Analysis/VecFuncs.def
+11-0llvm/include/llvm/IR/RuntimeLibcalls.td
+170-53 files

LLVM/project 86c1510llvm/lib/Transforms/Vectorize VPlanRecipes.cpp, llvm/test/Transforms/LoopVectorize/AArch64 binop-costs.ll predicated-costs.ll

[VPlan] Remove isVector guard in getCostForRecipeWithOpcode. (#188126)

The legacy cost model computes and passes RHSInfo both when widening and
replicating. Match behavior in VPlan-based cost model.

The added test shows that we now compute the same cost as the legacy
cost model.

Without this change, the test added in
llvm/test/Transforms/LoopVectorize/AArch64/predicated-costs.ll would
crash with https://github.com/llvm/llvm-project/pull/187056.

PR: https://github.com/llvm/llvm-project/pull/188126
DeltaFile
+43-0llvm/test/Transforms/LoopVectorize/AArch64/binop-costs.ll
+42-0llvm/test/Transforms/LoopVectorize/AArch64/predicated-costs.ll
+7-12llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
+92-123 files

LLVM/project 2e7dd44clang/lib/StaticAnalyzer/Checkers CStringChecker.cpp, clang/test/Analysis bstring.cpp malloc.c

[analyzer] Untangle subcheckers of CStringChecker (#186802)

It turns out, that some checks for cstring functions happened as a side
effect of other checks. For example, whether the arguments to memcpy
were uninitialized happened during buffer overflow checking.

The way this was implemented is that if alpha.unix.cstring.OutOfBounds
was disabled, alpha.unix.cstring.UninitializedRead couldn't emit any
warnings. It turns out that major modeling steps are early-exited if a
certain checker is disabled!

This patch moved the early returns to the report emission parts --
modeling still happens, only the bug report construction is omitted.
This would mean that if we find a fatal error (like buffer overflow) we
_should_ stop analysis even if we don't emit a warning (thats a part of
doing modeling), but I decided against implementing that.

One hurdle is that CStringChecker is a dependency of MallocChecker, and
the current tests rely on the CStringChecker _not_ terminating execution

    [9 lines not shown]
DeltaFile
+61-23clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp
+57-5clang/test/Analysis/bstring.cpp
+11-0clang/test/Analysis/malloc.c
+129-283 files

LLVM/project 458d3a8libc/src/sys/time/linux utimes.cpp

[libc] Fix unused variable warning in utimes.cpp (#188347) (#188448)

Moved the declaration of 'ret' inside the SYS_utimes block to prevent an
unused variable warning on the libc-riscv32-qemu-yocto-fullbuild-dbg
builder, which doesn't define SYS_utimes.
DeltaFile
+1-3libc/src/sys/time/linux/utimes.cpp
+1-31 files