LLVM/project 28e980fllvm/include/llvm/Transforms/Utils MemoryTaggingSupport.h, llvm/lib/Target/AArch64 AArch64StackTagging.cpp

[MTE][Darwin] This patch extends support for the stack frame history buffer to  Darwin. (#178049)

Darwin reserves slot 231 for storing a pointer to the history ring
buffer. It also uses bits 60-62 to store the size of the ring buffer

rdar://168176496
DeltaFile
+25-14llvm/test/CodeGen/AArch64/stack-tagging-prologue.ll
+28-2llvm/lib/Transforms/Utils/MemoryTaggingSupport.cpp
+18-6llvm/lib/Target/AArch64/AArch64StackTagging.cpp
+2-1llvm/include/llvm/Transforms/Utils/MemoryTaggingSupport.h
+73-234 files

LLVM/project 9d3d1ddllvm/lib/MC WasmObjectWriter.cpp

[MC][WebAssembly] Use `contains` over `count` for map membership. NFC (#178348)

DeltaFile
+23-22llvm/lib/MC/WasmObjectWriter.cpp
+23-221 files

LLVM/project 2615005llvm/lib/Target/AMDGPU SIInsertWaitcnts.cpp

[AMDGPU][SIInsertWaitcnts] Cleanup: Remove WaitEventMaskForInst member variable (#178030)

The event mask is constant and target dependent it should be accessed
through the WCG object.
DeltaFile
+10-12llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
+10-121 files

LLVM/project d6ca3d0llvm/lib/Object WasmObjectFile.cpp

[WebAssembly] Dump more info when printing symbols. NFC (#178328)

DeltaFile
+12-3llvm/lib/Object/WasmObjectFile.cpp
+12-31 files

LLVM/project 18925d1libc/shared/math fsqrtl.h, libc/src/__support/math fsqrtl.h CMakeLists.txt

[libc] [math] Refactor fsqrtl to be header-only (#176169)

This PR refactors fsqrtl to be header only as discussed. No functional
change intended. Test and build files were updated as required by the
refactor
Fixes #175335
DeltaFile
+26-0libc/src/__support/math/fsqrtl.h
+24-0libc/shared/math/fsqrtl.h
+9-1utils/bazel/llvm-project-overlay/libc/BUILD.bazel
+8-0libc/src/__support/math/CMakeLists.txt
+2-6libc/src/math/generic/fsqrtl.cpp
+1-1libc/src/math/generic/CMakeLists.txt
+70-83 files not shown
+73-89 files

LLVM/project 43f58a8llvm/test lit.cfg.py, llvm/utils profcheck-xfail.txt

Exclude some target-specific tests from profcheck (#178500)

DeltaFile
+0-16llvm/utils/profcheck-xfail.txt
+4-0llvm/test/lit.cfg.py
+4-162 files

LLVM/project 7075f38llvm/lib/Transforms/Instrumentation MemProfUse.cpp, llvm/test/Transforms/PGOProfile data-access-profile.ll

[StaticDataLayout][MemProf] Annotate string literal hotness by making use of data access profiles. (#178333)

The change is gated under a new option
`memprof-annotate-string-literal-section-prefix` so we can flag-gate it
for rollout purposes.

A follow-up PR https://github.com/llvm/llvm-project/pull/178336/changes
updates the codegen pass to reconcile the hotness similar to the
reconciliation for other global variables.
DeltaFile
+67-26llvm/test/Transforms/PGOProfile/data-access-profile.ll
+37-6llvm/lib/Transforms/Instrumentation/MemProfUse.cpp
+104-322 files

LLVM/project 3f3190ellvm/include/llvm/Transforms/Utils LoopUtils.h

[NFC] update doc comment on `setLoopEstimatedTripCount` (#178091)

See [this
discussion](https://github.com/llvm/llvm-project/pull/174896#issuecomment-3802361713)
prompted by PR #174896.

A 0-0 encoding in branch weights is invalid (the probability of an edge
is computed as a fraction where the denominator is the sum of the
weights and the numerator is its - the edge's - weight). So BPI actually
handles it as 1-1, which then results in raising the BFI of the loop
body that's meant to be cold.

The aforementioned PR addressed this, but didn't update the doc comment.
DeltaFile
+13-1llvm/include/llvm/Transforms/Utils/LoopUtils.h
+13-11 files

LLVM/project cd4c9d2mlir/include/mlir/Dialect/XeGPU/Transforms Transforms.h, mlir/lib/Dialect/XeGPU/Transforms XeGPUPropagateLayout.cpp

[mlir][xegpu] Add initial support for layout conflict handling. (#173090)

This PR adds initial support for layout conflict resolution in XeGPU.
Layout conflict occurs when some op's use point expects a different
layout than what the op can currently provide. This conflict needs to be
resolved by adding certain other xegpu ops.

Initially, We only focus conflict handling at tensor desc use points.
DeltaFile
+175-41mlir/lib/Dialect/XeGPU/Transforms/XeGPUPropagateLayout.cpp
+81-0mlir/test/Dialect/XeGPU/resolve-layout-conflicts.mlir
+76-0mlir/test/lib/Dialect/XeGPU/TestXeGPUTransforms.cpp
+7-0mlir/include/mlir/Dialect/XeGPU/Transforms/Transforms.h
+1-1mlir/test/Dialect/XeGPU/propagate-layout-subgroup.mlir
+1-1mlir/test/Dialect/XeGPU/propagate-layout.mlir
+341-431 files not shown
+342-447 files

LLVM/project 9abd65dllvm/test lit.cfg.py, llvm/utils profcheck-xfail.txt

Exclude `RISCV` tests from profcheck
DeltaFile
+0-14llvm/utils/profcheck-xfail.txt
+2-0llvm/test/lit.cfg.py
+2-142 files

LLVM/project 7cf11edllvm/lib/CodeGen EarlyIfConversion.cpp, llvm/lib/Target/AArch64 AArch64InstrInfo.cpp

[EarlyIfConversion] Add analysis for data-dependent conditional branches(#174457)

Add infrastructure to identify conditional branches on values loaded from 
memory.  Such branches are likely to be harder to predict accurately since 
branch  history (probably) provides little useful information.

This analysis walks the def-use chain from the branch condition to find
loads that feed into it. Several cases are excluded from consideration:
- Loads from constant pools (predictable values)
- Dereferenceable invariant loads (loop-invariant)
- Branches with biased probability (null checks, etc.)
- Loads not "close in program time" to the branch (must be in the same
  basic block with no intervening calls)

The analysis is disabled by default behind -enable-early-ifcvt-data-dependent.
DeltaFile
+983-0llvm/test/CodeGen/AArch64/early-ifcvt-load-to-cond-br.mir
+159-5llvm/lib/CodeGen/EarlyIfConversion.cpp
+65-37llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
+45-0llvm/test/CodeGen/AArch64/early-ifcvt-remarks.mir
+1,252-424 files

LLVM/project a8913a2libc/shared/math logf16.h, libc/src/__support/math logf16.h CMakeLists.txt

[libc][math] Refactor logf16 to header-only shared math (#175408)

## Summary

Following the discussion in the RFC [1], Refactors logf16 to a
header-only shared math.

[1]
https://discourse.llvm.org/t/rfc-make-clang-builtin-math-functions-constexpr-with-llvm-libc-to-support-c-23-constexpr-math-functions/86450

## Implementation 

- Moved the core logic and lookup tables from `generic/logf16.cpp` to
`__support/math/logf16.h`
- Updated `generic/logf16.cpp` to include the new header and call
`internal::logf16`
- Updated `CMakeLists.txt` and `BUILD.bazel` to reflect the dependency
changes and new header library

Fix : https://github.com/llvm/llvm-project/issues/175367
DeltaFile
+180-0libc/src/__support/math/logf16.h
+2-148libc/src/math/generic/logf16.cpp
+28-0libc/shared/math/logf16.h
+22-1utils/bazel/llvm-project-overlay/libc/BUILD.bazel
+21-0libc/src/__support/math/CMakeLists.txt
+1-11libc/src/math/generic/CMakeLists.txt
+254-1603 files not shown
+257-1609 files

LLVM/project 07ee61dllvm/lib/Transforms/Utils LoopUtils.cpp

[LoopUnroll] Fix unused variable warning (#178490)

Fixes 362c39d36dd87c5659b0caa3115dfa67f592cdf6.
DeltaFile
+2-2llvm/lib/Transforms/Utils/LoopUtils.cpp
+2-21 files

LLVM/project df0c6f4llvm/include/llvm/Frontend/OpenMP ConstructDecompositionT.h

[OpenMP] Rename some data members in ConstructDecompositionT for clar… (#178475)

…ity, NFC
DeltaFile
+129-128llvm/include/llvm/Frontend/OpenMP/ConstructDecompositionT.h
+129-1281 files

LLVM/project 024b8aclibc/shared/math llogbf128.h, libc/src/__support/math llogbf128.h CMakeLists.txt

[libc][math] Refactor llogbf128 to header-only (#175617)

DeltaFile
+34-0libc/src/__support/math/llogbf128.h
+29-0libc/shared/math/llogbf128.h
+16-1utils/bazel/llvm-project-overlay/libc/BUILD.bazel
+10-0libc/src/__support/math/CMakeLists.txt
+2-6libc/src/math/generic/llogbf128.cpp
+1-2libc/src/math/generic/CMakeLists.txt
+92-93 files not shown
+96-99 files

LLVM/project 2bd2e13clang/lib/CodeGen CGObjCMac.cpp, clang/test/CodeGenObjC expose-direct-method-same-name.m expose-direct-method-visibility-linkage.m

does not allow samename function
DeltaFile
+0-103clang/test/CodeGenObjC/expose-direct-method-same-name.m
+45-0clang/test/CodeGenObjC/expose-direct-method-visibility-linkage.m
+8-8clang/test/CodeGenObjC/expose-direct-method.m
+2-3clang/lib/CodeGen/CGObjCMac.cpp
+55-1144 files

LLVM/project 4f7f908llvm/lib/Transforms/Instrumentation MemProfUse.cpp, llvm/test/Transforms/PGOProfile data-access-profile.ll

resolve comments
DeltaFile
+10-31llvm/lib/Transforms/Instrumentation/MemProfUse.cpp
+2-8llvm/test/Transforms/PGOProfile/data-access-profile.ll
+12-392 files

LLVM/project 5343841llvm/include/llvm/CodeGen SDPatternMatch.h

[SDPatternMatch][NFC] Use empty SDNodeFlags instead of std::optional (#178483)

I think we can avoid using std::optional for SDNodeFlags in
UnaryOpc_match.

NFC.
DeltaFile
+3-5llvm/include/llvm/CodeGen/SDPatternMatch.h
+3-51 files

LLVM/project e5d8396llvm/test/CodeGen/AMDGPU isel-amdgcn-cs-chain-intrinsic-w32.ll isel-amdgcn-cs-chain-intrinsic-w64.ll

[AMDGPU] Introduce V_READANYLANE_B32

This is non-convergent pseudo suitable for uniform inputs.
The MachineInstr::NoConvergent attribute allows hoisting
which is otherwise prohibited for a convergent instruction.
DeltaFile
+160-160llvm/test/CodeGen/AMDGPU/isel-amdgcn-cs-chain-intrinsic-w32.ll
+100-100llvm/test/CodeGen/AMDGPU/isel-amdgcn-cs-chain-intrinsic-w64.ll
+48-48llvm/test/CodeGen/AMDGPU/llvm.amdgcn.make.buffer.rsrc.ll
+30-30llvm/test/CodeGen/AMDGPU/isel-amdgpu-cs-chain-intrinsic-dyn-vgpr-w32.ll
+33-0llvm/test/CodeGen/AMDGPU/readanylane.ll
+10-10llvm/test/CodeGen/AMDGPU/dag-preserve-disjoint-flag.ll
+381-3485 files not shown
+418-35711 files

LLVM/project b88d49eflang/lib/Optimizer/Transforms/CUDA CUFComputeSharedMemoryOffsetsAndSize.cpp, flang/test/Fir/CUDA cuda-shared-offset.mlir

[flang][cuda] Do not initialize shared variable (#178489)

DeltaFile
+6-6flang/test/Fir/CUDA/cuda-shared-offset.mlir
+0-7flang/lib/Optimizer/Transforms/CUDA/CUFComputeSharedMemoryOffsetsAndSize.cpp
+6-132 files

LLVM/project 6f79891clang/lib/Sema SemaDeclAttr.cpp, clang/test/Sema attr-modular-format.c

[clang] Check that first modular_format argument is an identifier (#178322)

This fixes an oversight discovered in #147431.
DeltaFile
+5-0clang/lib/Sema/SemaDeclAttr.cpp
+2-0clang/test/Sema/attr-modular-format.c
+7-02 files

LLVM/project f0bf838clang/lib/Sema SemaCoroutine.cpp, clang/test/CodeGenCoroutines coro-await-elidable.cpp

[Clang] Fix coro_await_elidable breaking with parenthesized expressions

The applySafeElideContext function used IgnoreImplicit() to find the
underlying CallExpr, but this didn't strip ParenExpr nodes. When code
like `co_await (fn(leaf()))` was parsed, the operand was wrapped in a
ParenExpr, causing HALO (Heap Allocation eLision Optimization) to fail.

This fix chains IgnoreImplicit()->IgnoreParens()->IgnoreImplicit() to
handle both orderings of implicit nodes and parentheses in the AST.

Fixes the issue where adding parentheses around co_await's argument
would prevent heap elision for coro_await_elidable coroutines, which
is particularly problematic since parentheses are often required in
real-world code due to co_await's tight binding with operators.
DeltaFile
+50-0clang/test/CodeGenCoroutines/coro-await-elidable.cpp
+5-1clang/lib/Sema/SemaCoroutine.cpp
+55-12 files

LLVM/project e01a880mlir/lib/Dialect/LLVMIR/IR NVVMDialect.cpp, mlir/lib/Dialect/Linalg/Transforms Vectorization.cpp

Merge branch 'main' into users/mtrofin/01-26-_nfc_update_doc_comment_on_setloopestimatedtripcount_
DeltaFile
+27-27mlir/lib/Dialect/XeGPU/Transforms/XeGPUPropagateLayout.cpp
+26-26mlir/lib/Target/Cpp/TranslateToCpp.cpp
+21-21mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp
+17-17mlir/lib/Dialect/LLVMIR/IR/NVVMDialect.cpp
+17-17mlir/lib/IR/AsmPrinter.cpp
+10-11mlir/lib/Dialect/SPIRV/IR/SPIRVTypes.cpp
+118-11944 files not shown
+280-28150 files

LLVM/project 59e4479mlir/lib/Dialect/LLVMIR/IR NVVMDialect.cpp, mlir/lib/Dialect/Linalg/Transforms Vectorization.cpp

[mlir] Fix new clang-tidy warning llvm-type-switch-case-types. NFC. (#178487)

Pre-commiting this before landing the new check in
https://github.com/llvm/llvm-project/pull/177892
DeltaFile
+27-27mlir/lib/Dialect/XeGPU/Transforms/XeGPUPropagateLayout.cpp
+26-26mlir/lib/Target/Cpp/TranslateToCpp.cpp
+21-21mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp
+17-17mlir/lib/Dialect/LLVMIR/IR/NVVMDialect.cpp
+17-17mlir/lib/IR/AsmPrinter.cpp
+10-11mlir/lib/Dialect/SPIRV/IR/SPIRVTypes.cpp
+118-11942 files not shown
+267-27748 files

LLVM/project 1c7cf3allvm/include/llvm/Transforms/Utils LoopUtils.h

[NFC] update doc comment on `setLoopEstimatedTripCount`
DeltaFile
+13-1llvm/include/llvm/Transforms/Utils/LoopUtils.h
+13-11 files

LLVM/project ee28e8fllvm/test/CodeGen/X86 global-variable-partition-with-dap.ll

Add test coverage
DeltaFile
+56-13llvm/test/CodeGen/X86/global-variable-partition-with-dap.ll
+56-131 files

LLVM/project 37e9381mlir/lib/Dialect/Tosa/IR TosaOps.cpp, mlir/test/Dialect/Tosa ops.mlir

[mlir][tosa] Fix pad op verifier when padding is dynamic (#177622)

When padding is dynamic the verifier should not return failure, it
shouldn't try to check the pad values.
DeltaFile
+11-1mlir/test/Dialect/Tosa/ops.mlir
+2-3mlir/lib/Dialect/Tosa/IR/TosaOps.cpp
+13-42 files

LLVM/project 3327eealibcxx/test/benchmarks spec.gen.py

[libc++] Reduce the number of warnings when running SPEC (#160366)

We don't care about warnings in the SPEC suite, so just use -w to turn
them off.
DeltaFile
+1-1libcxx/test/benchmarks/spec.gen.py
+1-11 files

LLVM/project 9aec188llvm/include/llvm/CodeGen SDPatternMatch.h, llvm/lib/CodeGen/SelectionDAG DAGCombiner.cpp

[DAG] SDPatternMatch - allow m_BinOp / m_c_BinOp to take an optional SDNodeFlags required for matching (#178435)

BinaryOpc_match is already wired up for this - but allow us to use
m_BinOp/m_c_BinOp with the required flags directly

Updated the foldShiftToAvg folds to make use of this
DeltaFile
+8-12llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+7-5llvm/include/llvm/CodeGen/SDPatternMatch.h
+8-0llvm/unittests/CodeGen/SelectionDAGPatternMatchTest.cpp
+23-173 files

LLVM/project efe7562llvm/lib/Transforms/InstCombine InstCombineCompares.cpp, llvm/test/Transforms/InstCombine abs-intrinsic.ll

[InstCombine] Add combines for unsigned comparison of absolute value to constant (#176148)

This patch implements the following two peephole optimisations:
1. ``` abs(X) u> K --> K >= 0 ? `X + K u> 2 * K` : `false` ```;
2. If `abs(INT_MIN)` is `poison`, ```abs(X) u< K --> K >= 1 ? `X + (K -
1) u<= 2 * (K - 1)` : K != 0```.

See the following Alive2 proofs:
[1](https://alive2.llvm.org/ce/z/J2SRSv) and
[2](https://alive2.llvm.org/ce/z/tfxTrU).
DeltaFile
+195-3llvm/test/Transforms/InstCombine/abs-intrinsic.ll
+26-0llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
+221-32 files