LLVM/project 4ea7488mlir/lib/Tools/mlir-lsp-server MLIRServer.cpp

[MLIR] Apply clang-tidy fixes for readability-identifier-naming in MLIRServer.cpp (NFC)
DeltaFile
+9-9mlir/lib/Tools/mlir-lsp-server/MLIRServer.cpp
+9-91 files

LLVM/project a9a05dfclang/include/clang/CIR/Dialect/Builder CIRBaseBuilder.h, clang/lib/CIR/CodeGen CIRGenAtomic.cpp CIRGenBuilder.h

[CIR] Scoped atomic store (#171627)

This patch adds support for `__scoped_atomic_store` and
`__scoped_atomic_store_n`.
DeltaFile
+32-0clang/test/CIR/CodeGen/atomic-scoped.c
+12-12clang/test/CIR/CodeGen/atomic.c
+6-4clang/lib/CIR/CodeGen/CIRGenAtomic.cpp
+5-2clang/lib/CIR/Lowering/DirectToLLVM/LowerToLLVM.cpp
+3-1clang/include/clang/CIR/Dialect/Builder/CIRBaseBuilder.h
+2-1clang/lib/CIR/CodeGen/CIRGenBuilder.h
+60-202 files not shown
+63-208 files

LLVM/project 6b7b0abmlir/include/mlir/Pass PassInstrumentation.h Pass.h, mlir/lib/Pass Pass.cpp

Enable pass instrumentation to signal failures. (#163126)

Enables adding instrumentation to pass manager that can track/flag
invariants. This would be useful for cases where one some tighter
requirements than the general dialects or for a phase of conversion that
elsewhere.

It would enable making verify also just a regular instrumentation I
believe, but also a non-goal as that is a first class concept and
baseline for the ops and passes.

Would have enabled some of the requirements of
https://discourse.llvm.org/t/pre-verification-logic-before-running-conversion-pass-in-mlir/88318/10
.
DeltaFile
+101-0mlir/unittests/Pass/PassManagerTest.cpp
+23-13mlir/lib/Pass/Pass.cpp
+4-0mlir/include/mlir/Pass/PassInstrumentation.h
+4-0mlir/include/mlir/Pass/Pass.h
+132-134 files

LLVM/project 0f2f9e1mlir/lib/Dialect/Linalg/Transforms DropUnitDims.cpp

[MLIR] Apply clang-tidy fixes for performance-unnecessary-copy-initialization in DropUnitDims.cpp (NFC)
DeltaFile
+2-2mlir/lib/Dialect/Linalg/Transforms/DropUnitDims.cpp
+2-21 files

LLVM/project de29f24compiler-rt/test/sanitizer_common/TestCases/Posix popen.cpp

[sanitizer_common][test-only] Mark popen xfail on iossim (#171814)

rdar://166246774
DeltaFile
+2-0compiler-rt/test/sanitizer_common/TestCases/Posix/popen.cpp
+2-01 files

LLVM/project 3558537mlir/lib/Dialect/Tosa/IR TosaOps.cpp

[MLIR] Apply clang-tidy fixes for readability-identifier-naming in TosaOps.cpp (NFC)
DeltaFile
+65-66mlir/lib/Dialect/Tosa/IR/TosaOps.cpp
+65-661 files

LLVM/project bb40d94mlir/lib/Dialect/Linalg/Transforms ElementwiseOpFusion.cpp

[MLIR] Apply clang-tidy fixes for llvm-else-after-return in ElementwiseOpFusion.cpp (NFC)
DeltaFile
+1-2mlir/lib/Dialect/Linalg/Transforms/ElementwiseOpFusion.cpp
+1-21 files

LLVM/project 7069db9clang/lib/Driver/ToolChains HIPAMD.cpp HIPAMD.h, clang/test/Driver spirv-amd-toolchain.c

[clang][Driver] SPIRVAMDToolChain must not require device libs.

Prior to this changes, the toolchain was looking for device libs and failing.
This is fixed by not looking for device libs (for SPIR-V).
DeltaFile
+9-0clang/lib/Driver/ToolChains/HIPAMD.cpp
+5-0clang/lib/Driver/ToolChains/HIPAMD.h
+1-1clang/test/Driver/spirv-amd-toolchain.c
+15-13 files

LLVM/project f548902compiler-rt/test/sanitizer_common/TestCases/Posix dedup_token_length_test.cpp

[sanitizer_common][test-only] Remove xfail for darwin ubsan on dedup_token_length_test (#171812)

This test is currently XPASSing on the iossim CI.

rdar://166219043
DeltaFile
+0-1compiler-rt/test/sanitizer_common/TestCases/Posix/dedup_token_length_test.cpp
+0-11 files

LLVM/project 44aa41cllvm/lib/Transforms/InstCombine InstCombineCalls.cpp, llvm/test/Transforms/InstCombine ldexp.ll fold-select-fmul-if-zero.ll

InstCombine: Fold ldexp with constant exponent to fmul

If we can represent this with an fmul, prefer it as a canonical
form. More optimizations will understand fmul, and allows contract to
fma.
DeltaFile
+36-26llvm/test/Transforms/InstCombine/ldexp.ll
+13-0llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
+2-8llvm/test/Transforms/InstCombine/fold-select-fmul-if-zero.ll
+51-343 files

LLVM/project 892f156llvm/lib/Analysis ValueTracking.cpp, llvm/test/Transforms/Attributor nofpclass-fmul.ll nofpclass-nan-fmul.ll

ValueTracking: Teach computeKnownFPClass that multiply can avoid denormals

Multiply by large constant can be used to scale denormal inputs into
a normal range. This pattern appears frequently in math function library
implementations to make use of hardware instructions that do not support
denormals. We already handle this case for ldexp, but now canonicalize
ldexp by a constant to an fmul.

The test cases are mostly the existing nofpclass test for ldexp,
run through the new instcombine to replace ldexp with fmul.
DeltaFile
+10-10llvm/test/Transforms/Attributor/nofpclass-fmul.ll
+18-0llvm/lib/Analysis/ValueTracking.cpp
+1-1llvm/test/Transforms/Attributor/nofpclass-nan-fmul.ll
+29-113 files

LLVM/project 4be3df8llvm/test/Transforms/Attributor nofpclass-fmul.ll

ValueTracking: Add baseline test for fmul denormal scaling handling (#171729)

DeltaFile
+288-0llvm/test/Transforms/Attributor/nofpclass-fmul.ll
+288-01 files

LLVM/project ac2291dllvm/lib/Target/X86 X86ISelLowering.cpp, llvm/test/CodeGen/X86 setcc-wide-types.ll elementwise-store-of-scalar-splat.ll

[X86] Allow handling of i128/256/512 AND/OR/XOR bitlogic on the FPU (#171616)

If the scalar integer sources are freely transferable to the FPU, then
perform the bitlogic op as a SSE/AVX operation.

Uses the mayFoldIntoVector helper added at #171589
DeltaFile
+481-562llvm/test/CodeGen/X86/setcc-wide-types.ll
+194-64llvm/test/CodeGen/X86/elementwise-store-of-scalar-splat.ll
+137-68llvm/test/CodeGen/X86/ptest.ll
+60-33llvm/test/CodeGen/X86/subvectorwise-store-of-vector-splat.ll
+8-22llvm/test/CodeGen/X86/pr166744.ll
+29-0llvm/lib/Target/X86/X86ISelLowering.cpp
+909-7496 files

LLVM/project 9bcba9dllvm/test/CodeGen/X86 vector-compare-results.ll

[X86] vector-compare-results.ll - regenerate VPTERNLOG asm comments (#171803)

DeltaFile
+14-14llvm/test/CodeGen/X86/vector-compare-results.ll
+14-141 files

LLVM/project 06aecdbmlir/lib/Dialect/SCF/IR SCF.cpp, mlir/test/Dialect/SCF invalid.mlir

[MLIR][SCF] Verify number of regions in scf.reduce (#171450)

This patch adds `ReduceOp::verifyRegions` to ensure that the number of
reduction regions equals the number of operands (`getReductions().size()
== getOperands().size()`).

Additionally, `ParallelOp::verify` is updated to gracefully handle cases
where the number of reduce operands differs from the initial values,
preventing verification logic crashes and relying on `ReduceOp` to
report structural inconsistencies.

Fixes: #118768
DeltaFile
+31-0mlir/test/Dialect/SCF/invalid.mlir
+8-0mlir/lib/Dialect/SCF/IR/SCF.cpp
+39-02 files

LLVM/project 9b6b52bllvm/lib/CodeGen/AsmPrinter AsmPrinter.cpp

[AsmPrinter][NFC] Reuse Target Triple variable (#171612)

DeltaFile
+11-11llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
+11-111 files

LLVM/project a9e62b3llvm/lib/CodeGen AtomicExpandPass.cpp, llvm/test/CodeGen/ARM atomic-load-store.ll

[AtomicExpand] Add bitcasts when expanding load atomic vector

AtomicExpand fails for aligned `load atomic <n x T>` because it
does not find a compatible library call. This change adds appropriate
bitcasts so that the call can be lowered. It also adds support for
128 bit lowering in tablegen to support SSE/AVX.
DeltaFile
+226-61llvm/test/Transforms/AtomicExpand/X86/expand-atomic-non-integer.ll
+90-1llvm/test/CodeGen/X86/atomic-load-store.ll
+51-0llvm/test/CodeGen/ARM/atomic-load-store.ll
+15-4llvm/lib/CodeGen/AtomicExpandPass.cpp
+382-664 files

LLVM/project c89d87alldb/test/Shell/BuildScript toolchain-msvc.test

[lldb][test] Fix toolchain-msvc.test for native ARM64 MSVC environment (#171797)

This patch fixes toolchain-msvc.test on Windows ARM64 hosts running
under native ARM64 environment via vcvarsarm64.bat. Our lab buildbot
recently switched from using cross vcvarsamd64_arm64.bat environment to
native vcvarsarm64.bat. This patch updates FileCheck patterns to also
allow HostARM64 and arm64 PATH entries.

Changes:
-> Extend host regex to match HostARM64 (case-insensitive)
-> Allow arm64 in PATH tail.
-> Apply same fix in both 32-bit and 64-bit sections.
DeltaFile
+6-6lldb/test/Shell/BuildScript/toolchain-msvc.test
+6-61 files

LLVM/project d0767e9compiler-rt/lib/orc elfnix_tls.systemz.S, compiler-rt/test/orc/TestCases/Linux/systemz trivial-tls.S

[JITLink] Add TLS support for SystemZ (#171559)

This patch adds TLS support for SystemZ on top of orc-runtime support. A
separate orc-runtime support #171062 has been created from earlier TLS
support #[170706](https://github.com/llvm/llvm-project/pull/170706).

See conversations in
[#170706](https://github.com/llvm/llvm-project/pull/170706)

---------

Co-authored-by: anoopkg6 <anoopkg6 at github.com>
DeltaFile
+67-0compiler-rt/test/orc/TestCases/Linux/systemz/trivial-tls.S
+65-1llvm/lib/ExecutionEngine/JITLink/ELF_systemz.cpp
+42-0compiler-rt/lib/orc/elfnix_tls.systemz.S
+15-0llvm/include/llvm/ExecutionEngine/JITLink/systemz.h
+5-0llvm/lib/ExecutionEngine/Orc/ELFNixPlatform.cpp
+2-0llvm/lib/ExecutionEngine/JITLink/systemz.cpp
+196-11 files not shown
+197-17 files

LLVM/project 481ce81llvm/lib/IR Instruction.cpp, llvm/test/Transforms/InstCombine issue64967-reassoc-fmul.ll 2006-10-26-VectorReassoc.ll

IR: Stop requiring nsz to reassociate fmul (#171726)

nsz can only change the behavior of the sign bit.
The sign bit for fmul can be implemented as xor,
which is associative. DAGCombiner already reassociates
the multiply by 2 constants without nsz.

Fixes #64967
DeltaFile
+6-12llvm/test/Transforms/InstCombine/issue64967-reassoc-fmul.ll
+2-4llvm/test/Transforms/InstCombine/2006-10-26-VectorReassoc.ll
+1-2llvm/test/Transforms/InstCombine/fdiv.ll
+1-0llvm/lib/IR/Instruction.cpp
+10-184 files

LLVM/project 3aa4452bolt/include/bolt/Core BinaryBasicBlock.h, bolt/test/AArch64 long-jmp-bti.s long-jmp-bti-ignored.s

[BOLT] Updates

- fix format
- add comments
DeltaFile
+11-11bolt/test/AArch64/long-jmp-bti.s
+3-0bolt/test/AArch64/long-jmp-bti-ignored.s
+1-1bolt/include/bolt/Core/BinaryBasicBlock.h
+15-123 files

LLVM/project 504c108bolt/lib/Passes LongJmp.cpp, bolt/lib/Rewrite GNUPropertyRewriter.cpp

[BOLT][BTI] Add needed BTIs in LongJmp or refuse to optimize binary

This patch adds BTI landing pads to ShortJmp/LongJmp targets in the
LongJmp pass when optimizing BTI binaries.

BOLT does not have the ability to add BTI to all types of functions.
This patch aims to insert the landing pad where possible, and emit an
error where it currently is not.

BOLT cannot insert BTIs into several function "types", including:
- ignored functions,
- PLT functions,
- other functions without a CFG.

Additional context:

In #161206, BOLT gained the ability to decode the .note.gnu.property
section, and warn about lack of BTI support for BOLT. However, this
warning is misleading: the emitted binary may not need extra BTI landing

    [3 lines not shown]
DeltaFile
+50-3bolt/lib/Passes/LongJmp.cpp
+46-0bolt/test/AArch64/long-jmp-bti.s
+35-0bolt/test/AArch64/long-jmp-bti-ignored.s
+2-2bolt/test/AArch64/no-bti-note.test
+2-2bolt/test/AArch64/bti-note.test
+1-2bolt/lib/Rewrite/GNUPropertyRewriter.cpp
+136-91 files not shown
+138-97 files

LLVM/project f49cf89bolt/lib/Passes LongJmp.cpp

[BOLT] Fix param order
DeltaFile
+2-2bolt/lib/Passes/LongJmp.cpp
+2-21 files

LLVM/project fbc121cbolt/include/bolt/Core MCPlusBuilder.h, bolt/lib/Target/AArch64 AArch64MCPlusBuilder.cpp

[BOLT][BTI] Add MCPlusBuilder::insertBTI (#167329)

This function contains most of the logic for BTI:
- it takes the BasicBlock and the instruction used to jump to it.
- Then it checks if the first non-pseudo instruction is a sufficient
landing pad for the used call.
- if not, it generates the correct BTI instruction.

Also introduce the isCallCoveredByBTI helper to simplify the logic.
DeltaFile
+116-0bolt/unittests/Core/MCPlusBuilder.cpp
+75-0bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp
+13-0bolt/include/bolt/Core/MCPlusBuilder.h
+204-03 files

LLVM/project 8af88a4mlir/include/mlir/Dialect/LLVMIR NVVMOps.td, mlir/lib/Dialect/LLVMIR/IR NVVMDialect.cpp

[MLIR][NVVM] Update PMEvent lowering to intrinsics (#171649)

The patch updates the lowering of `id` based pmevent
also to intrinsics. The mask is simply (1 << event-id).

Signed-off-by: Durgadoss R <durgadossr at nvidia.com>
DeltaFile
+23-0mlir/test/Target/LLVMIR/nvvm/pm_event.mlir
+21-0mlir/test/Target/LLVMIR/nvvm/pm_event_invalid.mlir
+0-21mlir/test/Target/LLVMIR/nvvmir-invalid.mlir
+10-10mlir/include/mlir/Dialect/LLVMIR/NVVMOps.td
+19-0mlir/lib/Dialect/LLVMIR/IR/NVVMDialect.cpp
+0-15mlir/test/Conversion/NVVMToLLVM/nvvm-to-llvm.mlir
+73-461 files not shown
+73-577 files

LLVM/project f8d1f53mlir/lib/Dialect/SCF/IR ValueBoundsOpInterfaceImpl.cpp, mlir/test/Dialect/SCF value-bounds-op-interface-impl.mlir

[mlir][scf] Add value bound for computed upper bound of forall loop (#171158)

Add additional bound for the induction variable of the scf.forall such
that:
%iv <= %lower_bound + (%trip_count - 1) * step

Same as https://github.com/llvm/llvm-project/pull/126426 but for
scf.forall loop
DeltaFile
+34-29mlir/lib/Dialect/SCF/IR/ValueBoundsOpInterfaceImpl.cpp
+9-0mlir/test/Dialect/SCF/value-bounds-op-interface-impl.mlir
+43-292 files

LLVM/project 0b17025llvm/lib/Transforms/InstCombine InstCombineCalls.cpp, llvm/test/Transforms/InstCombine ldexp.ll fold-select-fmul-if-zero.ll

InstCombine: Fold ldexp with constant exponent to fmul

If we can represent this with an fmul, prefer it as a canonical
form. More optimizations will understand fmul, and allows contract to
fma.
DeltaFile
+36-26llvm/test/Transforms/InstCombine/ldexp.ll
+13-0llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
+2-8llvm/test/Transforms/InstCombine/fold-select-fmul-if-zero.ll
+51-343 files

LLVM/project b552c70llvm/lib/Analysis ValueTracking.cpp, llvm/test/Transforms/Attributor nofpclass-fmul.ll nofpclass-nan-fmul.ll

ValueTracking: Teach computeKnownFPClass that multiply can avoid denormals

Multiply by large constant can be used to scale denormal inputs into
a normal range. This pattern appears frequently in math function library
implementations to make use of hardware instructions that do not support
denormals. We already handle this case for ldexp, but now canonicalize
ldexp by a constant to an fmul.

The test cases are mostly the existing nofpclass test for ldexp,
run through the new instcombine to replace ldexp with fmul.
DeltaFile
+10-10llvm/test/Transforms/Attributor/nofpclass-fmul.ll
+18-0llvm/lib/Analysis/ValueTracking.cpp
+1-1llvm/test/Transforms/Attributor/nofpclass-nan-fmul.ll
+29-113 files

LLVM/project ef8a54ellvm/test/Transforms/Attributor nofpclass-fmul.ll

ValueTracking: Add baseline test for fmul denormal scaling handling
DeltaFile
+288-0llvm/test/Transforms/Attributor/nofpclass-fmul.ll
+288-01 files

LLVM/project 966cb03llvm/lib/IR Instruction.cpp, llvm/test/Transforms/InstCombine issue64967-reassoc-fmul.ll 2006-10-26-VectorReassoc.ll

IR: Stop requiring nsz to reassociate fmul

nsz can only change the behavior of the sign bit.
The sign bit for fmul can be implemented as xor,
which is associative. DAGCombiner already reassociates
the multiply by 2 constants without nsz.

Fixes #64967
DeltaFile
+6-12llvm/test/Transforms/InstCombine/issue64967-reassoc-fmul.ll
+2-4llvm/test/Transforms/InstCombine/2006-10-26-VectorReassoc.ll
+1-2llvm/test/Transforms/InstCombine/fdiv.ll
+1-0llvm/lib/IR/Instruction.cpp
+10-184 files