LLVM/project 00f8ed1llvm/test/CodeGen/X86/GlobalISel prelegalizer-combiner-identity.mir prelegalizer-combiner-sub.mir

[X86][Gisel] add trivial arith tests for gisel x86-prelegalizer-combiner (#183544)
DeltaFile
+192-0llvm/test/CodeGen/X86/GlobalISel/prelegalizer-combiner-identity.mir
+108-0llvm/test/CodeGen/X86/GlobalISel/prelegalizer-combiner-sub.mir
+88-0llvm/test/CodeGen/X86/GlobalISel/prelegalizer-combiner-mul.mir
+41-0llvm/test/CodeGen/X86/GlobalISel/prelegalizer-combiner-div.mir
+41-0llvm/test/CodeGen/X86/GlobalISel/prelegalizer-combiner-rem.mir
+27-0llvm/test/CodeGen/X86/GlobalISel/prelegalizer-combiner-or.mir
+497-02 files not shown
+539-08 files

LLVM/project e3c287fllvm/lib/Target/AMDGPU SIInstrInfo.h, llvm/test/CodeGen/AMDGPU insert-skips-gfx1250.mir vgpr-set-msb-coissue.mir

[AMDGPU] Handle S_WAIT_XCNT in SIInstrInfo::isWaitcnt (#187726)

This affects the behavior of SIPreEmitPeephole and
AMDGPULowerVGPREncoding.
DeltaFile
+60-0llvm/test/CodeGen/AMDGPU/insert-skips-gfx1250.mir
+40-0llvm/test/CodeGen/AMDGPU/vgpr-set-msb-coissue.mir
+1-0llvm/lib/Target/AMDGPU/SIInstrInfo.h
+101-03 files

LLVM/project 94b222bllvm/include/llvm/CodeGen/GlobalISel LegalizerInfo.h

[GlobalISel] Add `widenScalarFor()` function (#187731)

The function is mentioned in `Legalizer.rst` but has been missing. This
also fixes the asymetry between `narrowScalarXXX()` that has both
`narrowScalarFor()` and `narrowScalarIf()`, and `widenScalarXXX()` that
only had `widenScalarIf()`.
DeltaFile
+8-0llvm/include/llvm/CodeGen/GlobalISel/LegalizerInfo.h
+8-01 files

LLVM/project 9fa53a8llvm/lib/Target/AArch64 AArch64ExpandPseudoInsts.cpp

[AArch64] Combine cases with the same code in `expandMOVImm` (NFC) (#187843)

Combine cases for `ORRWri`, `ORRXri`, `ANDXri` and `EORXri` in
`AArch64ExpandPseudoImpl::expandMOVImm`, because these cases are handled
with exactly the same code.
DeltaFile
+2-19llvm/lib/Target/AArch64/AArch64ExpandPseudoInsts.cpp
+2-191 files

LLVM/project 9426fc1clang/lib/CodeGen/TargetBuiltins ARM.cpp, clang/lib/Sema SemaARM.cpp

[AArch64] Fix _sys implemantation and MRS/MSR Sema checks (#187290)

This patch fixes lowering of _sys builtin, which used to lower into
invalid MSR S1... instruction. This was fixed by adding new sys llvm
intrinsic and proper lowering into sys instruction and its aliases.

I also fixed the sema check for _sys, _ReadStatusRegister and
_WriteStatusRegister builtins so they correctly capture invalid
usecases.
DeltaFile
+126-0llvm/test/CodeGen/AArch64/aarch64-sys-intrinsic.ll
+21-12clang/lib/CodeGen/TargetBuiltins/ARM.cpp
+10-16clang/test/CodeGen/arm64-microsoft-sys.c
+14-5llvm/lib/Target/AArch64/AArch64InstrFormats.td
+13-0clang/test/Sema/builtins-microsoft-arm64.c
+6-3clang/lib/Sema/SemaARM.cpp
+190-363 files not shown
+198-419 files

LLVM/project 2ab8924clang-tools-extra/test/clang-tidy/checkers/Inputs/Headers/std map unordered_map, clang-tools-extra/test/clang-tidy/checkers/modernize use-emplace.cpp

[clang-tidy][NFC] Use universal containers mock (#186669)

Changes are quite big but most of them is just copypasting and creating
mocks.
DeltaFile
+12-306clang-tools-extra/test/clang-tidy/checkers/modernize/use-emplace.cpp
+127-0clang-tools-extra/test/clang-tidy/checkers/Inputs/Headers/std/map
+126-0clang-tools-extra/test/clang-tidy/checkers/Inputs/Headers/std/unordered_map
+121-0clang-tools-extra/test/clang-tidy/checkers/Inputs/Headers/std/set
+113-0clang-tools-extra/test/clang-tidy/checkers/Inputs/Headers/std/unordered_set
+82-0clang-tools-extra/test/clang-tidy/checkers/Inputs/Headers/std/functional
+581-30622 files not shown
+1,027-65928 files

LLVM/project befad79libclc/clc/lib/generic/math clc_remainder.cl clc_remainder.inc

libclc: Implement remainder with remquo
 (#187999)

This fixes conformance failures for double and
without -cl-denorms-are-zero. Optimizations are
able to eliminate the unusued quo handling without
duplicating most of the code.
DeltaFile
+2-221libclc/clc/lib/generic/math/clc_remainder.cl
+13-0libclc/clc/lib/generic/math/clc_remainder.inc
+15-2212 files

LLVM/project 1a9fe17libclc/clc/include/clc/math remquo_decl.inc, libclc/clc/include/clc/shared binary_with_out_arg_scalarize.inc

libclc: Update remquo (#187998)

This was failing in the float case without -cl-denorms-are-zero
and failing for double. This now passes in all cases.

This was originally ported from rocm device libs in
8db45e4cf170cc6044a0afe7a0ed8876dcd9a863. This is mostly a port
in of more recent changes with a few changes.

- Templatification, which almost but doesn't quite enable
  vectorization yet due to the outer branch and loop.

- Merging of the 3 types into one shared code path, instead of
  duplicating  per type with 3 different functions implemented together.
  There are only some slight differences for the half case, which mostly
  evaluates as float.

- Splitting out of the is_odd tracking, instead of deriving it from the
  accumulated quotient. This costs an extra register, but saves several

    [6 lines not shown]
DeltaFile
+13-260libclc/clc/lib/generic/math/clc_remquo.inc
+158-0libclc/clc/lib/generic/math/clc_remquo_stret.inc
+82-0libclc/clc/include/clc/shared/binary_with_out_arg_scalarize.inc
+41-15libclc/clc/lib/generic/math/clc_remquo.cl
+22-12libclc/clc/include/clc/math/remquo_decl.inc
+316-2875 files

LLVM/project d6373b4mlir/include/mlir/Dialect/LLVMIR LLVMIntrinsicOps.td, mlir/test/Dialect/LLVMIR roundtrip.mlir

[mlir][LLVM] Add more `llvm.intr.experimental.constrained.*` ops (#187948)

Add additional "constrained" intrinsic ops. A rounding mode can be
specified for these ops.

Assisted by: claude-4.6-opus-high
DeltaFile
+105-0mlir/test/Target/LLVMIR/llvmir-intrinsics.mlir
+77-0mlir/test/Target/LLVMIR/Import/intrinsic.ll
+67-2mlir/include/mlir/Dialect/LLVMIR/LLVMIntrinsicOps.td
+49-0mlir/test/Dialect/LLVMIR/roundtrip.mlir
+298-24 files

LLVM/project ac795f0clang/lib/AST/ByteCode InterpBuiltin.cpp

[clang][bytecode] Create fewer pointers in __builtin_nan() (#187990)

Check the elements directly for initialization state and keep track of
whether we found a NUL byte.
DeltaFile
+12-8clang/lib/AST/ByteCode/InterpBuiltin.cpp
+12-81 files

LLVM/project 8e53f91clang/lib/CIR/CodeGen CIRGenModule.cpp TargetInfo.cpp, clang/lib/CIR/CodeGen/Targets AMDGPU.cpp

[CIR][AMDGPU] Add AMDGPU-specific function attributes for HIP kernels
DeltaFile
+256-0clang/lib/CIR/CodeGen/Targets/AMDGPU.cpp
+82-0clang/test/CIR/CodeGenHIP/amdgpu-attrs.hip
+24-3clang/lib/CIR/Lowering/DirectToLLVM/LowerToLLVMIR.cpp
+8-6clang/lib/CIR/CodeGen/CIRGenModule.cpp
+10-0clang/lib/CIR/CodeGen/TargetInfo.cpp
+5-0clang/lib/CIR/CodeGen/TargetInfo.h
+385-91 files not shown
+386-97 files

LLVM/project e60c11fclang/lib/CIR/CodeGen CIRGenModule.cpp TargetInfo.cpp, clang/lib/CIR/CodeGen/Targets AMDGPU.cpp

[CIR][AMDGPU] Add AMDGPU-specific function attributes for HIP kernels
DeltaFile
+256-0clang/lib/CIR/CodeGen/Targets/AMDGPU.cpp
+82-0clang/test/CIR/CodeGenHIP/amdgpu-attrs.hip
+24-3clang/lib/CIR/Lowering/DirectToLLVM/LowerToLLVMIR.cpp
+8-6clang/lib/CIR/CodeGen/CIRGenModule.cpp
+13-0clang/lib/CIR/CodeGen/TargetInfo.cpp
+5-0clang/lib/CIR/CodeGen/TargetInfo.h
+388-91 files not shown
+389-97 files

LLVM/project bdfb59blibclc/clc/include/clc/math remquo_decl.inc, libclc/clc/include/clc/shared binary_with_out_arg_scalarize.inc

Address comments
DeltaFile
+5-0libclc/clc/include/clc/shared/binary_with_out_arg_scalarize.inc
+0-4libclc/clc/include/clc/math/remquo_decl.inc
+5-42 files

LLVM/project 083b36blibclc/clc/lib/generic/math clc_remainder.cl clc_remainder.inc

libclc: Update remainder

Previously this was failing conformance without -cl-denorms-are-zero
in the float case, and always failing in the double case.
DeltaFile
+17-212libclc/clc/lib/generic/math/clc_remainder.cl
+171-0libclc/clc/lib/generic/math/clc_remainder.inc
+188-2122 files

LLVM/project 0a0e785libclc/clc/lib/generic/math clc_remainder.inc clc_remainder.cl

libclc: Implement remainder with remquo

This fixes conformance failures for double and
without -cl-denorms-are-zero. Optimizations are
able to eliminate the unusued quo handling without
duplicating most of the code.
DeltaFile
+2-160libclc/clc/lib/generic/math/clc_remainder.inc
+1-25libclc/clc/lib/generic/math/clc_remainder.cl
+3-1852 files

LLVM/project b7ee3b0libclc/clc/lib/generic/math clc_remquo_stret.inc clc_remquo.inc

Fix missing definitions
DeltaFile
+163-0libclc/clc/lib/generic/math/clc_remquo_stret.inc
+0-154libclc/clc/lib/generic/math/clc_remquo.inc
+28-1libclc/clc/lib/generic/math/clc_remquo.cl
+191-1553 files

LLVM/project 7c6996fllvm/include/llvm/CodeGen ValueTypes.h, llvm/lib/CodeGen/SelectionDAG TargetLowering.cpp

[ValueType][NFC] Add widenIntegerElementType method (#187816)

Fixes #187805
DeltaFile
+9-20llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
+9-1llvm/include/llvm/CodeGen/ValueTypes.h
+18-212 files

LLVM/project 85ab2a9llvm/include/llvm/CodeGen TargetInstrInfo.h, llvm/lib/CodeGen/AsmPrinter AsmPrinter.cpp

[AsmPrinter] Add generic support for verifying instruction sizes (#187703)

Many backends rely on TII reporting correct instruction sizes for MIR
level branch relaxation passes. Reporting a too small size can result in
MC fixup failures (or silent miscompiles for unvalidated fixups).

Some time ago I added validation to the PPC asm printer to verify that
the TII instruction size matches the actually emitted size. This was
very helpful to systematically fix all incorrectly reported instruction
sizes.

However, the same problem also exists in lots of other backends, so this
moves the validation into AsmPrinter, controlled by a new
getInstSizeVerifyMode() hook in TII, which is disabled by default.

The intention here is to gradually enable this validation for more
backends (which requires fixing them first).
DeltaFile
+35-0llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
+0-26llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp
+16-0llvm/include/llvm/CodeGen/TargetInstrInfo.h
+8-0llvm/lib/Target/PowerPC/PPCInstrInfo.cpp
+3-0llvm/lib/Target/PowerPC/PPCInstrInfo.h
+62-265 files

LLVM/project 0d6185ellvm/test/CodeGen/AMDGPU callee-frame-setup.ll

[AMDGPU] Update test to match comment. NFC (#187273)

The comment says there shouldn't be any free registers, so update the
inline assembly to clobber all non-preserved SGPRs.
DeltaFile
+68-18llvm/test/CodeGen/AMDGPU/callee-frame-setup.ll
+68-181 files

LLVM/project 31caa34llvm/lib/Target/LoongArch LoongArchISelLowering.cpp, llvm/lib/Target/RISCV RISCVISelLowering.cpp

[LoongArch][RISCV] Fix incorrect indexing of incoming byval arguments in tail call eligibility check
DeltaFile
+48-0llvm/test/CodeGen/LoongArch/issue187832.ll
+48-0llvm/test/CodeGen/RISCV/issue187832.ll
+2-2llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+2-2llvm/lib/Target/LoongArch/LoongArchISelLowering.cpp
+100-44 files

LLVM/project bb86440clang-tools-extra/clang-tidy/bugprone DerivedMethodShadowingBaseMethodCheck.cpp, clang-tools-extra/docs ReleaseNotes.rst

[clang-tidy] Correctly ignore function templates in derived-method-shadowing-base-method (#185741) (#185875)

This commit fixes a false positive in the
derived-method-shadowin-base-method clang-tidy check, as described in
[ticket 185741](https://github.com/llvm/llvm-project/issues/185741)

Fixes #185741

---------

Co-authored-by: Tom James <tom.james at siemens.com>
Co-authored-by: Zeyi Xu <mitchell.xu2 at gmail.com>
DeltaFile
+6-2clang-tools-extra/clang-tidy/bugprone/DerivedMethodShadowingBaseMethodCheck.cpp
+7-0clang-tools-extra/test/clang-tidy/checkers/bugprone/derived-method-shadowing-base-method.cpp
+4-0clang-tools-extra/docs/ReleaseNotes.rst
+17-23 files

LLVM/project de514fbbolt/include/bolt/Profile DataReader.h, bolt/include/bolt/Rewrite RewriteInstance.h

[BOLT] Remove some unused code (NFC) (#183880)

Remove some unused code in BOLT:
- `RewriteInstance::linkRuntime` is declared but not defined
- `BranchContext` typedef is never used
- `FuncBranchData::getBranch` is defined but never used
- `FuncBranchData::getDirectCallBranch` is defined but never used
DeltaFile
+0-29bolt/lib/Profile/DataReader.cpp
+0-10bolt/include/bolt/Profile/DataReader.h
+0-3bolt/include/bolt/Rewrite/RewriteInstance.h
+0-423 files

LLVM/project 3fa88f0mlir/lib/Dialect/Tensor/IR TensorOps.cpp, mlir/test/Dialect/Tensor canonicalize.mlir

[mlir][tensor] Fix empty tensor with cast encoding fold (#187963)

Fixed a todo where empty tensor with cast fold can't fold encoding or
attributes.
DeltaFile
+13-0mlir/test/Dialect/Tensor/canonicalize.mlir
+3-5mlir/lib/Dialect/Tensor/IR/TensorOps.cpp
+16-52 files

LLVM/project 66afa8fllvm/lib/Target/X86 X86ISelLoweringCall.cpp, llvm/test/CodeGen/X86 x86-fp80-ret-no-x87.ll

[X86] Emit user-friendly error for x86_fp80 with x87 disabled on x86_64 (#183932)

When compiling a function that uses `x86_fp80` on x86_64 with x87 disabled (`-mattr=-x87`), LLVM crashes with a cryptic internal error.

Fixes #182450
DeltaFile
+13-0llvm/test/CodeGen/X86/x86-fp80-ret-no-x87.ll
+13-0llvm/lib/Target/X86/X86ISelLoweringCall.cpp
+26-02 files

LLVM/project 252eb2aflang/include/flang/Optimizer/Dialect FIROps.td, flang/lib/Optimizer/CodeGen CodeGen.cpp

[flang][FIR] add a new fir.bitcast operation (#187793)

This patch introduces a new bitcast operation for integer, float,
character, and logical.

The main rational for it is that it is currently not possible to express
such bitcast in FIR without going trough memory and there is a need to
have some bitcast support when interfacing with the memref dialect where
one cannot use fir.char<> and fir.logical and must use the underlying
storage type. Using fir.convert is not a good idea because it is a
semantic cast and it will for instance normalize integers when
converting from/to logical.

This could also be used to simplify the implementation of TRANSFER for
the cases of simple scalars of those types.

Assisted by: Claude
DeltaFile
+123-0flang/test/Fir/convert-to-llvm.fir
+51-9flang/lib/Optimizer/CodeGen/CodeGen.cpp
+48-0flang/test/Fir/invalid.fir
+48-0flang/lib/Optimizer/Dialect/FIROps.cpp
+28-0flang/include/flang/Optimizer/Dialect/FIROps.td
+26-0flang/test/Fir/bitcast-fold.fir
+324-91 files not shown
+348-97 files

LLVM/project d705957clang/include/clang/Basic AArch64CodeGenUtils.h, clang/lib/CIR/CodeGen CIRGenBuiltinAArch64.cpp

[clang][Neon] Extract code shared by classic and CIR codegen (NFC) (#186448)

Extract intrinsic maps shared by the classic and CIR codegen into a new
header, AArch64CodeGenUtils.h, which is reused by both. This keeps the
implementations in sync and avoids code duplication.

The maps are moved without modification. The accompanying code (e.g.
`ARMVectorIntrinsicInfo`) is updated to follow Clang coding style
(CamelCase instead of the camelCase used in CIR).
DeltaFile
+14-676clang/lib/CIR/CodeGen/CIRGenBuiltinAArch64.cpp
+651-0clang/include/clang/Basic/AArch64CodeGenUtils.h
+2-627clang/lib/CodeGen/TargetBuiltins/ARM.cpp
+667-1,3033 files

LLVM/project cae0710llvm/lib/Analysis DependenceAnalysis.cpp

[DA] Remove absolute value calculations in the Weak Zero SIV tests
DeltaFile
+7-7llvm/lib/Analysis/DependenceAnalysis.cpp
+7-71 files

LLVM/project 1b4e416llvm/test/Analysis/DependenceAnalysis weak-zero-siv-addrec-wrap.ll

[DA] Update tests for the Weak Zero SIV tests (NFC)
DeltaFile
+112-0llvm/test/Analysis/DependenceAnalysis/weak-zero-siv-addrec-wrap.ll
+112-01 files

LLVM/project 3b47aa1llvm/lib/Analysis DependenceAnalysis.cpp, llvm/test/Analysis/DependenceAnalysis weak-zero-siv-addrec-wrap.ll

[DA] Add nsw check for addrecs in the Weak Zero SIV tests
DeltaFile
+31-16llvm/test/Analysis/DependenceAnalysis/weak-zero-siv-addrec-wrap.ll
+3-0llvm/lib/Analysis/DependenceAnalysis.cpp
+34-162 files

LLVM/project 94267f8llvm/include/llvm/Analysis DependenceAnalysis.h, llvm/lib/Analysis DependenceAnalysis.cpp

[DA] Consolidate the core logic of the Weak Zero SIV tests (NFCI)
DeltaFile
+80-124llvm/lib/Analysis/DependenceAnalysis.cpp
+5-0llvm/include/llvm/Analysis/DependenceAnalysis.h
+85-1242 files