LLVM/project 9fd8bc0compiler-rt/lib/fuzzer FuzzerInterceptors.cpp

[libFuzzer] Fix -Wunused-variable when building with NDEBUG (#188301)

The variable `FuzzerInitIsRunning` is only used within `assert()`.
Follow up to #178342.
DeltaFile
+6-0compiler-rt/lib/fuzzer/FuzzerInterceptors.cpp
+6-01 files

LLVM/project 68edb9flldb/source/Target ProcessTrace.cpp, lldb/unittests/Process ProcessTraceTest.cpp

[lldb] Fix trace load hang (#187768)

#179799 removed the `SetPrivateState(eStateStopped)` call in
`ProcessTrace::DidAttach()`. This makes the call to
`WaitForProcessToStop` hang forever, causing the `trace load` command to
hang.

This fix reintroduces the `SetPrivateState` call so a postmortem trace
process will "stop" after being loaded, matching the logic used in
`Process::LoadCore()`.
DeltaFile
+26-1lldb/unittests/Process/ProcessTraceTest.cpp
+4-0lldb/source/Target/ProcessTrace.cpp
+30-12 files

LLVM/project 8228749llvm/lib/CodeGen/AsmPrinter WinCFGuard.cpp, llvm/test/CodeGen/WinCFGuard cfguard-alias.ll

[CFGuard] Consider function aliases as indirect call targets (#188223)

With vector deleting destructors, it's common to include function
aliases in vftables.

After #185653 it's become more likely that the alias gets overridden in
a different TU. It's therefore important that it's the alias itself that
goes in the control-flow guard table.
DeltaFile
+28-0llvm/test/CodeGen/WinCFGuard/cfguard-alias.ll
+14-3llvm/lib/CodeGen/AsmPrinter/WinCFGuard.cpp
+42-32 files

LLVM/project 06c51b1llvm/docs LangRef.rst, llvm/include/llvm/IR FloatingPointOps.def

[IR] Allow non-constrained math intrinsics in strictfp functions

The current implementation of floating-point support uses two different
representations for each floating-point operation, such as `llvm.trunc`
and `llvm.experimental.constrained.trunc`. The main difference between
them is the presence of side effects that describe interaction with the
floating-point environment. Which of the two functions should be used is
determined by the enclosing function's attribute 'strictfp'. The
compiler does not check whether a regular functions, like `llvm.trunc`
is used in a strictfp function, so maintaining consistency is the user's
responsibility.  It is easy to mistakenly use the regular,
side-effect-free intrinsic in a strictfp function, and even LLVM tests
contain examples of this.

If the variant of intrinsic is determined solely by the 'strictfp'
function attribute, the distinction between the two forms appear to be
redundant, and the regular form could be used in all cases. This would
require the compiler to deduce side effects from the function
attributes. In this scenario, floating-point operations would have

    [17 lines not shown]
DeltaFile
+150-28llvm/docs/LangRef.rst
+68-108llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
+72-0llvm/include/llvm/IR/FloatingPointOps.def
+32-0llvm/lib/IR/IRBuilder.cpp
+9-14llvm/test/CodeGen/AMDGPU/amdgpu-simplify-libcall-pow.ll
+19-0llvm/lib/Analysis/BasicAliasAnalysis.cpp
+350-15014 files not shown
+425-17320 files

LLVM/project c191ad0mlir/include/mlir/Dialect/Arith/IR ArithOps.td, mlir/lib/Dialect/Arith/IR ArithOps.cpp

[mlir][arith] Mark `arith.remsi` and `arith.remui` as conditionally speculatable
DeltaFile
+16-2mlir/include/mlir/Dialect/Arith/IR/ArithOps.td
+13-0mlir/lib/Dialect/Arith/IR/ArithOps.cpp
+29-22 files

LLVM/project 87725e7libsycl/src CMakeLists.txt, libsycl/src/detail device_binary_structures.hpp program_manager.cpp

fix comments

Signed-off-by: Tikhomirova, Kseniya <kseniya.tikhomirova at intel.com>
DeltaFile
+27-65libsycl/src/detail/device_binary_structures.hpp
+26-26libsycl/src/detail/program_manager.cpp
+20-18libsycl/src/detail/program_manager.hpp
+7-7libsycl/src/detail/kernel_id.hpp
+6-5libsycl/src/detail/device_image_wrapper.hpp
+3-0libsycl/src/CMakeLists.txt
+89-1211 files not shown
+91-1227 files

LLVM/project efaf001lldb/docs/resources formatterbytecode.rst

[lldb][docs] Restore correct @update bytecode signature (#188295)

Mistakenly changed in https://github.com/llvm/llvm-project/pull/182155
DeltaFile
+1-1lldb/docs/resources/formatterbytecode.rst
+1-11 files

LLVM/project 0158cd0llvm/lib/Target/RISCV RISCVMergeBaseOffset.cpp, llvm/test/CodeGen/RISCV sfb-merge-base-offset.ll fold-addi-loadstore-cpi.mir

[RISCV] Merge Base Offset for SFB Pseudos (#187620)

This implements the Merge Base Offset pass for the SFB Load Pseudos.
These Pseudos are expanded after Merge Base Offset, so the pass needs to
handle them.

I also had to extend support in MergeBaseOffset to ensuring that ImmOp
could be a Constant Pool Index, which seemed to be supported in some
checks but not others.
DeltaFile
+493-0llvm/test/CodeGen/RISCV/sfb-merge-base-offset.ll
+99-21llvm/lib/Target/RISCV/RISCVMergeBaseOffset.cpp
+76-0llvm/test/CodeGen/RISCV/fold-addi-loadstore-cpi.mir
+668-213 files

LLVM/project 95546c8lldb/examples/python formatter_bytecode.py, lldb/test/Shell/ScriptInterpreter/Python/Inputs/FormatterBytecode RigidArrayLLDBFormatterC.txt RigidArrayLLDBFormatterSwift.txt

[lldb] Fix value of sig_update in formatter_bytecode.py (#188292)
DeltaFile
+3-1lldb/examples/python/formatter_bytecode.py
+1-1lldb/test/Shell/ScriptInterpreter/Python/Inputs/FormatterBytecode/RigidArrayLLDBFormatterC.txt
+1-1lldb/test/Shell/ScriptInterpreter/Python/Inputs/FormatterBytecode/RigidArrayLLDBFormatterSwift.txt
+5-33 files

LLVM/project 01af152llvm/test/CodeGen/NVPTX i1-int-to-fp.ll

[NVPTX] Fix assumption of sm versioning (#188282)

The test case in #188118 assumes sm-90 is always available, leading to a
crash
```
 # | ptxas fatal   : SM version specified by .target is higher than default SM version assumed
```

This PR updates the test case to follow the check specified in
`llvm/test/CodeGen/NVPTX/cp-async-bulk-tensor-reduce.ll`,
namely `%if ptxas-sm_90 && ptxas-isa-7.8`
DeltaFile
+1-1llvm/test/CodeGen/NVPTX/i1-int-to-fp.ll
+1-11 files

LLVM/project 047cd90clang/lib/DependencyScanning DependencyScannerImpl.cpp, clang/test/ClangScanDeps p1689-suppress-warnings.cppm

Revert "[ClangScanDeps] Do not emit warning for P1689 format" (#188179)

Reverts llvm/llvm-project#186966
DeltaFile
+0-23clang/test/ClangScanDeps/p1689-suppress-warnings.cppm
+0-2clang/lib/DependencyScanning/DependencyScannerImpl.cpp
+0-252 files

LLVM/project 8d8f635llvm/lib/Target/SPIRV SPIRVCBufferAccess.h SPIRVCommandLine.h, llvm/lib/Target/SPIRV/MCTargetDesc SPIRVBaseInfo.h SPIRVInstPrinter.h

[NFC][SPIR-V] Fix include guard names to match file paths (#187689)
DeltaFile
+3-3llvm/lib/Target/SPIRV/MCTargetDesc/SPIRVBaseInfo.h
+3-3llvm/lib/Target/SPIRV/MCTargetDesc/SPIRVInstPrinter.h
+3-3llvm/lib/Target/SPIRV/MCTargetDesc/SPIRVTargetStreamer.h
+3-3llvm/lib/Target/SPIRV/SPIRVCBufferAccess.h
+3-3llvm/lib/Target/SPIRV/SPIRVCommandLine.h
+3-3llvm/lib/Target/SPIRV/SPIRVGlobalRegistry.h
+18-187 files not shown
+35-3513 files

LLVM/project 549b529llvm/lib/Target/X86 X86AsmPrinter.h, llvm/test/CodeGen/X86 npm-asmprint.ll

[X86][NewPM] Mark X86AsmPrinter isRequired (#188278)

Otherwise the pass does not run when a function has the optnone
attribute, which means we get no assembly out for functions marked
optnone.
DeltaFile
+9-0llvm/test/CodeGen/X86/npm-asmprint.ll
+2-0llvm/lib/Target/X86/X86AsmPrinter.h
+11-02 files

LLVM/project 3deca2dllvm/include/llvm/Support KnownBits.h, llvm/lib/CodeGen/SelectionDAG SelectionDAG.cpp

[KnownBits] KnownBits::add - add optional arg indicating a X+X self add pattern (#188078)

Compute knownbits for ADD(X,X) as SHL(X,1)

Followup to #186461
DeltaFile
+74-0llvm/unittests/Support/KnownBitsTest.cpp
+13-9llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
+10-1llvm/include/llvm/Support/KnownBits.h
+97-103 files

LLVM/project 6159b25clang/test/CIR/IR invalid-linkage.cir

[CIR][NFC] Mark invalid-linkage.cir as XFAIL (#188279)

The invalid-linkage.cir test is currently failing as a result of a
recent change to the MLIR attribute parser. I am temporarily marking
this test as XFAIL while that problem is being worked on to unblock CIR
development. I added a check that will force the test to fail even after
the problem is fixed so that we don't start getting unexpected passes
when the fix is merged. (CIR testing isn't run during CI for MLIR
changes.) I will reenable the test after the problem has been fixed.
DeltaFile
+3-0clang/test/CIR/IR/invalid-linkage.cir
+3-01 files

LLVM/project ca2bc55llvm/lib/Target/AMDGPU AMDGPURegBankLegalizeRules.cpp, llvm/test/CodeGen/AMDGPU llvm.amdgcn.ps.live.ll

AMDGPU/GlobalISel: RegBankLegalize rules for ps_live (#188101)
DeltaFile
+7-3llvm/test/CodeGen/AMDGPU/llvm.amdgcn.ps.live.ll
+2-1llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeRules.cpp
+1-2llvm/test/CodeGen/AMDGPU/GlobalISel/regbankselect-amdgcn.ps.live.mir
+10-63 files

LLVM/project 6e2a720llvm/include/llvm/ADT GenericUniformityImpl.h, llvm/include/llvm/Analysis TargetTransformInfo.h

[AMDGPU][Uniformity][TTI] Make Uniformity Analysis Operand-Aware via Custom Uniformity Checks (#137639)

See: https://github.com/llvm/llvm-project/issues/131779

Extends uniformity analysis to support instructions whose uniformity
depends on which specific operands are uniform. Introduces
`InstructionUniformity::Custom` and a target hook `TTI::isUniform(I,
UniformArgs)` that allows targets to define custom uniformity rules.
During propagation, custom candidates are checked via the target hook.
If we can prove they are uniform, we skip marking them divergent and let
iterative propagation re-evaluate as operands change.

Implements AMDGPU's `llvm.amdgcn.wave.shuffle` rules (uniform when
either operand is uniform, divergent only when both are divergent) as
the motivating example.

This inverted-logic approach is critical for correctness: proving
uniformity early during propagation would be unsafe, as operands can
transition from uniform to divergent during divergence propagation.

    [3 lines not shown]
DeltaFile
+50-0llvm/test/Analysis/UniformityAnalysis/AMDGPU/uniform_intrinsic.ll
+24-0llvm/include/llvm/ADT/GenericUniformityImpl.h
+23-0llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp
+15-2llvm/lib/Analysis/UniformityAnalysis.cpp
+11-0llvm/include/llvm/Analysis/TargetTransformInfo.h
+8-0llvm/lib/CodeGen/MachineUniformityAnalysis.cpp
+131-24 files not shown
+150-310 files

LLVM/project 399b465llvm/lib/Target/AMDGPU GCNVOPDUtils.cpp VOP3PInstructions.td, llvm/lib/Target/AMDGPU/Utils AMDGPUBaseInfo.cpp

AMDGPU: Codegen for v_dual_dot2acc_f32_f16/bf16 from VOP3

For V_DOT2_F32_F16 and V_DOT2_F32_BF16 add their VOPDName and mark
them with usesCustomInserter whihc will be used to add pre-RA register
allocation hints to preferably assign dst and src2 to the same physical
register. When the hint is satisfied, canMapVOP3PToVOPD recognises the
instruction as eligible for VOPD pairing by checking if it is VOP2 like:
dst==src2, no source modifiers, no clamp, and src1 is a register.
Mark both instructions as commutable to allow a literal in src1 to be
moved to src0, since VOPD only permits a literal in src0.
DeltaFile
+258-592llvm/test/CodeGen/AMDGPU/llvm.amdgcn.fdot2.ll
+75-93llvm/test/CodeGen/AMDGPU/llvm.amdgcn.fdot2.f32.bf16.ll
+32-1llvm/lib/Target/AMDGPU/GCNVOPDUtils.cpp
+8-5llvm/lib/Target/AMDGPU/VOP3PInstructions.td
+11-0llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+6-0llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp
+390-6911 files not shown
+392-6937 files

LLVM/project c7b5f7cllvm/lib/Support Parallel.cpp

[Support] Use atomic counter in parallelFor instead of per-task spawning (#187989)

This function is primarily used by lld and debug info tools.

Instead of pre-splitting work into up to MaxTasksPerGroup (1024) tasks
and spawning each through the Executor's mutex+condvar, use an atomic
counter for work distribution. Only ThreadCount workers are spawned;
each grabs the next chunk via atomic fetch_add.

This reduces futex calls from ~31K (glibc, release+assertions build) to
~1.4K when linking clang-14 (191MB PIE with --export-dynamic) with
`ld.lld --threads=8` (each parallelFor spawned up to 1024 tasks, each
requiring mutex lock + condvar signal).

```
                             Wall      System    futex
  glibc (assertions) before: 927ms     897ms     31K
  glibc (assertions) after:  879ms     765ms     1.4K
  mimalloc before:           872ms     694ms     25K
  mimalloc after:            830ms     661ms     1K
```
DeltaFile
+23-18llvm/lib/Support/Parallel.cpp
+23-181 files

LLVM/project e73d8f8compiler-rt/cmake/Modules CompilerRTUtils.cmake, compiler-rt/cmake/caches GPU.cmake

[compiler-rt] Support unit tests for the GPU build (#187895)

Summary:
This PR enables the basic unit tests for builtins to be run on the GPU
architectures. Other targets like profiling are supported, but the
host-device natures will make it more difficult to adequately unit
test. It may be be possible to do basic tests there, to simply verify
that
counters are present and in the proper format for when they are copied
to the host.
DeltaFile
+13-0compiler-rt/test/builtins/CMakeLists.txt
+12-0compiler-rt/cmake/Modules/CompilerRTUtils.cmake
+2-1compiler-rt/test/CMakeLists.txt
+1-1compiler-rt/cmake/caches/GPU.cmake
+1-1compiler-rt/test/builtins/Unit/lit.cfg.py
+2-0compiler-rt/test/lit.common.cfg.py
+31-36 files

LLVM/project 4eedd51clang-tools-extra/clang-tidy/bugprone StdNamespaceModificationCheck.cpp, clang-tools-extra/docs ReleaseNotes.rst

[clang-tidy] Do not provide diagnostics for cert-dcl58-cpp on implicit declarations (#188152)

Do not provide diagnostics for cert-dcl58-cpp for compiler generated
intrinsic as it will be a false positive.

In provided tests compiler generates align_val_t which ends up inside
std namespace, resulting in std::align_val_t symbol. This symbol is
compiler generated, having no location, causing compiler crash. Also
there is no point to notify user about violations which user has no
control of.

Resolution: Diagnostics suppressed.

Co-authored-by: Vladislav Aranov <vladislav.aranov at ericsson.com>
DeltaFile
+21-0clang-tools-extra/test/clang-tidy/checkers/bugprone/std-namespace-modification-implicit.cpp
+18-0clang/test/AST/ast-dump-implicit-align-val.cpp
+4-0clang-tools-extra/clang-tidy/bugprone/StdNamespaceModificationCheck.cpp
+2-1clang-tools-extra/docs/ReleaseNotes.rst
+45-14 files

LLVM/project de6ed3cllvm/lib/Target/PowerPC PPCInstrInfo.cpp PPCInstr64Bit.td

[PowerPC] Fix some instruction sizes (#188227)

This fixes:
 * PADDIdtprel: Lowers to PADDI8, which is prefixed.
 * PATCHABLE_FUNTION_ENTER/PATCHABLE_RET: Handle xray sleds.

These came up when generalizing the instruction size verification
infrastructure.
DeltaFile
+11-1llvm/lib/Target/PowerPC/PPCInstrInfo.cpp
+1-0llvm/lib/Target/PowerPC/PPCInstr64Bit.td
+12-12 files

LLVM/project ce44d63llvm/unittests/CodeGen/GlobalISel IRTranslatorBF16Test.cpp

[GlobalISel][Test] Fix `IRTranslatorBF16Test` crash (#188273)

Skip the test when the AArch64 target is unavailable.
DeltaFile
+2-0llvm/unittests/CodeGen/GlobalISel/IRTranslatorBF16Test.cpp
+2-01 files

LLVM/project d85db97clang/include/clang/CIR MissingFeatures.h, clang/lib/CIR/CodeGen TargetInfo.cpp TargetInfo.h

update requiresAMDGPUProtectedVisibility and other minor fixes
DeltaFile
+51-64clang/lib/CIR/CodeGen/Targets/AMDGPU.cpp
+12-5clang/lib/CIR/CodeGen/TargetInfo.cpp
+4-0clang/lib/CIR/CodeGen/TargetInfo.h
+0-1clang/include/clang/CIR/MissingFeatures.h
+67-704 files

LLVM/project 31cb434clang/lib/CIR/CodeGen/Targets AMDGPU.cpp, clang/lib/CIR/Lowering/DirectToLLVM LowerToLLVMIR.cpp

add support for amdgpu-expand-waitcnt-profiling
DeltaFile
+44-32clang/lib/CIR/CodeGen/Targets/AMDGPU.cpp
+16-0clang/test/CIR/CodeGenHIP/amdgpu-attrs.hip
+1-4clang/lib/CIR/Lowering/DirectToLLVM/LowerToLLVMIR.cpp
+61-363 files

LLVM/project be773ffclang/lib/CIR/CodeGen CIRGenModule.cpp TargetInfo.cpp, clang/lib/CIR/CodeGen/Targets AMDGPU.cpp

[CIR][AMDGPU] Add AMDGPU-specific function attributes for HIP kernels
DeltaFile
+256-0clang/lib/CIR/CodeGen/Targets/AMDGPU.cpp
+82-0clang/test/CIR/CodeGenHIP/amdgpu-attrs.hip
+24-3clang/lib/CIR/Lowering/DirectToLLVM/LowerToLLVMIR.cpp
+8-6clang/lib/CIR/CodeGen/CIRGenModule.cpp
+10-0clang/lib/CIR/CodeGen/TargetInfo.cpp
+5-0clang/lib/CIR/CodeGen/TargetInfo.h
+385-91 files not shown
+386-97 files

LLVM/project ab903b4llvm/include/llvm/ADT StringSwitch.h, llvm/unittests/ADT StringSwitchTest.cpp

[ADT] Add predicate based match support to StringSwitch (#188046)

This introduces `Predicate` and `IfNotPredicate` case selection to
StringSwitch to allow use cases like

```
StringSwitch<...>(..)
  .Case("foo", FooTok)
  .Predicate([](StringRef Str){ ... }, IdentifierTok)
...
```

This is mostly useful for improving conciseness and clarity when
processing generated strings, diagnostics, and similar.
DeltaFile
+14-0llvm/unittests/ADT/StringSwitchTest.cpp
+8-0llvm/include/llvm/ADT/StringSwitch.h
+22-02 files

LLVM/project ffd0cfbllvm/test/CodeGen/X86 vector-interleaved-store-i64-stride-7.ll vector-interleaved-store-i64-stride-6.ll

Rebase

Created using spr 1.3.7
DeltaFile
+4,978-4,984llvm/test/CodeGen/X86/vector-interleaved-store-i64-stride-7.ll
+4,590-4,623llvm/test/CodeGen/X86/vector-interleaved-store-i64-stride-6.ll
+3,850-4,310llvm/test/CodeGen/X86/vector-interleaved-load-i8-stride-8.ll
+3,562-3,632llvm/test/CodeGen/X86/vector-interleaved-load-i16-stride-8.ll
+2,430-2,474llvm/test/CodeGen/X86/vector-interleaved-load-i8-stride-7.ll
+1,815-1,852llvm/test/CodeGen/X86/vector-interleaved-load-i16-stride-7.ll
+21,225-21,87548 files not shown
+29,269-29,41254 files

LLVM/project 5f0b3d6llvm/lib/Transforms/Vectorize SLPVectorizer.cpp

[SLP][NFC]Fix formatting and debug printing, NFC
DeltaFile
+3-3llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+3-31 files

LLVM/project e71cbabllvm/lib/Transforms/Vectorize SLPVectorizer.cpp

Address comment

Created using spr 1.3.7
DeltaFile
+2-2llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+2-21 files