LLVM/project f2cf0cdllvm/lib/Target/RISCV RISCVInstrInfoP.td

[RISCV] Remove unneeded ImmLeaf from simm8_unsigned. NFC (#184960)
DeltaFile
+1-1llvm/lib/Target/RISCV/RISCVInstrInfoP.td
+1-11 files

LLVM/project a33cecbllvm/include/llvm/Analysis DependenceAnalysis.h, llvm/lib/Analysis DependenceAnalysis.cpp

[DA] Rewrite formula in the Weak Zero SIV tests
DeltaFile
+67-72llvm/lib/Analysis/DependenceAnalysis.cpp
+8-8llvm/test/Analysis/DependenceAnalysis/weak-zero-siv-large-btc.ll
+4-8llvm/include/llvm/Analysis/DependenceAnalysis.h
+2-6llvm/test/Analysis/DependenceAnalysis/weak-zero-siv-overflow.ll
+2-2llvm/test/Analysis/DependenceAnalysis/weak-crossing-siv-large-btc.ll
+83-965 files

LLVM/project 128e676llvm/include/llvm/Analysis DependenceAnalysis.h, llvm/lib/Analysis DependenceAnalysis.cpp

[DA] Remove isPeelFirst and isPeelLast
DeltaFile
+1-24llvm/include/llvm/Analysis/DependenceAnalysis.h
+0-20llvm/lib/Analysis/DependenceAnalysis.cpp
+3-3llvm/test/Analysis/DependenceAnalysis/WeakZeroDstSIV.ll
+3-3llvm/test/Analysis/DependenceAnalysis/WeakZeroSrcSIV.ll
+7-504 files

LLVM/project 7dda8ballvm/lib/Analysis DependenceAnalysis.cpp, llvm/test/Analysis/DependenceAnalysis weak_zero_siv_parametric_coeff.ll WeakZeroSrcSIV.ll

[DA] Fix the Weak Zero SIV tests when the coeff may be zero (#183736)

In the Weak Zero SIV tests, given two subscripts `{c0,+,a}` and `c1`,
when `c0 == c1`, the tests conclude that a dependency exists from the
former subscript at the first iteration to the latter subscript at every
iteration. However, this conclusion is correct only when `a` is not
zero, which was not being checked.
This patch adds non-zero checks for `a` in the Weak Zero SIV tests.
Fix the test cases added in #183735 .
DeltaFile
+4-10llvm/test/Analysis/DependenceAnalysis/weak_zero_siv_parametric_coeff.ll
+2-2llvm/lib/Analysis/DependenceAnalysis.cpp
+1-1llvm/test/Analysis/DependenceAnalysis/WeakZeroSrcSIV.ll
+1-1llvm/test/Analysis/DependenceAnalysis/WeakZeroDstSIV.ll
+8-144 files

LLVM/project d17c5f9llvm/test/CodeGen/AArch64 clmul-fixed.ll, llvm/test/CodeGen/RISCV/rvv clmulh-sdnode.ll clmul-sdnode.ll

Rebase and address comments

Created using spr 1.3.6-beta.1
DeltaFile
+53,024-7,001llvm/test/CodeGen/RISCV/rvv/clmulh-sdnode.ll
+15,172-1,553llvm/test/CodeGen/RISCV/rvv/clmul-sdnode.ll
+6,812-3,080llvm/test/CodeGen/AArch64/clmul-fixed.ll
+6,520-0llvm/test/CodeGen/X86/bit-manip-i512.ll
+3,441-0llvm/test/MC/AMDGPU/gfx13_asm_vflat.s
+3,257-0llvm/test/CodeGen/X86/bit-manip-i256.ll
+88,226-11,6341,391 files not shown
+138,997-27,6991,397 files

LLVM/project 9243a1cllvm/test/CodeGen/AArch64 clmul-fixed.ll, llvm/test/CodeGen/RISCV/rvv clmulh-sdnode.ll clmul-sdnode.ll

[𝘀𝗽𝗿] changes introduced through rebase

Created using spr 1.3.6-beta.1

[skip ci]
DeltaFile
+53,024-7,001llvm/test/CodeGen/RISCV/rvv/clmulh-sdnode.ll
+15,172-1,553llvm/test/CodeGen/RISCV/rvv/clmul-sdnode.ll
+6,812-3,080llvm/test/CodeGen/AArch64/clmul-fixed.ll
+6,520-0llvm/test/CodeGen/X86/bit-manip-i512.ll
+3,441-0llvm/test/MC/AMDGPU/gfx13_asm_vflat.s
+3,257-0llvm/test/CodeGen/X86/bit-manip-i256.ll
+88,226-11,6341,390 files not shown
+138,949-27,6701,396 files

LLVM/project e12bb1allvm/test/CodeGen/RISCV/rvv abd.ll

[𝘀𝗽𝗿] initial version

Created using spr 1.3.6-beta.1
DeltaFile
+106-0llvm/test/CodeGen/RISCV/rvv/abd.ll
+106-01 files

LLVM/project 2b21231llvm/lib/Transforms/IPO MemProfContextDisambiguation.cpp, llvm/test/ThinLTO/X86 remark-missing-info.ll memprof-basic.ll

[MemProf] Enhance thin link optimization remarks (#184829)

Don't require -memprof-report-hinted-sizes for emitting opt remarks
during the thin link step. Invoke the handling also when opt remarks are
enabled for MemProf per OptimizationRemarkEmitter::allowExtraAnalysis.

Also, add a fallback message if we don't have the context size
information, adding tests for those new messages.

I also realized we don't currently emit these messages for MemProf with
regular LTO, and added a TODO.
DeltaFile
+64-0llvm/test/ThinLTO/X86/remark-missing-info.ll
+59-0llvm/test/Transforms/MemProfContextDisambiguation/remark-missing-info.ll
+44-6llvm/test/ThinLTO/X86/memprof-basic.ll
+29-6llvm/lib/Transforms/IPO/MemProfContextDisambiguation.cpp
+7-7llvm/test/Transforms/MemProfContextDisambiguation/inlined3.ll
+3-3llvm/test/Transforms/MemProfContextDisambiguation/basic.ll
+206-223 files not shown
+211-259 files

LLVM/project f31e65flibclc/clc/lib/amdgcn/mem_fence clc_mem_fence.cl, libclc/opencl/lib/generic SOURCES

libclc: Add atomic_work_item_fence (#184844)
DeltaFile
+18-0libclc/opencl/lib/generic/atomic/atomic_work_item_fence.cl
+2-0libclc/clc/lib/amdgcn/mem_fence/clc_mem_fence.cl
+1-0libclc/opencl/lib/generic/SOURCES
+21-03 files

LLVM/project 42e6928llvm/test/CodeGen/AArch64 clmul-fixed.ll, llvm/test/CodeGen/RISCV/rvv clmulh-sdnode.ll clmul-sdnode.ll

Merge branch 'main' into users/kasuga-fj/da-fix-weak-zero-siv
DeltaFile
+53,024-7,001llvm/test/CodeGen/RISCV/rvv/clmulh-sdnode.ll
+15,172-1,553llvm/test/CodeGen/RISCV/rvv/clmul-sdnode.ll
+6,812-3,080llvm/test/CodeGen/AArch64/clmul-fixed.ll
+6,520-0llvm/test/CodeGen/X86/bit-manip-i512.ll
+3,441-0llvm/test/MC/AMDGPU/gfx13_asm_vflat.s
+3,257-0llvm/test/CodeGen/X86/bit-manip-i256.ll
+88,226-11,6341,516 files not shown
+142,295-28,6401,522 files

LLVM/project 1f2e52fllvm/test/tools/llvm-ir2vec/Inputs input.ll, llvm/test/tools/llvm-ir2vec/bindings ir2vec-bindings.py

[llvm-ir2vec] Adding getFuncNames API to ir2vec python bindings (#180473)

This is more a user convenience thing. But I thought it helpful.

Otherwise, at the moment, the user has to fetch the entire embeddings
dict, just to see what all functions a module has
DeltaFile
+11-0llvm/tools/llvm-ir2vec/Bindings/PyIR2Vec.cpp
+11-0llvm/test/tools/llvm-ir2vec/bindings/ir2vec-bindings.py
+3-0llvm/test/tools/llvm-ir2vec/Inputs/input.ll
+25-03 files

LLVM/project cb6936ellvm/test/Transforms/LoopVectorize/AArch64 blend-costs.ll

[LV] Remove branch on false in blend-costs.ll test. NFC (#184816)

I have a patch I want to post that improves blend masks, but it ends up
with a weird diff in this test stemming from the branch on false.

This replaces it with an external boolean. This should still test
scalarizing a blend which I believe is the original intent.
DeltaFile
+49-23llvm/test/Transforms/LoopVectorize/AArch64/blend-costs.ll
+49-231 files

LLVM/project 46d29d4lld/ELF Relocations.cpp RelocScan.h

[ELF] Remove unused handleTlsRelocation (#184951)

Now that all targets use target-specific relocation scanning for TLS
(#181332 RISC-V being the last), handleTlsRelocation is unused.
DeltaFile
+0-100lld/ELF/Relocations.cpp
+0-15lld/ELF/RelocScan.h
+2-3lld/ELF/InputSection.cpp
+2-1183 files

LLVM/project 4541e23llvm/lib/Target/RISCV RISCVISelDAGToDAG.cpp RISCVInstrInfoP.td, llvm/test/CodeGen/RISCV rvp-ext-rv64.ll rvp-ext-rv32.ll

[RISCV][P-ext] Select plui.h/w and improve usage of pli.b/h/w. (#184937)

This patch adds custom instruction selection of splat_vector of
constants. Rather that using the element size from the VT, find
the smallest splat size in the constant. This allow us to use
pli.b for i16 or i32 elements that contain a byte splat.
DeltaFile
+82-14llvm/test/CodeGen/RISCV/rvp-ext-rv64.ll
+41-14llvm/test/CodeGen/RISCV/rvp-ext-rv32.ll
+32-0llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp
+0-8llvm/lib/Target/RISCV/RISCVInstrInfoP.td
+155-364 files

LLVM/project 62a5e53mlir/lib/Transforms/Utils DialectConversion.cpp, mlir/test/Transforms test-legalizer.mlir

[mlir] Improve dialect conversion failure diagnostics (#182729)

This PR improves MLIR dialect conversion failure diagnostics when
legalization fails.

Previously, the diagnostic mostly included the operation name (and in
partial conversion, whether it was explicitly marked illegal). This
change keeps that prefix and appends the printed failing operation. This
provides immediate operand/result/type context directly in the same
error line.

### Example

Before:
```
failed to legalize operation 'test.type_consumer' that was explicitly marked illegal
```

After:

    [6 lines not shown]
DeltaFile
+10-4mlir/lib/Transforms/Utils/DialectConversion.cpp
+2-2mlir/test/Transforms/test-legalizer.mlir
+12-62 files

LLVM/project 54e4eeblibc/src/__support/CPP bit.h

reapply static
DeltaFile
+1-1libc/src/__support/CPP/bit.h
+1-11 files

LLVM/project fb7d255libcxx/include string

[libc++][string] Replace ASAN volatile wrapper with memory barrier (#184693)

The previous `_LIBCPP_ASAN_VOLATILE_WRAPPER` approach was used to
prevent
speculative loads of string data before the short/long state was
determined. This patch replaces that mechanism with a more explicit
`__annotate_memory_barrier()` using an empty volatile assembly block.

This PR is inspired by #183457 and by downstream false positive on
`__get_long_size`. It fails same way as `__get_long_pointer` before we
have
`_LIBCPP_ASAN_VOLATILE_WRAPPER`. Barrier approach avoids
expanding `_LIBCPP_ASAN_VOLATILE_WRAPPER` for size_t, and to
in general looks more readable.

I failed to create reasonable reproducer for test, I suspect it requires
precise set of compiler flags, and libc++ site_config which will be hard
to maintain in test.
DeltaFile
+21-28libcxx/include/string
+21-281 files

LLVM/project b6c06fdlibc/src/__support/FPUtil bfloat16.h NearestIntegerOperations.h, libc/src/__support/FPUtil/generic add_sub.h

[libc][math] Qualify ceil functions to constexpr
DeltaFile
+59-7libc/test/shared/shared_math_test.cpp
+13-13libc/src/__support/FPUtil/generic/add_sub.h
+11-11libc/src/__support/FPUtil/bfloat16.h
+8-8libc/src/__support/FPUtil/NearestIntegerOperations.h
+13-1libc/src/__support/math/ceill.h
+7-7libc/src/__support/FPUtil/comparison_operations.h
+111-479 files not shown
+141-7215 files

LLVM/project 1093a18llvm/lib/Transforms/Vectorize/SandboxVectorizer DependencyGraph.cpp, llvm/unittests/Transforms/Vectorize/SandboxVectorizer DependencyGraphTest.cpp

[SandboxVec][DAG] Handle unscheduled successors when user is external (#183861)

Whenever an IR use-def edge gets updated, the DAG gets notified about
the change by having its `notifySetUse()` callback called. The
callback's job is to update the DAG node's `UnscheduledSuccs` counter
which is the number of successor nodes that are yet to be scheduled.

This update makes sense only if both ends of the use-def edge are in the
DAG. Up until now we would still update the counter even if the user was
outside the DAG. This patch fixes this, so from now on we skip updatinge
`UnscheduledSuccs` if the user is outside the DAG.
DeltaFile
+41-0llvm/unittests/Transforms/Vectorize/SandboxVectorizer/DependencyGraphTest.cpp
+12-2llvm/lib/Transforms/Vectorize/SandboxVectorizer/DependencyGraph.cpp
+53-22 files

LLVM/project eafd076llvm/lib/Target/RISCV RISCVISelDAGToDAG.cpp, llvm/test/CodeGen/RISCV rvp-ext-rv32.ll rvp-ext-rv64.ll

[RISCV][P-ext] Select (splat_vector 0) as copy from X0. (#184911)
DeltaFile
+16-0llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp
+1-2llvm/test/CodeGen/RISCV/rvp-ext-rv32.ll
+1-2llvm/test/CodeGen/RISCV/rvp-ext-rv64.ll
+18-43 files

LLVM/project 3015979clang/test/CodeGen/RISCV riscv-inline-asm.c, llvm/lib/Target/RISCV RISCVISelLowering.cpp

[RISCV] Support 'f' Inline Assembly Constraint for bfloat16 (#184566)

This patch is to add 'f' and 'cf' Inline Assembly Constraint for the `bfloat16` type, so they are passed in the floating point registers.
DeltaFile
+45-0llvm/test/CodeGen/RISCV/inline-asm-bf-constraint-f.ll
+4-0llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+4-0clang/test/CodeGen/RISCV/riscv-inline-asm.c
+53-03 files

LLVM/project 4ea72c1lld/ELF Relocations.cpp InputSection.cpp, lld/ELF/Arch RISCV.cpp

[ELF] Add target-specific relocation scanning for RISC-V (#181332)

Implement RISCV::scanSectionImpl, following the pattern established
for x86 (#178846) and AArch64 (#181099). This merges the getRelExpr
and TLS handling for SHF_ALLOC sections into the target-specific
scanner, enabling devirtualization and eliminating abstraction
overhead.

- Inline relocation classification into scanSectionImpl with a switch
  on relocation type, replacing the generic rs.scan() path.
- Use processR_PC/processR_PLT_PC for common PC-relative and PLT
  relocations.
- Handle TLS IE and GD directly (RISC-V does not optimize GD/LD/IE).
- Replace TLS-optimization-specific expressions for TLSDESC, following
  the x86 pattern: R_RELAX_TLS_GD_TO_IE -> R_GOT_PC,
  R_RELAX_TLS_GD_TO_LE -> R_TPREL. Update relocateAlloc and relax()
  to dispatch on relocation type instead of RelExpr for TLSDESC.
- Simplify getRelExpr to only handle relocations needed by
  relocateNonAlloc and preprocessRelocs.

    [4 lines not shown]
DeltaFile
+185-94lld/ELF/Arch/RISCV.cpp
+36-6lld/test/ELF/riscv-vendor-relocations.s
+32-0lld/test/ELF/riscv-vendor-relocations2.test
+6-21lld/ELF/Relocations.cpp
+2-2lld/test/ELF/riscv-reloc-leb128.s
+1-1lld/ELF/InputSection.cpp
+262-1246 files

LLVM/project f7ca74fllvm/lib/Target/RISCV RISCVInstrInfoV.td, llvm/test/MC/RISCV/rvv zvlsseg-invalid.s

[RISCV] Add register overlap checks to the assembler for vector indexed segment load (#184569)

The destination vector register group cannot overlap the source vector
register group for vector indexed segment load. This patch is to add
register overlap checks to the assembler.
DeltaFile
+66-0llvm/test/MC/RISCV/rvv/zvlsseg-invalid.s
+4-0llvm/lib/Target/RISCV/RISCVInstrInfoV.td
+70-02 files

LLVM/project 76ffbc7llvm/lib/Target/AMDGPU SIRegisterInfo.cpp, llvm/test/CodeGen/AMDGPU vgpr-spill.mir

Review comments
DeltaFile
+8-7llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
+4-4llvm/test/CodeGen/AMDGPU/vgpr-spill.mir
+12-112 files

LLVM/project f712c97clang/include/clang/DependencyScanning ModuleDepCollector.h, clang/lib/DependencyScanning ModuleDepCollector.cpp

[clang][deps] Store `IgnoreCWD` on `ModuleDeps` (#184921)

This aligns us with downstream, where we need to be able to query
whether a module depends on CWD or not.
DeltaFile
+6-5clang/lib/DependencyScanning/ModuleDepCollector.cpp
+4-1clang/include/clang/DependencyScanning/ModuleDepCollector.h
+10-62 files

LLVM/project eaae8e2llvm/lib/Target/RISCV RISCVISelLowering.cpp

[RISCV] Remove outdated TODO in isExtractSubvectorCheap (#184938)

Index 0 is already handled by an early return, so the TODO comment about
extracting index 0 from a mask vector is no longer needed.
DeltaFile
+0-1llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+0-11 files

LLVM/project 0538d0aclang/lib/Basic/Targets NVPTX.h, clang/test/Misc nvptx.unsupported_core.cl nvptx.languageOptsOpenCL.cl

[NVPTX] Enable OpenCL 3d_image_writes support (#143331)

NV supports opencl_3d_image_writes according
https://developer.nvidia.com/blog/nvidia-is-now-opencl-3-0-conformant/

This PR allows removing explicit enabling of image extensions via
-cl-ext command line option, e.g. at
https://github.com/intel/llvm/blob/43b3d42e2b2060e9e9e3a96469a1982dc4c10ddd/libclc/CMakeLists.txt#L503
DeltaFile
+0-7clang/test/Misc/nvptx.unsupported_core.cl
+2-3clang/test/Misc/nvptx.languageOptsOpenCL.cl
+4-0clang/lib/Basic/Targets/NVPTX.h
+6-103 files

LLVM/project bcb85e3llvm/test/CodeGen/AArch64 clmul-fixed.ll, llvm/test/CodeGen/RISCV/rvv clmulh-sdnode.ll clmul-sdnode.ll

Merge branch 'main' into users/c8ef/fold_left_first
DeltaFile
+53,024-7,001llvm/test/CodeGen/RISCV/rvv/clmulh-sdnode.ll
+15,172-1,553llvm/test/CodeGen/RISCV/rvv/clmul-sdnode.ll
+6,520-0llvm/test/CodeGen/X86/bit-manip-i512.ll
+3,717-56llvm/test/CodeGen/AArch64/clmul-fixed.ll
+3,441-0llvm/test/MC/AMDGPU/gfx13_asm_vflat.s
+3,257-0llvm/test/CodeGen/X86/bit-manip-i256.ll
+85,131-8,6101,222 files not shown
+123,904-21,9951,228 files

LLVM/project 8acbd0clibcxx/test/libcxx/strings/basic.string/string.cons constexpr_initialization_stress.pass.cpp

5000

Created using spr 1.3.7
DeltaFile
+1-1libcxx/test/libcxx/strings/basic.string/string.cons/constexpr_initialization_stress.pass.cpp
+1-11 files

LLVM/project 61ead49clang/lib/CIR/CodeGen CIRGenBuiltinAArch64.cpp, clang/test/AST/HLSL Texture2D-vector-AST.hlsl Texture2D-scalar-AST.hlsl

Merge branch 'main' into users/vitalybuka/spr/libcxxstring-replace-asan-volatile-wrapper-with-memory-barrier
DeltaFile
+754-88clang/lib/CIR/CodeGen/CIRGenBuiltinAArch64.cpp
+726-0clang/test/AST/HLSL/Texture2D-vector-AST.hlsl
+722-0clang/test/AST/HLSL/Texture2D-scalar-AST.hlsl
+560-0clang/test/SemaSYCL/sycl-kernel-launch.cpp
+364-111llvm/test/CodeGen/AMDGPU/llvm.fptrunc.round.ll
+0-439clang/test/AST/HLSL/Texture2D-AST.hlsl
+3,126-638256 files not shown
+9,480-2,995262 files