LLVM/project b39a9dbclang/lib/Tooling/DependencyScanning ModuleDepCollector.cpp, clang/test/ClangScanDeps modules-current-modulemap-file-dep.c modules-header-sharing.m

[clang][deps] Add module map describing compiled module to file dependencies. (#160226)

When we add the module map describing the compiled module to the command
line, add it to the file dependencies as well.

Discovered while working on reproducers where a command line input was
missing in the captured files as it wasn't considered a dependency.
DeltaFile
+57-0clang/test/ClangScanDeps/modules-current-modulemap-file-dep.c
+7-0clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp
+2-1clang/test/ClangScanDeps/modules-header-sharing.m
+2-1clang/test/ClangScanDeps/modules-implementation-private.m
+2-1clang/test/ClangScanDeps/modules-implementation-module-map.c
+2-1clang/test/ClangScanDeps/modules-fmodule-name-no-module-built.m
+72-46 files

LLVM/project d033d69mlir/test/Dialect/Transform test-pass-application.mlir, mlir/test/Pass invalid-unsupported-operation.mlir

address comments
DeltaFile
+0-32mlir/test/Dialect/Transform/test-pass-application.mlir
+10-0mlir/test/Pass/invalid-unsupported-operation.mlir
+10-322 files

LLVM/project 13ed14fllvm/test/Transforms/SLPVectorizer/AMDGPU slp-v2f16.ll

AMDGPU: Autogenerate checks in a test (#168815)

DeltaFile
+345-52llvm/test/Transforms/SLPVectorizer/AMDGPU/slp-v2f16.ll
+345-521 files

LLVM/project 7198279clang/lib/AST/ByteCode Compiler.cpp, clang/test/AST/ByteCode switch.cpp

[clang][bytecode] Implement case ranges (#168418)

Fixes #165969

Implement GNU case ranges for constexpr bytecode interpreter.
DeltaFile
+103-0clang/test/AST/ByteCode/switch.cpp
+30-3clang/lib/AST/ByteCode/Compiler.cpp
+133-32 files

LLVM/project 47b756allvm/lib/Target/RISCV RISCVVLOptimizer.cpp, llvm/test/CodeGen/RISCV/rvv vl-opt.ll vl-opt.mir

[RISCV] Only reduce VLs of instructions with demanded VLs (#168693)

In RISCVVLOptimizer we first compute all the demanded VLs, then we walk
backwards through the function and try to reduce any VLs.

We don't actually need to walk backwards anymore since after #124530 the
order in which we modify the instructions doesn't matter.

This patch changes it to just iterate over the instructions with a
demanded VL computed, which means we don't iterate over scalar
instructions etc.

This also fixes #168665, where we triggered an assert on instructions
with a dead $vxsat implicit-def:

dead %x:vr = PseudoVSADDU_VV_M1 $noreg, $noreg, $noreg, -1, 3 /* e8 */,
0 /* tu, mu */, implicit-def dead $vxsat

Because $vxsat is a reserved register, DeadMachineInstructionElim won't

    [6 lines not shown]
DeltaFile
+24-31llvm/lib/Target/RISCV/RISCVVLOptimizer.cpp
+25-0llvm/test/CodeGen/RISCV/rvv/vl-opt.ll
+12-0llvm/test/CodeGen/RISCV/rvv/vl-opt.mir
+61-313 files

LLVM/project 19a4296llvm/lib/Transforms/Vectorize VectorCombine.cpp, llvm/test/Transforms/VectorCombine/AMDGPU extract-insert-i8.ll

VectorCombine: Improve the insert/extract fold in the narrowing case

Keeping the extracted element in a natural position in the narrowed
vector has two beneficial effects:

1. It makes the narrowing shuffles cheaper (at least on AMDGPU), which
   allows the insert/extract fold to trigger.
2. It makes the narrowing shuffles in a chain of extract/insert
   compatible, which allows foldLengthChangingShuffles to successfully
   recognize a chain that can be folded.

There are minor X86 test changes that look reasonable to me. The IR
change for AVX2 in llvm/test/Transforms/VectorCombine/X86/extract-insert-poison.ll
doesn't change the assembly generated by `llc -mtriple=x86_64-- -mattr=AVX2`
at all.

commit-id:c151bb04
DeltaFile
+6-16llvm/lib/Transforms/Vectorize/VectorCombine.cpp
+2-15llvm/test/Transforms/VectorCombine/AMDGPU/extract-insert-i8.ll
+8-4llvm/test/Transforms/VectorCombine/X86/extract-insert-poison.ll
+4-4llvm/test/Transforms/VectorCombine/X86/extract-insert.ll
+2-2llvm/test/Transforms/VectorCombine/X86/pr126085.ll
+22-415 files

LLVM/project 21c74b5llvm/lib/Transforms/Vectorize VectorCombine.cpp, llvm/test/Transforms/VectorCombine/AMDGPU extract-insert-i8.ll

VectorCombine: Fold chains of shuffles fed by length-changing shuffles

Such chains can arise from folding insert/extract chains.

commit-id:a960175d
DeltaFile
+168-0llvm/lib/Transforms/Vectorize/VectorCombine.cpp
+8-33llvm/test/Transforms/VectorCombine/AMDGPU/extract-insert-i8.ll
+176-332 files

LLVM/project 0be4495llvm/lib/Target/AMDGPU AMDGPUTargetTransformInfo.cpp, llvm/test/Analysis/CostModel/AMDGPU shufflevector.ll

AMDGPU: Improve getShuffleCost accuracy for 8- and 16-bit shuffles

These shuffles can always be implemented using v_perm_b32, and so this
rewrites the analysis from the perspective of "how many v_perm_b32s does
it take to assemble each register of the result?"

The test changes in Transforms/SLPVectorizer/reduction.ll are
reasonable: VI (gfx8) has native f16 math, but not packed math.

commit-id:8b76e888
DeltaFile
+498-488llvm/test/Analysis/CostModel/AMDGPU/shufflevector.ll
+107-20llvm/test/Transforms/SLPVectorizer/AMDGPU/reduction.ll
+92-30llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp
+33-64llvm/test/Transforms/VectorCombine/AMDGPU/extract-insert-i8.ll
+17-34llvm/test/Transforms/SLPVectorizer/AMDGPU/slp-v2f16.ll
+1-31llvm/test/Transforms/VectorCombine/AMDGPU/extract-insert-chain-to-shuffles.ll
+748-6676 files

LLVM/project 6683adbllvm/test/Transforms/VectorCombine/AMDGPU extract-insert-chain-to-shuffles.ll extract-insert-i8.ll

VectorCombine/AMDGPU: Cleanup a test and add a new one

The existing, recently added test contains a whole lot of noise in the
form of dead instructions. Also, prefer named values.

The new test isolates a separate issue with concatenating i8 vectors.

commit-id:0b9965b7
DeltaFile
+47-531llvm/test/Transforms/VectorCombine/AMDGPU/extract-insert-chain-to-shuffles.ll
+186-0llvm/test/Transforms/VectorCombine/AMDGPU/extract-insert-i8.ll
+233-5312 files

LLVM/project ba875dbclang/lib/CodeGen CGExpr.cpp, compiler-rt/lib/ubsan_minimal ubsan_minimal_handlers.cpp

[𝘀𝗽𝗿] initial version

Created using spr 1.3.7
DeltaFile
+13-39compiler-rt/lib/ubsan_minimal/ubsan_minimal_handlers.cpp
+0-16clang/lib/CodeGen/CGExpr.cpp
+1-1compiler-rt/test/ubsan_minimal/TestCases/null.cpp
+1-1compiler-rt/test/ubsan_minimal/TestCases/misalignment.cpp
+15-574 files

LLVM/project 3f151a3libcxx/include/__memory shared_ptr.h unique_ptr.h, libcxx/test/libcxx/utilities/smartptr nodiscard.verify.cpp

[libc++][memory] Applied `[[nodiscard]]` to smart pointers (#168483)

Applied `[[nodiscard]]` where relevant to smart pointers and related
functions.

- [x] - `std::unique_ptr`
- [x] - `std::shared_ptr`
- [x] - `std::weak_ptr`

See guidelines:
-
https://libcxx.llvm.org/CodingGuidelines.html#apply-nodiscard-where-relevant
- `[[nodiscard]]` should be applied to functions where discarding the
return value is most likely a correctness issue. For example a locking
constructor in unique_lock.

---------

Co-authored-by: Hristo Hristov <zingam at outlook.com>
DeltaFile
+127-2libcxx/test/libcxx/utilities/smartptr/nodiscard.verify.cpp
+57-41libcxx/include/__memory/shared_ptr.h
+13-8libcxx/include/__memory/unique_ptr.h
+1-1libcxx/test/std/utilities/memory/util.smartptr/util.smartptr.shared/util.smartptr.shared.obs/unique.deprecated_in_cxx17.verify.cpp
+198-524 files

LLVM/project 9c66069clang/include/clang/Options Options.td, clang/lib/Driver SanitizerArgs.cpp

rename

Created using spr 1.3.7
DeltaFile
+5-5clang/lib/Driver/SanitizerArgs.cpp
+2-2clang/include/clang/Options/Options.td
+2-2clang/test/Driver/fsanitize.c
+1-1clang/test/CodeGen/cfi-icall-trap-recover-runtime.c
+1-1clang/test/CodeGenCXX/cfi-vcall-trap-recover-runtime.cpp
+1-1compiler-rt/test/ubsan_minimal/TestCases/override-callback.c
+12-126 files

LLVM/project c21b8a7clang/include/clang/Options Options.td, clang/lib/Driver SanitizerArgs.cpp

[𝘀𝗽𝗿] changes introduced through rebase

Created using spr 1.3.7

[skip ci]
DeltaFile
+5-5clang/lib/Driver/SanitizerArgs.cpp
+2-2clang/include/clang/Options/Options.td
+2-2clang/test/Driver/fsanitize.c
+9-93 files

LLVM/project 5a7c143clang/include/clang/Options Options.td, clang/lib/Driver SanitizerArgs.cpp

rename

Created using spr 1.3.7
DeltaFile
+5-5clang/lib/Driver/SanitizerArgs.cpp
+2-2clang/include/clang/Options/Options.td
+2-2clang/test/Driver/fsanitize.c
+9-93 files

LLVM/project fda20d9utils/bazel/llvm-project-overlay/llvm BUILD.bazel

[bazel] Fix #165009 (#168804)

DeltaFile
+11-0utils/bazel/llvm-project-overlay/llvm/BUILD.bazel
+11-01 files

LLVM/project 79fffedllvm/lib/DWP DWP.cpp, llvm/test/tools/llvm-dwp/X86 incompatible_dwarf_version.test

[llvm-dwp] Give more information when incompatible version found (#168511)

Provide more information when detecting a DWARF version mismatch in .dwo
files to help locate the issue and align with other similar errors.
DeltaFile
+6-1llvm/lib/DWP/DWP.cpp
+6-0llvm/test/tools/llvm-dwp/X86/incompatible_dwarf_version.test
+12-12 files

LLVM/project beac880compiler-rt/test/asan/TestCases stack_container_dynamic_lib.cpp

Better fix for the stack_container_dynamic_lib test (#168798)

Add the missing %libdl to the link command
DeltaFile
+2-5compiler-rt/test/asan/TestCases/stack_container_dynamic_lib.cpp
+2-51 files

LLVM/project c36dd77clang/include/clang/Options Options.td, clang/lib/CodeGen CGExpr.cpp

rename

Created using spr 1.3.7
DeltaFile
+6-6llvm/lib/Passes/PassBuilder.cpp
+6-5clang/lib/Driver/SanitizerArgs.cpp
+4-3clang/lib/CodeGen/CGExpr.cpp
+3-3llvm/include/llvm/Transforms/Instrumentation/BoundsChecking.h
+3-3clang/include/clang/Options/Options.td
+2-2clang/test/Driver/fsanitize.c
+24-228 files not shown
+33-3014 files

LLVM/project adac920clang/include/clang/Basic CodeGenOptions.def, clang/include/clang/Driver SanitizerArgs.h

[𝘀𝗽𝗿] changes introduced through rebase

Created using spr 1.3.7

[skip ci]
DeltaFile
+6-5clang/lib/Driver/SanitizerArgs.cpp
+3-3clang/include/clang/Options/Options.td
+2-2clang/test/Driver/fsanitize.c
+1-1clang/include/clang/Basic/CodeGenOptions.def
+1-1clang/include/clang/Driver/SanitizerArgs.h
+13-125 files

LLVM/project d223103clang/include/clang/Basic CodeGenOptions.def, clang/include/clang/Driver SanitizerArgs.h

rename

Created using spr 1.3.7
DeltaFile
+6-5clang/lib/Driver/SanitizerArgs.cpp
+3-3clang/include/clang/Options/Options.td
+2-2clang/test/Driver/fsanitize.c
+1-1clang/include/clang/Driver/SanitizerArgs.h
+1-1clang/include/clang/Basic/CodeGenOptions.def
+13-125 files

LLVM/project bc2f9a5clang-tools-extra/clang-doc ClangDoc.cpp Representation.h

[clang-doc] Remove unused headers

Removes unused headers or replaces them with headers that directly
provide the symbol instead. For example, `Serialize.h` included `AST.h`,
but it was actually `Serialize.cpp` that needed concept expressions, so
now it includes just `ExprConcepts.h`.
DeltaFile
+0-4clang-tools-extra/clang-doc/ClangDoc.cpp
+1-3clang-tools-extra/clang-doc/Representation.h
+0-4clang-tools-extra/clang-doc/BitcodeWriter.h
+0-3clang-tools-extra/clang-doc/Serialize.h
+2-0clang-tools-extra/clang-doc/Serialize.cpp
+0-2clang-tools-extra/clang-doc/ClangDoc.h
+3-165 files not shown
+4-2311 files

LLVM/project 9e9fe08llvm/lib/Transforms/Vectorize LoadStoreVectorizer.cpp, llvm/test/CodeGen/AMDGPU splitkit-getsubrangeformask.ll fmul-2-combine-multi-use.ll

Re-land [Transform][LoadStoreVectorizer] allow redundant in Chain (#168135)

This is the fixed version of
https://github.com/llvm/llvm-project/pull/163019
DeltaFile
+83-88llvm/test/CodeGen/AMDGPU/splitkit-getsubrangeformask.ll
+65-87llvm/test/CodeGen/AMDGPU/fmul-2-combine-multi-use.ll
+109-0llvm/test/Transforms/LoadStoreVectorizer/AMDGPU/vectorize-redund-loads.ll
+69-34llvm/lib/Transforms/Vectorize/LoadStoreVectorizer.cpp
+57-40llvm/test/Transforms/LoadStoreVectorizer/AMDGPU/multiple_tails.ll
+25-26llvm/test/CodeGen/AMDGPU/branch-folding-implicit-def-subreg.ll
+408-27510 files not shown
+476-33916 files

LLVM/project be68c42utils/bazel/llvm-project-overlay/llvm BUILD.bazel

buildifier fix
DeltaFile
+2-2utils/bazel/llvm-project-overlay/llvm/BUILD.bazel
+2-21 files

LLVM/project 9536234llvm/lib/Target/AMDGPU GCNHazardRecognizer.cpp

[AMDGPU] Allow hazard checks for WMMA co-exec

Now we are just inserting V_NOP instrtuctions, try to schedule
something into the shadow.

It is still somewhat imprecise, for example AdvanceCycle() will
use TII.getNumWaitStates() anyway, but in a scheduling mode
we are not required to be precise. We must be finally precise
in the hazard recognizer mode. Then EmittedInstrs buffer is also
limited to MaxLookAhead even though VALU only hazards may actually
never expire and require an endless buffer. But that's OK, we can
at least mitigate what the buffer can hold. The buffer is also
currently much bigger than any of VALU hazards may need.

That said the rest of the 'fix*' functions here can be changed
the same way, these which are using V_NOPs. This one is just the
worst because it may require up to 9 nops.
DeltaFile
+6-0llvm/lib/Target/AMDGPU/GCNHazardRecognizer.cpp
+6-01 files

LLVM/project 148faa6utils/bazel/llvm-project-overlay/llvm BUILD.bazel

[bazel] Fix #165009
DeltaFile
+11-0utils/bazel/llvm-project-overlay/llvm/BUILD.bazel
+11-01 files

LLVM/project f969694clang/include/clang/Basic TargetID.h, clang/lib/Basic TargetID.cpp

[ClangLinkerWrapper] Refactor target ID sanitization for Windows file… (#168744)

… names

Fix non-RDC mode HIP compilation for the new driver on Windows due to
invalid temporary file names when offload arch is a target ID containing
':', which is invalid in file names on Windows.

Refactor the existing handling of ':' in file names on Windows from
clang driver into a shared function sanitizeTargetIDInFileName in
clang/Basic/TargetID.h. This function replaces ':' with '@' on Windows
only, preserving the original behavior.

Update both clang/lib/Driver/Driver.cpp and
clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp to use this
shared function, ensuring consistent handling across both tools.
DeltaFile
+24-8clang/test/Driver/linker-wrapper-hip-no-rdc.c
+10-0clang/lib/Basic/TargetID.cpp
+1-6clang/lib/Driver/Driver.cpp
+4-3clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
+5-0clang/include/clang/Basic/TargetID.h
+44-175 files

LLVM/project def8ecbmlir/lib/Dialect/Tosa/Transforms TosaDecomposeTransposeConv.cpp TosaDecomposeDepthwise.cpp, mlir/test/Dialect/Tosa tosa-decompose-depthwise.mlir tosa-decompose-transpose-conv.mlir

[tosa] : Relax dynamic dimension checks for batch for conv decompositions (#168764)

This PR relaxes the validation checks to allow input/output data to have
dynamic batch dimensions.
DeltaFile
+23-0mlir/test/Dialect/Tosa/tosa-decompose-depthwise.mlir
+21-0mlir/test/Dialect/Tosa/tosa-decompose-transpose-conv.mlir
+14-4mlir/lib/Dialect/Tosa/Transforms/TosaDecomposeTransposeConv.cpp
+7-2mlir/lib/Dialect/Tosa/Transforms/TosaDecomposeDepthwise.cpp
+65-64 files

LLVM/project 2c3aa92llvm/lib/Transforms/Vectorize SLPVectorizer.cpp, llvm/test/Transforms/SLPVectorizer/X86 reused-last-instruction-in-split-node.ll matching-insert-point-for-nodes.ll

[SLP]Fix insertion point for setting for the nodes

The problem with the many def-use chain problems in SLP vectorizer are
related to the fact that some nodes reuse the same instruction as
insertion point. Insertion point is not the instruction, but the place
between instructions. To set it correctly, better to generate pseudo
instruction immediately after the last instruction, and use it as
insertion point. It resolves the issues in most cases.

Fixes #168512 #168576
DeltaFile
+148-0llvm/test/Transforms/SLPVectorizer/X86/reused-last-instruction-in-split-node.ll
+91-0llvm/test/Transforms/SLPVectorizer/X86/matching-insert-point-for-nodes.ll
+15-0llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+2-2llvm/test/Transforms/SLPVectorizer/X86/gathered-node-with-in-order-parent.ll
+2-2llvm/test/Transforms/SLPVectorizer/X86/shuffle-mask-emission.ll
+258-45 files

LLVM/project 5170c53llvm/lib/Target/AMDGPU GCNHazardRecognizer.cpp GCNHazardRecognizer.h

[AMDGPU] Refactor hazard recognizer for VALU-pipeline hazards. NFCI.

This is in preparation of handling these in scheduler. I do not expect
any changes to the produce code here, it is just an infrastructure.
Our current problem with the VALU pipeline hazards is that we only
insert V_NOP instructions in the hazard recognizer mode, but ignore
it during scheduling. This patch is meant to create a mechanism to
actually account for that during scheduling.
DeltaFile
+43-38llvm/lib/Target/AMDGPU/GCNHazardRecognizer.cpp
+10-3llvm/lib/Target/AMDGPU/GCNHazardRecognizer.h
+53-412 files

LLVM/project 4e275f7clang/lib/CodeGen ABIInfo.cpp ABIInfo.h, clang/lib/CodeGen/Targets AArch64.cpp X86.cpp

[Arm64EC][clang] Implement varargs support in clang. (#152411)

The clang side of the calling convention code for arm64 vs. arm64ec is
close enough that this isn't really noticeable in most cases, but the
rule for choosing whether to pass a struct directly or indirectly is
significantly different.

(Adapted from my old patch https://reviews.llvm.org/D125419 .)

Fixes #89615.
DeltaFile
+79-0clang/test/CodeGen/arm64ec-varargs.c
+32-12clang/lib/CodeGen/Targets/AArch64.cpp
+6-0clang/lib/CodeGen/Targets/X86.cpp
+5-0clang/lib/CodeGen/ABIInfo.cpp
+4-0clang/lib/CodeGen/ABIInfo.h
+126-125 files