LLVM/project 20b0089lld/ELF SymbolTable.cpp

[ELF] Remove redundant memset in SymbolTable::insert. NFC (#198132)

make<SymbolUnion>() value-initializes the union, zero-initializing all
sizeof(SymbolUnion) bytes. The following memset(sym, 0, sizeof(Symbol))
is therefore redundant.

This placeholder path runs no Symbol constructor, so it was not covered
by the constructor initialization in
905a88b923433eb8cd83677ea55bee82eb9ba498.
DeltaFile
+2-2lld/ELF/SymbolTable.cpp
+2-21 files

LLVM/project 19502e4llvm/test/CodeGen/AMDGPU/GlobalISel sdivrem.ll udivrem.ll, llvm/test/CodeGen/Thumb2 mve-clmul.ll

Rebase

Created using spr 1.3.7
DeltaFile
+8,633-8,584llvm/test/CodeGen/Thumb2/mve-clmul.ll
+3,436-2,769llvm/test/CodeGen/AMDGPU/GlobalISel/sdivrem.ll
+2,801-2,109llvm/test/CodeGen/AMDGPU/GlobalISel/udivrem.ll
+0-4,752llvm/test/tools/llvm-mca/RISCV/SiFiveP800/vlseg-vsseg.s
+4,549-0llvm/test/tools/llvm-mca/RISCV/SiFiveP800/rvv/arithmetic.test
+3,706-328llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-load-local.mir
+23,125-18,5422,566 files not shown
+155,715-74,0332,572 files

LLVM/project f70897fllvm/lib/Target/X86 X86ISelLowering.cpp, llvm/test/CodeGen/X86 vector-replicaton-i1-mask.ll

[X86] isExtractSubvectorCheap - fix typo in vXi1 extraction test (#198127)

Fix typo in check for ResVT subvector being half the size of the SrcVT vector (instead of vice-versa).

Fixes #195695
DeltaFile
+1,243-8,768llvm/test/CodeGen/X86/vector-replicaton-i1-mask.ll
+3-2llvm/lib/Target/X86/X86ISelLowering.cpp
+1,246-8,7702 files

LLVM/project 905a88blld/ELF Symbols.h InputFiles.cpp

[ELF] Initialize Symbol fields in the constructor instead of via memset (#198129)

`initSectionsAndLocalSyms` and `makeDefined` memset the storage to zero
and then placement-new a Symbol-derived object into it. Placement new
begins a new object's lifetime. The standard does not seem to guarantee
the memset bytes carry into members the constructor leaves
uninitialized.

lld built by GCC 16 can make Valgrind report reads of Symbol::flags
(via getSymSectionIndex during finalizeSections) as uses of
uninitialized values (ClangBuiltLinux/linux#2162).

This patch reinstates the per-field initialization that commit
778742760534 ("[ELF] Avoid redundant assignment to Symbol fields. NFC")
had replaced with a bulk memset.
DeltaFile
+12-8lld/ELF/Symbols.h
+0-2lld/ELF/InputFiles.cpp
+12-102 files

LLVM/project 2e4c820llvm/lib/Transforms/Vectorize VPlanCFG.h VPlanConstruction.cpp

[VPlan] Refine plain CFG iterator name and strengthen assert (NFC). (#198124)

Address post-commit comments for
https://github.com/llvm/llvm-project/pull/197499:
* add rpo prefix the name to indicate traversal (similar to other
vp_depth_first_ helpers)
 * Added comment about skipped VPIRBBs + assert.
DeltaFile
+8-3llvm/lib/Transforms/Vectorize/VPlanCFG.h
+1-1llvm/lib/Transforms/Vectorize/VPlanConstruction.cpp
+9-42 files

LLVM/project e024375llvm/lib/Transforms/Vectorize VPlanUtils.h VPlanTransforms.cpp

[VPlan] Add blocksAs helper (NFC). (#198122)

Add new blocksAs helper which casts all blocks in the provided range to
the specified type, instead of filtering out non-matching blocks.
Migrate a number of users that expect only VPBasicBlocks.

Pointed out post-commit in
https://github.com/llvm/llvm-project/pull/197499.
DeltaFile
+11-0llvm/lib/Transforms/Vectorize/VPlanUtils.h
+3-3llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
+2-2llvm/lib/Transforms/Vectorize/VPlanUnroll.cpp
+1-1llvm/lib/Transforms/Vectorize/VPlanCFG.h
+17-64 files

LLVM/project 7f0fe47llvm/utils/gn/secondary/llvm/tools/llvm-ir2vec/lib BUILD.gn

[gn build] Port 6ea6d51c0b88 (#198108)
DeltaFile
+4-1llvm/utils/gn/secondary/llvm/tools/llvm-ir2vec/lib/BUILD.gn
+4-11 files

LLVM/project 1f53485llvm/lib/Target/RISCV/MCTargetDesc RISCVELFStreamer.cpp RISCVBaseInfo.h

[RISC-V][RVY] Introduce pure-capability ABI names

Adding this will allow updating #177249 to define the datalayout only
based on the triple and ABI instead of inspecting the feature string
which is a per-function property and not a per-module one.

The RVY ABIs are currently under review at this psABI pull request:
https://github.com/riscv-non-isa/riscv-elf-psabi-doc/pull/499

Pull Request: https://github.com/llvm/llvm-project/pull/194270
DeltaFile
+7-0llvm/lib/Target/RISCV/MCTargetDesc/RISCVELFStreamer.cpp
+7-0llvm/lib/Target/RISCV/MCTargetDesc/RISCVBaseInfo.h
+7-0llvm/lib/Target/RISCV/MCTargetDesc/RISCVBaseInfo.cpp
+21-03 files

LLVM/project 6e8b6efllvm/lib/Transforms/Vectorize SLPVectorizer.cpp, llvm/test/Transforms/SLPVectorizer/AArch64 scalable-type-revec.ll

[SLP][REVEC] Fix crash on scalable vector types with -slp-revec

isValidElementType incorrectly called toScalarizedTy for scalable vector
types because isVectorizedTy returns true for all vector types. This let
scalable types pass as valid revectorization elements, causing a fatal
"Cannot implicitly convert a scalable size to a fixed-width size" error
in getNumElements when it called getVectorizedTypeVF(Ty).getFixedValue().

Fixes #198076

Reviewers: 

Pull Request: https://github.com/llvm/llvm-project/pull/198123
DeltaFile
+22-0llvm/test/Transforms/SLPVectorizer/AArch64/scalable-type-revec.ll
+1-1llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+23-12 files

LLVM/project fc6996bllvm/lib/Transforms/Vectorize VPlanConstruction.cpp VPlanTransforms.h, llvm/test/Transforms/LoopVectorize/VPlan predicator.ll tail-folding.ll

[VPlan] Split out adding canonical IV recipes to separate transform. (#197541)

Introduce canonical IV recipes after initial scalar
transformations/simplifications. Conceptually it is a separate
transformation and moving it later simplifies initial construction The
canonical IV is only needed once we handle early exits/introduce
regions.

This is needed to compute costs of scalar VPlans, where we need to
compare the cost of the original loop control instructions.

PR: https://github.com/llvm/llvm-project/pull/197541
DeltaFile
+13-21llvm/lib/Transforms/Vectorize/VPlanConstruction.cpp
+14-0llvm/test/Transforms/LoopVectorize/VPlan/predicator.ll
+7-4llvm/unittests/Transforms/Vectorize/VPlanTestBase.h
+6-3llvm/lib/Transforms/Vectorize/VPlanTransforms.h
+5-4llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+8-0llvm/test/Transforms/LoopVectorize/VPlan/tail-folding.ll
+53-324 files not shown
+67-3210 files

LLVM/project f724d70llvm/lib/Target/AArch64 AArch64ISelLowering.cpp, llvm/test/CodeGen/AArch64 mul-const-addressing-mode.ll

[AArch64] Decompose constant multiplies used only by memory addresses (#194584)

AArch64 currently avoids decomposing a constant multiply when the
multiply has
a single ADD/SUB user, preserving the opportunity to form MADD/MSUB.

That heuristic is too conservative when the ADD/SUB is used only as a
memory
address. In that case the ADD/SUB is selected as part of load/store
address
mode selection, so preserving the multiply does not produce MADD/MSUB
and
prevents the existing constant-multiply decomposition from exposing
ADD/LSL
forms usable by AArch64 register-offset addressing.

Relax the bailout for ADD/SUB users that are only used as unindexed
load/store
base addresses.

Fixes #161446.
DeltaFile
+137-0llvm/test/CodeGen/AArch64/mul-const-addressing-mode.ll
+37-3llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+174-32 files

LLVM/project 8156fcellvm/lib/Transforms/Vectorize SLPVectorizer.cpp, llvm/test/Transforms/SLPVectorizer/X86 reused-extract-scalar-lanes.ll

[SLP] Prefer VF-matching scalar-set match in gather-shuffle lookup

In isGatherShuffledSingleRegisterEntry, the perfect-match search accepted
an entry that isSame(TE->Scalars) regardless of the entry's vector factor.
isSame can succeed via ReuseShuffleIndices on an entry whose actual VF is
smaller than TE->Scalars.size(); the subsequent mask construction then
copies TE->getCommonMask() indices that overrun the chosen source's lanes,
producing wrong shufflevector masks and a more-poisonous result than the
scalar code.

Fixes #197765

Reviewers: 

Pull Request: https://github.com/llvm/llvm-project/pull/198120
DeltaFile
+3-1llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+1-3llvm/test/Transforms/SLPVectorizer/X86/reused-extract-scalar-lanes.ll
+4-42 files

LLVM/project 078f0e7llvm/test/Transforms/SLPVectorizer/X86 reused-extract-scalar-lanes.ll

[SLP][NFC]Add a test with the incorrect shuffle for matched entries, NFC



Reviewers: 

Pull Request: https://github.com/llvm/llvm-project/pull/198119
DeltaFile
+164-0llvm/test/Transforms/SLPVectorizer/X86/reused-extract-scalar-lanes.ll
+164-01 files

LLVM/project b1a8b4fllvm/lib/Transforms/Vectorize VPlanTransforms.cpp

[LV] Update stale comment for partial reduction operands (NFC)

The `neg` form was removed in #187228 (this case now uses the out-of-loop sub, which is preferable, see #189739).
DeltaFile
+0-2llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
+0-21 files

LLVM/project b2366c4llvm/utils/FileCheck FileCheck.cpp

[FileCheck][NFC] Introduce MarkerRange for -dump-input (#196800)

`MarkerRange` makes the computation of marker ranges clearer because it
encapsulates handling of several subtle boundary cases:
- It handles adjustments to line numbers when a range boundary appears
  at a line boundary.
- It avoids related mistakes in determining whether the range is
  contained within a single line.
- It avoids the mistake of producing no marker in an input annotation
  for an empty range.

It will be used more in a future patch that extends `-dump-input` to
present search ranges for all errors.

This PR is stacked on PR #196799.
DeltaFile
+85-39llvm/utils/FileCheck/FileCheck.cpp
+85-391 files

LLVM/project ea2eeb2llvm/test/CodeGen/AMDGPU/GlobalISel srem.i64.ll sdiv.i64.ll

[AArch64][GlobalISel] Improve multiplication with multiple registers (#197943)

When working on codegen for `llvm.umul.fix.sat` I have recognized that
among for many things, GISel also generates worse code for mul when the
data is in multiple registers (for example when the register length is
64 bits but you want to multiply two 128 bit values).

Here is the example ll:
```
define i128 @i128(i128 %a, i128 %b) {
entry:
  %s = mul i128 %a, %b
  ret i128 %s
}
```

This is what GISel gave:
```
  mul   x9, x0, x3

    [19 lines not shown]
DeltaFile
+945-949llvm/test/CodeGen/AMDGPU/GlobalISel/srem.i64.ll
+942-946llvm/test/CodeGen/AMDGPU/GlobalISel/sdiv.i64.ll
+411-384llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-srem.mir
+411-384llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-sdiv.mir
+384-384llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-udiv.mir
+384-384llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-urem.mir
+3,477-3,43128 files not shown
+5,654-5,65934 files

LLVM/project fa4d737llvm/lib/Target/X86 X86ISelLowering.cpp

[X86] LowerVECREDUCE - add AllowScalarization operand (#198109)

Pull out scalarization control from LowerVECREDUCE call to allow
different decisions based on the VECREDUCE opcode in future patches
DeltaFile
+5-5llvm/lib/Target/X86/X86ISelLowering.cpp
+5-51 files

LLVM/project ce6661dllvm/utils/FileCheck FileCheck.cpp

Simplify comments and a line of code
DeltaFile
+20-44llvm/utils/FileCheck/FileCheck.cpp
+20-441 files

LLVM/project 0d7bccfllvm/lib/CodeGen AtomicExpandPass.cpp, llvm/test/CodeGen/ARM atomic-load-store.ll

[AtomicExpand] Add bitcasts when expanding store atomic vector

AtomicExpand fails for aligned \`store atomic <n x T>\` because it
does not find a compatible library call. This change adds appropriate
ptrtoint + bitcast so that the call can be lowered, mirroring the
load-side handling from #148900.
DeltaFile
+99-6llvm/test/CodeGen/X86/atomic-load-store.ll
+98-0llvm/test/Transforms/AtomicExpand/X86/expand-atomic-non-integer.ll
+49-0llvm/test/CodeGen/ARM/atomic-load-store.ll
+4-2llvm/lib/CodeGen/AtomicExpandPass.cpp
+250-84 files

LLVM/project 222484bclang/include/clang/Basic CodeGenOptions.def, clang/include/clang/Options Options.td

Remove default setting signaling_nan attribute for strictfp functions

We cannot describe such behavior in Clang User Manual, strictfp is not
visible for a user.
DeltaFile
+8-9clang/include/clang/Options/Options.td
+4-8clang/lib/Driver/ToolChains/Clang.cpp
+5-5clang/test/CodeGen/fp-floatcontrol-stack.cpp
+4-4clang/test/Driver/fp-model.c
+1-6clang/lib/CodeGen/CodeGenFunction.cpp
+1-4clang/include/clang/Basic/CodeGenOptions.def
+23-366 files not shown
+30-4412 files

LLVM/project d90baa0lldb/source/Commands CommandObjectBreakpoint.cpp CommandObjectTarget.cpp

[lldb] Make CommandObject::GetTarget filter out the dummy target (#198026)

Follow-up to #197805. Make CommandObject::GetTarget the canonical target
accessor for command code, and tighten its semantics so that DoExecute
methods can't accidentally operate on the dummy target.

GetTarget now returns Target* instead of Target&. The result is the
target from the command's frozen execution context, falling back to the
interpreter's execution context. The dummy target is filtered out and
replaced with nullptr unless the command opts in via one of the
eCommandRequires{Target,Process,Thread,Frame} flags (in which case
CheckRequirements has already guaranteed a real target) or via the new
eCommandAllowsDummyTarget flag.

This is the first half of the cleanup discussed at the end of #197805. A
follow-up will audit DoExecute methods that still reach for
GetSelectedTarget or m_exe_ctx.GetTargetPtr() directly and migrate them
to GetTarget.
DeltaFile
+161-149lldb/source/Commands/CommandObjectBreakpoint.cpp
+143-114lldb/source/Commands/CommandObjectTarget.cpp
+55-48lldb/source/Commands/CommandObjectWatchpoint.cpp
+49-46lldb/source/Commands/CommandObjectSource.cpp
+20-23lldb/source/Commands/CommandObjectProcess.cpp
+24-18lldb/source/Commands/CommandObjectFrame.cpp
+452-39814 files not shown
+578-49120 files

LLVM/project b110a11llvm/include/llvm/Target TargetSelectionDAG.td, llvm/lib/Target/X86 X86InstrFragmentsSIMD.td X86InstrAVX512.td

[X86] Cast atomic vectors in IR to support floats

Extend the X86 \`alignedstore\` PatFrag to also match \`atomic_store\`
with vector-size alignment, so existing MOVAPS/MOVAPD/MOVDQA-family
aligned-store patterns cover 128-bit aligned vector atomic stores on
SSE/AVX/AVX-512 without per-type duplicates. \`<4 x float>\`,
\`<2 x double>\`, \`<2 x i64>\`, \`<4 x i32>\`, \`<8 x half>\`, \`<8 x bfloat>\`
all codegen to a single \`movaps\`/\`movapd\` on AVX+ via this.

Adds v8f16/v8bf16 bitconvert variants to the widen-path
\`atomic_store_32\` / \`atomic_store_64\` patterns so \`<2 x half>\`,
\`<2 x bfloat>\`, \`<4 x half>\`, \`<4 x bfloat>\` atomic stores reaching
the PR4 widen path also collapse to a single instruction on AVX+
targets.

Vectors whose \`getTypeAction\` is split rather than widen still rely
on PR6's \`SplitVecOp_ATOMIC_STORE\` — that path bitcasts the vector
to a scalar integer and issues an integer \`atomic_store_N\`, picked
up by the pre-existing scalar atomic-store patterns. The two

    [4 lines not shown]
DeltaFile
+86-0llvm/test/CodeGen/X86/atomic-load-store.ll
+5-4llvm/lib/Target/X86/X86InstrFragmentsSIMD.td
+1-1llvm/include/llvm/Target/TargetSelectionDAG.td
+1-1llvm/lib/Target/X86/X86InstrAVX512.td
+93-64 files

LLVM/project e4c9611llvm/lib/CodeGen/SelectionDAG LegalizeVectorTypes.cpp LegalizeTypes.h, llvm/test/CodeGen/X86 atomic-load-store.ll

[SelectionDAG] Split vector types for atomic store

Vector types that aren't widened are split so that a single ATOMIC_STORE
is issued for the entire vector at once. This enables SelectionDAG to
translate vectors with type bfloat,half.
DeltaFile
+440-0llvm/test/CodeGen/X86/atomic-load-store.ll
+20-0llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
+1-0llvm/lib/CodeGen/SelectionDAG/LegalizeTypes.h
+461-03 files

LLVM/project 95ab456llvm/test/Transforms/JumpTableToSwitch profile-no-guid-metadata.ll

[JTS] Drop test for multiple zero values in VP metadata

This will soon become a verifier failure. Drop the test so that we can
actually enforce this in the verifier without causing test failures.

Reviewers: mtrofin

Pull Request: https://github.com/llvm/llvm-project/pull/197617
DeltaFile
+4-38llvm/test/Transforms/JumpTableToSwitch/profile-no-guid-metadata.ll
+4-381 files

LLVM/project f587a58llvm/lib/Transforms/Instrumentation PGOMemOPSizeOpt.cpp

[PGO] Remove pgo-memop-opt VP metadata verification

This is no longer necessary now that we are explicitly deduplicating
values at construction time. This will also soon be enforced in the
verifier.

https://reviews.llvm.org/D92074 and https://reviews.llvm.org/D136211
have more context on the introduction of this check/its evolution.

Reviewers: mtrofin

Pull Request: https://github.com/llvm/llvm-project/pull/197616
DeltaFile
+0-7llvm/lib/Transforms/Instrumentation/PGOMemOPSizeOpt.cpp
+0-71 files

LLVM/project 1d14696llvm/lib/ProfileData InstrProf.cpp, llvm/test/Transforms/PGOProfile consecutive-zeros-metadata.ll

[InstrProf] Deduplicate VP values

Zero VP values can come up in some places. They are intentional around
external symbols for indirect call sites, and it seems like they might
be unintentional around memop VP metadata
(https://reviews.llvm.org/D92074). This patch combines them so that we
can enforce the variant that there are no duplicate values in VP
metadata, which allows passes to make some simplifying assumptions. We
also deduplicate non-zero values, because there is error handling for
them and still some undebugged cases where they show up
(https://reviews.llvm.org/D136211).

This ended up being a bit messier than I would like due to the need to
handle non-zero duplicate values and preserve existing error handling
behavior in llvm-profdata. I've left comments explaining this so we can
hopefully clean this up when llvm-profdata eventually gets fixed. The
error has shown up in some places
(https://issues.chromium.org/issues/353702041), so does still exist, but
I still have not been able to find profraw files to be able to fix the

    [6 lines not shown]
DeltaFile
+30-4llvm/lib/ProfileData/InstrProf.cpp
+25-0llvm/test/Transforms/PGOProfile/Inputs/consecutive-zeros-metadata.proftext
+21-0llvm/test/Transforms/PGOProfile/consecutive-zeros-metadata.ll
+76-43 files

LLVM/project 77f0918clang-tools-extra/test/clang-tidy/checkers/modernize macro-to-enum-headers.cpp, clang-tools-extra/test/clang-tidy/checkers/modernize/Inputs/macro-to-enum modernize-macro-to-enum3.h

[clang-tidy][NFC] Fix modernize-macro-to-enum testcases (#198093)

Previously these header files are not tested, the new added test case
fixes the problem.

As of AI Usage: Codex is used to suggest the new tests
Closes https://github.com/llvm/llvm-project/issues/173530
DeltaFile
+15-0clang-tools-extra/test/clang-tidy/checkers/modernize/macro-to-enum-headers.cpp
+8-0clang-tools-extra/test/clang-tidy/checkers/modernize/Inputs/macro-to-enum/modernize-macro-to-enum3.h
+23-02 files

LLVM/project eeff32fllvm/test/CodeGen/X86 vector-shift-ashr-sub128.ll vector-shift-ashr-256.ll

[X86] Add vXi8 sra-by-one tests (#198096)

Test coverage for #198061
DeltaFile
+213-0llvm/test/CodeGen/X86/vector-shift-ashr-sub128.ll
+97-0llvm/test/CodeGen/X86/vector-shift-ashr-256.ll
+71-0llvm/test/CodeGen/X86/vector-shift-ashr-128.ll
+28-0llvm/test/CodeGen/X86/vector-shift-ashr-512.ll
+409-04 files

LLVM/project 77cf17ellvm/test/CodeGen/X86 vector-fshr-rot-256.ll vector-fshr-rot-128.ll

[X86] Add vXi8 rot-by-one tests (#198095)

Test coverage for #198059 and #198060
DeltaFile
+89-0llvm/test/CodeGen/X86/vector-fshr-rot-256.ll
+86-0llvm/test/CodeGen/X86/vector-fshr-rot-128.ll
+86-0llvm/test/CodeGen/X86/vector-fshl-rot-256.ll
+80-0llvm/test/CodeGen/X86/vector-fshl-rot-128.ll
+56-0llvm/test/CodeGen/X86/vector-fshl-rot-512.ll
+56-0llvm/test/CodeGen/X86/vector-fshr-rot-512.ll
+453-06 files

LLVM/project 1fb6c22mlir/include/mlir/IR Operation.h, mlir/lib/IR Operation.cpp

[mlir] Cleanup Operation.cpp (NFC) (#197712)

This PR cleans up the Operation.cpp based on clangd suggestions. It
removes unused headers, fixes incorrect comments, and improves
performance by applying std::move where appropriate.
DeltaFile
+2-3mlir/lib/IR/Operation.cpp
+0-1mlir/include/mlir/IR/Operation.h
+2-42 files