LLVM/project 75b3226clang/include/clang/ScalableStaticAnalysisFramework/Core/EntityLinker LUSummary.h, clang/unittests/ScalableStaticAnalysisFramework LUSummaryTest.cpp CMakeLists.txt

[clang][ssaf] Add accessor for `LUNamespace` (#195756)
DeltaFile
+25-0clang/unittests/ScalableStaticAnalysisFramework/LUSummaryTest.cpp
+2-0clang/include/clang/ScalableStaticAnalysisFramework/Core/EntityLinker/LUSummary.h
+1-0clang/unittests/ScalableStaticAnalysisFramework/CMakeLists.txt
+28-03 files

LLVM/project 7ff811allvm/lib/DWARFLinker Utils.cpp, llvm/lib/DWARFLinker/Classic DWARFLinker.cpp

[DWARFLinker] Patch DW_AT_LLVM_stmt_sequence in the parallel linker (#195388)

Mirror dsymutil's stmt-sequence rewriting in the parallel linker so each
attribute ends up pointing at the DW_LNE_set_address that opens its
containing output sequence, with the correct offset in the combined
.debug_line.

At DIE cloning time we resolve each attribute's input offset to the
address of its first row and record the pair (DIEValue, address) on the
CompileUnit, alongside a DebugOffsetPatch on the .debug_info section so
combination adds the CU's .debug_line start offset. The line-table
emitter then fills a map from row address to the byte offset of the
sequence-opening DW_LNE_set_address.

After emission, each recorded attribute is rewritten by relocating its
input address through the CU's function ranges and looking the result up
in the map. When resolution fails the DWARF max-offset sentinel is
written instead, and the patch applier preserves it unchanged.

First-row lookups share a lazy per-CU cache to keep resolution O(1) per
attribute.
DeltaFile
+169-69llvm/lib/DWARFLinker/Parallel/DWARFLinkerCompileUnit.cpp
+13-106llvm/lib/DWARFLinker/Classic/DWARFLinker.cpp
+118-0llvm/unittests/DWARFLinkerParallel/DWARFLinkerTest.cpp
+100-0llvm/lib/DWARFLinker/Utils.cpp
+64-2llvm/lib/DWARFLinker/Parallel/DWARFLinkerCompileUnit.h
+33-2llvm/lib/DWARFLinker/Parallel/DIEAttributeCloner.cpp
+497-1796 files not shown
+590-20012 files

LLVM/project 8ec6a2dllvm/lib/Transforms/IPO Inliner.cpp, llvm/test/Transforms/Inline inline_store_to_load.ll

Reland [Inliner] Use store-to-load forwarding to resolve call arguments (#195526)

Adds store to load forwarding when inliner has successfully done some
inlining. This allows simplification of further inlining attempts and
can give them more precise cost analysis.

It allows to optimize away empty `std::set` and `std::map` in both
`libc++` and `libstdc++` and many other real world cases.

Reland of #190607. It was reverted since it was causing crashes in
#195135. These were crashes in `FindAvailableLoadedValue` on mixed
address space pointers and should be fixed by #195256
DeltaFile
+212-0llvm/test/Transforms/Inline/inline_store_to_load.ll
+2-36llvm/test/Transforms/PhaseOrdering/inline-store-to-load.ll
+37-0llvm/lib/Transforms/IPO/Inliner.cpp
+251-363 files

LLVM/project 86b346dclang/lib/Sema SemaExpr.cpp, clang/test/CodeGenHLSL/builtins fma.hlsl

[HLSL] For builtins aliases, apply implicit conversions before running custom type checking (#195365)

Fixes https://github.com/llvm/llvm-project/issues/195329 by making HLSL
builtin aliases apply implicit conversions before running custom type
checking.

After this PR:
- There are no more size 1 vectors being passed and returned to/from
aliased Clang builtins because they get truncated to scalars due to the
HLSL alias builtin not having explicit size 1 vector overloads.
- HLSL alias builtins no longer accept matrices unless they have
explicit matrix overloads. Matrices get implicitly truncated to scalars
and resolve to the scalar Clang builtin being aliased.
- Many calls with mismatched vector sizes no longer error with
`arguments are of different types` and instead follow Clang's overload
resolution rules with respect to HLSL's implicit conversion sequences.
(e.g., `dot(float3, float2)` -> `dot(float2, float2)` with warning)
- Calls with implicitly-convertible types no longer error. They are now
implicitly converted, and with a warning in some cases. (e.g.,

    [3 lines not shown]
DeltaFile
+48-20clang/test/SemaHLSL/BuiltIns/fma-errors.hlsl
+22-11clang/test/SemaHLSL/BuiltIns/f16tof32-errors.hlsl
+19-11clang/test/SemaHLSL/BuiltIns/f32tof16-errors.hlsl
+18-1clang/lib/Sema/SemaExpr.cpp
+9-4clang/test/CodeGenHLSL/builtins/fma.hlsl
+6-2clang/test/SemaHLSL/BuiltIns/lerp-errors.hlsl
+122-493 files not shown
+132-539 files

LLVM/project 48d25e3llvm/docs AMDGPUUsage.rst, llvm/docs/AMDGPU DeveloperGuideline.rst

[NFC][AMDGPU][Doc] Add developer guideline

This guideline covers topics on top of existing LLVM guideline.
DeltaFile
+431-0llvm/docs/AMDGPU/DeveloperGuideline.rst
+1-0llvm/docs/AMDGPUUsage.rst
+432-02 files

LLVM/project 0e0c9a5llvm/lib/Transforms/Vectorize VPlanTransforms.cpp VPlanRecipes.cpp, llvm/test/Transforms/LoopVectorize make-scalarization-decisions.ll extract-value-widen.ll

[VPlan] Scalarize to first-lane-only directly on VPlan (#184267)

This is needed to enable subsequent
https://github.com/llvm/llvm-project/pull/182595.

I don't think we can fully port all scalarization logic from the legacy
path to VPlan-based right now because by that point in the pipeline
interleave groups aren't lowered into any VPlan-based representation and
as such this pass operates on incomplete information. Currently, the
pass can make transformations if "all uses are scalar" (that won't
change later) but not "are uses a mix of vector and scalar uses" (that
might change after lowering interleave groups).

As such, I decided just to implement something much simpler that would
be enough for #182595. However, we perform this transformation before
delegating to the old CM-based decision, so it **is** effective
immediately and taking precedence even for consecutive loads/stores
right away.


    [2 lines not shown]
DeltaFile
+46-0llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
+43-0llvm/test/Transforms/LoopVectorize/make-scalarization-decisions.ll
+22-0llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
+5-5llvm/test/Transforms/LoopVectorize/AArch64/binop-costs.ll
+2-5llvm/test/Transforms/LoopVectorize/extract-value-widen.ll
+4-2llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+122-126 files not shown
+136-1612 files

LLVM/project a6f40eellvm/lib/DWARFLinker/Parallel DependencyTracker.cpp, llvm/test/tools/dsymutil asm-line-tables.test

[DWARFLinker] Add assembly-label range handling to parallel linker (#195366)

Assembly CUs typically have DW_TAG_label entries instead of subprograms,
so the parallel linker's line-table filter saw no function ranges and
dropped every row. Mirror the classic linker: for labels in
Mips_Assembler or Assembly CUs, look up an assembly range via
getAssemblyRangeForAddress and call addFunctionRange before falling back
to addLabelLowPc.
DeltaFile
+28-15llvm/lib/DWARFLinker/Parallel/DependencyTracker.cpp
+10-2llvm/test/tools/dsymutil/asm-line-tables.test
+38-172 files

LLVM/project 42c3267flang/lib/Optimizer/OpenMP MapInfoFinalization.cpp, flang/test/Lower/OpenMP derived-type-allocatable-map.f90 allocatable-dtype-intermediate-map-gen.f90

[Flang][OpenMP] Fix assert trigger in MapInfoFinalization pass for implicit record member maps (#193851)

The current iteration of the implicit record member mapping segment of
the MapInfoFinalization pass makes the assumption that child maps of
parents are already bound to the targets block arguments, but that is
not the case apon initial lowering from PFT to MLIR. This actually
happens as the end of the MapInfoFinalization pass currently where we
"canonicalize" that all maps are inserted as Block arguments to their
respective targets.

This assumption unfortunately leads to a few cases where we trigger the
assertion, to address this we can impose this canonicalization of map
<-> block arguments as soon as we enter the pass and then once again at
the end of the pass for any new members generated by the
MapInfoFinalization pass. This allows the implicit record member mapping
process to continue unhindered whilst changing very little elsewhere
other than the ordering of block arguments (hence some lit tests
tweaks). The main downside is the extra processing required for running
the "canonialization" twice.

    [4 lines not shown]
DeltaFile
+52-3flang/lib/Optimizer/OpenMP/MapInfoFinalization.cpp
+42-0flang/test/Transforms/omp-map-info-finalization-member-record.fir
+33-0offload/test/offloading/fortran/target-map-nested-dtype-allocatable-member.f90
+3-3flang/test/Transforms/omp-map-info-finalization.fir
+2-2flang/test/Lower/OpenMP/derived-type-allocatable-map.f90
+1-1flang/test/Lower/OpenMP/allocatable-dtype-intermediate-map-gen.f90
+133-96 files

LLVM/project 7c30772llvm/test/CodeGen/AArch64 st1-lane.ll arm64-neon-2velem.ll

[AArch64][GlobalISel] Lower unmerge to extract_subvector (#195046)

This follows and reuses the existing lowering for unmerge -> extract
vector element, extending it to also lower unmerge -> subvector extract
for half-sized vector extracts. This allows certain tablegen patterns to
match.

An extra extract_subvector(dup) combine is needed to optimize away
unnecessary instructions. The ext vs mov/dup brings us in-line with
SDAG, but we may change both to use mov/dup.
DeltaFile
+42-90llvm/test/CodeGen/AArch64/st1-lane.ll
+65-65llvm/test/CodeGen/AArch64/arm64-neon-2velem.ll
+52-54llvm/test/CodeGen/AArch64/neon-extadd.ll
+36-36llvm/test/CodeGen/AArch64/arm64-neon-2velem-high.ll
+25-46llvm/test/CodeGen/AArch64/highextractbitcast.ll
+20-42llvm/test/CodeGen/AArch64/arm64-extract_subvector.ll
+240-33336 files not shown
+548-66442 files

LLVM/project e4e4198compiler-rt/lib/asan asan_linux.cpp asan_malloc_linux.cpp

[NFCI] clarify that asan-*linux.cpp files affect *nix OS'es (#195565)

**Prior Work:** Aims to supersede (#132263), which seems inactive,
specifically by applying my own comment:
https://github.com/llvm/llvm-project/pull/132263#issuecomment-3051238734

**Context:** It aims to minimally document that the
`asan_(malloc_)?linux.cpp` files may impact non-linux OS'es (despite the
name) such as Solaris, BSD, and other *nix OS'es. This is worth
documenting as otherwise we risk breakage due to confusion, as occurred
[here](https://github.com/llvm/llvm-project/pull/131975#issuecomment-2741097471).

This is done simply by minimally augmenting the file header comment
saying precisely this.
Unlike the prior PR, this does not rename any files, which should reduce
the 'git noise' impact of this change.

_Thanks!_
DeltaFile
+1-1compiler-rt/lib/asan/asan_linux.cpp
+1-1compiler-rt/lib/asan/asan_malloc_linux.cpp
+2-22 files

LLVM/project ab7cec9llvm/lib/Transforms/InstCombine InstCombineCalls.cpp, llvm/test/Transforms/InstCombine assume.ll

[InstCombine] Remove redundant assume fold (#195852)

The fold is fully redundant with the fold using `computeKnownBits`, so
we can let that do the work instead.
DeltaFile
+1-6llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
+0-1llvm/test/Transforms/InstCombine/assume.ll
+1-72 files

LLVM/project 6c797ebllvm/lib/Target/AArch64 AArch64SchedC1Premium.td, llvm/test/tools/llvm-mca/AArch64/Cortex C1Premium-sve-instructions.s C1Premium-writeback.s

Merge branch 'main' into users/eas/vplan-based-first-lane-only-scalarize
DeltaFile
+6,873-0llvm/test/tools/llvm-mca/AArch64/Cortex/C1Premium-sve-instructions.s
+3,979-0llvm/test/tools/llvm-mca/AArch64/Cortex/C1Premium-writeback.s
+3,163-0llvm/test/tools/llvm-mca/AArch64/Cortex/C1Premium-neon-instructions.s
+2,565-0llvm/test/tools/llvm-mca/AArch64/Cortex/C1Premium-forwarding.s
+2,523-0llvm/test/tools/llvm-mca/AArch64/Cortex/C1Premium-basic-instructions.s
+2,348-0llvm/lib/Target/AArch64/AArch64SchedC1Premium.td
+21,451-0771 files not shown
+40,147-7,253777 files

LLVM/project 176dff4lldb/source/Host/common File.cpp

[lldb][windows] fix cross DLL file descriptor lookup crash (#195855)

On Windows, file descriptors are only valid in the same DLL: they are
really just handles mapped to an index in a table in the CRT. Calling a
liblldb method with a file descriptor from lldb-dap will cause the
program to crash. See
https://github.com/llvm/llvm-project/issues/193971.

This patch fixes the issue by refactoring the `NativeFile` constructors
so that they no longer try to convert `FILE` types to handles through
the CRT lookup table.
DeltaFile
+22-5lldb/source/Host/common/File.cpp
+22-51 files

LLVM/project 6f5570cllvm/test/CodeGen/AMDGPU llvm.amdgcn.rsq.ll llvm.amdgcn.rsq.clamp.ll

[AMDGPU][NFC] Autogenerate check lines for llvm.amdgcn.rsq.clamp.ll and llvm.amdgcn.rsq.ll (#195867)
DeltaFile
+128-16llvm/test/CodeGen/AMDGPU/llvm.amdgcn.rsq.ll
+60-23llvm/test/CodeGen/AMDGPU/llvm.amdgcn.rsq.clamp.ll
+188-392 files

LLVM/project ea6b7f8llvm/lib/Transforms/InstCombine InstCombineCalls.cpp

[InstCombine][NFC] Use CreateAssumption instead of CreateCall (#195862)
DeltaFile
+4-9llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
+4-91 files

LLVM/project f23c407llvm/lib/CodeGen PeepholeOptimizer.cpp, llvm/test/CodeGen/AMDGPU peephole-fold-imm.mir

PeepholeOpt: Clear kill flags in foldImmediate (#195680)

When foldImmediate replaces a COPY destination with its source,
this extends the live range of the source, but it does not update the
kill flags.

Clear kill flags on the source register after replacement.

This was found while working on REG_SEQUENCE optimizations motivated by
AMDGPU demands. Both an AMDGPU and an X86 test case are added to show that
the issue is not AMDGPU specific.
DeltaFile
+27-0llvm/test/CodeGen/X86/peephole.mir
+26-0llvm/test/CodeGen/AMDGPU/peephole-fold-imm.mir
+1-0llvm/lib/CodeGen/PeepholeOptimizer.cpp
+54-03 files

LLVM/project 6af0cbfmlir/lib/Dialect/Vector/Transforms LowerVectorGather.cpp, mlir/test/Dialect/Vector vector-gather-lowering.mlir

[mlir][vector] Account for subview offset in gather lowering. (#195359)

Strided vector.gather on a column subview was reading the wrong column
because the rewrite to a collapsed gather dropped the subview's static
offset.

---------

Signed-off-by: hanhanW <hanhan0912 at gmail.com>
DeltaFile
+49-14mlir/lib/Dialect/Vector/Transforms/LowerVectorGather.cpp
+61-0mlir/test/Dialect/Vector/vector-gather-lowering.mlir
+110-142 files

LLVM/project a8a3b96llvm/lib/Target/AArch64 AArch64InstrFormats.td AArch64RegisterInfo.td, llvm/lib/Target/AArch64/AsmParser AArch64AsmParser.cpp

fixup! More optimisations spotted by Marian after the specs changed
DeltaFile
+40-42llvm/lib/Target/AArch64/MCTargetDesc/AArch64InstPrinter.cpp
+24-43llvm/lib/Target/AArch64/AsmParser/AArch64AsmParser.cpp
+16-30llvm/lib/Target/AArch64/Disassembler/AArch64Disassembler.cpp
+5-6llvm/lib/Target/AArch64/AArch64InstrFormats.td
+4-5llvm/lib/Target/AArch64/AArch64RegisterInfo.td
+1-3llvm/lib/Target/AArch64/AArch64InstrInfo.td
+90-1291 files not shown
+91-1307 files

LLVM/project bb3de90llvm/lib/Target/AArch64/AsmParser AArch64AsmParser.cpp, llvm/test/MC/AArch64 armv9a-tlbip.s

fixup! Address Carol's PR comments
DeltaFile
+5-0llvm/test/MC/AArch64/armv9a-tlbip.s
+3-0llvm/lib/Target/AArch64/AsmParser/AArch64AsmParser.cpp
+8-02 files

LLVM/project 75e74bbllvm/lib/Target/AArch64 AArch64InstrFormats.td

fixup! Remove superfluous code
DeltaFile
+0-7llvm/lib/Target/AArch64/AArch64InstrFormats.td
+0-71 files

LLVM/project 8b02faflldb/test/Shell/Commands command-disassemble-aarch64-extensions.s command-disassemble-aarch64-color.s, llvm/lib/Target/AArch64 AArch64InstrFormats.td

fixup! Update diff because SYSP definition has changed
DeltaFile
+126-114llvm/test/MC/AArch64/armv9a-sysp.s
+19-21llvm/test/MC/AArch64/armv9-sysp-diagnostics.s
+7-32llvm/lib/Target/AArch64/AArch64InstrFormats.td
+2-11llvm/lib/Target/AArch64/AsmParser/AArch64AsmParser.cpp
+2-2lldb/test/Shell/Commands/command-disassemble-aarch64-extensions.s
+2-2lldb/test/Shell/Commands/command-disassemble-aarch64-color.s
+158-1822 files not shown
+159-1878 files

LLVM/project 935b473llvm/lib/Target/AArch64/AsmParser AArch64AsmParser.cpp, llvm/test/MC/AArch64 armv9-sysp-diagnostics.s

fixup! Improve error parsing
DeltaFile
+46-25llvm/lib/Target/AArch64/AsmParser/AArch64AsmParser.cpp
+12-12llvm/test/MC/AArch64/armv9-sysp-diagnostics.s
+58-372 files

LLVM/project 48ee08ellvm/lib/Target/AArch64 AArch64InstrFormats.td

fixup! Fixes after rebasing following Marian's change
DeltaFile
+3-3llvm/lib/Target/AArch64/AArch64InstrFormats.td
+3-31 files

LLVM/project 389ae90llvm/lib/Target/AArch64/MCTargetDesc AArch64InstPrinter.cpp

fixup! Address PR comment about shortened `sysp` with xzr/xzr
DeltaFile
+17-16llvm/lib/Target/AArch64/MCTargetDesc/AArch64InstPrinter.cpp
+17-161 files

LLVM/project 1c0f241llvm/lib/Target/AArch64 AArch64RegisterInfo.td, llvm/lib/Target/AArch64/AsmParser AArch64AsmParser.cpp

fixup! Implement Marian's suggestion to implement as XSeqPairsClass + [XZR, XZR]
DeltaFile
+54-82llvm/lib/Target/AArch64/AsmParser/AArch64AsmParser.cpp
+35-73llvm/lib/Target/AArch64/MCTargetDesc/AArch64InstPrinter.cpp
+12-9llvm/lib/Target/AArch64/Disassembler/AArch64Disassembler.cpp
+8-1llvm/lib/Target/AArch64/AArch64RegisterInfo.td
+0-7llvm/test/MC/AArch64/armv9a-sysp.s
+1-3llvm/lib/Target/AArch64/MCTargetDesc/AArch64InstPrinter.h
+110-1756 files

LLVM/project 46568abllvm/lib/Target/AArch64 AArch64InstrInfo.td, llvm/lib/Target/AArch64/AsmParser AArch64AsmParser.cpp

fixup! Remove SYSPxt_XZR and update code to reflect this
DeltaFile
+27-34llvm/lib/Target/AArch64/AsmParser/AArch64AsmParser.cpp
+41-14llvm/lib/Target/AArch64/MCTargetDesc/AArch64InstPrinter.cpp
+8-26llvm/lib/Target/AArch64/Disassembler/AArch64Disassembler.cpp
+2-30llvm/lib/Target/AArch64/AArch64InstrInfo.td
+0-20llvm/test/MC/AArch64/armv9-sysp-invalid.s
+13-3llvm/test/MC/AArch64/armv9-sysp-diagnostics.s
+91-1274 files not shown
+105-13710 files

LLVM/project e37f42bllvm/lib/Target/AArch64 AArch64InstrFormats.td AArch64InstrInfo.td, llvm/lib/Target/AArch64/Disassembler AArch64Disassembler.cpp

[AArch64][llvm] Tighten SYSP; don't disassemble invalid encodings

Tighten SYSP aliases, so that invalid encodings are disassembled
to `<unknown>`. This is because:

```
  Cn is a 4-bit unsigned immediate, in the range 8 to 9
  Cm is a 4-bit unsigned immediate, in the range 0 to 7
  op1 is a 3-bit unsigned immediate, in the range 0 to 6
  op2 is a 3-bit unsigned immediate, in the range 0 to 7
```

Ensure we check this when disassembling, and also constrain
tablegen for compile-time errors of invalid encodings.

Also adjust the testcases in `armv9-sysp-diagnostics.s` and
`llvm/test/MC/AArch64/armv9a-sysp.s` as they were invalid,
and added a few invalid (outside of range) SYSP-alikes to
test that `<unknown>` is printed
DeltaFile
+111-111llvm/test/MC/AArch64/armv9a-sysp.s
+25-1llvm/lib/Target/AArch64/AArch64InstrFormats.td
+25-0llvm/lib/Target/AArch64/Disassembler/AArch64Disassembler.cpp
+20-0llvm/test/MC/AArch64/armv9-sysp-invalid.s
+7-8llvm/test/MC/AArch64/armv9-sysp-diagnostics.s
+7-3llvm/lib/Target/AArch64/AArch64InstrInfo.td
+195-1233 files not shown
+207-1279 files

LLVM/project 5685d3cllvm/lib/Target/AArch64/MCTargetDesc AArch64InstPrinter.cpp, llvm/test/MC/AArch64 armv9a-sysp.s

fixup! Add no-alias tests
DeltaFile
+4-3llvm/lib/Target/AArch64/MCTargetDesc/AArch64InstPrinter.cpp
+7-0llvm/test/MC/AArch64/armv9a-sysp.s
+11-32 files

LLVM/project b1d8d4fllvm/lib/Target/AArch64 AArch64InstrFormats.td AArch64InstrInfo.td

fixup! Address Marian's PR comments: use imm0_6 predicate
DeltaFile
+9-1llvm/lib/Target/AArch64/AArch64InstrFormats.td
+2-2llvm/lib/Target/AArch64/AArch64InstrInfo.td
+11-32 files

LLVM/project 6f9dd5cllvm/lib/Target/AArch64 AArch64InstrFormats.td, llvm/lib/Target/AArch64/AsmParser AArch64AsmParser.cpp

fixup! Templatise bounds checking and improve tests
DeltaFile
+15-4llvm/test/MC/AArch64/armv9-sysp-diagnostics.s
+18-0llvm/lib/Target/AArch64/AsmParser/AArch64AsmParser.cpp
+12-5llvm/lib/Target/AArch64/AArch64InstrFormats.td
+0-8llvm/lib/Target/AArch64/MCTargetDesc/AArch64InstPrinter.cpp
+45-174 files