LLVM/project 4f5b905llvm/include/llvm/CodeGen ScheduleDAG.h, llvm/lib/CodeGen ScheduleDAG.cpp

[ScheduleDAG] Add a reachability cache to amortize DFS calls (#195079)

ScheduleDAGTopologicalSort::IsReachable falls out to a DFS on its
slow path. For some connectivity patterns this can result in ~quadratic
behavior.

Add a cache of {A, B} -> Reachable(A, B). This is invalidated whenever
AddPred or InitDAGTopologicalSorting is called.

For an antagnostic testcase, SelectionDAG time went from 1300s to 250s.

No testcase as no functional change, performance only.

---------

Co-authored-by: James Molloy <jmolloy at google.com>
DeltaFile
+11-0llvm/lib/CodeGen/ScheduleDAG.cpp
+3-0llvm/include/llvm/CodeGen/ScheduleDAG.h
+14-02 files

LLVM/project 23f5a7ellvm/test/CodeGen/AMDGPU/GlobalISel legalize-load-private.mir legalize-llvm.amdgcn.image.sample.a16.ll

AMDGPU/GlobalISel: Switch to extended LLTs

Switch is required to be able to translate bfloat.

After the switch most of the codegen patterns now require explicit
type on register to match instead of LLT::scalar.
So we can still use LLT::scalar for type checks but new instructions
created during lowerings/combines need to use propper extended LLT.

inst select test sources fully switched to i32/f32 so patterns can match
for legalizer and regbanklegalize left as is (should probably be switched
as well)

New functionality worth noting is f16 and bitcast lowering to i32
f16 = g_bitcast i16
->
i32 = g_anyext i16
f16 = g_trunc i32

f16 = trunc i32 is legal
DeltaFile
+6,753-6,685llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-load-private.mir
+5,732-5,732llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-llvm.amdgcn.image.sample.a16.ll
+5,570-5,519llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-load-local.mir
+5,045-5,045llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-store-global.mir
+5,017-4,999llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-load-global.mir
+3,948-3,900llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-llvm.amdgcn.image.dim.a16.ll
+32,065-31,880589 files not shown
+107,722-105,330595 files

LLVM/project 98a84fcclang/include/clang/Options Options.td, clang/lib/Sema AnalysisBasedWarnings.cpp SemaLifetimeSafety.h

run-lifetime-analysis
DeltaFile
+2-2clang/include/clang/Options/Options.td
+1-2clang/lib/Sema/AnalysisBasedWarnings.cpp
+2-0clang/lib/Sema/SemaLifetimeSafety.h
+5-43 files

LLVM/project 76cdf60llvm/include/llvm/Analysis ValueTracking.h, llvm/lib/Analysis ValueTracking.cpp BasicAliasAnalysis.cpp

[BasicAA] Don't look through llvm.ptrmask in GEP decomposition (#197082)

DecomposeGEPExpression() looked through llvm.ptrmask via
getArgumentAliasingToReturnedPointer(Call, MustPreserveNullness=false).
ptrmask preserves the underlying object but can change the byte address
by clearing low bits, so treating its result as having the same symbolic
offset as its argument produces stale offsets and bogus NoAlias answers.
The bug was introduced by 3f2850bc606c847075673554fe49d4a35f525b61.
    
Rename MustPreserveNullness to MustPreserveOffset, the property
DecomposeGEPExpression actually needs. Offset preservation is strictly
stronger than nullness preservation, so existing callers remain correct
and the accepted intrinsic set is unchanged (ptrmask stays excluded).
switch DecomposeGEPExpression to pass MustPreserveOffset=true. Every
call site is now tagged with MustPreserveOffset=.
DeltaFile
+16-14llvm/lib/Analysis/ValueTracking.cpp
+25-0llvm/test/Analysis/BasicAA/ptrmask-gep-decomposition.ll
+12-8llvm/include/llvm/Analysis/ValueTracking.h
+3-3llvm/lib/Transforms/IPO/AttributorAttributes.cpp
+5-1llvm/lib/Analysis/BasicAliasAnalysis.cpp
+2-1llvm/lib/Analysis/CaptureTracking.cpp
+63-273 files not shown
+68-309 files

LLVM/project 25507b6llvm/include/llvm/BinaryFormat ELF.h, llvm/lib/ObjectYAML ELFYAML.cpp

[Hexagon] Define Hexagon v93 ELF flags (#196643)
DeltaFile
+19-0llvm/test/tools/obj2yaml/ELF/hexagon-eflags.yaml
+2-0llvm/include/llvm/BinaryFormat/ELF.h
+2-0llvm/lib/ObjectYAML/ELFYAML.cpp
+23-03 files

LLVM/project 8897e28clang/include/clang/Options Options.td, clang/lib/Sema AnalysisBasedWarnings.cpp SemaLifetimeSafety.h

run-lifetime-analysis
DeltaFile
+2-2clang/include/clang/Options/Options.td
+1-1clang/lib/Sema/AnalysisBasedWarnings.cpp
+2-0clang/lib/Sema/SemaLifetimeSafety.h
+5-33 files

LLVM/project 8bddd0fllvm/lib/CodeGen/SelectionDAG DAGCombiner.cpp, llvm/test/CodeGen/AArch64 neon-dotreduce.ll

[DAG] visitBITCAST - fold (conv (scalar_to_vector(load x))) -> (load (conv*)x) (#196978)

Legalization can leave superfluous scalar_to_vector nodes with the
scalar bitwidth matching the vector bitwidth - peek through these when
attempting to bitcast folds

Only one match in trunk at the moment, but there are some additional
folds encountered in #149798
DeltaFile
+5-0llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+1-2llvm/test/CodeGen/AArch64/neon-dotreduce.ll
+6-22 files

LLVM/project a83cfdalldb/include/lldb/Expression LLVMUserExpression.h, lldb/source/Plugins/ExpressionParser/Clang ClangUserExpression.cpp ClangUserExpression.h

[LLDB] Simplify the API of ClangUserExpression::ScanContext [NFC] (#197037)

- this function is a virtual function, but it is called by the leaf
class ClangUserExpression

- it also returns a Status only to then report any error as a warning

This patch devirtualizes the function, since there is use-case for
overloading it in other expression evaluator plugins, and it cleans up
the Status usage by passing in DiagnosticManager directly, like its
sibling functions do.
DeltaFile
+35-27lldb/source/Plugins/ExpressionParser/Clang/ClangUserExpression.cpp
+2-2lldb/source/Plugins/ExpressionParser/Clang/ClangUserExpression.h
+0-3lldb/include/lldb/Expression/LLVMUserExpression.h
+37-323 files

LLVM/project 350536ellvm/test/CodeGen/X86 vector-reduce-fminimum.ll

[X86] add llvm.vector.reduce.fminimum test coverage (#197210)
DeltaFile
+1,252-0llvm/test/CodeGen/X86/vector-reduce-fminimum.ll
+1,252-01 files

LLVM/project f48026bllvm/lib/CodeGen/SelectionDAG SelectionDAG.cpp

[DAG] canCreateUndefOrPoison - fmaxnum/fminnum/fmaximum/fminimum/fmaximumnum/fminimumnum don't create poison (#197195)

Test coverage is proving tricky due to lack of folds that work with these - I'm open to suggestions if we don't want to just eyeball this.
DeltaFile
+8-0llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
+8-01 files

LLVM/project 5977cb5clang-tools-extra/clangd Selection.cpp, clang-tools-extra/clangd/unittests SelectionTests.cpp

[clangd] Avoid crash on pseudo-destructor selection (#195939)

clangd crashes during textDocument/codeAction on valid pseudo-destructor
expressions like y->~decltype(A())(). The bug is in
Selection.cpp::earlySourceRange(), which assumes destructor names always
have NamedTypeInfo. The fix is adding null checks before calling
getTypeLoc().

Fixes #195788.
DeltaFile
+18-0clang-tools-extra/clangd/unittests/SelectionTests.cpp
+10-4clang-tools-extra/clangd/Selection.cpp
+28-42 files

LLVM/project 297e3e9llvm/test/CodeGen/X86 avx512-intrinsics-fast-isel.ll

[X86] avx512-intrinsics-fast-isel.ll - add nounwind to remove cfi noise (#197207)
DeltaFile
+407-471llvm/test/CodeGen/X86/avx512-intrinsics-fast-isel.ll
+407-4711 files

LLVM/project 4bce216llvm/test/tools/llvm-objdump lit.local.cfg, llvm/test/tools/llvm-objdump/wasm line-numbers.s

SymbolizableObjectFile: Fix Wasm test to avoid layering violation (#193574)

Tests for LLVM libraries should not require wasm-ld. It's not necessary
in this case to generate the binary at test time, so instead check in a
YAMLized pre-linked binary.
DeltaFile
+72-0llvm/test/tools/llvm-objdump/wasm/Inputs/line-numbers.yaml
+21-18llvm/test/tools/llvm-objdump/wasm/line-numbers.s
+0-4llvm/test/tools/llvm-objdump/lit.local.cfg
+93-223 files

LLVM/project 45fc52bllvm/utils/FileCheck FileCheck.cpp

Run clang-format
DeltaFile
+1-1llvm/utils/FileCheck/FileCheck.cpp
+1-11 files

LLVM/project 785e3c0llvm/include/llvm/Transforms/Scalar GVN.h, llvm/lib/Transforms/Scalar GVN.cpp

[GVN][NVPTX] Rename PRE flag to ScalarPRE, disable option in NVPTX (#190386)

Scalar PRE in GVN may cause performance issues in the NVPTX backend
by increasing register pressure. This PR renames the enable-pre flag to
enable-scalar-pre and updates its usage to cover an additional case of
scalar PRE being performed. The newly renamed option is also used to
disable scalar PRE for NVPTX.
DeltaFile
+96-0llvm/test/CodeGen/NVPTX/gvn-scalar-pre-reg-pressure.ll
+61-0llvm/test/Transforms/GVN/PRE/no-scalar-pre.ll
+24-15llvm/lib/Transforms/Scalar/GVN.cpp
+6-5llvm/include/llvm/Transforms/Scalar/GVN.h
+3-3llvm/test/Transforms/GVN/PRE/pre-basic-add.ll
+2-2llvm/test/Other/new-pm-print-pipeline.ll
+192-259 files not shown
+208-4015 files

LLVM/project 9c4ff6ecompiler-rt/test/asan/TestCases/Posix multiple_sigaltstack.cpp

[ASan][Darwin] Make multiple_sigaltstack.cpp test use MINSIGSTKSZ (#197204)
DeltaFile
+4-0compiler-rt/test/asan/TestCases/Posix/multiple_sigaltstack.cpp
+4-01 files

LLVM/project 015bb78llvm/lib/Transforms/Vectorize VPlanPatternMatch.h

[NFC][LLVM][VPlan] Fix "parameter ‘P’ set but not used" warning. (#197194)

For Is... = {} the fold expression short-circuits to true and does not
evaluate P.
DeltaFile
+2-1llvm/lib/Transforms/Vectorize/VPlanPatternMatch.h
+2-11 files

LLVM/project c388be4llvm/utils/FileCheck FileCheck.cpp

Add unused inits; remove unused constructor
DeltaFile
+3-2llvm/utils/FileCheck/FileCheck.cpp
+3-21 files

LLVM/project f612211llvm/lib/Transforms/Vectorize LoopVectorize.cpp VPlanConstruction.cpp, llvm/test/Transforms/LoopVectorize/VPlan tail-folding.ll

[VPlan] Introduce reduction selects for tail folding in foldTailByMasking. NFCI (#192987)

Currently addComputeReductionResult has to check the cost model to see
if the loop is tail folded, and if so then manually fix up the backedge
value so any tail elements are ignored.

This PR moves this handling into foldTailByMasking itself so the plan
doesn't requiring fixing up. We do this by setting the incoming value
for the latch phi to the reduction phi instead of poison. A blend will
be created for this automatically.

The main benefits of this are that the reduction is correct when tail
folding is applied, and we don't need to worry about tail folding in as
many places.

In order to preserve some of the optimizations that we get on
VPInstruction::Select we need to convert the VPBlendRecipe to a select.
DeltaFile
+32-16llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+14-6llvm/lib/Transforms/Vectorize/VPlanConstruction.cpp
+1-1llvm/test/Transforms/LoopVectorize/VPlan/tail-folding.ll
+47-233 files

LLVM/project 98b84b5llvm/utils/FileCheck FileCheck.cpp

Merge branch 'filecheck-marker-mid-tail' into filecheck-marker-range
DeltaFile
+3-2llvm/utils/FileCheck/FileCheck.cpp
+3-21 files

LLVM/project 153ef72llvm/utils/FileCheck FileCheck.cpp

Merge branch 'filecheck-input-annotation-labeler' into filecheck-marker-mid-tail
DeltaFile
+3-2llvm/utils/FileCheck/FileCheck.cpp
+3-21 files

LLVM/project 2e599c6llvm/utils/FileCheck FileCheck.cpp

Fix typo; add nullptr inits
DeltaFile
+3-2llvm/utils/FileCheck/FileCheck.cpp
+3-21 files

LLVM/project 16e8a3cllvm/include/llvm/Analysis MemoryBuiltins.h, llvm/lib/Analysis MemoryBuiltins.cpp

[MemoryBuiltins] Remove isNewLikeFn() (#197209)

This function is unused.
DeltaFile
+0-6llvm/lib/Analysis/MemoryBuiltins.cpp
+0-4llvm/include/llvm/Analysis/MemoryBuiltins.h
+0-102 files

LLVM/project 3e6b973llvm/test/CodeGen/AMDGPU/GlobalISel legalize-load-private.mir legalize-llvm.amdgcn.image.sample.a16.ll

AMDGPU/GlobalISel: Switch to extended LLTs

Switch is required to be able to translate bfloat.

After the switch most of the codegen patterns now require explicit
type on register to match instead of LLT::scalar.
So we can still use LLT::scalar for type checks but new instructions
created during lowerings/combines need to use propper extended LLT.

inst select test sources fully switched to i32/f32 so patterns can match
for legalizer and regbanklegalize left as is (should probably be switched
as well)

New functionality worth noting is f16 and bitcast lowering to i32
f16 = g_bitcast i16
->
i32 = g_anyext i16
f16 = g_trunc i32

f16 = trunc i32 is legal
DeltaFile
+6,753-6,685llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-load-private.mir
+5,732-5,732llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-llvm.amdgcn.image.sample.a16.ll
+5,570-5,519llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-load-local.mir
+5,045-5,045llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-store-global.mir
+5,017-4,999llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-load-global.mir
+3,948-3,900llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-llvm.amdgcn.image.dim.a16.ll
+32,065-31,880581 files not shown
+107,268-104,875587 files

LLVM/project 422d023libc/cmake/modules LLVMLibCArchitectures.cmake

[libc] Pass -c to compiler when detecting target (#197012)

Follow-up to #176680 where I claimed having done this, but apparantly
didn't actually add it to the commit.

Hopefully no observable behavior change; will tell the compiler to omit
linker info in its output, which we don't need for this detection step.
DeltaFile
+1-1libc/cmake/modules/LLVMLibCArchitectures.cmake
+1-11 files

LLVM/project 13a20bdllvm/test/CodeGen/X86 select-big-integer.ll

[X86] Add test coverage for #196493 (#197198)
DeltaFile
+82-1llvm/test/CodeGen/X86/select-big-integer.ll
+82-11 files

LLVM/project 17dde87llvm/include/llvm/Support Debug.h, llvm/lib/Support Debug.cpp

[Support] Add a function to print the debug log (#197184)

With `EnableDebugBuffering`, the debug log is stored in a circular
buffer and printed, with a nice banner, on program termination - this is
achieved via a signal handler. For in-process tool execution, such as
for running the regression tests using daemon versions of the tools, we
need to be able to trigger the printing/flushing of the debug log from
the process itself. This PR just adds a small function `printDebugLog`
which checks if debug output and debug log buffering are enabled and, if
so, prints the debug log.

The code for printing the debug log in the signal handler is moved to a
new function `printDebugLogImpl` which is called by the signal handler
and `printDebugLog` - the reason this is separate from `printDebugLog`
is to avoid running the option check in the signal handler
implementation, in case options were reset before the signal handler is
called, as this would be an unintentional behavioral change.
DeltaFile
+11-2llvm/lib/Support/Debug.cpp
+5-0llvm/include/llvm/Support/Debug.h
+16-22 files

LLVM/project ef6df9dclang/docs ReleaseNotes.rst, clang/lib/Sema SemaDecl.cpp SemaObjC.cpp

[Clang] Fix assertion when __block is used on global variables in C mode (#194856)

This is a reland PR, related to #183988 

I added an extra check in handleBlocksAttr to ensure that illegal Decl
values ​​are not passed to downstream functions.
And remove unnecessary check in `CheckCompleteVariableDeclaration`.

Also added a extra regression test.

Fixes #183974
DeltaFile
+11-0clang/test/Sema/block-on-objc-ivars.m
+0-6clang/lib/Sema/SemaDecl.cpp
+6-0clang/lib/Sema/SemaObjC.cpp
+5-0clang/test/Sema/gh183974.c
+1-0clang/docs/ReleaseNotes.rst
+23-65 files

LLVM/project 68dfda0llvm/test/CodeGen/AMDGPU packed-fneg-fsub-fp16.ll strict_fsub.f16.ll, llvm/test/CodeGen/AMDGPU/GlobalISel legalize-fsub.mir combine-fma-sub-mul.ll

[AMDGPU] Optimize fneg and fsub with packed fp16 ops (#196659)

The work optimize fneg and fsub when packed half math instructions are supported.
  On global isel path, for wider vectors of G_FSUB with element type of f16, we should
split them to v2f16 for v_pk_add_f16 to be selected.
  On SelectionDAG path, we make FNEG legal, and also make sure to split wider vectors
to v2f16. In this way, we can fold fneg into the source modifiers for packed half ops.
DeltaFile
+557-0llvm/test/CodeGen/AMDGPU/packed-fneg-fsub-fp16.ll
+85-108llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-fsub.mir
+30-82llvm/test/CodeGen/AMDGPU/GlobalISel/combine-fma-sub-mul.ll
+27-67llvm/test/CodeGen/AMDGPU/strict_fsub.f16.ll
+11-22llvm/test/CodeGen/AMDGPU/wmma-gfx12-w32-f16-f32-matrix-modifiers.ll
+8-24llvm/test/CodeGen/AMDGPU/GlobalISel/combine-fma-sub-neg-mul.ll
+718-3032 files not shown
+724-3068 files

LLVM/project 5a13758libc/cmake/modules LLVMLibCCompileOptionRules.cmake

Revert "[libc] Build with -Wshadow" (#197201)

Reverts llvm/llvm-project#196519

Passed CI on the PR, but apparently breaks several bots.
DeltaFile
+0-1libc/cmake/modules/LLVMLibCCompileOptionRules.cmake
+0-11 files