LLVM/project 9bb1933clang/lib/CIR/CodeGen CIRGenBuiltinAMDGPU.cpp, clang/test/CIR/CodeGenHIP builtins-amdgcn.hip

[CIR][AMDGPU] Add lowering for amdgcn readlane readfirstlane builtins
DeltaFile
+16-0clang/test/CIR/CodeGenHIP/builtins-amdgcn.hip
+13-2clang/lib/CIR/CodeGen/CIRGenBuiltinAMDGPU.cpp
+29-22 files

LLVM/project 215bd25clang/lib/StaticAnalyzer/Core ExprEngine.cpp

[analyzer] Clean up evalBind, fix bad logic (#196313)

This commit refactors `ExprEngine::evalBind` to eliminate the use of a
`NodeBuilder` and fix incorrect logic that was apparently introduced
because the `NodeBuilder` had obfuscated the underlying set operations.

In the special case when the engine is binding to an `Unknown` or
`Undefined` memory location, with the old code on each execution path
_either_ only the `check::Bind` checkers _or_ only the pointer escape
checkers were invoked. This commit ensures that on each execution path
_both_ the `check::Bind` checkers _and then_ the pointer escape checkers
get a chance to activate.

I'm pretty sure that the bad logic did not cause incorrect behavior of
the analyzer, because there are no `checkBind` checkers that generate
non-sink transitions when the location is `Unknown` or `Undefined`.

I also added an assertion that the location argument of `evalBind`
cannot be a `NonLoc`, because this is a common sense precondition, seems
to be actually true and makes it easier to reason about the behavior of
this function.
DeltaFile
+19-33clang/lib/StaticAnalyzer/Core/ExprEngine.cpp
+19-331 files

LLVM/project 26cae62lldb/source/Host/common NativeProcessProtocol.cpp, lldb/source/Plugins/Process/FreeBSD NativeProcessFreeBSD.cpp

 Reapply "[lldb] Do not refcount breakpoints in lldb-server" (#195858)  (#196891)

This reapplies #195858 with a fix for 32-bit arm (and generally, any
architecture that uses software single-stepping). The problem was that
the temporary breakpoints used for single-stepping were interfering with
the breakpoints set by the client.

The fix is to check for existing breakpoints before setting the
temporary ones. To achieve this, I've separated the notion of "next PC
candidates for a thread" from "step breakpoints we've actually set".

The freebsd code had some software single stepping code, but:
- this was [introduced](https://reviews.llvm.org/D95802) for mips64
support, which was
[removed](https://github.com/llvm/llvm-project/pull/179582) earlier this
year
- AFAICT, this never worked since the original patch only checked
`m_threads_stepping_with_breakpoint`, but never set it to anything.


    [18 lines not shown]
DeltaFile
+19-21lldb/source/Plugins/Process/Utility/NativeProcessSoftwareSingleStep.cpp
+1-16lldb/source/Plugins/Process/FreeBSD/NativeProcessFreeBSD.cpp
+4-10lldb/source/Host/common/NativeProcessProtocol.cpp
+11-2lldb/test/API/functionalities/multi-breakpoint/TestMultiBreakpoint.py
+6-7lldb/source/Plugins/Process/Linux/NativeProcessLinux.cpp
+5-2lldb/source/Plugins/Process/Utility/NativeProcessSoftwareSingleStep.h
+46-582 files not shown
+51-618 files

LLVM/project f7f911flibc/include CMakeLists.txt, libc/include/llvm-libc-types in_port_t.h CMakeLists.txt

[libc] Add some types to netinet/in.h (#196932)

Not including more types because I need to fix in_addr definition first.

This exposes stdint macros and types through the header, but POSIX
permits that behavior (and explicitly requires that we define uint8_t
and uint32_t).

No test as this is just adding a typedef, and I don't *think* we have
tests for that, but I can add a "check that type is defined" test if
that is desirable.
DeltaFile
+16-0libc/include/llvm-libc-types/in_port_t.h
+4-1libc/include/CMakeLists.txt
+4-1libc/include/netinet/in.yaml
+1-0libc/include/llvm-libc-types/CMakeLists.txt
+25-24 files

LLVM/project 694fc0ellvm/test/CodeGen/AArch64/GlobalISel select-intrinsic-aarch64-sdiv.mir preselect-process-phis.mir

Fix tests
DeltaFile
+4-22llvm/test/CodeGen/AArch64/GlobalISel/select-intrinsic-aarch64-sdiv.mir
+2-4llvm/test/CodeGen/AArch64/GlobalISel/preselect-process-phis.mir
+6-262 files

LLVM/project 65a206fllvm/test/CodeGen/X86 fold-int-pow2-with-fmul-or-fdiv.ll

[X86] fold-int-pow2-with-fmul-or-fdiv.ll - regenerate with (V)PADD asm comments (#197137)

Reduce diff in #197097
DeltaFile
+11-11llvm/test/CodeGen/X86/fold-int-pow2-with-fmul-or-fdiv.ll
+11-111 files

LLVM/project 8393227mlir/lib/Transforms Mem2Reg.cpp, mlir/test/Transforms mem2reg.mlir

fix handling of region
DeltaFile
+23-0mlir/test/Transforms/mem2reg.mlir
+3-3mlir/lib/Transforms/Mem2Reg.cpp
+26-32 files

LLVM/project e08e48dllvm/test/CodeGen/AArch64/GlobalISel select-intrinsic-aarch64-sdiv.mir, llvm/test/TableGen/GlobalISelEmitter MatchTableOptimizerRecursion.td

[GlobalISel] Recursively Optimise MatchTable Matchers

The core of this change is the additional call to `Matcher::optimize()` in the `optimizeRules` function,
which enables the match table optimization logic to recurse on the children of every GroupMatcher, forming
additional groups (which hoist more common predicates into a shared group).

To enable that, I had to update the `getFirstConditionAsRootType` implementation to support `GroupMatcher`.
I also included a small refactoring of the match table optimization pipeline that was identical between the
GlobalISel and GlobalISelCombiner emitters.

The results of this change are up to a 25% size reduction for GlobalISel match tables.
There is a tiny increase (a few bytes) in a combiner table because we now create new groups
(which need up to 3 additional opcodes because of the new `Try` and `Reject` required) to hoist one predicate for only 2 rules, which
result in a small net negative change (one or two more ops).

I used a small bash script to compare all relevant files, this is the before/after:
```
FILE                                          OLD      NEW    DIFF%    SAME?
----                                      -------  -------    -----    -----

    [8 lines not shown]
DeltaFile
+204-0llvm/test/TableGen/GlobalISelEmitter/MatchTableOptimizerRecursion.td
+67-19llvm/utils/TableGen/Common/GlobalISel/GlobalISelMatchTable.cpp
+5-34llvm/utils/TableGen/GlobalISelEmitter.cpp
+1-34llvm/utils/TableGen/GlobalISelCombinerEmitter.cpp
+12-7llvm/utils/TableGen/Common/GlobalISel/GlobalISelMatchTable.h
+18-0llvm/test/CodeGen/AArch64/GlobalISel/select-intrinsic-aarch64-sdiv.mir
+307-941 files not shown
+310-947 files

LLVM/project d176a1ellvm/lib/CodeGen/GlobalISel InstructionSelect.cpp, llvm/lib/Target/AArch64/GISel AArch64InstructionSelector.cpp

[GlobalISel][AMDGPU][AArch64] Fix GlobalISel copy propagation (#188781)

Disallow propagation of sub-registers after GlobalISel, as the current
code is blindly dropping any sub-register information. This also fixes
bugs in AArch64 and AMDGPU back-end that rely on the incorrect behavior
and would fail with the fix:

* Update `selectG_UNMERGE_VALUES` in AMDGPU so instead of generating
`hi16` for SGPR it shifts higher bits into the destination register
using `lshr`.
* Prevent AArch64 back-end from generating spurious `sub_32:gpr32all`
when selecting copy.
* Test changes: `fpto[s/u]i-sat-vector.ll`: The correct number of
conversions is now generated as higher 16-bits are handled correctly;
however, it introduces `lshr` instructions. This should be resolved in
#188287 by enabling `s_cvt_hi_*`.
DeltaFile
+144-55llvm/test/CodeGen/AMDGPU/fptosi-sat-vector.ll
+128-51llvm/test/CodeGen/AMDGPU/fptoui-sat-vector.ll
+9-7llvm/lib/CodeGen/GlobalISel/InstructionSelect.cpp
+10-2llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp
+7-0llvm/lib/Target/AArch64/GISel/AArch64InstructionSelector.cpp
+298-1155 files

LLVM/project 9346acdllvm/lib/TableGen TGParser.cpp, llvm/test/TableGen submulticlass-leteq.td submulticlass-typecheck.td

[TableGen] Add submulticlass typechecking to template arg values (#197128)

Some typechecking was missing when parsing a submulticlass reference.
Add the CheckTemplateArgValues call in ParseSubMultiClassReference.

Resolves https://github.com/llvm/llvm-project/issues/84910.
DeltaFile
+21-0llvm/test/TableGen/submulticlass-leteq.td
+12-0llvm/test/TableGen/submulticlass-typecheck.td
+5-0llvm/lib/TableGen/TGParser.cpp
+38-03 files

LLVM/project 2ec483dclang/test/CodeGenOpenCL builtins-amdgcn-gfx1250.cl, llvm/include/llvm/IR IntrinsicsAMDGPU.td

[AMDGPU] Update permlane_bcast/down/up/xor intrinsic to support more types
DeltaFile
+2,848-0llvm/test/CodeGen/AMDGPU/llvm.amdgcn.permlane.gfx1250.ll
+25-8llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+12-12llvm/include/llvm/IR/IntrinsicsAMDGPU.td
+9-9clang/test/CodeGenOpenCL/builtins-amdgcn-gfx1250.cl
+16-2llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
+8-6llvm/lib/Target/AMDGPU/VOP3Instructions.td
+2,918-373 files not shown
+2,938-429 files

LLVM/project bfe5d5bclang/include/clang/Analysis/Analyses/LifetimeSafety LifetimeSafety.h, clang/include/clang/Basic DiagnosticSemaKinds.td

[LifetimeSafety] Diagnose invalidated-field (#196680)

Teach lifetime safety invalidation diagnostics to handle origins that
escape through fields before the referenced object is invalidated.
Previously they were skipped.

Partially addresses https://github.com/llvm/llvm-project/issues/195706
DeltaFile
+60-0clang/test/Sema/warn-lifetime-safety-invalidations.cpp
+32-0clang/lib/Sema/SemaLifetimeSafety.h
+20-1clang/lib/Analysis/LifetimeSafety/Checker.cpp
+6-0clang/include/clang/Analysis/Analyses/LifetimeSafety/LifetimeSafety.h
+4-0clang/include/clang/Basic/DiagnosticSemaKinds.td
+122-15 files

LLVM/project d611791llvm/include/llvm/IR Function.h InstructionListener.h, llvm/lib/IR Function.cpp Value.cpp

refactoring
DeltaFile
+15-9llvm/include/llvm/IR/Function.h
+11-4llvm/lib/IR/Function.cpp
+6-2llvm/include/llvm/IR/InstructionListener.h
+1-3llvm/lib/IR/Value.cpp
+33-184 files

LLVM/project b4aa4d4llvm/lib/Target/X86 X86ISelLowering.cpp, llvm/test/CodeGen/X86 avx512-skx-insert-subvec.ll

[X86] combineINSERT_SUBVECTOR - only fold vXi1 zero-widening if scalar mask source has one use (#197125)

Fixes infinite loop reported on #192699
DeltaFile
+21-0llvm/test/CodeGen/X86/avx512-skx-insert-subvec.ll
+2-2llvm/lib/Target/X86/X86ISelLowering.cpp
+23-22 files

LLVM/project fccd6famlir/lib/Interfaces ControlFlowInterfaces.cpp

[mlir][Interfaces] Disallow tokens in `BranchOpInterface` / `RegionBranchOpInterface` verifiers
DeltaFile
+14-0mlir/lib/Interfaces/ControlFlowInterfaces.cpp
+14-01 files

LLVM/project 9ca55c0llvm/lib/Transforms/InstCombine InstCombineAddSub.cpp, llvm/test/Transforms/InstCombine sub-xor.ll

[InstCombine] Relax the requirements for (X ^ C2) + C -> (C2 + C) - X (#196897)

If (C2 - X) has no borrow between bits, it is equivalent to (X ^ C2).
A borrow would occur when c2_bit=0 and x_bit=1.
It follows that c2_bit=1 or x_bit=0 means no borrow.

Remove an artificial condition that C2 must be a low bits mask.

Proof: https://alive2.llvm.org/ce/z/uNMsg_
DeltaFile
+78-0llvm/test/Transforms/InstCombine/sub-xor.ll
+6-7llvm/lib/Transforms/InstCombine/InstCombineAddSub.cpp
+84-72 files

LLVM/project 87cba87clang/test/AST ast-dump-templates.cpp, llvm/test/CodeGen/AArch64 bf16-v8-instructions.ll

Merge branch 'main' into users/jmmartinez/error_on_prune
DeltaFile
+652-9,305clang/test/AST/ast-dump-templates.cpp
+5,061-4,162llvm/test/CodeGen/Thumb2/mve-clmul.ll
+7,601-671llvm/test/CodeGen/AArch64/bf16-v8-instructions.ll
+8,195-0llvm/test/MC/AMDGPU/gfx13_asm_vop3.s
+8,182-0llvm/test/MC/AMDGPU/gfx13_asm_vop3-fake16.s
+6,873-0llvm/test/tools/llvm-mca/AArch64/Cortex/C1Premium-sve-instructions.s
+36,564-14,13810,621 files not shown
+527,596-229,59610,627 files

LLVM/project 3f88ed1llvm/test/MC/AMDGPU gfx13_asm_vop3.s gfx13_asm_vop3-fake16.s

Merge branch 'main' into users/jakos-sec/spr/safestack-add-sigaction-interceptor
DeltaFile
+8,195-0llvm/test/MC/AMDGPU/gfx13_asm_vop3.s
+8,182-0llvm/test/MC/AMDGPU/gfx13_asm_vop3-fake16.s
+5,587-0llvm/test/MC/AMDGPU/gfx13_asm_vop3_dpp16.s
+5,574-0llvm/test/MC/AMDGPU/gfx13_asm_vop3_dpp16-fake16.s
+4,106-0llvm/test/MC/AMDGPU/gfx13_asm_vop3_from_vop1-fake16.s
+3,524-0llvm/test/MC/AMDGPU/gfx13_asm_vop3_dpp8.s
+35,168-0476 files not shown
+59,727-10,327482 files

LLVM/project c956013lld/ELF SyntheticSections.cpp

[lld] Remove unused argument of DataExtractor constructor (NFC) (#196361)

`AddressSize` parameter is not used by `DataExtractor` and will be
removed in the future. See #190519 for more context.
DeltaFile
+1-2lld/ELF/SyntheticSections.cpp
+1-21 files

LLVM/project fb18fe7clang/lib/CodeGen/TargetBuiltins ARM.cpp, clang/test/CodeGen/AArch64 cpu-supports-target.c

[AArch64] Guard against vector invalidation in EmitAArch64CpuSupports. (#196909)

This prevents the Vector from being invalidated whilst iterator over it.
As far as I can tell we were adding elements twice.

Fixes #196789
DeltaFile
+15-0clang/test/CodeGen/AArch64/cpu-supports-target.c
+3-2clang/lib/CodeGen/TargetBuiltins/ARM.cpp
+18-22 files

LLVM/project 9f3d304clang-tools-extra/clang-tidy/misc StaticInitializationCycleCheck.cpp StaticInitializationCycleCheck.h, clang-tools-extra/docs ReleaseNotes.rst

[clang-tidy] Add new check 'misc-static-initialization-cycle' (#175342)
DeltaFile
+395-0clang-tools-extra/clang-tidy/misc/StaticInitializationCycleCheck.cpp
+277-0clang-tools-extra/test/clang-tidy/checkers/misc/static-initialization-cycle.cpp
+63-0clang-tools-extra/docs/clang-tidy/checks/misc/static-initialization-cycle.rst
+31-0clang-tools-extra/clang-tidy/misc/StaticInitializationCycleCheck.h
+5-0clang-tools-extra/docs/ReleaseNotes.rst
+3-0clang-tools-extra/clang-tidy/misc/MiscTidyModule.cpp
+774-02 files not shown
+776-08 files

LLVM/project a469fe9llvm/test/Transforms/LoopVectorize skip-iterations.ll

[LV] Regenerate skip-iterations checks (NFC) (#197105)
DeltaFile
+154-8llvm/test/Transforms/LoopVectorize/skip-iterations.ll
+154-81 files

LLVM/project 58a639dclang-tools-extra/clang-tidy/hicpp HICPPTidyModule.cpp, clang-tools-extra/clang-tidy/tool check_alphabetical_order_test.py

[clang-tidy] Remove hicpp module [3/4] (#197076)

This is part three of removing the hicpp-* checks.

RFC:
https://discourse.llvm.org/t/rfc-regarding-the-current-status-of-hicpp-checks/89883

Part of https://github.com/llvm/llvm-project/issues/183462
DeltaFile
+16-11clang-tools-extra/docs/ReleaseNotes.rst
+0-24clang-tools-extra/clang-tidy/hicpp/HICPPTidyModule.cpp
+0-23clang-tools-extra/docs/clang-tidy/checks/hicpp/undelegated-constructor.rst
+10-10clang-tools-extra/test/clang-tidy/infrastructure/config-file.cpp
+10-6clang-tools-extra/clang-tidy/tool/check_alphabetical_order_test.py
+0-11clang-tools-extra/test/clang-tidy/checkers/hicpp/no-assembler-msvc.cpp
+36-8514 files not shown
+57-16720 files

LLVM/project ace5004clang/test/Headers wasm.c __clang_hip_math.hip, llvm/lib/Analysis ValueTracking.cpp

[ValueTracking] Handle sext, zext in computeConstantRange

Propagate constant ranges through sign extension, zero extension.
Extends the existing handling for truncations.
DeltaFile
+42-42clang/test/Headers/wasm.c
+57-0llvm/unittests/Analysis/ValueTrackingTest.cpp
+24-25clang/test/Headers/__clang_hip_math.hip
+17-4llvm/lib/Analysis/ValueTracking.cpp
+140-714 files

LLVM/project 51893b4llvm/lib/CodeGen MachineBlockPlacement.cpp

[MachineBlockPlacement] Fix use-after-erase (#197109)

`ComputedEdges.erase(FoundEdge)` invalidates `FoundEdge`, but the
function then returns `FoundEdge->second`. Read the bucket value into
a local before erasing.
DeltaFile
+6-5llvm/lib/CodeGen/MachineBlockPlacement.cpp
+6-51 files

LLVM/project e51bb36llvm/include/llvm/Analysis AliasAnalysis.h, llvm/lib/Analysis BasicAliasAnalysis.cpp AliasAnalysis.cpp

[AA] Respect potential synchronization effects of inline asm (#196965)

Respect potential synchronization effects of inline assembly calls on
not-yet-escaped memory.

We only do this if the call is both non-nosync and ModRefs "other"
memory. This is consistent with the atomic memory effects established in
https://github.com/llvm/llvm-project/pull/193768 and makes sure that
things like readonly/argmemonly continue to work as expected even for
frontends that do not emit nosync (which, right now, is all of them).

The limitation to inline asm should not actually exist: The issue
applies to all calls. This just fixes a particularly important case in a
targeted way. (The fact that inline asm memory barrier do not work as
expected is a problem for making optimizations of monotonic accesses
more aggressive, e.g. it caused issues for
https://github.com/llvm/llvm-project/pull/195015.)

The ability of inline asm (with a `~{memory}` clobber) to synchronize
was explicitly specified in
https://github.com/llvm/llvm-project/pull/150191.
DeltaFile
+35-0llvm/test/Analysis/BasicAA/atomics.ll
+15-1llvm/lib/Analysis/BasicAliasAnalysis.cpp
+2-4llvm/lib/Analysis/AliasAnalysis.cpp
+5-0llvm/include/llvm/Analysis/AliasAnalysis.h
+57-54 files

LLVM/project c85f29fllvm/lib/Target/PowerPC PPCISelLowering.cpp, llvm/test/CodeGen/PowerPC pr175297.ll

[PowerPC] Fix types when emitting ppc_altivec_vupklsw (#187789)

When lowering BUILD_VECTOR, we produce this intrinsic node, but fail to
adjust the input/output types to ensure ISel works.
This patch simply adds the necessary bitcasts.

Fixes: https://github.com/llvm/llvm-project/issues/175297
DeltaFile
+92-0llvm/test/CodeGen/PowerPC/pr175297.ll
+4-1llvm/lib/Target/PowerPC/PPCISelLowering.cpp
+96-12 files

LLVM/project cc7353bllvm/test/MC/AMDGPU gfx13_asm_vop3.s gfx13_asm_vop3-fake16.s

[AMDGPU] Add VOP3 encoding for gfx13 (#196258)

Co-authored-by: Ivan Kosarev <ivan.kosarev at amd.com>
DeltaFile
+8,195-0llvm/test/MC/AMDGPU/gfx13_asm_vop3.s
+8,182-0llvm/test/MC/AMDGPU/gfx13_asm_vop3-fake16.s
+5,587-0llvm/test/MC/AMDGPU/gfx13_asm_vop3_dpp16.s
+5,574-0llvm/test/MC/AMDGPU/gfx13_asm_vop3_dpp16-fake16.s
+4,106-0llvm/test/MC/AMDGPU/gfx13_asm_vop3_from_vop1-fake16.s
+3,524-0llvm/test/MC/AMDGPU/gfx13_asm_vop3_dpp8.s
+35,168-010 files not shown
+39,596-29816 files

LLVM/project 7fddf99clang/lib/AST/ByteCode Compiler.cpp, clang/test/AST/ByteCode fixed-point.cpp

[clang][bytecode] Pass correct QualType to getFixedPointSemantics() (#196952)

The expression type might be different, so pass the QualType we have at
hand.
DeltaFile
+5-0clang/test/AST/ByteCode/fixed-point.cpp
+1-1clang/lib/AST/ByteCode/Compiler.cpp
+6-12 files

LLVM/project 4ef1ef5llvm/test/TableGen aarch64-apple-tuning-features.td

[AArch64] Add a regression test for Apple tuning features(NFC) (#196792)

This patch adds a TableGen regression test that directly checks complete
featrure lists per generation for Apple CPUs, to guard against changes
that can break the <CPU,features> association if we lack indirect
coverage.
    
A followup patch should introduce generational delta encoding for Apple
tuning features that this test should help verify.
DeltaFile
+28-0llvm/test/TableGen/aarch64-apple-tuning-features.td
+28-01 files