LLVM/project 2e54873llvm/utils/TableGen GlobalISelEmitter.cpp, llvm/utils/TableGen/Common/GlobalISel GlobalISelMatchTable.cpp

Comments
DeltaFile
+1-4llvm/utils/TableGen/Common/GlobalISel/GlobalISelMatchTable.cpp
+1-3llvm/utils/TableGen/GlobalISelEmitter.cpp
+2-72 files

LLVM/project 691ad51clang/docs LanguageExtensions.rst, clang/include/clang/Options Options.td

Enable driver changes for fexec-charset
DeltaFile
+14-6clang/lib/Driver/ToolChains/Clang.cpp
+14-4clang/include/clang/Options/Options.td
+11-3clang/test/Driver/clang_f_opts.c
+10-0llvm/lib/Support/TextEncoding.cpp
+4-3clang/test/Driver/cl-options.c
+3-3clang/docs/LanguageExtensions.rst
+56-193 files not shown
+60-199 files

LLVM/project a95fbc9clang/lib/AST PrintfFormatString.cpp FormatString.cpp, clang/lib/Sema SemaChecking.cpp

Add format string handling
DeltaFile
+58-31clang/lib/AST/PrintfFormatString.cpp
+46-40clang/lib/AST/FormatString.cpp
+33-21clang/lib/Sema/SemaChecking.cpp
+25-11clang/lib/AST/FormatStringParsing.h
+15-8clang/lib/AST/ScanfFormatString.cpp
+19-0llvm/lib/Support/TextEncoding.cpp
+196-11110 files not shown
+255-12116 files

LLVM/project 0943a74clang/include/clang/Basic TargetInfo.h, clang/lib/AST ASTContext.cpp

convert to exec-charset inside getPredefinedStringLiteralFromCache, test __builtin_FILE()
DeltaFile
+28-0clang/test/CodeGen/systemz-charset.cpp
+10-0clang/lib/AST/ASTContext.cpp
+5-4clang/lib/Lex/TextEncodingConfig.cpp
+3-0clang/lib/Basic/TargetInfo.cpp
+2-0clang/include/clang/Basic/TargetInfo.h
+48-45 files

LLVM/project 3b9a1b8clang/test/CodeGenHIP amdgpu-barrier-type.hip, llvm/lib/Target/AMDGPU AMDGPU.h

Address comments
DeltaFile
+25-9clang/test/CodeGenHIP/amdgpu-barrier-type.hip
+16-0llvm/test/CodeGen/AMDGPU/barrier-addrspace-dereference.ll
+2-2llvm/lib/Target/AMDGPU/AMDGPU.h
+2-2llvm/test/CodeGen/AMDGPU/s-barrier-lowering.ll
+0-3llvm/test/CodeGen/AMDGPU/lds-link-time-codegen-named-barrier.ll
+45-165 files

LLVM/project 78b9857clang/test/CodeGen systemz-charset.c

fix CI
DeltaFile
+2-0clang/test/CodeGen/systemz-charset.c
+2-01 files

LLVM/project 14bf8e7libclc CMakeLists.txt, libclc/clc/lib/spirv/vulkan/math clc_sw_fma.cl

[libclc] Base the build around `add_sources` instead of source list (#197034)

Summary:
The current build uses a curated + deduplicated source list. This PR
seeks to simplify this a little bit and canonicalize the behavior.

Now we create the target up-front, `clc` and `opencl`. We add the
directories which add sources to this target. We normalize the
architecture to the variants. We always add target specific versions
first. When we add sources we check if the file already exists and defer
to the architecture specific one.

This normalized the behavior, the directories are now laid out like this
`clc/<arch>/<os>`. We normalize these to `amdgpu`, `nvptx`, and `spirv`
respectively. We use the OS for the newly created vulkan target. We now
control variants via checking if the directory for that exists, so it's
nested more naturally.

Hopefully this makes more sense, the goal is to exercise the fact that
we have individual builds now. Previously this did not work because you
could not add_subdirectory more than once.
DeltaFile
+0-275libclc/clc/lib/vulkan/math/clc_sw_fma.cl
+275-0libclc/clc/lib/spirv/vulkan/math/clc_sw_fma.cl
+0-151libclc/opencl/lib/vulkan/shared/vstore_half.cl
+151-0libclc/opencl/lib/spirv/vulkan/shared/vstore_half.cl
+68-74libclc/cmake/modules/AddLibclc.cmake
+52-89libclc/CMakeLists.txt
+546-58960 files not shown
+1,369-1,39966 files

LLVM/project 8690190llvm/lib/Target/AMDGPU VOP3Instructions.td, llvm/test/CodeGen/AMDGPU v_ashr_pk.ll

[AMDGPU][True16] true16 impl for v_ashr_pk_u/i8_i32
DeltaFile
+118-29llvm/test/CodeGen/AMDGPU/v_ashr_pk.ll
+24-24llvm/test/MC/AMDGPU/gfx1250_asm_vop3.s
+24-24llvm/test/MC/AMDGPU/gfx13_asm_vop3.s
+24-16llvm/test/MC/AMDGPU/gfx13_asm_vop3_dpp16.s
+24-12llvm/test/MC/Disassembler/AMDGPU/gfx1250_dasm_vop3.txt
+24-9llvm/lib/Target/AMDGPU/VOP3Instructions.td
+238-1144 files not shown
+276-13410 files

LLVM/project f69b1e9llvm/lib/Transforms/InstCombine InstCombineSelect.cpp

[InstCombine] Fix one operator precedence
DeltaFile
+4-4llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
+4-41 files

LLVM/project 84812fdllvm/lib/CodeGen/SelectionDAG DAGCombiner.cpp

[NFC][DAG] scalarizeExtractedBinOp - pull out constant build vector detection into isExtractFree helper (#197155)

Prep work for #196493
DeltaFile
+6-5llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+6-51 files

LLVM/project 783cb86llvm/lib/Transforms/InstCombine InstCombineAndOrXor.cpp

[NFC][InstCombine] fix duplicate CreateNot in ((A^C)^B) & (B^A) fold
DeltaFile
+1-1llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
+1-11 files

LLVM/project e8f4a57llvm/lib/CodeGen/SelectionDAG DAGCombiner.cpp

save one knownbits computation
DeltaFile
+7-4llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+7-41 files

LLVM/project ab87cb0llvm/test/CodeGen/X86 vector-reduce-add-mask.ll vector-reduce-add-sext.ll

[X86] vector-reduce-add-*.ll - add 32-bit test coverage (#197152)
DeltaFile
+1,646-808llvm/test/CodeGen/X86/vector-reduce-add-mask.ll
+1,463-870llvm/test/CodeGen/X86/vector-reduce-add-sext.ll
+1,188-489llvm/test/CodeGen/X86/vector-reduce-add.ll
+441-194llvm/test/CodeGen/X86/vector-reduce-add-zext.ll
+4,738-2,3614 files

LLVM/project 06615e0llvm/test/CodeGen/X86 vector-reduce-ctpop.ll

[X86] vector-reduce-ctpop.ll - add 32-bit test coverage (#197149)
DeltaFile
+4,686-918llvm/test/CodeGen/X86/vector-reduce-ctpop.ll
+4,686-9181 files

LLVM/project a464577llvm/lib/IR Instructions.cpp

[IR] Preserve samesign when cloning ICmpInst (#197118)

Clone should preserve IR flags faithfully.
DeltaFile
+3-1llvm/lib/IR/Instructions.cpp
+3-11 files

LLVM/project 41e493allvm/lib/CodeGen InlineSpiller.cpp, llvm/test/CodeGen/AMDGPU remat-through-copy.mir

[RegAlloc] Trace through COPYs to find rematerializable definitions (#190955)

After live range splitting, successful rematerialization in one split
interval can remove the original defining instruction, leaving only COPY
instructions in other split intervals. When attempting to rematerialize
uses in those intervals, the code fails to find the original definition
and gives up.

This patch traces backwards through COPY chains to recover the original
rematerializable definition instead of giving up.
DeltaFile
+82-0llvm/test/CodeGen/AMDGPU/remat-through-copy.mir
+39-4llvm/lib/CodeGen/InlineSpiller.cpp
+8-30llvm/test/CodeGen/RISCV/rvv/remat.ll
+129-343 files

LLVM/project de0a051mlir/docs Tokens.md LangRef.md, mlir/include/mlir/IR CommonTypeConstraints.td

address comments
DeltaFile
+17-21mlir/docs/Tokens.md
+2-17mlir/test/IR/token-type.mlir
+1-8mlir/include/mlir/IR/CommonTypeConstraints.td
+0-7mlir/test/lib/Dialect/Test/TestOps.td
+2-1mlir/docs/LangRef.md
+22-545 files

LLVM/project c2d51a2llvm/lib/Transforms/Vectorize VPlanAnalysis.cpp VPlan.h

[VPlan] Add Type* and getType() to VPSymbolicValue (NFC) (#195183)

Add a Type* field to VPSymbolicValue, along with a getType() methods to
query the stored scalar type.

This makes it easier to retrieve the type of various symbolic values,
and also simplifies VPTypeAnalysis construction.

PR: https://github.com/llvm/llvm-project/pull/195183
DeltaFile
+3-23llvm/lib/Transforms/Vectorize/VPlanAnalysis.cpp
+16-8llvm/lib/Transforms/Vectorize/VPlan.h
+7-4llvm/lib/Transforms/Vectorize/VPlan.cpp
+7-1llvm/lib/Transforms/Vectorize/VPlanValue.h
+2-6llvm/lib/Transforms/Vectorize/VPlanAnalysis.h
+4-3llvm/lib/Transforms/Vectorize/VPlanConstruction.cpp
+39-455 files not shown
+45-5011 files

LLVM/project 0b53aa8llvm/lib/CodeGen/SelectionDAG DAGCombiner.cpp

save some knownbits computations

Co-authored-by: Copilot <copilot at github.com>
DeltaFile
+9-5llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+9-51 files

LLVM/project 686ed50llvm/lib/Target/X86 X86ISelLowering.cpp, llvm/test/CodeGen/X86 atomic-load-store.ll

[X86] Manage atomic store of fp -> int promotion in DAG

When lowering atomic <1 x T> vector types with floats, selection can fail since
this pattern is unsupported. To support this, floats can be casted to
an integer type of the same size.
DeltaFile
+134-0llvm/test/CodeGen/X86/atomic-load-store.ll
+4-0llvm/lib/Target/X86/X86ISelLowering.cpp
+138-02 files

LLVM/project 9bb1933clang/lib/CIR/CodeGen CIRGenBuiltinAMDGPU.cpp, clang/test/CIR/CodeGenHIP builtins-amdgcn.hip

[CIR][AMDGPU] Add lowering for amdgcn readlane readfirstlane builtins
DeltaFile
+16-0clang/test/CIR/CodeGenHIP/builtins-amdgcn.hip
+13-2clang/lib/CIR/CodeGen/CIRGenBuiltinAMDGPU.cpp
+29-22 files

LLVM/project 2b96f9cllvm/lib/Target/AMDGPU SIISelLowering.cpp AMDGPULowerExecSync.cpp, llvm/test/CodeGen/AMDGPU addrspacecast-barrier.ll s-barrier.ll

[RFC][AMDGPU] Add BARRIER address space

Add a new BARRIER address space that is used for global variables that are used to represent the barrier IDs in GFX12.5.

These barrier addresses just have values corresponding 1-1 to barrier IDs. They are still implemented on top of LDS, but the offsetting happens during an addrspacecast to generic, not whenever the barrier GV is used.

The motivation for this is to make the relation between LDS and barrier GVs explicit in the compiler. It does add a bit more complexity, but that complexity was already there, just hidden by pretending barrier GVs were actual LDS.
DeltaFile
+442-0llvm/test/CodeGen/AMDGPU/addrspacecast-barrier.ll
+62-45llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+34-54llvm/lib/Target/AMDGPU/AMDGPULowerExecSync.cpp
+54-31llvm/test/CodeGen/AMDGPU/s-barrier.ll
+36-31llvm/test/CodeGen/AMDGPU/s-barrier-lowering.ll
+52-14llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
+680-17538 files not shown
+1,091-47344 files

LLVM/project 215bd25clang/lib/StaticAnalyzer/Core ExprEngine.cpp

[analyzer] Clean up evalBind, fix bad logic (#196313)

This commit refactors `ExprEngine::evalBind` to eliminate the use of a
`NodeBuilder` and fix incorrect logic that was apparently introduced
because the `NodeBuilder` had obfuscated the underlying set operations.

In the special case when the engine is binding to an `Unknown` or
`Undefined` memory location, with the old code on each execution path
_either_ only the `check::Bind` checkers _or_ only the pointer escape
checkers were invoked. This commit ensures that on each execution path
_both_ the `check::Bind` checkers _and then_ the pointer escape checkers
get a chance to activate.

I'm pretty sure that the bad logic did not cause incorrect behavior of
the analyzer, because there are no `checkBind` checkers that generate
non-sink transitions when the location is `Unknown` or `Undefined`.

I also added an assertion that the location argument of `evalBind`
cannot be a `NonLoc`, because this is a common sense precondition, seems
to be actually true and makes it easier to reason about the behavior of
this function.
DeltaFile
+19-33clang/lib/StaticAnalyzer/Core/ExprEngine.cpp
+19-331 files

LLVM/project 21024a1llvm/test/CodeGen/AArch64/GlobalISel select-intrinsic-aarch64-sdiv.mir preselect-process-phis.mir

Fix tests
DeltaFile
+4-22llvm/test/CodeGen/AArch64/GlobalISel/select-intrinsic-aarch64-sdiv.mir
+2-4llvm/test/CodeGen/AArch64/GlobalISel/preselect-process-phis.mir
+6-262 files

LLVM/project 1ba6ef2llvm/utils/TableGen/Common/GlobalISel GlobalISelMatchTable.h

[GlobalISel][MatchTable] Fix RTTI of Imm/ImmPredicate classes
DeltaFile
+7-5llvm/utils/TableGen/Common/GlobalISel/GlobalISelMatchTable.h
+7-51 files

LLVM/project f2c5901clang/lib/CodeGen TargetInfo.h CodeGenModule.cpp, clang/lib/CodeGen/Targets AMDGPU.cpp SPIR.cpp

[NFCI][clang] Allow overriding any global variable address space

Allow the target to change the AS of a global variable at will, not just whenever Clang cannot assign one.
This enables the next patch that will specialize LDS GVs for barriers as a separate address space.
DeltaFile
+10-9clang/lib/CodeGen/Targets/AMDGPU.cpp
+9-7clang/lib/CodeGen/TargetInfo.h
+7-8clang/lib/CodeGen/Targets/SPIR.cpp
+11-2clang/lib/CodeGen/CodeGenModule.cpp
+5-6clang/lib/CodeGen/TargetInfo.cpp
+6-3clang/lib/CodeGen/Targets/AVR.cpp
+48-356 files

LLVM/project 2018000llvm/lib/Target/AMDGPU AMDGPUMemoryUtils.cpp AMDGPUMemoryUtils.h

[NFC][AMDGPU] Generalize some LDS MemoryUtils

In preparation for upcoming work, I need some functions used by the LDS lowering
system to work on any GV. I removed the LDS specific queries inside these functions
and replaced them with functors passed by the caller, so these utility functions can be reused.

I also cleaned-up a few things that weren't up to code, such as lowercase variable names.
DeltaFile
+30-36llvm/lib/Target/AMDGPU/AMDGPUMemoryUtils.cpp
+37-9llvm/lib/Target/AMDGPU/AMDGPUMemoryUtils.h
+20-17llvm/lib/Target/AMDGPU/AMDGPULowerModuleLDSPass.cpp
+24-10llvm/lib/Target/AMDGPU/AMDGPULowerExecSync.cpp
+7-6llvm/lib/Target/AMDGPU/AMDGPUSwLowerLDS.cpp
+118-785 files

LLVM/project 1396099llvm/test/CodeGen/AArch64/GlobalISel select-intrinsic-aarch64-sdiv.mir, llvm/test/TableGen/GlobalISelEmitter MatchTableOptimizerRecursion.td

[GlobalISel] Recursively Optimise MatchTable Matchers

The core of this change is the additional call to `Matcher::optimize()` in the `optimizeRules` function,
which enables the match table optimization logic to recurse on the children of every GroupMatcher, forming
additional groups (which hoist more common predicates into a shared group).

To enable that, I had to update the `getFirstConditionAsRootType` implementation to support `GroupMatcher`.
I also included a small refactoring of the match table optimization pipeline that was identical between the
GlobalISel and GlobalISelCombiner emitters.

The results of this change are up to a 25% size reduction for GlobalISel match tables.
There is a tiny increase (a few bytes) in a combiner table because we now create new groups
(which need up to 3 additional opcodes because of the new `Try` and `Reject` required) to hoist one predicate for only 2 rules, which
result in a small net negative change (one or two more ops).

I used a small bash script to compare all relevant files, this is the before/after:
```
FILE                                          OLD      NEW    DIFF%    SAME?
----                                      -------  -------    -----    -----

    [8 lines not shown]
DeltaFile
+204-0llvm/test/TableGen/GlobalISelEmitter/MatchTableOptimizerRecursion.td
+67-19llvm/utils/TableGen/Common/GlobalISel/GlobalISelMatchTable.cpp
+5-34llvm/utils/TableGen/GlobalISelEmitter.cpp
+1-34llvm/utils/TableGen/GlobalISelCombinerEmitter.cpp
+12-7llvm/utils/TableGen/Common/GlobalISel/GlobalISelMatchTable.h
+18-0llvm/test/CodeGen/AArch64/GlobalISel/select-intrinsic-aarch64-sdiv.mir
+307-941 files not shown
+310-947 files

LLVM/project 26cae62lldb/source/Host/common NativeProcessProtocol.cpp, lldb/source/Plugins/Process/FreeBSD NativeProcessFreeBSD.cpp

 Reapply "[lldb] Do not refcount breakpoints in lldb-server" (#195858)  (#196891)

This reapplies #195858 with a fix for 32-bit arm (and generally, any
architecture that uses software single-stepping). The problem was that
the temporary breakpoints used for single-stepping were interfering with
the breakpoints set by the client.

The fix is to check for existing breakpoints before setting the
temporary ones. To achieve this, I've separated the notion of "next PC
candidates for a thread" from "step breakpoints we've actually set".

The freebsd code had some software single stepping code, but:
- this was [introduced](https://reviews.llvm.org/D95802) for mips64
support, which was
[removed](https://github.com/llvm/llvm-project/pull/179582) earlier this
year
- AFAICT, this never worked since the original patch only checked
`m_threads_stepping_with_breakpoint`, but never set it to anything.


    [18 lines not shown]
DeltaFile
+19-21lldb/source/Plugins/Process/Utility/NativeProcessSoftwareSingleStep.cpp
+1-16lldb/source/Plugins/Process/FreeBSD/NativeProcessFreeBSD.cpp
+4-10lldb/source/Host/common/NativeProcessProtocol.cpp
+11-2lldb/test/API/functionalities/multi-breakpoint/TestMultiBreakpoint.py
+6-7lldb/source/Plugins/Process/Linux/NativeProcessLinux.cpp
+5-2lldb/source/Plugins/Process/Utility/NativeProcessSoftwareSingleStep.h
+46-582 files not shown
+51-618 files

LLVM/project b1fe10dllvm/lib/CodeGen/SelectionDAG DAGCombiner.cpp

1
DeltaFile
+0-1llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+0-11 files