LLVM/project 1a34007llvm/lib/Transforms/Vectorize VPlan.h

[VPlan] Inline WidenSelect::isInvariantCond (NFC) (#166742)

VPWidenSelectRecipe::isInvariantCond has a sole use: inline it in the
use-site, as it is not meant to be used standalone.
DeltaFile
+1-5llvm/lib/Transforms/Vectorize/VPlan.h
+1-51 files

LLVM/project 54c9dddlibcxxabi/src/demangle Utility.h ItaniumDemangle.h, lldb/source/Core DemangledNameInfo.cpp

[libcxxabi][ItaniumDemangle] Separate GtIsGt counter into more states (#166578)

Currently `OutputBuffer::GtIsGt` is used to tell us if we're inside
template arguments and have printed a '(' without a closing ')'. If so,
we don't need to quote '<' when printing it as part of a binary
expression inside a template argument. Otherwise we need to. E.g.,
```
foo<a<(b < c)>> // Quotes around binary expression needed.
```

LLDB's `TrackingOutputBuffer` has heuristics that rely on checking
whether we are inside template arguments, regardless of the current
parentheses depth. We've been using `isGtInsideTemplateArgs` for this,
but that isn't correct. Resulting in us incorrectly tracking the
basename of function like:
```
void func<(foo::Enum)1>()
```
Here `GtIsGt > 0` despite us being inside template arguments (because we

    [12 lines not shown]
DeltaFile
+20-6libcxxabi/src/demangle/Utility.h
+20-6llvm/include/llvm/Demangle/Utility.h
+7-5libcxxabi/src/demangle/ItaniumDemangle.h
+7-5llvm/include/llvm/Demangle/ItaniumDemangle.h
+10-0lldb/unittests/Core/MangledTest.cpp
+2-2lldb/source/Core/DemangledNameInfo.cpp
+66-246 files

LLVM/project 7bbed06clang-tools-extra/clang-doc/assets class-template.mustache namespace-template.mustache, clang-tools-extra/test/clang-doc mustache-separate-namespace.cpp mustache-index.cpp

[clang-doc] remove indentation for preformatted text

Text that is in between <pre> tags is formatted verbatim. Thus, the
text that was correctly indented in relation to its depth in HTML was
being indented incorrectly when rendered. That resulted in bad looking pages.
DeltaFile
+2-6clang-tools-extra/clang-doc/assets/class-template.mustache
+1-3clang-tools-extra/test/clang-doc/mustache-separate-namespace.cpp
+1-3clang-tools-extra/test/clang-doc/mustache-index.cpp
+1-3clang-tools-extra/clang-doc/assets/namespace-template.mustache
+5-154 files

LLVM/project fc5e0c0libcxx/include/__algorithm for_each_n.h fill_n.h, libcxx/include/__iterator distance.h segmented_iterator.h

[libc++] Simplify most of the segmented iterator optimizations (#164797)

This patch does two things.
(1) It replaces SFINAE with `if constexpr`, avoiding some overload
resolution and unnecessary boilerplate.
(2) It removes an overload from `__for_each_n` to forward to
`__for_each`, since `__for_each` doesn't provide any further
optimizations.
DeltaFile
+20-47libcxx/include/__algorithm/for_each_n.h
+11-30libcxx/include/__algorithm/fill_n.h
+15-20libcxx/include/__iterator/distance.h
+12-18libcxx/include/__algorithm/for_each.h
+10-12libcxx/include/__algorithm/fill.h
+0-5libcxx/include/__iterator/segmented_iterator.h
+68-1326 files

LLVM/project 9cfef46clang-tools-extra/clang-doc/assets class-template.mustache namespace-template.mustache

[clang-doc] remove indentation for preformatted text

Text that is in between <pre> tags is formatted verbatim. Thus, the
text that was correctly indented in relation to its depth in HTML was
being indented incorrectly when rendered. That resulted in bad looking pages.
DeltaFile
+2-4clang-tools-extra/clang-doc/assets/class-template.mustache
+1-3clang-tools-extra/clang-doc/assets/namespace-template.mustache
+3-72 files

LLVM/project c8adbd7orc-rt/include/orc-rt Endian.h, orc-rt/unittests EndianTest.cpp CMakeLists.txt

[orc-rt] Add endian_read/write operations. (#166892)

The endian_read and endian_write operations read and write unsigned
integers stored in a given endianness.
DeltaFile
+100-0orc-rt/unittests/EndianTest.cpp
+44-0orc-rt/include/orc-rt/Endian.h
+1-0orc-rt/unittests/CMakeLists.txt
+145-03 files

LLVM/project 3aa7a24lldb/test/Shell/Unwind/Inputs call-asm.c

[NFCI][lldb][test] Avoid unnecessary GNU extension for assembly call (#166769)

`asm()` on function declarations is used for specifying the mangling. But it's a GNU extension and
very much unnecessary here since the name matches.

Fixes compiling if the compiler defaults to non-extensions mode.
DeltaFile
+1-2lldb/test/Shell/Unwind/Inputs/call-asm.c
+1-21 files

LLVM/project 6145b9dllvm/lib/Target/RISCV RISCVInstrInfo.cpp, llvm/test/CodeGen/RISCV machine-outliner-cfi.mir

[RISCV] Support outlining of CFI instructions in the machine outliner (#166149)

Add support for outlining CFI instructions if

  a) the outlined function is being tail called
  b) all of the CFI instructions in the function are being outlined

This is similar to what is being done on AArch64 and X86.

---------

Co-authored-by: Craig Topper <craig.topper at sifive.com>
DeltaFile
+183-26llvm/test/CodeGen/RISCV/machine-outliner-cfi.mir
+31-21llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
+214-472 files

LLVM/project 856ef96llvm/lib/Target/RISCV RISCVISelLowering.cpp, llvm/test/CodeGen/RISCV zicond-opts.ll xaluo.ll

[RISCV] Optimize (and (icmp x, 0, neq), (icmp y, 0, neq)) utilizing zicond extension

PR  #166469

```
    %1 = icmp x, 0, neq
    %2 = icmp y, 0, neq
    %3 = and %1, %2
```
Origionally lowered to:
```
    %1 = snez x
    %2 = snez y
    %3 = and %1, %2
```
With optimiztion:
```
    %1 = snez x
    %3 = czero.eqz %1, y
```
DeltaFile
+120-14llvm/test/CodeGen/RISCV/zicond-opts.ll
+43-1llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+20-24llvm/test/CodeGen/RISCV/xaluo.ll
+183-393 files

LLVM/project 80232b5llvm/lib/Target/AMDGPU SIInstrInfo.cpp SIInstrInfo.h, llvm/test/CodeGen/AMDGPU amdgcn.bitcast.1024bit.ll ran-out-of-sgprs-allocation-failure.mir

[AMDGPU] Add liverange split instructions into BB Prolog

The COPY inserted for liverange split during sgpr-regalloc
pipeline currently breaks the BB prolog during the subsequent
vgpr-regalloc phase while spilling and/or splitting the vector
liveranges. This patch fixes it by correctly including the
the LR split instructions during sgpr-regalloc and wwm-regalloc
pipelines into the BB prolog.
DeltaFile
+683-659llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.1024bit.ll
+58-62llvm/test/CodeGen/AMDGPU/ran-out-of-sgprs-allocation-failure.mir
+27-7llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
+2-0llvm/lib/Target/AMDGPU/SIInstrInfo.h
+770-7284 files

LLVM/project a7bf45autils/bazel/llvm-project-overlay/mlir BUILD.bazel

[bazel] Add missing deps for AlignmentAttrInterface.h (#166899)

Commit 4a7d3df added `#include "AlignmentAttrInterface.h"` in three
places: MemRef.h, SPIRVOps.h, and VectorOps.h; my previous bazel fix
7e9db96 only covered MemRef.h, but not the other two.
DeltaFile
+2-0utils/bazel/llvm-project-overlay/mlir/BUILD.bazel
+2-01 files

LLVM/project 1e3a049clang/include/clang/Basic BuiltinsAMDGPU.def, clang/lib/CodeGen/TargetBuiltins AMDGPU.cpp

[AMDGPU] Add builtins for wave reduction intrinsics
DeltaFile
+84-0clang/test/CodeGenOpenCL/builtins-amdgcn.cl
+8-0clang/lib/CodeGen/TargetBuiltins/AMDGPU.cpp
+4-0clang/include/clang/Basic/BuiltinsAMDGPU.def
+96-03 files

LLVM/project ba56c47llvm/lib/Target/AMDGPU SIISelLowering.cpp SIInstructions.td, llvm/test/CodeGen/AMDGPU llvm.amdgcn.reduce.sub.ll llvm.amdgcn.reduce.add.ll

[AMDGPU] Add wave reduce intrinsics for float types - 2

Supported Ops: `fadd`, `fsub`
DeltaFile
+967-0llvm/test/CodeGen/AMDGPU/llvm.amdgcn.reduce.sub.ll
+949-0llvm/test/CodeGen/AMDGPU/llvm.amdgcn.reduce.add.ll
+39-3llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+2-0llvm/lib/Target/AMDGPU/SIInstructions.td
+1,957-34 files

LLVM/project 8f8b501llvm/include/llvm/IR IntrinsicsAMDGPU.td, llvm/lib/Target/AMDGPU SIISelLowering.cpp SIInstructions.td

[AMDGPU] Add wave reduce intrinsics for float types - 1

Supported Ops: `fmin`, `fmax`
DeltaFile
+881-0llvm/test/CodeGen/AMDGPU/llvm.amdgcn.reduce.max.ll
+881-0llvm/test/CodeGen/AMDGPU/llvm.amdgcn.reduce.min.ll
+42-4llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+4-1llvm/lib/Target/AMDGPU/SIInstructions.td
+1-1llvm/include/llvm/IR/IntrinsicsAMDGPU.td
+1,809-65 files

LLVM/project 77b9301llvm/lib/Target/AArch64 AArch64FrameLowering.cpp AArch64AsmPrinter.cpp, llvm/test/CodeGen/AArch64 seh-extended-spills.ll

AArch64: support extended spills in SEH on WoS (#166849)

When lowering code for Windows, we might be using a non-standard calling
convention (e.g. `preserve_most`). In such a case, we might be spilling
registers which are unexpected (i.e. x9-x15). Use the ARM64EC opcodes to
indicate such spills.

This adds support for the handling for these spills but is insufficient
on its own. The encoded results are incorrect due to the expectation
that the pair wise spills are always 16-byte aligned which we currently
do not enforce. Fixing that is beyond the scope of emitting the SEH
directives for the spill.
DeltaFile
+34-0llvm/test/CodeGen/AArch64/seh-extended-spills.ll
+23-7llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
+16-0llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
+2-0llvm/lib/Target/AArch64/AArch64InstrInfo.td
+2-0llvm/lib/Target/AArch64/AArch64PrologueEpilogue.cpp
+2-0llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
+79-76 files

LLVM/project 40c89e5llvm/lib/Target/X86 X86AvoidTrailingCall.cpp X86.h

[X86][NewPM] Add New Pass Manager wiring for x86-avoid-trailing-call (#166723)

DeltaFile
+26-7llvm/lib/Target/X86/X86AvoidTrailingCall.cpp
+12-2llvm/lib/Target/X86/X86.h
+2-2llvm/lib/Target/X86/X86TargetMachine.cpp
+1-1llvm/lib/Target/X86/X86PassRegistry.def
+41-124 files

LLVM/project 28fdda6llvm/lib/Target/RISCV RISCVISelLowering.cpp, llvm/test/CodeGen/RISCV rv64zba.ll

[RISCV] Use SLLI.UW in double-SHL_ADD multiplications (#166728)

Similarily to muls by 3/5/9 << N, emit the SHL first for other SHL_ADD
multiplications, if it can be fold into SLLI.UW.
DeltaFile
+37-41llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+56-0llvm/test/CodeGen/RISCV/rv64zba.ll
+93-412 files

LLVM/project a9587c1llvm/include/llvm/CodeGen MachineInstr.h, llvm/lib/CodeGen SplitKit.cpp

[CodeGen] Introduce MI flag for Live Range split instructions

For some targets, it is required to identify the COPY instruction
corresponds to the RA inserted live range split. Adding the new
flag `MachineInstr::LRSplit` to serve the purpose.
DeltaFile
+2-1llvm/include/llvm/CodeGen/MachineInstr.h
+2-0llvm/lib/CodeGen/SplitKit.cpp
+4-12 files

LLVM/project bf1b866libc/config/baremetal/aarch64 entrypoints.txt, libc/config/baremetal/arm entrypoints.txt

[libc] Add localtime_r to baremetal entrypoints (#166677)

DeltaFile
+4-4libc/src/time/baremetal/CMakeLists.txt
+2-3libc/config/baremetal/aarch64/entrypoints.txt
+2-0libc/config/baremetal/arm/entrypoints.txt
+2-0libc/config/baremetal/riscv/entrypoints.txt
+10-74 files

LLVM/project 41825fbllvm/lib/Option ArgList.cpp, llvm/unittests/Option OptionSubCommandsTest.cpp

[Option] Fix simple subcommand with positional arguments (#166859)

Fix subcommand detection when subcommand used with positional arguments.
When there is only one valid subcommand passed,
`ArgList::getSubCommand()` should return the correct subcommand even
there are other positionals passed.
DeltaFile
+13-0llvm/unittests/Option/OptionSubCommandsTest.cpp
+1-3llvm/lib/Option/ArgList.cpp
+14-32 files

LLVM/project b9669ballvm/test/CodeGen/AMDGPU unstructured-cfg-def-use-issue.ll branch-folding-implicit-def-subreg.ll, llvm/test/Instrumentation/HWAddressSanitizer/X86 basic.ll

[WIP][IR][Constants] Change the semantic of `ConstantPointerNull` to represent an actual `nullptr` instead of a zero-value pointer

The value of a `nullptr` is not always `0`. For example, on AMDGPU, the `nullptr` in address spaces 3 and 5 is `0xffffffff`. Currently, there is no target-independent way to get this information, making it difficult and error-prone to handle null pointers in target-agnostic code.

We do have `ConstantPointerNull`, but it might be a little confusing and misleading. It represents a pointer with an all-zero value rather than necessarily a real `nullptr`. Therefore, to represent a real `nullptr` in address space `N`, we need to use `addrspacecast ptr null to ptr addrspace(N)` and it can't be folded.

In this PR, we change the semantic of `ConstantPointerNull` to represent an actual `nullptr` instead of a zero-value pointer. Here is the detailed changes.

* `ptr addrspace(N) null` will represent the actual `nullptr` in address space `N`.
* `ptr addrspace(N) zeroinitializer` will represent a zero-value pointer in address space `N`.

* `Constant::getNullValue` will return a _null_ value. It is same as the current semantics except for the `PointerType`, which will return a real `nullptr` pointer.
* `Constant::getZeroValue` will return a zero value constant. It is completely same as the current semantics. To represent a zero-value pointer, a `ConstantExpr` will be used (effectively `inttoptr i8 0 to ptr addrspace(N)`).
* Correspondingly, there will be both `Constant::isNullValue` and `Constant::isZeroValue`.

The RFC is https://discourse.llvm.org/t/rfc-introduce-sentinel-pointer-value-to-datalayout/85265. It is a little bit old and the title might look different, but everything eventually converges to this change. An early attempt can be found in https://github.com/llvm/llvm-project/pull/131557, which has many valuable discussion as well.

This PR is still WIP but any early feedback is welcome. I'll include as many necessary code changes as possible in this PR, but eventually this needs to be carefully split into multiple PRs, and I'll do it after the changes look good to every one.
DeltaFile
+76-73llvm/test/CodeGen/AMDGPU/unstructured-cfg-def-use-issue.ll
+71-56llvm/test/CodeGen/AMDGPU/branch-folding-implicit-def-subreg.ll
+62-60llvm/test/CodeGen/AMDGPU/tuple-allocation-failure.ll
+46-44llvm/test/CodeGen/AMDGPU/agpr-copy-no-free-registers.ll
+41-41llvm/test/Instrumentation/HWAddressSanitizer/X86/basic.ll
+40-37llvm/test/CodeGen/AMDGPU/promote-constOffset-to-imm.ll
+336-31185 files not shown
+869-67391 files

LLVM/project c9b4169clang/include/clang/Analysis/Analyses/LifetimeSafety Facts.h, clang/lib/Analysis/LifetimeSafety Dataflow.h Facts.cpp

[LifetimeSafety] Optimize fact storage with IDs and vector-based lookup (#165963)

Optimize the FactManager and DataflowAnalysis classes by using vector-based storage with ID-based lookups instead of maps.

- Added a `FactID` type using the `utils::ID` template to uniquely identify facts
- Modified `Fact` class to store and manage IDs
- Changed `FactManager` to use vector-based storage indexed by block ID instead of a map
- Updated `DataflowAnalysis` to use vector-based storage for states instead of maps
- Modified lookups to use ID-based indexing for better performance

Improves compile time hit on long-tail targets like `tools/clang/lib/CodeGen/CMakeFiles/obj.clangCodeGen.dir/TargetBuiltins/RISCV.cpp.o`​ from [21%](http://llvm-compile-time-tracker.com/compare_clang.php?from=6e25a04027ca786b7919657c7df330a33985ceea&to=20b42efa277c8b1915db757863e1fc26531cfd53&stat=instructions%3Au&sortBy=absolute-difference) to [3.2%](http://llvm-compile-time-tracker.com/compare_clang.php?from=6e25a04027ca786b7919657c7df330a33985ceea&to=d2d1cd1109c3a85344457bfff6f092ae7b96b211&stat=instructions%3Au&sortBy=absolute-difference)
DeltaFile
+23-8clang/include/clang/Analysis/Analyses/LifetimeSafety/Facts.h
+9-5clang/lib/Analysis/LifetimeSafety/Dataflow.h
+5-8clang/lib/Analysis/LifetimeSafety/Facts.cpp
+1-0clang/lib/Analysis/LifetimeSafety/LifetimeSafety.cpp
+38-214 files

LLVM/project 0dafeb9clang/include/clang/Analysis/Analyses/LifetimeSafety Facts.h, clang/lib/Analysis/LifetimeSafety Dataflow.h Facts.cpp

Avoid using DenseMap for CFGBlock and program points
DeltaFile
+23-8clang/include/clang/Analysis/Analyses/LifetimeSafety/Facts.h
+9-5clang/lib/Analysis/LifetimeSafety/Dataflow.h
+5-8clang/lib/Analysis/LifetimeSafety/Facts.cpp
+1-0clang/lib/Analysis/LifetimeSafety/LifetimeSafety.cpp
+38-214 files

LLVM/project 5314d99llvm/include/llvm/Support ThreadPool.h, llvm/lib/Support ThreadPool.cpp

Use `llvm::unique_function` in the async APIs (#166727)

This is needed to allow using these APIs with callable objects that
transitively capture move-only constructs. These come up very widely
when writing concurrent code such a `std::future`, `std::promise`,
`std::unique_lock`, etc.
DeltaFile
+13-10llvm/include/llvm/Support/ThreadPool.h
+14-0llvm/unittests/Support/ThreadPool.cpp
+2-2llvm/lib/Support/ThreadPool.cpp
+29-123 files

LLVM/project f8e9b89clang/lib/CodeGen CGBuiltin.cpp

[CodeGen] Fix a warning

This patch fixes:

  clang/lib/CodeGen/CGBuiltin.cpp:1216:13: error: unused variable
  'CAT' [-Werror,-Wunused-variable]
DeltaFile
+1-1clang/lib/CodeGen/CGBuiltin.cpp
+1-11 files

LLVM/project 2dd60afclang/include/clang/Analysis/Analyses/LifetimeSafety Facts.h, clang/lib/Analysis/LifetimeSafety Dataflow.h Facts.cpp

Avoid using DenseMap for CFGBlock and program points
DeltaFile
+23-8clang/include/clang/Analysis/Analyses/LifetimeSafety/Facts.h
+9-5clang/lib/Analysis/LifetimeSafety/Dataflow.h
+5-8clang/lib/Analysis/LifetimeSafety/Facts.cpp
+1-0clang/lib/Analysis/LifetimeSafety/LifetimeSafety.cpp
+38-214 files

LLVM/project 25c01e2llvm/test/CodeGen/LoongArch expandmemcmp.ll expandmemcmp-optsize.ll

rebase && update tests
DeltaFile
+2,594-715llvm/test/CodeGen/LoongArch/expandmemcmp.ll
+1,619-527llvm/test/CodeGen/LoongArch/expandmemcmp-optsize.ll
+18-9llvm/test/CodeGen/LoongArch/memcmp.ll
+4,231-1,2513 files

LLVM/project 92d82f8llvm/lib/Target/LoongArch LoongArchTargetTransformInfo.cpp

enable tail expansion which will reduce branches like aarch64/riscv
DeltaFile
+7-3llvm/lib/Target/LoongArch/LoongArchTargetTransformInfo.cpp
+7-31 files

LLVM/project 6fe5270llvm/lib/Target/LoongArch LoongArchTargetTransformInfo.cpp LoongArchTargetTransformInfo.h

[LoongArch] Initial implementation for `enableMemCmpExpansion` hook

After overriding `TargetTransformInfo::enableMemCmpExpansion`
in this commit, `MergeICmps` and `ExpandMemCmp` passes will be
enabled on LoongArch.
DeltaFile
+20-1llvm/lib/Target/LoongArch/LoongArchTargetTransformInfo.cpp
+2-1llvm/lib/Target/LoongArch/LoongArchTargetTransformInfo.h
+22-22 files

LLVM/project 630f43allvm/test/CodeGen/RISCV remat.ll

[RISCV] Move remat.ll test from riscv32 to riscv64. NFC

This is to allow us to test rematting i64 lds in #166774. Checked and
we're still rematting the same lis on rv64 in this test.
DeltaFile
+108-109llvm/test/CodeGen/RISCV/remat.ll
+108-1091 files