LLVM/project bd40d56libcxx/utils conformance

[libc++] Fix backslash substitution in RST synchronization scripts
DeltaFile
+1-1libcxx/utils/conformance
+1-11 files

LLVM/project f07877bllvm/test/Analysis/DependenceAnalysis monotonicity-loop-guard.ll

[DA] Add tests for nsw doesn't hold on entire iteration space (NFC) (#162281)

The monotonicity definition states its domain as follows:

```
/// The property of monotonicity of a SCEV. To define the monotonicity, assume
/// a SCEV defined within N-nested loops. Let i_k denote the iteration number
/// of the k-th loop. Then we can regard the SCEV as an N-ary function:
///
///   F(i_1, i_2, ..., i_N)
///
/// The domain of i_k is the closed range [0, BTC_k], where BTC_k is the
/// backedge-taken count of the k-th loop
```

Current monotonicity check implementation doesn't match this definition
because:

- Just checking nowrap property of addrecs recursively is not sufficient

    [7 lines not shown]
DeltaFile
+141-0llvm/test/Analysis/DependenceAnalysis/monotonicity-loop-guard.ll
+141-01 files

LLVM/project 377efc7mlir/lib/Dialect/Arith/IR InferIntRangeInterfaceImpls.cpp, mlir/lib/Dialect/LLVMIR/IR NVVMDialect.cpp

[mlir][arith] Fix SelectOp unsafe int range inference with uninitialized range case (#173716)

This PR fixes a bug in `arith::SelectOp::inferResultRangesFromOptional`
where uninitialized SelectOp branch int ranges were incorrectly joined
with initialized int ranges during dataflow analysis, leading to
incorrect folding in `-int-range-optimizations`.

**The Issue:**
When a `arith.select` branch has an uninitialized range (e.g., from an
op like `nvvm.read.ptx.sreg.cluster.ctaid.x`, `scf.switch`, `llvm.call`,
... that lacks range inference), the analysis computed
`IntegerValueRange::join(Uninitialized, Constant) = Constant`. This
caused the `arith.select` to be replaced with the constant, ignoring the
dynamic branch.

**Example:**
```mlir
// The bug before fix: -int-range-optimizations replaces %1 with %c32
// led to incorrect results and unsafe behaviours

    [14 lines not shown]
DeltaFile
+18-0mlir/test/Dialect/Arith/int-range-interface.mlir
+17-0mlir/test/lib/Dialect/Test/TestOps.td
+10-0mlir/test/lib/Dialect/Test/TestOpDefs.cpp
+7-1mlir/lib/Dialect/Arith/IR/InferIntRangeInterfaceImpls.cpp
+6-0mlir/test/Dialect/LLVMIR/nvvm-test-range.mlir
+2-0mlir/lib/Dialect/LLVMIR/IR/NVVMDialect.cpp
+60-16 files

LLVM/project 0a69bccclang/lib/CodeGen CGObjCMac.cpp CGObjCGNU.cpp

[ObjCDirectPreconditionThunk] Setup helper functions (#170617)

## TL;DR

This is a stack of PRs implementing features to expose direct methods
ABI.
You can see the RFC, design, and discussion
[here](https://discourse.llvm.org/t/rfc-optimizing-code-size-of-objc-direct-by-exposing-function-symbols-and-moving-nil-checks-to-thunks/88866).

https://github.com/llvm/llvm-project/pull/170616 Flag
`-fobjc-direct-precondition-thunk` set up
https://github.com/llvm/llvm-project/pull/170617 **Code refactoring to
ease later reviews**
https://github.com/llvm/llvm-project/pull/170618 Thunk generation
https://github.com/llvm/llvm-project/pull/170619 Optimizations, some
class objects can be known to be realized

## Implementation details


    [14 lines not shown]
DeltaFile
+73-24clang/lib/CodeGen/CGObjCMac.cpp
+12-1clang/lib/CodeGen/CGObjCGNU.cpp
+6-0clang/lib/CodeGen/CGObjCRuntime.h
+91-253 files

LLVM/project 0316441clang/include/clang/Basic LangOptions.h Features.def, clang/include/clang/Options Options.td

Add fine-grained `__has_feature()` cutout (#170822)

This is a follow-up to pull #148323. It mints
`-fsanitize-ignore-for-ubsan-feature=...`, accepting a list of (UBSan)
sanitizers that should not cause
`__has_feature(undefined_behavior_sanitizer)` to evaluate true.

---------

Co-authored-by: Kalvin Lee <kdlee at chromium.org>
DeltaFile
+30-0clang/test/Lexer/has_feature_undefined_behavior_sanitizer.cpp
+16-1clang/lib/Driver/SanitizerArgs.cpp
+15-0clang/lib/Frontend/CompilerInvocation.cpp
+8-0clang/include/clang/Options/Options.td
+3-0clang/include/clang/Basic/LangOptions.h
+1-1clang/include/clang/Basic/Features.def
+73-21 files not shown
+74-27 files

LLVM/project aa29926llvm/lib/CodeGen/SelectionDAG TargetLowering.cpp, llvm/test/CodeGen/AArch64 dag-combine-setcc.ll

[SDAG] (setcc (sub nsw a, b), zero, s??) -> (setcc a, b, s??) (#175459)

This often happens when the dag combiner produces sign/zero extends and
realizes that nsw/nuw can be added, for example in the case of `(abds
(sext a), (sext b))`

alive2:
- slt, nsw: [link](https://alive2.llvm.org/ce/z/cgjMSx)
- sgt, nsw: [link](https://alive2.llvm.org/ce/z/JP7h2f)
- sle, nsw: [link](https://alive2.llvm.org/ce/z/n5Wuc_)
- sge, nsw: [link](https://alive2.llvm.org/ce/z/Eps53-)
DeltaFile
+86-74llvm/test/CodeGen/Thumb2/mve-vabdus.ll
+73-2llvm/test/CodeGen/AArch64/dag-combine-setcc.ll
+7-0llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
+166-763 files

LLVM/project e40101blibcxx/docs/Status Cxx2cIssues.csvgb, libcxx/utils conformance synchronize_csv_status_files.py

[libc++] Improve the script to manage libc++ conformance issues (#172905)

The previous script was fairly inflexible. This patch refactors the
script into a tool that can be used in various ways to manage the
conformance-tracking bits of libc++. This should make it possible to
synchronize the CSV status files, but also to find Github issues that
aren't linked to the 'C++ Standards Conformance' project, to create
missing issues more easily, etc.
DeltaFile
+650-0libcxx/utils/conformance
+0-472libcxx/utils/synchronize_csv_status_files.py
+1-0libcxx/utils/requirements.txt
+0-0libcxx/docs/Status/Cxx2cIssues.csvgb
+651-4724 files

LLVM/project efad356offload/plugins-nextgen/amdgpu/dynamic_hsa hsa.cpp, offload/plugins-nextgen/amdgpu/src rtl.cpp

[OFFLOAD] Update CUDA and AMD plugins to new debug format (#175787)

DeltaFile
+10-4offload/plugins-nextgen/cuda/dynamic_cuda/cuda.cpp
+7-3offload/plugins-nextgen/amdgpu/dynamic_hsa/hsa.cpp
+5-4offload/plugins-nextgen/cuda/src/rtl.cpp
+4-3offload/plugins-nextgen/amdgpu/src/rtl.cpp
+1-1offload/plugins-nextgen/amdgpu/utils/UtilitiesRTL.h
+27-155 files

LLVM/project e6cdfb7clang/lib/APINotes APINotesYAMLCompiler.cpp, clang/lib/Sema SemaDeclCXX.cpp

Fix typos and spelling errors across codebase (#156270)

Corrected various spelling mistakes such as 'occurred', 'receiver',
'initialized', 'length', and others in comments, variable names,
function names, and documentation throughout the project. These
changes improve code readability and maintain consistency in naming
and documentation.

Co-authored-by: Louis Dionne <ldionne.2 at gmail.com>
DeltaFile
+5-5lldb/source/Plugins/ObjectFile/ELF/ObjectFileELF.h
+5-5clang/lib/Sema/SemaDeclCXX.cpp
+5-5clang/lib/APINotes/APINotesYAMLCompiler.cpp
+4-4llvm/lib/Target/AMDGPU/AMDGPUPostLegalizerCombiner.cpp
+4-4clang/unittests/Format/FormatTestCSharp.cpp
+3-3openmp/libompd/gdb-plugin/ompdModule.c
+26-2685 files not shown
+143-14391 files

LLVM/project 3e9045bllvm/test/MC/AMDGPU gfx8_asm_vop3.s gfx7_asm_vop3.s, llvm/test/MC/Disassembler/AMDGPU gfx9_vop3.txt

Merge branch 'main' into users/kasuga-fj/da-monotonic-check-1
DeltaFile
+42,349-42,348llvm/test/MC/AMDGPU/gfx8_asm_vop3.s
+41,419-41,418llvm/test/MC/AMDGPU/gfx7_asm_vop3.s
+36,428-36,427llvm/test/MC/AMDGPU/gfx9_asm_vop3.s
+28,175-28,174llvm/test/MC/AMDGPU/gfx9_asm_vopc.s
+22,708-22,884llvm/test/MC/Disassembler/AMDGPU/gfx9_vop3.txt
+22,276-22,275llvm/test/MC/AMDGPU/gfx8_asm_vopc.s
+193,355-193,52612,462 files not shown
+1,908,177-1,389,14612,468 files

LLVM/project 71cb2b0llvm/test/CodeGen/AMDGPU fneg-combines.f16.ll bf16.ll

AMDGPU: Change ABI of 16-bit scalar values for gfx6/gfx7

Keep bf16/f16 values encoded as the low half of a 32-bit register,
instead of promoting to float. This avoids unwanted FP effects
from the fpext/fptrunc which should not be implied by just
passing an argument. This also fixes ABI divergence between
SelectionDAG and GlobalISel.

I've wanted to make this change for ages, and failed the last
few times. The main complication was the hack to return
shader integer types in SGPRs, which now needs to inspect
the underlying IR type.
DeltaFile
+372-419llvm/test/CodeGen/AMDGPU/fneg-combines.f16.ll
+247-430llvm/test/CodeGen/AMDGPU/bf16.ll
+116-174llvm/test/CodeGen/AMDGPU/fcopysign.bf16.ll
+139-139llvm/test/CodeGen/AMDGPU/constant-address-space-32bit.ll
+112-153llvm/test/CodeGen/AMDGPU/select-fabs-fneg-extract.f16.ll
+140-114llvm/test/CodeGen/AMDGPU/fcopysign.f16.ll
+1,126-1,42981 files not shown
+3,579-4,36087 files

LLVM/project ca4c52allvm/test/CodeGen/AMDGPU amdgcn.bitcast.1024bit.ll amdgcn.bitcast.960bit.ll

AMDGPU: Change ABI of 16-bit element vectors on gfx6/7

Fix ABI on old subtargets so match new subtargets, packing
16-bit element subvectors into 32-bit registers. Previously
this would be scalarized and promoted to i32/float.

Note this only changes the vector cases. Scalar i16/half are
still promoted to i32/float for now. I've unsuccessfully tried
to make that switch in the past, so leave that for later.

This will help with removal of softPromoteHalfType.
DeltaFile
+47,697-51,378llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.1024bit.ll
+14,474-16,242llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.960bit.ll
+16,328-12,881llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.512bit.ll
+13,036-14,705llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.896bit.ll
+11,668-13,311llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.832bit.ll
+10,558-11,908llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.768bit.ll
+113,761-120,425151 files not shown
+200,130-204,067157 files

LLVM/project aa92839llvm/lib/CodeGen/GlobalISel CallLowering.cpp

GlobalISel: Fix mishandling vector-as-scalar in return values

This fixes 2 cases when the AMDGPU ABI is fixed to pass <2 x i16>
values as packed on gfx6/gfx7. The ABI does not pack values
currently; this is a pre-fix for that change.

Insert a bitcast if there is a single part with a different size.
Previously this would miscompile by going through the scalarization
and extend path, dropping the high element.

Also fix assertions in odd cases, like <3 x i16> -> i32. This needs
to unmerge with excess elements from the widened source vector.

All of this code is in need of a cleanup; this should look more
like the DAG version using getVectorTypeBreakdown.
DeltaFile
+24-2llvm/lib/CodeGen/GlobalISel/CallLowering.cpp
+24-21 files

LLVM/project 2e0e4f6llvm/test/CodeGen/AMDGPU bf16.ll llvm.exp2.bf16.ll, llvm/test/CodeGen/AMDGPU/GlobalISel irtranslate-bf16.ll

AMDGPU: Directly use v2bf16 as register type for bf16 vectors. (#175761)

Previously we were casting v2bf16 to i32, unlike the f16 case. Simplify
this by using the natural vector type. This is probably a leftover from
before v2bf16 was treated as legal. This is preparation for fixing a
miscompile in globalisel.
DeltaFile
+465-462llvm/test/CodeGen/AMDGPU/bf16.ll
+121-282llvm/test/CodeGen/AMDGPU/GlobalISel/irtranslate-bf16.ll
+122-133llvm/test/CodeGen/AMDGPU/llvm.exp2.bf16.ll
+91-91llvm/test/CodeGen/AMDGPU/minimumnum.bf16.ll
+91-91llvm/test/CodeGen/AMDGPU/maximumnum.bf16.ll
+14-24llvm/test/CodeGen/AMDGPU/llvm.log2.bf16.ll
+904-1,0833 files not shown
+910-1,1009 files

LLVM/project 5cb4d32. .gitignore

[NFC] Add tablegen_compile_commands.yml to .gitignore (#175687)

People may want to symlink the autogonerated
tablegen_compile_commands.yml into their source directories by analogy
with compile_commands.json, and so this commit given them similar
.gitignore treatment.
DeltaFile
+1-0.gitignore
+1-01 files

LLVM/project d20f617llvm/include/llvm/Analysis BlockFrequencyInfoImpl.h LazyCallGraph.h, llvm/lib/IR DiagnosticInfo.cpp Attributes.cpp

[NFCI][LLVM] Remove `raw_string_ostream::flush` calls (#164086)

Remove calls to `flush()` on `raw_string_ostream` objects as they are not needed.
DeltaFile
+16-50llvm/unittests/Frontend/HLSLRootSignatureDumpTest.cpp
+6-9llvm/lib/IR/DiagnosticInfo.cpp
+2-8llvm/lib/IR/Attributes.cpp
+2-7llvm/include/llvm/Analysis/BlockFrequencyInfoImpl.h
+2-6llvm/include/llvm/Analysis/LazyCallGraph.h
+0-8llvm/lib/IR/Core.cpp
+28-8841 files not shown
+30-15647 files

LLVM/project 94cc82dllvm/include/llvm/Frontend/OpenMP ConstructDecompositionT.h

[OpenMP] Remove special handling of implicit clauses in decomposition (#174654)

Applying implicit clauses should not cause any issues. The only
exception is that "simd linear(x)" could imply a "firstprivate", and
that clause is not allowed on the simd construct.
Add a check for that specific case, and apply all implicit clauses as if
they were explicit.
DeltaFile
+15-11llvm/include/llvm/Frontend/OpenMP/ConstructDecompositionT.h
+15-111 files

LLVM/project 34603dbllvm/lib/Transforms/Utils CMakeLists.txt

Fix dependencies after PR #174490 (#175793)

Added the missing direct dependency (it was already an indirect dependency due to `Analysis`)
DeltaFile
+1-0llvm/lib/Transforms/Utils/CMakeLists.txt
+1-01 files

LLVM/project 4a9a13cllvm/include/llvm/ADT FloatingPointMode.h, llvm/include/llvm/Support KnownFPClass.h

InstCombine: Handle fadd in SimplifyDemandedFPClass (#174853)

Note some of the tests currently fail with alive, but not
due to this patch. Namely, when performing the fadd x, 0 -> x
simplification in functions with non-IEEE denormal handling.
The existing instsimplify ignores the denormals-are-zero hazard by
checking cannotBeNegativeZero instead of isKnownNeverLogicalZero.

Also note the self handling doesn't really do anything yet, other
than propagate consistent known-fpclass information until there is
multiple use support.

This also leaves behind the original ValueTracking support, without
switching to the new KnownFPClass:fadd utility. This will be easier
to clean up after the subsequent fsub support patch.
DeltaFile
+93-124llvm/test/Transforms/InstCombine/simplify-demanded-fpclass-fadd.ll
+113-0llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp
+46-0llvm/lib/Support/KnownFPClass.cpp
+17-0llvm/include/llvm/ADT/FloatingPointMode.h
+10-0llvm/include/llvm/Support/KnownFPClass.h
+279-1245 files

LLVM/project f8ae051libcxx/test/libcxx/ranges/range.adaptors/range.as_rvalue nodiscard.verify.cpp, libcxx/test/libcxx/ranges/range.adaptors/range.as_rvalue_view nodiscard.verify.cpp

[libc++][ranges][NFC] Cleanup `nodiscard.verify.cpp` tests (#175725)

- Removed redundant files
- Renamed files to the common `nodiscard.verify.cpp`
DeltaFile
+63-0libcxx/test/libcxx/ranges/range.adaptors/range.as_rvalue/nodiscard.verify.cpp
+0-63libcxx/test/libcxx/ranges/range.adaptors/range.as_rvalue_view/nodiscard.verify.cpp
+0-61libcxx/test/libcxx/ranges/range.adaptors/range.common.view/nodiscard.verify.cpp
+61-0libcxx/test/libcxx/ranges/range.adaptors/range.common/nodiscard.verify.cpp
+0-21libcxx/test/libcxx/ranges/range.adaptors/range.common.view/adaptor.nodiscard.verify.cpp
+19-0libcxx/test/libcxx/ranges/range.adaptors/range.counted/nodiscard.verify.cpp
+143-1451 files not shown
+143-1647 files

LLVM/project 62f629allvm/include/llvm/Transforms/Utils LowerMemIntrinsics.h, llvm/lib/Transforms/Utils LowerMemIntrinsics.cpp

[LowerMemIntrinsics] Propagate value profile to branch weights (#174490)

If the mem intrinsics have value profile information associated, we can synthesize branch weights when converting them (the intrinsics) to loops.  
  
Issue #147390
DeltaFile
+112-37llvm/lib/Transforms/Utils/LowerMemIntrinsics.cpp
+31-18llvm/test/Transforms/PreISelIntrinsicLowering/X86/memcpy-inline-non-constant-len.ll
+20-8llvm/test/Transforms/PreISelIntrinsicLowering/X86/memset-inline-non-constant-len.ll
+4-2llvm/include/llvm/Transforms/Utils/LowerMemIntrinsics.h
+0-3llvm/utils/profcheck-xfail.txt
+167-685 files

LLVM/project b88ea53llvm/lib/Target/AMDGPU GCNRegPressure.cpp, llvm/test/CodeGen/AMDGPU machine-scheduler-sink-trivial-remats-attr.mir

[WIP] Change how ArchVGPR excess is computed. It's not clear why it was considering AGPRs.
DeltaFile
+20-20llvm/test/CodeGen/AMDGPU/machine-scheduler-sink-trivial-remats-attr.mir
+13-8llvm/lib/Target/AMDGPU/GCNRegPressure.cpp
+33-282 files

LLVM/project 3accd1fllvm/lib/Target/AMDGPU GCNSchedStrategy.cpp

[Review] typos in comment
DeltaFile
+4-3llvm/lib/Target/AMDGPU/GCNSchedStrategy.cpp
+4-31 files

LLVM/project 9b1f3d4llvm/lib/Target/AMDGPU GCNRegPressure.cpp GCNRegPressure.h

[Review] Change consturctor of RegExcess to take a pressure and a target and rename spillsToMemory to spillsToMemoryForTargetOccupancy
DeltaFile
+14-8llvm/lib/Target/AMDGPU/GCNRegPressure.cpp
+5-0llvm/lib/Target/AMDGPU/GCNRegPressure.h
+5-0llvm/lib/Target/AMDGPU/GCNSchedStrategy.cpp
+24-83 files

LLVM/project 58e74dellvm/lib/Target/AMDGPU GCNRegPressure.cpp

[Review] Move the  class into an annonymous namespace
DeltaFile
+2-0llvm/lib/Target/AMDGPU/GCNRegPressure.cpp
+2-01 files

LLVM/project e24ab2allvm/lib/Target/AMDGPU GCNRegPressure.cpp

[Review] Use unified vgpr count with unified-register-file
DeltaFile
+2-2llvm/lib/Target/AMDGPU/GCNRegPressure.cpp
+2-21 files

LLVM/project 42f18ccllvm/test/CodeGen/AMDGPU swdev-549940.ll

Remove undef from test (it still preserves the test behavour before and after the fix)
DeltaFile
+1-1llvm/test/CodeGen/AMDGPU/swdev-549940.ll
+1-11 files

LLVM/project 9bc63e3llvm/lib/Target/AMDGPU GCNRegPressure.cpp GCNSchedStrategy.cpp, llvm/test/CodeGen/AMDGPU machine-scheduler-sink-trivial-remats-attr.mir swdev-549940.ll

[AMDGPU] Rematerialize VGPR candidates when SGPR spills to VGPR over the VGPR limit

Before, when selecting candidates to rematerialize, we would only
consider SGPR candidates when there was an excess of SGPR registers.

Failing to eliminate the excess would result in spills to VGPRs.
This is normally not an issue, unless spilling to VGPRs results in
excess VGPRs.

This patch does 2 things:
* It relaxes the GCNRPTarget success criteria: now we accept regions
  where we spill SGPRs to VGPRs, as long as this does not end up in
  excess VGPRs.
* It changes isSaveBeneficial to consider the excess VGPRs (which
  includes the SGPRs that would be spilled to VGPR).

With these changes, the compiler rematerializes VGPRs when the excess
SGPRs would result in VGPR excess.


    [4 lines not shown]
DeltaFile
+30-30llvm/test/CodeGen/AMDGPU/machine-scheduler-sink-trivial-remats-attr.mir
+15-9llvm/lib/Target/AMDGPU/GCNRegPressure.cpp
+3-1llvm/lib/Target/AMDGPU/GCNSchedStrategy.cpp
+1-1llvm/test/CodeGen/AMDGPU/swdev-549940.ll
+1-0llvm/lib/Target/AMDGPU/GCNRegPressure.h
+50-415 files

LLVM/project d8bbb25llvm/test/CodeGen/AMDGPU swdev-549940.ll

Unacceptably large test
DeltaFile
+609-0llvm/test/CodeGen/AMDGPU/swdev-549940.ll
+609-01 files

LLVM/project 87dc9dfllvm/lib/Target/AMDGPU GCNRegPressure.cpp

[NFC][AMDGPU] Refactor common code computing excess register preassure into RegExcess class
DeltaFile
+47-45llvm/lib/Target/AMDGPU/GCNRegPressure.cpp
+47-451 files