LLVM/project 4c9efc6llvm/lib/Target/AMDGPU AMDGPURegBankLegalizeHelper.cpp, llvm/test/CodeGen/AMDGPU/GlobalISel unmerge-sgpr-s16.mir

AMDGPU/GlobalISel: Fix sgpr s16 unmerge lowering in regbanklegalize

Used to fail EXPENSIVE_CHECKS because of type mismatch.
DeltaFile
+5-3llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeHelper.cpp
+4-4llvm/test/CodeGen/AMDGPU/GlobalISel/unmerge-sgpr-s16.mir
+9-72 files

LLVM/project e1f69eellvm/lib/Target/PowerPC PPCAsmPrinter.cpp PPCPrepareIFuncsOnAIX.cpp, llvm/test/CodeGen/PowerPC aix-ifunc-toc-restore-query-neg.ll aix-ifunc-obj.ll

[AIX] Implement the ifunc attribute. (#153049)

Currently, the AIX linker and loader do not provide a mechanism to
implement ifuncs similar to GNU_ifunc on ELF Linux.
On AIX, we will lower `__attribute__((ifunc("resolver"))` to the llvm
`ifunc` as other platforms do. The llvm `ifunc` in turn will get lowered
at late stages of the optimization pipeline to an AIX-specific
implementation. No special linkage or relocations are needed when
generating assembly/object output.

On AIX, a function `foo` has two symbols associated with it: a function
descriptor (`foo`) residing in the `.data` section, and an entry point
(`.foo`) residing in the `.text` section. The first field of the
descriptor is the address of the entry point. Typically, the address
field in the descriptor is initialized once: statically, at load time
(?), or at runtime if runtime linking is enabled.

Here we would like to use the address field in the descriptor to
implement the `ifunc` semantics. Specifically, the ifunc function will

    [29 lines not shown]
DeltaFile
+270-24llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp
+114-0llvm/lib/Target/PowerPC/PPCPrepareIFuncsOnAIX.cpp
+94-0llvm/test/CodeGen/PowerPC/aix-ifunc-toc-restore-query-neg.ll
+89-0llvm/test/CodeGen/PowerPC/aix-ifunc-obj.ll
+77-0llvm/test/CodeGen/PowerPC/aix-ifunc-toc-restore-query.ll
+75-0llvm/test/CodeGen/PowerPC/aix-ifunc.ll
+719-2427 files not shown
+966-5633 files

LLVM/project 9481902llvm/test/CodeGen/RISCV tls-models.ll

[llvm][RISCV] precommit test update via UTC (#179508)

Run UTC in preparation of additional tests.
DeltaFile
+107-71llvm/test/CodeGen/RISCV/tls-models.ll
+107-711 files

LLVM/project b418233llvm/lib/MC MCAsmInfoGOFF.cpp

Revert MCAsmInfoGOFF.cpp
DeltaFile
+1-1llvm/lib/MC/MCAsmInfoGOFF.cpp
+1-11 files

LLVM/project dd19a5amlir/test/Conversion/ConvertToEmitC tosa.mlir, mlir/test/Dialect/EmitC/tosa td.mlir ops.mlir

[mlir][emitc] Update and extend the TOSA -> EmitC test (#177339)

This patch updates and extends the TOSA-to-EmitC lowering test:
  * Conversion/ConvertToEmitC/tosa.mlir

Summary of changes and rationale:
* Remove `buffer-alignment=0` from the lowering pipeline; it is not required
  (the existing `CHECK` lines are not affected).
* Move the test from Conversion/ConvertToEmitC/tosa.mlir to
  Dialect/EmitC/tosa/ops.mlir. Conversion tests are intended for single
  conversion passes (e.g. `-convert-dialect1-to-dialect2`), whereas this test
  exercises a more complex lowering pipeline with multiple explicit steps (e.g.
  TOSA -> Linalg, bufferization, etc.).
* Add a Transform Dialect sequence to complement the existing lowering pipeline
  definition. This introduces an additional `RUN` line that is compatible with
  the original one. Using the Transform Dialect makes the pipeline easier to
  document, maintain, and experiment with.
DeltaFile
+44-0mlir/test/Dialect/EmitC/tosa/td.mlir
+43-0mlir/test/Dialect/EmitC/tosa/ops.mlir
+0-41mlir/test/Conversion/ConvertToEmitC/tosa.mlir
+2-0mlir/test/Dialect/EmitC/tosa/lit.local.cfg
+89-414 files

LLVM/project 079b55fllvm/lib/Target/AMDGPU AMDGPURegBankLegalizeHelper.cpp, llvm/test/CodeGen/AMDGPU/GlobalISel unmerge-sgpr-s16.mir

AMDGPU/GlobalISel: Fix sgpr s16 unmerge lowering in regbanklegalize

Used to fail EXPENSIVE_CHECKS because of type mismatch.
DeltaFile
+4-4llvm/test/CodeGen/AMDGPU/GlobalISel/unmerge-sgpr-s16.mir
+5-3llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeHelper.cpp
+9-72 files

LLVM/project f0c519dllvm/include/llvm/IR Intrinsics.h, llvm/lib/IR Intrinsics.cpp

[NFC][TableGen] Adopt CodeGenHelpers in IntrinsicEmitter (#179310)

- Adopt IfDefEmitter in IntrinsicEmitter.
- Remove #undef for various flags in Intrinsics.cpp/Intrinsics.h as the
TableGen generated code does that now.
DeltaFile
+30-53llvm/utils/TableGen/Basic/IntrinsicEmitter.cpp
+10-9llvm/test/TableGen/intrinsic-arginfo.td
+0-10llvm/lib/IR/Intrinsics.cpp
+0-2llvm/include/llvm/IR/Intrinsics.h
+40-744 files

LLVM/project 90f575bclang/lib/Analysis UnsafeBufferUsage.cpp

[NFC][Clang][UnsafeBufferUsage] Simplify libc function matchers. (#178985)

DeltaFile
+3-14clang/lib/Analysis/UnsafeBufferUsage.cpp
+3-141 files

LLVM/project 35365c0llvm/lib/Target/AMDGPU AMDGPURegBankLegalizeHelper.cpp, llvm/test/CodeGen/AMDGPU/GlobalISel unmerge-sgpr-s16.mir

AMDGPU/GlobalISel: Fix sgpr s16 unmerge lowering in regbanklegalize

Used to fail EXPENSIVE_CHECKS because of type mismatch.
DeltaFile
+5-3llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeHelper.cpp
+4-4llvm/test/CodeGen/AMDGPU/GlobalISel/unmerge-sgpr-s16.mir
+9-72 files

LLVM/project d07b1c4llvm/lib/Target/AMDGPU SIISelLowering.cpp, llvm/test/CodeGen/AMDGPU isel-amdgcn-cs-chain-intrinsic-w32.ll isel-amdgcn-cs-chain-intrinsic-w64.ll

[AMDGPU] Allow hoising of V_READFIRSTLANE_B32 for uniform operand

readfirstlane can be moved across control flow for uniform inputs.
The MachineInstr::NoConvergent attribute allows hoisting
which is otherwise prohibited for a convergent instruction.
DeltaFile
+82-82llvm/test/CodeGen/AMDGPU/isel-amdgcn-cs-chain-intrinsic-w32.ll
+52-52llvm/test/CodeGen/AMDGPU/isel-amdgcn-cs-chain-intrinsic-w64.ll
+24-24llvm/test/CodeGen/AMDGPU/llvm.amdgcn.make.buffer.rsrc.ll
+33-0llvm/test/CodeGen/AMDGPU/readanylane.ll
+16-16llvm/test/CodeGen/AMDGPU/isel-amdgpu-cs-chain-intrinsic-dyn-vgpr-w32.ll
+11-0llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+218-1743 files not shown
+225-1809 files

LLVM/project efb0c4allvm/include/llvm/CodeGen SelectionDAGNodes.h, llvm/lib/CodeGen/SelectionDAG SelectionDAGDumper.cpp InstrEmitter.cpp

Add SDNodeFlag::NoConvergent
DeltaFile
+6-1llvm/include/llvm/CodeGen/SelectionDAGNodes.h
+3-0llvm/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp
+3-0llvm/lib/CodeGen/SelectionDAG/InstrEmitter.cpp
+12-13 files

LLVM/project 9bfeaafllvm/lib/IR Intrinsics.cpp

[LLVM][Intrinsics] Minor cleanup in getIntrinsicInfoTableEntries (#179317)

Change `IITValues` from SmallVector to a simple array, since its maximum
size is bounded and relatively small. As a result, using a SmallVector
for this array is not necessary.
DeltaFile
+7-4llvm/lib/IR/Intrinsics.cpp
+7-41 files

LLVM/project 06a903ellvm/lib/Target/AMDGPU SIFoldOperands.cpp

[AMDGPU] Clear no convergence flag on operand folding. NFCI (#179438)

Clear the flag. It fails verification if set, only convergent
operations may have NoConvergent flag. NFCI as it is now because
it just does not happen.
DeltaFile
+2-0llvm/lib/Target/AMDGPU/SIFoldOperands.cpp
+2-01 files

LLVM/project cb5e2dbllvm/lib/Target/SPIRV SPIRVInstructionSelector.cpp SPIRVLegalizerInfo.cpp, llvm/test/CodeGen/SPIRV/llvm-intrinsics sincos-opencl.ll sincos-glsl.ll

[SPIR-V] Add lowering for G_FSINCOS (#179053)

Use either OpenCL::sincos for compute or sequence of HLSL::sin +
HLSL::cos for shader.
DeltaFile
+60-0llvm/lib/Target/SPIRV/SPIRVInstructionSelector.cpp
+42-0llvm/test/CodeGen/SPIRV/llvm-intrinsics/sincos-opencl.ll
+38-0llvm/test/CodeGen/SPIRV/llvm-intrinsics/sincos-glsl.ll
+3-0llvm/lib/Target/SPIRV/SPIRVLegalizerInfo.cpp
+143-04 files

LLVM/project 2d6dce4llvm/lib/Target/AMDGPU AMDGPURegBankLegalizeHelper.cpp, llvm/test/CodeGen/AMDGPU/GlobalISel unmerge-sgpr-s16.mir

AMDGPU/GlobalISel: Fix sgpr s16 unmerge lowering in regbanklegalize

Used to fail EXPENSIVE_CHECKS because of type mismatch.
DeltaFile
+5-3llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeHelper.cpp
+4-4llvm/test/CodeGen/AMDGPU/GlobalISel/unmerge-sgpr-s16.mir
+9-72 files

LLVM/project 3064291llvm/lib/Transforms/InstCombine InstCombineCalls.cpp, llvm/test/Transforms/InstCombine assume.ll assume-loop-align.ll

Reapply "[InstCombine] Always fold alignment assumptions into operand bundles (#177597)" (#179497)

Truncating at 32 bits is now avoided by removing a cast to `unsigned`.
This would also break at 64 bits (with the pointer size > 64 bit), but I
don't think LLVM supports such a
thing.

This reverts commit bc7315749d6d16d0f162f816b3ec0ef7169615f2.
DeltaFile
+44-49llvm/test/Transforms/InstCombine/assume.ll
+2-8llvm/test/Transforms/InstCombine/assume-loop-align.ll
+1-4llvm/test/Transforms/InstCombine/assume_inevitable.ll
+2-3llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
+49-644 files

LLVM/project 10d3859llvm/lib/Target/ARM ARMISelLowering.cpp ARMCallingConv.cpp

ARM: Avoid using isTarget wrappers around Triple predicates (#179512)

These are module level properties, and querying them through
a function-level subtarget context is confusing. Plus we don't
need an aliased name.

Continue change started in 91439817e8d19613ac6e25ca9abd5e7534a9d33b
DeltaFile
+28-20llvm/lib/Target/ARM/ARMISelLowering.cpp
+3-2llvm/lib/Target/ARM/ARMCallingConv.cpp
+31-222 files

LLVM/project 7297d48llvm/lib/Transforms/Utils LoopUnroll.cpp LoopUnrollRuntime.cpp, llvm/test/Transforms/LoopUnroll loop-probability-one.ll

[LoopUnroll] Fix block frequencies for newly unconditional latches

As another step in issue #135812, this patch fixes block frequencies
when LoopUnroll converts a conditional latch in an unrolled loop
iteration to unconditional.  It thus includes complete loop unrolling
(the conditional backedge becomes an unconditional loop exit), which
might be applied to the original loop or to its remainder loop.

As explained in detail in the header comments on the
fixProbContradiction function that this patch introduces, these
conversions mean LoopUnroll has proven that the original uniform latch
probability is incorrect for the original loop iterations associated
with the converted latches.  However, LoopUnroll often is able to
perform these corrections for only some iterations, leaving other
iterations with the original latch probability, and thus corrupting
the aggregate effect on the total frequency of the original loop body.

This patch ensures that the total frequency of the original loop body,
summed across all its occurrences in the unrolled loop after the

    [27 lines not shown]
DeltaFile
+1,121-0llvm/test/Transforms/LoopUnroll/branch-weights-freq/unroll-complete.ll
+460-5llvm/lib/Transforms/Utils/LoopUnroll.cpp
+380-0llvm/test/Transforms/LoopUnroll/branch-weights-freq/unroll-partial-unconditional-latch.ll
+284-50llvm/test/Transforms/LoopUnroll/branch-weights-freq/unroll-epilog.ll
+122-85llvm/test/Transforms/LoopUnroll/loop-probability-one.ll
+2-2llvm/lib/Transforms/Utils/LoopUnrollRuntime.cpp
+2,369-1422 files not shown
+2,372-1438 files

LLVM/project 9b164edllvm/lib/Support KnownFPClass.cpp, llvm/test/Transforms/Attributor nofpclass-powi.ll

ValueTracking: Handle tracking nan through powi

Nans should propagate simply, the infinity cases are complicated.
DeltaFile
+161-1llvm/test/Transforms/Attributor/nofpclass-powi.ll
+12-0llvm/lib/Support/KnownFPClass.cpp
+173-12 files

LLVM/project c3fb4ccllvm/lib/Target/AMDGPU AMDGPURegBankLegalizeHelper.cpp, llvm/test/CodeGen/AMDGPU/GlobalISel unmerge-sgpr-s16.mir

AMDGPU/GlobalISel: Fix sgpr s16 unmerge lowering in regbanklegalize

Used to fail EXPENSIVE_CHECKS because of type mismatch.
DeltaFile
+5-3llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeHelper.cpp
+4-4llvm/test/CodeGen/AMDGPU/GlobalISel/unmerge-sgpr-s16.mir
+9-72 files

LLVM/project faa4b97llvm/test/Transforms/InstCombine and-or-icmps.ll canonicalize-selects-icmp-condition-bittest.ll, llvm/test/Transforms/PGOProfile chr.ll chr_coro.ll

[InstCombine] fold icmp ne (and X, 1), 0 --> trunc X to i1 (#178977)

Remove vector check so this fold always is done.

proof: https://alive2.llvm.org/ce/z/oabD6J
closes #172888
DeltaFile
+98-105llvm/test/Transforms/PGOProfile/chr.ll
+18-25llvm/test/Transforms/InstCombine/and-or-icmps.ll
+23-11llvm/test/Transforms/PGOProfile/chr_coro.ll
+16-16llvm/test/Transforms/InstCombine/canonicalize-selects-icmp-condition-bittest.ll
+12-18llvm/test/Transforms/InstCombine/load-cmp.ll
+12-17llvm/test/Transforms/InstCombine/icmp-and-shift.ll
+179-19215 files not shown
+215-25521 files

LLVM/project c0827d3llvm/lib/Target/AMDGPU AMDGPURegBankLegalizeHelper.cpp, llvm/test/CodeGen/AMDGPU/GlobalISel unmerge-sgpr-s16.mir

AMDGPU/GlobalISel: Fix sgpr s16 unmerge lowering in regbanklegalize

Used to fail EXPENSIVE_CHECKS because of type mismatch.
DeltaFile
+5-3llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeHelper.cpp
+4-4llvm/test/CodeGen/AMDGPU/GlobalISel/unmerge-sgpr-s16.mir
+9-72 files

LLVM/project 7798a89clang/lib/Sema HLSLBuiltinTypeDeclBuilder.cpp SemaHLSL.cpp, clang/test/AST/HLSL ByteAddressBuffers-AST.hlsl

Implement `ByteAddressBuffer` Load/Store methods (#176058)

Closes #108058.

This PR:
- Adds the `uint` `Load` and `Store` methods (`Load/Store`,
`Load2/Store2`, `Load3/Store3`, `Load4/Store4`) to the existing
`ByteAddressBuffer` objects
- Adds the new templated `Load` and `Store` methods to
`ByteAddressBuffer` objects, which allow types other than `uint` (e.g.
aggregate types) to be used with them directly
- One exception to this is array types, which are rejected by the
methods (as array returns will be disallowed in 202x)
- Adds the relevant `AST`, `CodeGenHLSL`, and `SemaHLSL` tests for these
methods

*Note: the `HLSL Tests` check is failing because this implementation
makes the `ByteAddressBuffer` tests XPASS. Will remove the XFAILs from
these tests in a follow-up.*
DeltaFile
+264-3clang/test/AST/HLSL/ByteAddressBuffers-AST.hlsl
+161-23clang/lib/Sema/HLSLBuiltinTypeDeclBuilder.cpp
+160-0clang/test/CodeGenHLSL/resources/ByteAddressBuffers-methods.hlsl
+50-0clang/lib/Sema/SemaHLSL.cpp
+44-0clang/test/SemaHLSL/BuiltIns/ByteAddressBuffers.hlsl
+9-3clang/lib/Sema/HLSLBuiltinTypeDeclBuilder.h
+688-296 files not shown
+721-4012 files

LLVM/project e9ca496clang/include/clang/Analysis/Scalable/Serialization SerializationFormat.h, clang/lib/Analysis/Scalable/Serialization SerializationFormat.cpp

[clang][ssaf] Add FormatInfo sub-registry and tests [3/3]

Add `FormatInfoEntry` template to support per-analysis-type serialization
within a `SerializationFormat`.
This allows to implement different formats for the different analyses in
a decoupled way.

For testing, this patch also implements the MockSerializationFormat
demonstrating the FormatInfo sub-registry pattern.

Assisted-by: claude
DeltaFile
+94-3clang/unittests/Analysis/Scalable/Registries/MockSerializationFormat.cpp
+87-0clang/unittests/Analysis/Scalable/Registries/FancyAnalysisData.cpp
+56-0clang/unittests/Analysis/Scalable/Registries/SerializationFormatRegistryTest.cpp
+15-2clang/include/clang/Analysis/Scalable/Serialization/SerializationFormat.h
+15-0clang/unittests/Analysis/Scalable/Registries/MockSerializationFormat.h
+9-0clang/lib/Analysis/Scalable/Serialization/SerializationFormat.cpp
+276-51 files not shown
+277-57 files

LLVM/project cbbb877mlir/lib/IR SymbolTable.cpp, mlir/test/Dialect/GPU invalid.mlir

[MLIR] Enforce symbol visibility during symbol lookup (#179370)

Update symbol resolution to examine whether a nested symbol being
resolved is private, and fail in that case. This ensures that we
maintain invariants on symbol visibility that we depend on in
optimisations.
DeltaFile
+25-0mlir/test/IR/test-symbol-uses.mlir
+3-1mlir/lib/IR/SymbolTable.cpp
+4-0mlir/test/lib/IR/TestSymbolUses.cpp
+1-1mlir/test/Dialect/GPU/invalid.mlir
+33-24 files

LLVM/project 4c936dcclang/include/clang/Analysis/Scalable/Serialization SerializationFormatRegistry.h SerializationFormat.h, clang/lib/Analysis/Scalable/Serialization SerializationFormatRegistry.cpp

[clang][ssaf] Add SerializationFormatRegistry [2/3]

Add a registry infrastructure for SerializationFormat implementations,
enabling registration and instantiation of different serialization formats.

For example:
```c++
  static SerializationFormatRegistry::Add<MyFormat>
    RegisterFormat("MyFormat", "Description");
```

Formats can then be instantiated by name using `makeFormat()`.

The patch also updates the SerializationFormat base class to accept
FileSystem and OutputBackend parameters for virtualizing I/O
operations.

Assisted-by: claude
DeltaFile
+73-0clang/include/clang/Analysis/Scalable/Serialization/SerializationFormatRegistry.h
+39-0clang/unittests/Analysis/Scalable/Registries/MockSerializationFormat.cpp
+38-0clang/lib/Analysis/Scalable/Serialization/SerializationFormatRegistry.cpp
+30-0clang/unittests/Analysis/Scalable/Registries/MockSerializationFormat.h
+29-0clang/unittests/Analysis/Scalable/Registries/SerializationFormatRegistryTest.cpp
+10-0clang/include/clang/Analysis/Scalable/Serialization/SerializationFormat.h
+219-03 files not shown
+227-09 files

LLVM/project a6598d9llvm/include/llvm/Support VirtualOutputBackends.h VirtualOutputFile.h, llvm/lib/Support VirtualOutputBackends.cpp VirtualOutputFile.cpp

[llvm][Support] Add InMemoryOutputBackend [1/3]

Add InMemoryOutputBackend, an output backend that creates files in
memory backed by string buffers in a map.
This is useful for unittests, where we don't want to create files on the
file system, but still want to check the content of the created files.

Assisted-by: claude
DeltaFile
+55-0llvm/unittests/Support/VirtualOutputBackendsTest.cpp
+30-0llvm/include/llvm/Support/VirtualOutputBackends.h
+18-0llvm/include/llvm/Support/VirtualOutputFile.h
+5-0llvm/lib/Support/VirtualOutputBackends.cpp
+2-0llvm/lib/Support/VirtualOutputFile.cpp
+110-05 files

LLVM/project 59eb721llvm/lib/Target/ARM ARMISelLowering.cpp ARMCallingConv.cpp

ARM: Avoid using isTarget wrappers around Triple predicates

These are module level properties, and querying them through
a function-level subtarget context is confusing. Plus we don't
need an aliased name.

Continue change started in 91439817e8d19613ac6e25ca9abd5e7534a9d33b
DeltaFile
+28-20llvm/lib/Target/ARM/ARMISelLowering.cpp
+3-2llvm/lib/Target/ARM/ARMCallingConv.cpp
+31-222 files

LLVM/project 5586d4allvm/lib/Target/X86 X86ISelLowering.cpp

[X86] mayFoldIntoVector - recognise larger than legal logic ops may fold to vectors (#179503)

Inspired by the hack to #174761 - move the custom operation handling
inside mayFoldIntoVector where we can more accurately predict ops that
can be moved to the vector unit
DeltaFile
+6-4llvm/lib/Target/X86/X86ISelLowering.cpp
+6-41 files

LLVM/project cc42144llvm/lib/Target/AMDGPU AMDGPURegBankLegalizeHelper.cpp, llvm/test/CodeGen/AMDGPU/GlobalISel unmerge-sgpr-s16.mir

AMDGPU/GlobalISel: Fix sgpr s16 unmerge lowering in regbanklegalize

Used to fail EXPENSIVE_CHECKS because of type mismatch.
DeltaFile
+5-3llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeHelper.cpp
+4-4llvm/test/CodeGen/AMDGPU/GlobalISel/unmerge-sgpr-s16.mir
+9-72 files