LLVM/project 3b51279clang/lib/Driver/ToolChains Gnu.cpp, clang/test/Driver amdgpu-unwind.cl

[Clang] Default to async unwind tables for amdgcn (#183148)

To avoid codegen changes when enabling debug-info (see
https://bugs.llvm.org/show_bug.cgi?id=37240) we want to
enable unwind tables by default.

There is some pessimization in post-prologepilog scheduling, and a
general solution to the problem of CFI_INSTRUCTION-as-scheduling-barrier
should be explored.

Change-Id: I83625875966928c7c4411cd7b95174dc58bda25a
DeltaFile
+26-0clang/test/Driver/amdgpu-unwind.cl
+1-0clang/lib/Driver/ToolChains/Gnu.cpp
+27-02 files

LLVM/project 3548cdaclang/include/clang/ScalableStaticAnalysisFramework/Core/Serialization SerializationFormat.h

Fix MSVC template parsing error in SerializationFormat (#196571)

This commit fixes a hard compilation error on Windows (when building with
Clang's MSVC compatibility mode) and a subsequent access violation that 
occurred during Windows CI testing.

Root Causes:
1. When compiling with `-fms-compatibility`, Clang's two-phase template
   lookup fails to resolve function-local static variables (`SavedSerialize` 
   and `SavedDeserialize`) captured by a local class (`ConcreteCodec`) inside 
   an uninstantiated template. It incorrectly assumes they are members of a 
   dependent base class.
2. Originally, `TypedSerializerFn` and `DeserializerFn` were typed as
   `llvm::function_ref`. Storing these in static variables created dangling 
   pointers, as `function_ref` is a non-owning wrapper that only referenced 
   the temporaries decaying on the constructor's stack, causing an 0xC0000005 
   access violation on x64 Windows.

The Fix:

    [11 lines not shown]
DeltaFile
+18-12clang/include/clang/ScalableStaticAnalysisFramework/Core/Serialization/SerializationFormat.h
+18-121 files

LLVM/project c2ce236clang/lib/Sema SemaLifetimeSafety.h

[LifetimeSafety] Expand diagnostic list that enables analysis (#198599)

Now, when any lifetime safety related diagnostic is not ignored, we run
the analysis.

No tests were added since this does not add new functionality.
DeltaFile
+7-1clang/lib/Sema/SemaLifetimeSafety.h
+7-11 files

LLVM/project 3fdbee1llvm/lib/Target/NVPTX NVVMIntrRange.cpp, llvm/test/CodeGen/NVPTX reqnctapercluster-const-fold.ll intr-range.ll

[NVPTX] Constant fold clusterDim when reqnctapercluster is specified (#195967)

This is a follow-up of https://github.com/llvm/llvm-project/pull/191575.

Currently, NVPTX cannot fold the `cluster_nctaid.x/y/z` and
`cluster_nctarank` intrinsic calls into const values when
`reqnctapercluster` is specified, which prevents the code from further
optimization.

Therefore, in this change, we extend the `NVVMIntrRange` pass to:

- Tighten `cluster_nctaid.x/y/z` intrinsic calls to one value range,
which can be const folded in later InstCombine pass
- Tighten `cluster_nctarank` intrinsic calls to one value range when
`cluster_dim` is specified
- Tighten `cluster_ctaid.x/y/z` range attributes to use per-dimension
`cluster_dim` bounds
DeltaFile
+98-0llvm/test/CodeGen/NVPTX/reqnctapercluster-const-fold.ll
+39-24llvm/lib/Target/NVPTX/NVVMIntrRange.cpp
+6-6llvm/test/CodeGen/NVPTX/intr-range.ll
+143-303 files

LLVM/project 76edbfdclang/lib/Format TokenAnnotator.cpp, clang/unittests/Format FormatTest.cpp

[clang-format] Harden annotation of operator keywords (#196768)

The star was already annotated as TT_PointerOrReference, just overwrite
it for the sake of not crashing. Also remove the annotation above, since
that would always be overwritten (or at least I don't see when not, and
there's no failed test).

Fixes #196054.
DeltaFile
+2-1clang/lib/Format/TokenAnnotator.cpp
+1-0clang/unittests/Format/FormatTest.cpp
+3-12 files

LLVM/project 86b58f5libc/src/__support/mathvec expf.h, libc/test/src/mathvec CMakeLists.txt

[libc][mathvec] Add exhaustive tester for SIMD math routines (#189488)

An exhaustive tester based on the scalar version.

Uses LIBC scalar math routines as a reference rather than MPFR

Also corrects a missed 1ULP value in expf when the target doesn't
support FMAs
DeltaFile
+200-0libc/test/src/mathvec/exhaustive/exhaustive_test.h
+32-0libc/test/src/mathvec/exhaustive/expf_test.cpp
+25-0libc/test/src/mathvec/exhaustive/CMakeLists.txt
+6-0libc/src/__support/mathvec/expf.h
+4-0libc/test/src/mathvec/CMakeLists.txt
+267-05 files

LLVM/project 528c97bclang/include/clang/CIR/Dialect/Builder CIRBaseBuilder.h, clang/include/clang/CIR/Dialect/IR CIROps.td

[CIR] Fix get_method callee type for member pointer calls (#198358)

Member-pointer calls through `cir.get_method` were lowering to an
indirect
callee type that still listed the member function's implicit `this`
parameter after `createGetMethod` had already prepended the adjusted
`void*` receiver.  A call like `(obj->*pmf)(arg)` therefore carried a
three-parameter `var_callee_type` but only two argument operands, and
`-fclangir -emit-llvm` failed LLVM's variadic-call verifier with
`expected var_callee_type to have at most N parameters`. Classic codegen
emits `(ptr, …)` for the same pattern.

The libc++ sweep had one remaining `frontend-crash-other` bucket hit on
`F_nullptr.pass.cpp`, which boils down to `__builtin_invoke` on a
varargs member function pointer — the same callee/operand mismatch in a
minimal repro.

The fix skips the implicit-`this` slot when cloning the member signature
into the callee function type in `createGetMethod`, and tightens

    [3 lines not shown]
DeltaFile
+27-0clang/test/CIR/CodeGen/builtin-invoke-varargs-member.cpp
+9-9clang/test/CIR/CodeGen/pointer-to-member-func.cpp
+6-3clang/include/clang/CIR/Dialect/Builder/CIRBaseBuilder.h
+5-1clang/lib/CIR/Dialect/IR/CIRDialect.cpp
+2-2clang/include/clang/CIR/Dialect/IR/CIROps.td
+49-155 files

LLVM/project 98227f5clang/lib/CIR/Dialect/IR CIRTypes.cpp, clang/test/CIR/CodeGen empty-union.cpp

[CIR] Guard union ABI alignment when getLargestMember is empty (#198340)

Padding-only unions (an empty union lowered as a single `!u8i`
padding member) leave `getLargestMember()` null when CIRGen walks
record layout through MLIR's DataLayout API.
`RecordType::getABIAlignment`
then passed that null `Type` into `getTypeABIAlignment` and crashed.
This showed up compiling libc++ types such as
`std::__variant_detail::__union` nested under `common_iterator`.

Return ABI alignment `1` when there is no largest member, matching a
byte-padded empty union.  This parallels how empty unions are already
handled for size (`getTypeSizeInBits` uses zero size in that situation).

Regression coverage adds a nested-union global in `empty-union.cpp`.
DeltaFile
+28-5clang/test/CIR/CodeGen/empty-union.cpp
+6-2clang/lib/CIR/Dialect/IR/CIRTypes.cpp
+34-72 files

LLVM/project bdead96mlir/include/mlir/Target/LLVMIR ModuleTranslation.h, mlir/lib/Dialect/LLVMIR/IR LLVMDialect.cpp

[mlir][LLVMIR] Allow address-of-global as a leaf in array constants (#198424)

Large `llvm.mlir.global` initializers built as nested `llvm.insertvalue`
chains make `LLVMModuleTranslation::convertGlobalsAndAliases` call
`ConstantFoldInsertValueInstruction` on every step, rebuilding the
whole `ConstantArray` each time. That is O(N²) in the number of
elements and shows up as multi-minute compiles on translation units with
huge pointer tables (SPEC CPU 2026 `gcc/insn-automata.cc` is the
motivating case; Eric Keane's `convertOperationImpl` profile matches
this
path).

This change lets `llvm.mlir.constant` carry an `ArrayAttr` of
`FlatSymbolRefAttr` leaves that name globals (not just functions), adds
a name-keyed global map beside the existing op-keyed map, and resolves
those refs in `getLLVMConstant`.  A translate test checks the resulting
single LLVM constant array initializer.
DeltaFile
+43-10mlir/lib/Target/LLVMIR/ModuleTranslation.cpp
+19-0mlir/test/Target/LLVMIR/llvmir-global-addressof-leaf.mlir
+13-1mlir/lib/Dialect/LLVMIR/IR/LLVMDialect.cpp
+11-0mlir/include/mlir/Target/LLVMIR/ModuleTranslation.h
+10-0mlir/test/Dialect/LLVMIR/invalid.mlir
+96-115 files

LLVM/project 22972efflang/lib/Optimizer/HLFIR/Transforms OptimizedBufferization.cpp, flang/test/HLFIR opt-bufferization-skip-volatile.fir

[flang] Recognize effects on non-addressable resources in opt-bufferization (#198051)

opt-bufferization has been only handling `fir::DebuggingResource`
explicitly. This patch adds support for other non-addressable resources,
such as `fir::VolatileMemoryResource`. This allows merging
elemental/assign for the `volatile_src_nonvolatile_dst` example in the
updated LIT test.
DeltaFile
+119-23flang/lib/Optimizer/HLFIR/Transforms/OptimizedBufferization.cpp
+6-8flang/test/HLFIR/opt-bufferization-skip-volatile.fir
+125-312 files

LLVM/project c357518llvm/docs CIBestPractices.rst

[Docs] Reccomend Container Pinning (#198572)

Add some info to CI Best Practices about pinning container images to a
specific image SHA, which we agreed was a best practice in #197315 (and
maybe somewhere else, but I cannot find anything).

This updates the best practices but does not currently attempt to
actually fix all the cases where we are using unpinned container images.
DeltaFile
+17-4llvm/docs/CIBestPractices.rst
+17-41 files

LLVM/project ed89f08llvm/include/llvm/Object ELFObjectFile.h, llvm/lib/DWP DWP.cpp

[llvm-dwp] Fix incorrect ELF OS/ABI in DWP output (#198486)

I received a report internally that
https://github.com/llvm/llvm-project/pull/192112 caused issues with
lldb.
LLDB has not able to load the dwp files because of the OS mismatch
between the binary and dwp file.

Investigating, it turns out that the refactor caused DWPWriter to call
`ELFObjectFileBase::getOS()` which sets the output OS/ABI, but getOS()
returns `Triple::OSType`, not the raw `e_ident[EI_OSABI]` byte. These
enums have different numbering :( oops.

This caused certain tools that validate OS/ABI consistency between a
binary and its DWP to reject the debug info.
Fix by adding getEIdentOSABI() to ELFObjectFileBase (parallel to
getEIdentABIVersion()) and using it instead of getOS().

Assisted-by: Claude
DeltaFile
+25-0llvm/test/tools/llvm-dwp/X86/osabi.test
+7-0llvm/include/llvm/Object/ELFObjectFile.h
+1-1llvm/lib/DWP/DWP.cpp
+33-13 files

LLVM/project 861baealldb/tools/lldb-dap/extension/src process-tree.ts

[lldb-dap] Add missing `arguments` field to LldbDapProcessEntry (#198597)

The TypeScript interface was missing the optional `arguments` field that
`parseListProcessesOutput` reads and `pick-process` displays, breaking
the extension build.
DeltaFile
+1-0lldb/tools/lldb-dap/extension/src/process-tree.ts
+1-01 files

LLVM/project 41c45a2clang-tools-extra/clangd/refactor/tweaks DefineOutline.cpp, clang-tools-extra/clangd/unittests/tweaks DefineOutlineTests.cpp

[clangd] Let DefineOutline tweak create a definition from scratch (#71950)

Fixes https://github.com/clangd/clangd/issues/445
DeltaFile
+72-11clang-tools-extra/clangd/refactor/tweaks/DefineOutline.cpp
+26-6clang-tools-extra/clangd/unittests/tweaks/DefineOutlineTests.cpp
+3-0clang-tools-extra/docs/ReleaseNotes.rst
+101-173 files

LLVM/project f037e17lldb/source/Commands CommandObjectTarget.cpp

[lldb] Don't require a real target for `target modules list -g` (#198594)

The `-g` flag lists the global module list, which doesn't need a target.
Switch to eCommandAllowsDummyTarget and error out explicitly in
DoExecute on the non-global paths when no real target is selected.

Fixes a regression introduced by #198429.
DeltaFile
+8-2lldb/source/Commands/CommandObjectTarget.cpp
+8-21 files

LLVM/project 70f8c7bllvm/test/CodeGen/AMDGPU atomic_optimizations_global_pointer.ll atomic_optimizations_local_pointer.ll, llvm/test/MachineVerifier/AMDGPU dpp-sgpr-src1.mir

[AMDGPU] Disable dpp src1 sgpr on gfx11 (#164241)

https://github.com/llvm/llvm-project/pull/67461 enabled SGPRs as src1 by
default for all dpp opcodes with manual checks for targets where this is
not supported. In that case, isOperandLegal checked if the second
operand is legal as src0.
https://github.com/llvm/llvm-project/pull/155595 disabled this check by
removing the calls to isOperandLegal, which resulted in SGPRs being used
as operands for src1 on gfx11. This PR reenables this check and fixes
the lit test.

---------

Co-authored-by: Paul Trojahn <paul.trojahn at amd.com>
DeltaFile
+83-79llvm/test/CodeGen/AMDGPU/atomic_optimizations_global_pointer.ll
+70-66llvm/test/CodeGen/AMDGPU/atomic_optimizations_local_pointer.ll
+26-0llvm/test/CodeGen/AMDGPU/si-fold-operands-gfx11.mir
+18-3llvm/test/CodeGen/AMDGPU/dpp_combine.ll
+20-0llvm/test/MachineVerifier/AMDGPU/dpp-sgpr-src1.mir
+7-5llvm/test/CodeGen/AMDGPU/dpp_combine_gfx11.mir
+224-1532 files not shown
+235-1618 files

LLVM/project b16e3a0libc/startup/linux/aarch64 tls.cpp

[libc] Remove broken __builtin_aarch64_wsr fallback in set_thread_ptr (#197295)

The fallback used __builtin_aarch64_wsr (32-bit) instead of
__builtin_aarch64_wsr64, truncating the 64-bit thread pointer value and
causing non-deterministic runtime crashes.

Modern GCC correctly warns about it and -Werror=conversion catches it.


```
/var/tmp/portage/llvm-runtimes/libc-22.1.5/work/libc/startup/linux/aarch64/tls.cpp: In function ‘bool __llvm_libc_22_1_5_::set_thread_ptr(uintptr_t)’:
/var/tmp/portage/llvm-runtimes/libc-22.1.5/work/libc/startup/linux/aarch64/tls.cpp:90:38: error: conversion from ‘uintptr_t’ {aka ‘long unsigned int’} to ‘unsigned int’ may change value [-Werror=conversion]
   90 |   __builtin_aarch64_wsr("tpidr_el0", val);
      |                                      ^~~
cc1plus: all warnings being treated as errors
```
DeltaFile
+2-2libc/startup/linux/aarch64/tls.cpp
+2-21 files

LLVM/project e13d9c2clang/lib/CIR/CodeGen CIRGenAtomic.cpp, clang/test/CIR/CodeGen atomic.c

[CIR] Implement atomic cmp exhange with non-const 'weak' lowering (#198546)

This was left as an NYI, but appears in self build!

This patch follows the existing solution in that we are doing the
branching of weak vs not-weak at the CIR level. This is necessary
because the LLVM intrinsics (and the CIR operaions) take 'weak' as a
constant value.

Unlike classic-codegen, this patch uses an 'if' instead of a 'switch' on
the 'weak' value. This is mainly for readability (since it is a switch
    over a bool!), but also because our 'switch' doesn't seem to support
'bool', so this would require an additional cast.

As a future direction, we may wish to modify the CIR operations to take
'weak' and 'failure' value (both are constants in LLVM intrinsics!) as
non-constants, and handle the switch/if statement during lowering. This
would give us an opportunity to optimize the value out without having to
collapse the if/switch/etc, and minimize the size of the CIR. However,
as that is a larger direction, this patch skips that for now.
DeltaFile
+354-0clang/test/CIR/CodeGen/atomic.c
+35-3clang/lib/CIR/CodeGen/CIRGenAtomic.cpp
+389-32 files

LLVM/project bb95a8dclang/lib/CIR/CodeGen CIRGenExpr.cpp, clang/test/CIR/CodeGen builtin-call.cpp

[CIR] Fix assumption that 'curFn' is always a function in direct-call (#197766)

The code to do some checking with a builtin function tried to tell
whether it is being called inside of a function of the same name. This
isn't necessarily true (that it is in a function), since we generate
'global' ops as a curFn too. This patch just removes the assumption and
changes the condition to only happen when we're in a function.
DeltaFile
+77-0clang/test/CIR/CodeGen/builtin-call.cpp
+4-3clang/lib/CIR/CodeGen/CIRGenExpr.cpp
+81-32 files

LLVM/project 58a43dcmlir/lib/Dialect/SPIRV/IR SPIRVOps.cpp, mlir/lib/Target/SPIRV/Deserialization Deserializer.cpp

[mlir][SPIR-V] Support literal struct type in spirv.Constant (#198414)
DeltaFile
+29-1mlir/test/Dialect/SPIRV/IR/structure-ops.mlir
+21-3mlir/lib/Dialect/SPIRV/IR/SPIRVOps.cpp
+14-0mlir/test/Target/SPIRV/constant.mlir
+4-3mlir/lib/Target/SPIRV/Serialization/Serializer.cpp
+1-1mlir/lib/Target/SPIRV/Deserialization/Deserializer.cpp
+69-85 files

LLVM/project ace44dcllvm/test/CodeGen/AMDGPU memory-legalizer-local-nontemporal.ll shl_add.ll

[AMDGPU] Gate `S_LSHL[1-4]_ADD_U32` patterns on uniform results (#198508)

Like the other SOP2 patterns in this file, these scalar instructions
require the result to be uniform. Wrap them in `UniformBinFrag` so
divergent shl/add chains use `V_LSHL_ADD_U32`
DeltaFile
+147-102llvm/test/CodeGen/AMDGPU/memory-legalizer-local-nontemporal.ll
+233-0llvm/test/CodeGen/AMDGPU/shl_add.ll
+104-72llvm/test/CodeGen/AMDGPU/memory-legalizer-private-nontemporal.ll
+87-62llvm/test/CodeGen/AMDGPU/memory-legalizer-local-volatile.ll
+51-51llvm/test/CodeGen/AMDGPU/flat-scratch.ll
+43-43llvm/test/CodeGen/AMDGPU/dynamic_stackalloc.ll
+665-3308 files not shown
+748-40114 files

LLVM/project 987b0e3mlir/include/mlir/Dialect/AMDGPU/IR AMDGPUOps.td, mlir/lib/Conversion/AMDGPUToROCDL AMDGPUToROCDL.cpp

[mlir][AMDGPU] Extend amdgpu.transpose_load for gfx1250 (#198354)

This commit adds support for gfx1250's ds_load_tr* instructions to
`amdgpu.transpose_load` since they're pretty close to the gfx950 ones.

---------

Co-authored-by: Codex <codex at openai.com>
DeltaFile
+97-35mlir/lib/Conversion/AMDGPUToROCDL/AMDGPUToROCDL.cpp
+41-0mlir/test/Conversion/AMDGPUToROCDL/transpose_load_gfx1250.mlir
+22-15mlir/lib/Dialect/AMDGPU/IR/AMDGPUOps.cpp
+10-6mlir/include/mlir/Dialect/AMDGPU/IR/AMDGPUOps.td
+5-5mlir/test/Conversion/AMDGPUToROCDL/transpose_load.mlir
+8-0mlir/test/Conversion/AMDGPUToROCDL/transpose_load_gfx1250_invalid.mlir
+183-612 files not shown
+192-628 files

LLVM/project ed14419mlir/include/mlir/IR BuiltinDialectBytecode.td BytecodeBase.td, mlir/test/Dialect/Builtin/Bytecode types.mlir

[mlirbc] Add missing encoding for float types (#191962)

Enabling but making it easy to disable to enable reader side first updates.
DeltaFile
+43-1mlir/include/mlir/IR/BuiltinDialectBytecode.td
+26-2mlir/test/Dialect/Builtin/Bytecode/types.mlir
+4-0mlir/include/mlir/IR/BytecodeBase.td
+73-33 files

LLVM/project d46cca0llvm/test/tools/dsymutil/ARM thumb.c, llvm/test/tools/dsymutil/X86 reproducer.test modules-pruning.cpp

[dsymutil] Add missing --linker {classic,parallel} in tests (#198568)

As I'm preparing to toggle the default, I found another set of tests
that don't explicitly pass the linker to dsymutil.
DeltaFile
+4-4llvm/test/tools/dsymutil/X86/reproducer.test
+5-0llvm/test/tools/dsymutil/X86/modules-pruning.cpp
+2-2llvm/test/tools/dsymutil/X86/remarks-linking-archive.text
+2-2llvm/test/tools/dsymutil/ARM/thumb.c
+2-2llvm/test/tools/dsymutil/X86/modules.m
+2-2llvm/test/tools/dsymutil/X86/odr-uniquing.cpp
+17-1215 files not shown
+32-2621 files

LLVM/project 43b66dfllvm/docs LangRef.rst

[IR] Explicitly note C standard library UB (#198562)

This language is to my understanding a bit outdated (if we're in a
freestanding environment, we should be handling things fine to my
knowledge, or at least I'm not aware of any outstanding issues reported
by people compiling for freestanding environments/different languages
which are somewhat prominent at this point). The language here dates
back to
68f971b1d67d51272f5c141fc9e4740e27e279f4 with some minor modifications
in 722212d1a0672ae18a23db58c4cfb7e38073abfa. Explicitly note the UB
aspect as this came up recently when working on llubi in #190147 and I
do not think hurts to explicitly note.
DeltaFile
+2-2llvm/docs/LangRef.rst
+2-21 files

LLVM/project bd74b5bllvm/include/llvm/CodeGen MachineFunction.h, llvm/lib/CodeGen MachineFunction.cpp

[AMDGPU][MC] Replace shifted registers in CFI instructions

Change-Id: I0d99e9fe43ec3b6fecac20531119956dca2e4e5c
DeltaFile
+67-67llvm/test/CodeGen/AMDGPU/sgpr-spill-overlap-wwm-reserve.mir
+33-0llvm/lib/MC/MCDwarf.cpp
+15-15llvm/test/CodeGen/AMDGPU/dwarf-multi-register-use-crash.ll
+10-0llvm/lib/CodeGen/MachineFunction.cpp
+4-4llvm/test/CodeGen/AMDGPU/debug-frame.ll
+4-0llvm/include/llvm/CodeGen/MachineFunction.h
+133-865 files not shown
+143-9011 files

LLVM/project 1df6d5fllvm/lib/Target/AMDGPU SIFrameLowering.cpp SIMachineFunctionInfo.h, llvm/test/CodeGen/AMDGPU amdgpu-spill-cfi-saved-regs.ll

[AMDGPU] Implement -amdgpu-spill-cfi-saved-regs

These spills need special CFI anyway, so implementing them directly
where CFI is emitted avoids the need to invent a mechanism to track them
from ISel.

Change-Id: If4f34abb3a8e0e46b859a7c74ade21eff58c4047
Co-authored-by: Scott Linder scott.linder at amd.com
Co-authored-by: Venkata Ramanaiah Nalamothu VenkataRamanaiah.Nalamothu at amd.com
DeltaFile
+2,926-0llvm/test/CodeGen/AMDGPU/amdgpu-spill-cfi-saved-regs.ll
+12-0llvm/lib/Target/AMDGPU/SIFrameLowering.cpp
+10-0llvm/lib/Target/AMDGPU/SIMachineFunctionInfo.h
+9-0llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
+2-0llvm/lib/Target/AMDGPU/SIRegisterInfo.h
+2,959-05 files

LLVM/project 5a78b00llvm/test/CodeGen/AMDGPU amdgcn.bitcast.1024bit.ll gfx-callable-argument-types.ll

[AMDGPU] Implement CFI for CSR spills

Introduce new SPILL pseudos to allow CFI to be generated for only CSR
spills, and to make ISA-instruction-level accurate information.

Other targets either generate slightly incorrect information or rely on
conventions for how spills are placed within the entry block. The
approach in this change produces larger unwind tables, with the
increased size being spent on additional DW_CFA_advance_location
instructions needed to describe the unwinding accurately.

Change-Id: I9b09646abd2ac4e56eddf5e9aeca1a5bebbd43dd
Co-authored-by: Scott Linder <scott.linder at amd.com>
Co-authored-by: Venkata Ramanaiah Nalamothu <VenkataRamanaiah.Nalamothu at amd.com>
DeltaFile
+3,568-2,598llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.1024bit.ll
+1,912-1,913llvm/test/CodeGen/AMDGPU/gfx-callable-argument-types.ll
+2,700-12llvm/test/CodeGen/AMDGPU/accvgpr-spill-scc-clobber.mir
+631-631llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.512bit.ll
+505-510llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.960bit.ll
+394-399llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.896bit.ll
+9,710-6,063108 files not shown
+14,825-9,526114 files

LLVM/project 2af9d2dllvm/test/CodeGen/AMDGPU amdgcn.bitcast.1024bit.ll amdgcn.bitcast.960bit.ll

[AMDGPU] Use register pair for PC spill

Change-Id: Ibedeef926f7ff235a06de65a83087c151f66a416
DeltaFile
+4,331-4,331llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.1024bit.ll
+1,742-1,740llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.960bit.ll
+1,562-1,560llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.512bit.ll
+1,462-1,460llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.896bit.ll
+1,238-1,236llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.832bit.ll
+1,030-1,028llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.768bit.ll
+11,365-11,35589 files not shown
+18,153-18,04495 files

LLVM/project e2acf26llvm/lib/Target/AMDGPU SIFrameLowering.cpp, llvm/test/CodeGen/AMDGPU debug-frame.ll eliminate-frame-index-v-add-u32.mir

[AMDGPU] Emit entry function Dwarf CFI

Entry functions represent the end of unwinding, as they are the
outer-most frame. This implies they can only have a meaningful
definition for the CFA, which AMDGPU defines using a memory location
description with a literal private address space address. The return
address is set to undefined as a sentinel value to signal the end of
unwinding.

Change-Id: I21580f6a24f4869ba32939c9c6332506032cc654
Co-authored-by: Scott Linder <scott.linder at amd.com>
Co-authored-by: Venkata Ramanaiah Nalamothu <VenkataRamanaiah.Nalamothu at amd.com>
DeltaFile
+1,405-0llvm/test/CodeGen/AMDGPU/debug-frame.ll
+204-12llvm/test/CodeGen/AMDGPU/eliminate-frame-index-v-add-u32.mir
+134-6llvm/test/CodeGen/AMDGPU/eliminate-frame-index-v-add-co-u32.mir
+114-10llvm/test/CodeGen/AMDGPU/eliminate-frame-index-s-add-i32.mir
+42-5llvm/lib/Target/AMDGPU/SIFrameLowering.cpp
+34-0llvm/test/CodeGen/AMDGPU/entry-function-cfi.mir
+1,933-3322 files not shown
+2,044-5028 files