LLVM/project f8a0c40offload/libomptarget omptarget.cpp, offload/plugins-nextgen/common/src RecordReplay.cpp

[offload][OpenMP] Fix record replay when no memory is used

Progams that do not use any memory (e.g., no mappings) were failing because
we were trying to execute zero size transfers.
DeltaFile
+17-18offload/plugins-nextgen/common/src/RecordReplay.cpp
+18-12offload/libomptarget/omptarget.cpp
+26-0offload/test/tools/omp-kernel-replay/record-replay-empty-memory.cpp
+2-1offload/tools/kernelreplay/llvm-omp-kernel-replay.cpp
+63-314 files

LLVM/project 004f829clang/cmake/caches Fuchsia-stage2.cmake

[clang][cmake] Disable exceptions for ASan runtime on Fuchsia (#204512)

Fuchsia's default runtime environment prefers no-exceptions. Compiling
the C++ slice of ASan (asan_new_delete.cpp) with exceptions introduces
dependencies on EH symbols
(__cxa_begin_catch, etc.) in libclang_rt.asan.so. This causes link
failures when linking ASan-enabled binaries with noexcept libc++abi.
Explicitly disable COMPILER_RT_ASAN_ENABLE_EXCEPTIONS for Fuchsia
targets in the stage2 cache.
DeltaFile
+1-0clang/cmake/caches/Fuchsia-stage2.cmake
+1-01 files

LLVM/project 3a9fdecllvm/lib/Target/AArch64 AArch64SystemOperands.td AArch64InstrFormats.td, llvm/lib/Target/AArch64/AsmParser AArch64AsmParser.cpp

fixup! Convert PSB to use PSBHint for consistency
DeltaFile
+7-23llvm/lib/Target/AArch64/AArch64SystemOperands.td
+17-5llvm/lib/Target/AArch64/AsmParser/AArch64AsmParser.cpp
+5-6llvm/lib/Target/AArch64/Utils/AArch64BaseInfo.h
+4-4llvm/lib/Target/AArch64/Utils/AArch64BaseInfo.cpp
+3-2llvm/lib/Target/AArch64/AArch64InstrFormats.td
+1-1llvm/lib/Target/AArch64/MCTargetDesc/AArch64InstPrinter.cpp
+37-411 files not shown
+38-417 files

LLVM/project 38e2806llvm/lib/Target/AArch64 AArch64SystemOperands.td AArch64InstrInfo.td, llvm/lib/Target/AArch64/AsmParser AArch64AsmParser.cpp

fixup! Address PR comments
DeltaFile
+24-48llvm/lib/Target/AArch64/AArch64SystemOperands.td
+25-23llvm/lib/Target/AArch64/Utils/AArch64BaseInfo.h
+15-23llvm/lib/Target/AArch64/AsmParser/AArch64AsmParser.cpp
+15-8llvm/lib/Target/AArch64/Utils/AArch64BaseInfo.cpp
+5-13llvm/lib/Target/AArch64/MCTargetDesc/AArch64InstPrinter.cpp
+9-8llvm/lib/Target/AArch64/AArch64InstrInfo.td
+93-1234 files not shown
+107-13610 files

LLVM/project 316e5a3llvm/lib/Target/X86 X86AvoidStoreForwardingBlocks.cpp

[X86] Simplify duplicate MMO offset tracking in breakBlockedCopies (NFC) (#202904)

LMMOffset and SMMOffset in breakBlockedCopies/buildCopies/buildCopy were
both initialized to 0 and advanced in lockstep by identical amounts, so
they were always equal. Collapse them into a single Offset used for both
the load and store MachineMemOperands.

This also removes a latent typo: the final buildCopies call passed
LMMOffset for the store offset argument instead of SMMOffset. Since the
two were always equal this was harmless, and the unified Offset makes
the divergence unrepresentable.

Found via @jlebar's X86 LLVM bug hunt / FuzzX effort:
https://github.com/SemiAnalysisAI/FuzzX/blob/master/x86/bugs/042-sfb-buildcopies-wrong-mmo-offset/NOTES.md

cc @jlebar
DeltaFile
+21-35llvm/lib/Target/X86/X86AvoidStoreForwardingBlocks.cpp
+21-351 files

LLVM/project 8199e9flldb/test/API/commands/process/attach TestProcessAttach.py

[lldb][test] Speed up ProcessAttach test (#201530)

ProcessAttach is our slowest test and runs for about 70s. We spend 60s
in the autocontinue test waiting for the target program to terminate.

The reason we wait for the program is that our autocontinue test is not
running its command in async mode, and we wait after the attach for the
next breakpoint or the program terminates.

This patch makes the attach and autocontinue run in async mode so we
don't wait for the program to finish. This reduces the test time from
70s to about 10s.

It also replaces the assertTrue call that was supposed to be an
assertEqual, which made the test succeed even though the inferior
process already terminated.
DeltaFile
+24-1lldb/test/API/commands/process/attach/TestProcessAttach.py
+24-11 files

LLVM/project c12e1adllvm/docs AMDGPUAsyncOperations.rst

review comments, and some clean up

- "produce and append"
- consistently use "initiate"
- use "dynamic instance" instead of "execute an instruction"
DeltaFile
+12-15llvm/docs/AMDGPUAsyncOperations.rst
+12-151 files

LLVM/project a02e66bllvm/lib/Target/AArch64/GISel AArch64InstructionSelector.cpp AArch64RegisterBankInfo.cpp, llvm/test/CodeGen/AArch64/GlobalISel select-insert-vector-elt.mir regbank-insert-vector-elt.mir

[AArch64][GlobalISel] Select narrow G_INSERT_VECTOR_ELT GPR operands (#203568)

RegBankSelect currently extends narrow i8/i16 G_INSERT_VECTOR_ELT GPR
operands to 32-bits. Move this widening to pre-isel lowering. This will
help enable a simple fast pure type-based RBS alternative.

Assisted-by: codex
DeltaFile
+24-11llvm/lib/Target/AArch64/GISel/AArch64InstructionSelector.cpp
+30-0llvm/test/CodeGen/AArch64/GlobalISel/select-insert-vector-elt.mir
+28-0llvm/test/CodeGen/AArch64/GlobalISel/regbank-insert-vector-elt.mir
+0-20llvm/lib/Target/AArch64/GISel/AArch64RegisterBankInfo.cpp
+82-314 files

LLVM/project 17143b0llvm/include/llvm/CodeGen UnreachableBlockElim.h RenameIndependentSubregs.h, llvm/include/llvm/Transforms/Scalar StructurizeCFG.h

[NPM] Make few more passes Required
DeltaFile
+4-4llvm/lib/Target/AMDGPU/AMDGPU.h
+2-2llvm/include/llvm/CodeGen/UnreachableBlockElim.h
+1-1llvm/include/llvm/CodeGen/RenameIndependentSubregs.h
+1-1llvm/include/llvm/CodeGen/TwoAddressInstructionPass.h
+1-1llvm/include/llvm/Transforms/Scalar/StructurizeCFG.h
+1-1llvm/include/llvm/Transforms/Utils/UnifyLoopExits.h
+10-1014 files not shown
+24-2420 files

LLVM/project d5194e2openmp/runtime/src kmp_adt.h, openmp/runtime/unittests/ADT TestVector.cpp CMakeLists.txt

[libomp] Add kmp_vector (ADT 2/2) (#176163)

See rationale in the commit adding kmp_str_ref.

This commit introduces kmp_vector, a class intended primarily for small
vectors. It currently only includes methods I need at the moment, but
it's easily extensible.
DeltaFile
+627-0openmp/runtime/unittests/ADT/TestVector.cpp
+194-0openmp/runtime/src/kmp_adt.h
+1-0openmp/runtime/unittests/ADT/CMakeLists.txt
+822-03 files

LLVM/project 62e5c56llvm/include/llvm/Analysis ModuleSummaryAnalysis.h, llvm/lib/Analysis StackSafetyAnalysis.cpp ModuleSummaryAnalysis.cpp

[NPM] Port ImmutableModuleSummaryAnalysis to NPM
DeltaFile
+29-0llvm/include/llvm/Analysis/ModuleSummaryAnalysis.h
+5-2llvm/lib/Analysis/StackSafetyAnalysis.cpp
+2-0llvm/lib/Passes/PassRegistry.def
+1-0llvm/lib/Analysis/ModuleSummaryAnalysis.cpp
+37-24 files

LLVM/project 14de4e5clang/test/AST/ByteCode cxx23.cpp

[clang][bytecode][NFC] Remove an outdated comment (#204509)

They don't disagree anymore.
DeltaFile
+0-1clang/test/AST/ByteCode/cxx23.cpp
+0-11 files

LLVM/project 36b516bclang/test/Driver hip-sanitize-options.hip, clang/test/Driver/Inputs/rocm/amdgcn/bitcode oclc_isa_version_12-5-generic.bc

AMDGPU: Remove xnack-any-only subtarget feature and handling

This reverts commit f4caa0a172d96597c375e6b6b2192c289723a6b9.

This feature was added to gfx12-5-generic only, which does not make
sense given that both gxf1250 and gfx1251 have the same unconditional
xnack handling. It also does not make sense to diagnose trying to use
a specific xnack mode on the generic target only, and only from the
backend.

The current feature management is a confusing mess, given that we have
2 parallel feature systems. AMDGPUTargetParser has a table containing
a bitmask of features, which already contained FEATURE_XNACK_ALWAYS
for gfx1250/gfx1251, but not gfx12-5-generic. Add this handling there
so the sanitizer detection is consistent on the generic target.

These 2 feature tables probably should be unified in some way. We also
probably should have a subtarget feature for the xnack handling, but it
should be inverted. xnack-any-only is an antifeature, in that it removes

    [2 lines not shown]
DeltaFile
+0-9llvm/test/CodeGen/AMDGPU/gfx12-5-generic-no-xnack.ll
+5-0clang/test/Driver/hip-sanitize-options.hip
+0-5llvm/lib/Target/AMDGPU/AMDGPU.td
+0-4llvm/lib/Target/AMDGPU/GCNSubtarget.cpp
+1-1llvm/include/llvm/TargetParser/AMDGPUTargetParser.def
+0-0clang/test/Driver/Inputs/rocm/amdgcn/bitcode/oclc_isa_version_12-5-generic.bc
+6-196 files

LLVM/project 4c94fd8clang/test/Driver hip-sanitize-options.hip, clang/test/Driver/Inputs/rocm/amdgcn/bitcode oclc_isa_version_12-5-generic.bc

AMDGPU: Remove xnack-any-only subtarget feature and handling

This reverts commit f4caa0a172d96597c375e6b6b2192c289723a6b9.

This feature was added to gfx12-5-generic only, which does not make
sense given that both gxf1250 and gfx1251 have the same unconditional
xnack handling. It also does not make sense to diagnose trying to use
a specific xnack mode on the generic target only, and only from the
backend.

The current feature management is a confusing mess, given that we have
2 parallel feature systems. AMDGPUTargetParser has a table containing
a bitmask of features, which already contained FEATURE_XNACK_ALWAYS
for gfx1250/gfx1251, but not gfx12-5-generic. Add this handling there
so the sanitizer detection is consistent on the generic target.

These 2 feature tables probably should be unified in some way. We also
probably should have a subtarget feature for the xnack handling, but it
should be inverted. xnack-any-only is an antifeature, in that it removes

    [2 lines not shown]
DeltaFile
+0-9llvm/test/CodeGen/AMDGPU/gfx12-5-generic-no-xnack.ll
+0-5llvm/lib/Target/AMDGPU/AMDGPU.td
+5-0clang/test/Driver/hip-sanitize-options.hip
+0-4llvm/lib/Target/AMDGPU/GCNSubtarget.cpp
+1-1llvm/include/llvm/TargetParser/AMDGPUTargetParser.def
+0-0clang/test/Driver/Inputs/rocm/amdgcn/bitcode/oclc_isa_version_12-5-generic.bc
+6-196 files

LLVM/project 42743ablld/MachO InputFiles.cpp, lld/test/MachO icf-ignore-literal-ptr-labels.s ignore-literal-cstring-labels.s

[lld-macho] Ignore labels on sections ld64 treats as ignoreLabel (#194275)

In ld64, labels on records in some sections never become named atoms and
never enter the symbol table:

- Unconditionally: __cfstring, __objc_classrefs, and __objc_selrefs
- Prefix-gated on `L`/`l`: __literal{4,8,16} and __cstring-family
sections such as __objc_methname

LLD, however, ran every such label through `SymbolTable::addDefined`,
which diverged from ld64 whenever an identically-named symbol appeared
in another section. This patch mirrors ld64's behavior in LLD. The
Defined is still created for the affected labels, but it bypasses the
symbol table entirely and cannot collide with any cross-TU symbol.

I have encountered a few link failures caused by this, and reduced them
into the regression tests in the patch.
DeltaFile
+98-0lld/test/MachO/icf-ignore-literal-ptr-labels.s
+45-0lld/test/MachO/ignore-literal-cstring-labels.s
+12-1lld/MachO/InputFiles.cpp
+155-13 files

LLVM/project ff0a5e1llvm/lib/Transforms/IPO ThinLTOBitcodeWriter.cpp WholeProgramDevirt.cpp, llvm/test/ThinLTO/X86 devirt_function_alias2.ll

[CFI] Create an external linkage alias instead of promoting internals
DeltaFile
+20-33llvm/lib/Transforms/IPO/ThinLTOBitcodeWriter.cpp
+20-5llvm/lib/Transforms/IPO/WholeProgramDevirt.cpp
+10-7llvm/test/Transforms/ThinLTOBitcodeWriter/comdat.ll
+16-0llvm/lib/Transforms/IPO/LowerTypeTests.cpp
+6-4llvm/test/ThinLTO/X86/devirt_function_alias2.ll
+4-2llvm/test/Transforms/ThinLTOBitcodeWriter/split-vfunc-internal.ll
+76-513 files not shown
+83-569 files

LLVM/project 2829ab3llvm/include/llvm/IR GlobalValue.h, llvm/include/llvm/Transforms/Utils AssignGUID.h

Reland #184065
DeltaFile
+61-17llvm/lib/Bitcode/Reader/BitcodeReader.cpp
+45-30llvm/lib/LTO/LTO.cpp
+64-2llvm/lib/IR/Globals.cpp
+49-3llvm/lib/Bitcode/Writer/BitcodeWriter.cpp
+45-5llvm/include/llvm/IR/GlobalValue.h
+49-0llvm/include/llvm/Transforms/Utils/AssignGUID.h
+313-57117 files not shown
+853-407123 files

LLVM/project 909773dclang/lib/AST TypePrinter.cpp, clang/test/AST ast-dump-riscv-rvv-fixed-length-mask-types.c

[RISCV] Fix the AST type printing code for VectorKind::RVVFixedLengthMask_1/2/4 (#204498)

These types have a fixed size of 1, 2, 4. The formula used for the other
types does not apply.

Assisted-by: Claude
DeltaFile
+51-0clang/test/AST/ast-dump-riscv-rvv-fixed-length-mask-types.c
+38-13clang/lib/AST/TypePrinter.cpp
+89-132 files

LLVM/project 6bc8b3cllvm/lib/Transforms/IPO ThinLTOBitcodeWriter.cpp WholeProgramDevirt.cpp, llvm/test/ThinLTO/X86 devirt_function_alias2.ll

[CFI] Create an external linkage alias instead of promoting internals
DeltaFile
+20-33llvm/lib/Transforms/IPO/ThinLTOBitcodeWriter.cpp
+20-5llvm/lib/Transforms/IPO/WholeProgramDevirt.cpp
+10-7llvm/test/Transforms/ThinLTOBitcodeWriter/comdat.ll
+16-0llvm/lib/Transforms/IPO/LowerTypeTests.cpp
+6-4llvm/test/ThinLTO/X86/devirt_function_alias2.ll
+4-2llvm/test/Transforms/ThinLTOBitcodeWriter/split-vfunc-internal.ll
+76-513 files not shown
+83-569 files

LLVM/project 1817d11llvm/include/llvm/IR GlobalValue.h, llvm/include/llvm/Transforms/Utils AssignGUID.h

Reland #184065
DeltaFile
+61-17llvm/lib/Bitcode/Reader/BitcodeReader.cpp
+45-30llvm/lib/LTO/LTO.cpp
+64-2llvm/lib/IR/Globals.cpp
+49-3llvm/lib/Bitcode/Writer/BitcodeWriter.cpp
+45-5llvm/include/llvm/IR/GlobalValue.h
+49-0llvm/include/llvm/Transforms/Utils/AssignGUID.h
+313-57116 files not shown
+853-400122 files

LLVM/project fecb127clang/test/Preprocessor init-datetime-macros.c

[clang-cl][test] Use /Zs to avoid writing unnecessary output files (#204501)

#194779 adds a test clang/test/Preprocessor/init-datetime-macros.c which
verifies some diagnostics. However, it does so with `/c`, which will
unnecessarily generate an output, and when run on a build system that
does not run tests in a writeable dir by default, will cause the test to
fail.

Since we don't care about the resulting object file, use `/Zs`
(equivalent of `-fsyntax-only`) to check the diagnostics but not produce
any output files.
DeltaFile
+1-1clang/test/Preprocessor/init-datetime-macros.c
+1-11 files

LLVM/project 0f56de1offload/libomptarget omptarget.cpp, offload/plugins-nextgen/common/src RecordReplay.cpp

[offload][OpenMP] Fix record replay when no memory is used

Progams that do not use any memory (e.g., no mappings) were failing because
we were trying to execute zero size transfers.
DeltaFile
+18-12offload/libomptarget/omptarget.cpp
+26-0offload/test/tools/omp-kernel-replay/record-replay-empty-memory.cpp
+13-9offload/plugins-nextgen/common/src/RecordReplay.cpp
+2-1offload/tools/kernelreplay/llvm-omp-kernel-replay.cpp
+59-224 files

LLVM/project c02f7b1offload/libomptarget device.cpp, offload/plugins-nextgen/common/include RecordReplay.h PluginInterface.h

[offload] Improve report printing for kernel recording
DeltaFile
+35-15offload/plugins-nextgen/common/src/RecordReplay.cpp
+15-2openmp/docs/design/Runtimes.rst
+9-5offload/plugins-nextgen/common/include/RecordReplay.h
+8-2offload/libomptarget/device.cpp
+4-4offload/plugins-nextgen/common/src/PluginInterface.cpp
+3-2offload/plugins-nextgen/common/include/PluginInterface.h
+74-301 files not shown
+76-317 files

LLVM/project d3ac9b5bolt/include/bolt/Core DebugData.h DIEBuilder.h, bolt/include/bolt/Rewrite DWARFRewriter.h

[RFC][BOLT] Add a new parallel DWARF processing(2/2) (#197859)

This PR implements a new parallel DWARF debug info processing pipeline
for BOLT that significantly speeds up `--update-debug-sections` for
large binaries. It is the second part of the split from the overall RFC
changes
RFC - [[RFC][BOLT] A New Parallel DWARF Processing Approach in
BOLT](https://discourse.llvm.org/t/rfc-bolt-a-new-parallel-dwarf-processing-approach-in-bolt/90736)
(The overall changes.)

This PR does the following:
1. **Equivalence-class CU partitioning:** Replaces batchsize grouping
with union-find over DW_FORM_ref_addr references. Connected CUs share a
bucket; isolated CUs become singletons.

> For the non-LTO case, CUs have no cross-CU dependencies, so each CU is
placed into its own singleton bucket and processed fully in parallel.
> For the LTO case, CUs with cross-CU dependencies are grouped into the
same bucket and processed sequentially within that bucket, while

    [7 lines not shown]
DeltaFile
+513-202bolt/lib/Rewrite/DWARFRewriter.cpp
+50-7bolt/include/bolt/Rewrite/DWARFRewriter.h
+55-0bolt/test/X86/dwarf4-cross-cu-ranges.test
+30-15bolt/lib/Core/DebugData.cpp
+16-12bolt/include/bolt/Core/DebugData.h
+7-2bolt/include/bolt/Core/DIEBuilder.h
+671-2382 files not shown
+673-2408 files

LLVM/project bc5c332llvm/lib/Target/AMDGPU AMDGPUISelDAGToDAG.cpp, llvm/test/CodeGen/AMDGPU i128-add-carry-chain.ll

[AMDGPU] Keep i64 carry chains on VCC when feeding VALU users

This PR fixes an issue where ISel could mix scalar and vector carry chains when
lowering widened integer add/sub operations. A scalar-looking i64 carry producer
may feed a divergent carry consumer, so ISel now keeps that carry chain on VCC
to avoid invalid MIR.
DeltaFile
+65-0llvm/test/CodeGen/AMDGPU/i128-add-carry-chain.ll
+36-2llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp
+101-22 files

LLVM/project 258b68fllvm/lib/Target/LoongArch LoongArchISelLowering.cpp LoongArchLSXInstrInfo.td, llvm/test/CodeGen/LoongArch/lasx/ir-instruction fptoui.ll fptosi.ll

[LoongArch] Combine FP_TO_UINT/FP_TO_SINT with [X]VFTINTRZ instruction (#201569)

Combine double conversion to signed 32-bit integer with
`[X]VFTINTRZ_W_D` instructions.

There are three cases:
1. For VT smaller than i32, we promote it to i32 then truncate to the
final result.
2. For `fptoui double to i32`, we convert it to `fptosi double to i64`
then truncate, avoid doing so with LASX enabled because we already have
the corresponding pattern in TableGen.
3. Last, for `fptosi double to i32`, we'll split them into blocks
(128-bit or 256-bit depending on whether LASX is enabled or not) and
then feed them into `[X]VFINTRZ_W_D` instructions, we using the XV
version, a shuffle is need because of the data layout is per 128-bit
lane.
DeltaFile
+96-0llvm/lib/Target/LoongArch/LoongArchISelLowering.cpp
+42-2llvm/test/CodeGen/LoongArch/lsx/ir-instruction/fptoui.ll
+39-2llvm/test/CodeGen/LoongArch/lsx/ir-instruction/fptosi.ll
+20-0llvm/test/CodeGen/LoongArch/lasx/ir-instruction/fptoui.ll
+17-3llvm/test/CodeGen/LoongArch/lasx/ir-instruction/fptosi.ll
+9-0llvm/lib/Target/LoongArch/LoongArchLSXInstrInfo.td
+223-71 files not shown
+227-77 files

LLVM/project 84ebdccoffload/libomptarget device.cpp, offload/plugins-nextgen/common/include RecordReplay.h PluginInterface.h

[offload] Improve report printing for kernel recording
DeltaFile
+35-16offload/plugins-nextgen/common/src/RecordReplay.cpp
+13-2openmp/docs/design/Runtimes.rst
+9-5offload/plugins-nextgen/common/include/RecordReplay.h
+8-2offload/libomptarget/device.cpp
+4-4offload/plugins-nextgen/common/src/PluginInterface.cpp
+3-2offload/plugins-nextgen/common/include/PluginInterface.h
+72-311 files not shown
+74-327 files

LLVM/project b2ffc0dllvm/lib/Transforms/Scalar LoopInterchange.cpp, llvm/test/Transforms/LoopInterchange inner-header-has-duplicate-succs.ll

[LoopInterchange] Reject if inner loop header has duplicate successors (#204128)

Previously, loop interchange crashed in several cases where the inner
loop header had duplicate successors. In practice, the following was
happening:

- During the transformation phase, the inner loop header was not split
because its first non-PHI instruction was its terminator.
- `updateSuccessor` was called on the header with `MustUpdateOnce=true`,
which triggers an assertion failure.

This patch fixes the issue by rejecting such cases during the legality
check phase. I believe this situation is rare, so it should not
significantly affect real-world cases.

Fix #203887.
DeltaFile
+184-0llvm/test/Transforms/LoopInterchange/inner-header-has-duplicate-succs.ll
+7-0llvm/lib/Transforms/Scalar/LoopInterchange.cpp
+191-02 files

LLVM/project d58c356llvm/lib/Target/AMDGPU AMDGPUISelDAGToDAG.cpp SIISelLowering.cpp, llvm/test/CodeGen/AMDGPU packed-fp64.ll packed-u64.ll

[AMDGPU] Make v2x64 BUILD_VECTOR legal on gfx1251
DeltaFile
+120-174llvm/test/CodeGen/AMDGPU/packed-fp64.ll
+70-106llvm/test/CodeGen/AMDGPU/packed-u64.ll
+14-36llvm/test/CodeGen/AMDGPU/shl.v2i64.ll
+15-16llvm/test/CodeGen/AMDGPU/pk-lshl-add-u64.ll
+11-6llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp
+3-2llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+233-3406 files

LLVM/project a24e158llvm/lib/Target/AMDGPU SIFoldOperands.cpp, llvm/test/CodeGen/AMDGPU fold-imm-pk64.mir

[AMDGPU] Prevent folding of immediates larger than 64 bit
DeltaFile
+37-0llvm/test/CodeGen/AMDGPU/fold-imm-pk64.mir
+3-0llvm/lib/Target/AMDGPU/SIFoldOperands.cpp
+40-02 files