LLVM/project 63074dallvm/lib/CodeGen/AsmPrinter DwarfDebug.cpp, llvm/test/DebugInfo attr-btf_tag.ll

[DebugInfo][DwarfDebug] Move emission of globals from beginModule() to endModule() (5/7) (#184219)

RFC
https://discourse.llvm.org/t/rfc-dwarfdebug-fix-and-improve-handling-imported-entities-types-and-static-local-in-subprogram-and-lexical-block-scopes/68544

This patch moves the emission of global variables from
`DwarfDebug::beginModule()` to `DwarfDebug::endModule()`.

It has the following effects:
1. The order of debug entities in the resulting DWARF changes.
2. Currently, if a DISubprogram requires emission of both concrete
out-of-line and inlined subprogram DIEs, and such a subprogram contains
a static local variable, the DIE for the variable is emitted into the
concrete out-of-line subprogram DIE. As a result, the variable is not
available in debugger when breaking at the inlined function instance.

It happens because static locals are emitted in
`DwarfDebug::beginModule()`, but abstract DIEs for functions that are
not completely inlined away are created only later during

    [18 lines not shown]
DeltaFile
+93-89llvm/test/DebugInfo/X86/gnu-public-names.ll
+78-78llvm/test/DebugInfo/NVPTX/debug-addr-class.ll
+63-63llvm/test/MC/WebAssembly/dwarfdump.ll
+50-37llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
+34-37llvm/test/DebugInfo/attr-btf_tag.ll
+31-31llvm/test/MC/WebAssembly/dwarfdump64.ll
+349-33539 files not shown
+610-55945 files

LLVM/project 6e1aee4llvm/lib/Target/AMDGPU SIInstructions.td, llvm/test/CodeGen/AMDGPU bfe-i8-i16.ll fptoi.i128.ll

[AMDGPU] Select v_bfe_u32 for i8/i16 (and (srl x, c), mask) (#182446)

Combine i8 and i16 (and (srl x, c), mask) instructions to v_bfe_32. This optimization is skipped true_i16 targets.

resolves issue #179494
DeltaFile
+141-0llvm/test/CodeGen/AMDGPU/bfe-i8-i16.ll
+16-20llvm/test/CodeGen/AMDGPU/fptoi.i128.ll
+15-4llvm/lib/Target/AMDGPU/SIInstructions.td
+8-8llvm/test/CodeGen/AMDGPU/permute_i8.ll
+5-7llvm/test/CodeGen/AMDGPU/buffer-fat-pointers-contents-legalization.ll
+4-5llvm/test/CodeGen/AMDGPU/bitcast_vector_bigint.ll
+189-443 files not shown
+196-529 files

LLVM/project d62fbb6llvm/lib/Target/SPIRV SPIRVInstructionSelector.cpp, llvm/test/CodeGen/SPIRV/extensions/SPV_INTEL_function_pointers fp_const.ll

[SPIRV] Update the global registry when expanding function pointer (#183873)

We do not update the global registry when expanding a G_GLOBAL_VALUE for
a function pointer. Then during pointer validation, we can get a
garbage value from the global registry.
DeltaFile
+28-0llvm/test/CodeGen/SPIRV/pointers/fun-ptr-to-itself.ll
+4-3llvm/lib/Target/SPIRV/SPIRVInstructionSelector.cpp
+4-2llvm/test/CodeGen/SPIRV/extensions/SPV_INTEL_function_pointers/fp_const.ll
+36-53 files

LLVM/project 5e7d3afflang/lib/Lower/OpenMP Utils.cpp, flang/test/Lower/OpenMP task-affinity.f90

Remove convert for iterator indices since array_coor accepts AnyCoordinateType
DeltaFile
+11-26flang/test/Lower/OpenMP/task-affinity.f90
+0-2flang/lib/Lower/OpenMP/Utils.cpp
+11-282 files

LLVM/project a212ebdllvm/include/llvm/Transforms/IPO Attributor.h, llvm/lib/Analysis ValueTracking.cpp

ValueTracking: Handle constant structs in computeKnownFPClass (#184192)

Also fix attributor not bothering to deal with structs.
DeltaFile
+47-2llvm/test/Transforms/Attributor/nofpclass.ll
+16-0llvm/lib/Analysis/ValueTracking.cpp
+1-9llvm/include/llvm/Transforms/IPO/Attributor.h
+2-3llvm/test/Transforms/InstCombine/simplify-demanded-fpclass-aggregates.ll
+66-144 files

LLVM/project 20902f0llvm/lib/Analysis ValueTracking.cpp, llvm/test/Transforms/Attributor nofpclass-bitcast.ll

ValueTracking: Teach computeKnownFPClass to look at bitcast + integer max (#184073)

The returned class will still be one of the bitpatterns.

This pattern is used in rocm device libraries in assorted functions,
e.g.,

https://github.com/ROCm/llvm-project/blob/amd-staging/amd/device-libs/ocml/src/rlen3F.cl#L20

I believe it is blocking the eliminationg of finite checks in some of
the more complex functions.
DeltaFile
+229-0llvm/test/Transforms/Attributor/nofpclass-bitcast.ll
+27-7llvm/lib/Analysis/ValueTracking.cpp
+256-72 files

LLVM/project 719e2fdflang/lib/Lower/OpenMP Utils.cpp, flang/test/Lower/OpenMP task-affinity.f90

Handle non-default lower bounds in iterator
DeltaFile
+63-0flang/test/Lower/OpenMP/task-affinity.f90
+10-1flang/lib/Lower/OpenMP/Utils.cpp
+73-12 files

LLVM/project de0862dlibunwind/src UnwindCursor.hpp

[WIP][PAC][libunwind] Handle LR and IP signing around sigreturn frame

Support stepping through sigreturn frame in PtrAuth-protected libunwind.

Unfortunately, this involves signing non-protected IP value from
sigcontext struct saved on the stack by the kernel.
DeltaFile
+15-3libunwind/src/UnwindCursor.hpp
+15-31 files

LLVM/project 34541e5clang/lib/Headers/hlsl hlsl_alias_intrinsics.h, clang/lib/Sema SemaHLSL.cpp

[HLSL] Add WaveActiveAllEqual functions (#183634)

This PR adds the WaveActiveAllEqual function to HLSL.
It also adds extra macro logic to CGHLSLBuiltins so that you can specify
a different intrinsic name for the SPIRV intrinsic.
Fixes https://github.com/llvm/llvm-project/issues/99162
DeltaFile
+124-0clang/lib/Headers/hlsl/hlsl_alias_intrinsics.h
+109-0llvm/test/CodeGen/DirectX/WaveActiveAllEqual.ll
+92-0llvm/lib/Target/SPIRV/SPIRVInstructionSelector.cpp
+54-0llvm/test/CodeGen/SPIRV/hlsl-intrinsics/WaveActiveAllEqual.ll
+45-0clang/test/CodeGenHLSL/builtins/WaveActiveAllEqual.hlsl
+25-0clang/lib/Sema/SemaHLSL.cpp
+449-010 files not shown
+505-116 files

LLVM/project 75b2ea5clang/docs ReleaseNotes.rst, clang/include/clang/Analysis/Analyses UnsafeBufferUsage.h

[Clang][UnsafeBufferUsage] Warn about two-arg string_view constructors. (#180471)

"This patch extends the unsafe buffer usage warning to cover
std::string_view constructors that take a pointer and size, similar to
the existing check for std::span.

The warning message has been updated to be generic ('container
construction' instead of 'span construction') and existing tests have
been updated to match.

Fixes #166644."
DeltaFile
+131-0clang/lib/Analysis/UnsafeBufferUsage.cpp
+44-0clang/test/SemaCXX/warn-unsafe-buffer-usage-string-view.cpp
+33-3clang/lib/Sema/AnalysisBasedWarnings.cpp
+5-1clang/include/clang/Basic/DiagnosticSemaKinds.td
+4-0clang/include/clang/Analysis/Analyses/UnsafeBufferUsage.h
+3-0clang/docs/ReleaseNotes.rst
+220-42 files not shown
+223-48 files

LLVM/project 2032960flang/test/Lower/Intrinsics bessel_jn.f90 atand.f90

[flang][NFC] Converted five tests from old lowering to new lowering (part 24) (#184538)

Tests converted from test/Lower/Intrinsics: atan2d.f90, atan2pi.f90,
atand.f90, atanpi.f90, bessel_jn.f90
DeltaFile
+69-86flang/test/Lower/Intrinsics/bessel_jn.f90
+25-20flang/test/Lower/Intrinsics/atand.f90
+25-20flang/test/Lower/Intrinsics/atanpi.f90
+13-12flang/test/Lower/Intrinsics/atan2pi.f90
+9-8flang/test/Lower/Intrinsics/atan2d.f90
+141-1465 files

LLVM/project 210d238llvm/lib/Analysis ValueTracking.cpp, llvm/test/Transforms/Attributor nofpclass-bitcast.ll

ValueTracking: Teach computeKnownFPClass to look at bitcast + integer max

The returned class will still be one of the bitpatterns.

This pattern is used in rocm device libraries in assorted functions, e.g.,
https://github.com/ROCm/llvm-project/blob/amd-staging/amd/device-libs/ocml/src/rlen3F.cl#L20

I believe it is blocking the eliminationg of finite checks in some of the more
complex functions.
DeltaFile
+229-0llvm/test/Transforms/Attributor/nofpclass-bitcast.ll
+27-7llvm/lib/Analysis/ValueTracking.cpp
+256-72 files

LLVM/project a8d8247llvm/lib/Transforms/InstCombine InstCombineSimplifyDemanded.cpp, llvm/test/Transforms/InstCombine simplify-demanded-fpclass-aggregates.ll

InstCombine: Handle insertvalue in SimplifyDemandedFPClass
DeltaFile
+68-0llvm/test/Transforms/InstCombine/simplify-demanded-fpclass-aggregates.ll
+9-0llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp
+77-02 files

LLVM/project 16c3474llvm/include/llvm/Transforms/IPO Attributor.h, llvm/lib/Analysis ValueTracking.cpp

ValueTracking: Handle constant structs in computeKnownFPClass

Also fix attributor not bothering to deal with structs.
DeltaFile
+47-2llvm/test/Transforms/Attributor/nofpclass.ll
+16-0llvm/lib/Analysis/ValueTracking.cpp
+1-9llvm/include/llvm/Transforms/IPO/Attributor.h
+2-3llvm/test/Transforms/InstCombine/simplify-demanded-fpclass-aggregates.ll
+66-144 files

LLVM/project a7914aellvm/lib/Target/RISCV RISCVInstrInfoP.td, llvm/lib/Target/RISCV/AsmParser RISCVAsmParser.cpp

[RISCV] Allow unsigned immediates for pli.h, pli.dh, pli.w (#184554)

Allow unsigned immediates that look like like simm10 when only
considering the lower 16 or 32 bits. For pli.dh and pli.dh this
[652024,65535]. For pli.w, this is [4294966784,4294967295].

Since we're only inserting 16 or 32 bits, it makes sense to me that only
those 16 or 32 bits matter. This is similar to how we allow `li a0,
0xffffffff` on RV32.
DeltaFile
+43-5llvm/lib/Target/RISCV/RISCVInstrInfoP.td
+13-0llvm/lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp
+6-0llvm/test/MC/RISCV/rv32p-valid.s
+6-0llvm/test/MC/RISCV/rv64p-valid.s
+68-54 files

LLVM/project 64c0f62lldb/source/Core PluginManager.cpp

[lldb] Make the PluginManager thread safe (#184452)

In #184273, John pointed out that the PluginManager is currently not
thread safe. While we don't currently provide any guarantees, it seems
desirable to be able to interact safely with the PluginManager from
different threads. For example, we allow dynamically loading plugins
from the command interpreter, which in theory could be coming from
different threads.
DeltaFile
+28-22lldb/source/Core/PluginManager.cpp
+28-221 files

LLVM/project 9dc6537clang-tools-extra/clang-tidy/misc ConstCorrectnessCheck.cpp, clang-tools-extra/docs/clang-tidy/checks/misc const-correctness.rst

[clang-tidy] Don't report unnamed params for misc-const-correctness (#184388)

Previously misc-const-correctness warned about non-const unnamed
parameters; but this commit excludes them because these warnings are not
actually useful. An unnamed parameter cannot be referenced at all, so
marking them as 'const' doesn't add additional information.

Also the diagnostic messages look awkward without a name.

Fixes #184330
DeltaFile
+31-0clang-tools-extra/test/clang-tidy/checkers/misc/const-correctness-parameters.cpp
+7-1clang-tools-extra/clang-tidy/misc/ConstCorrectnessCheck.cpp
+3-3clang-tools-extra/docs/clang-tidy/checks/misc/const-correctness.rst
+41-43 files

LLVM/project df52bb4llvm/lib/Transforms/Scalar LoopUnrollPass.cpp

[LoopUnrollPass] Don't use clang specific syntax in optimization remarks (#182430)

DeltaFile
+15-17llvm/lib/Transforms/Scalar/LoopUnrollPass.cpp
+15-171 files

LLVM/project 0baf5a0libunwind/src Unwind-wasm.c

Revert "Silence -Wunused-parameter warnings in Unwind-wasm.c" (#175776)

Reverts llvm/llvm-project#125412

See the discussion in #125412 for why this is necessary. The summary is
that:

- Eliding arguments is the C23 extension, but libunwind builds its C
files with `-std=c99`, so this change broke the build.
- `-Wno-unused-parameter` is part of the build for libunwind, so the
codebase does allow them.
DeltaFile
+4-2libunwind/src/Unwind-wasm.c
+4-21 files

LLVM/project 2bc84fallvm/include/llvm/CodeGen MachineFunction.h, llvm/lib/CodeGen MachineFunction.cpp

[AMDGPU][MC] Replace shifted registers in CFI instructions

Change-Id: I0d99e9fe43ec3b6fecac20531119956dca2e4e5c
DeltaFile
+67-67llvm/test/CodeGen/AMDGPU/sgpr-spill-overlap-wwm-reserve.mir
+33-0llvm/lib/MC/MCDwarf.cpp
+15-15llvm/test/CodeGen/AMDGPU/dwarf-multi-register-use-crash.ll
+10-0llvm/lib/CodeGen/MachineFunction.cpp
+4-4llvm/test/CodeGen/AMDGPU/debug-frame.ll
+4-0llvm/include/llvm/CodeGen/MachineFunction.h
+133-864 files not shown
+141-8810 files

LLVM/project 7981b2allvm/lib/Target/AMDGPU SIFrameLowering.cpp SIMachineFunctionInfo.h, llvm/test/CodeGen/AMDGPU amdgpu-spill-cfi-saved-regs.ll

[AMDGPU] Implement -amdgpu-spill-cfi-saved-regs

These spills need special CFI anyway, so implementing them directly
where CFI is emitted avoids the need to invent a mechanism to track them
from ISel.

Change-Id: If4f34abb3a8e0e46b859a7c74ade21eff58c4047
Co-authored-by: Scott Linder scott.linder at amd.com
Co-authored-by: Venkata Ramanaiah Nalamothu VenkataRamanaiah.Nalamothu at amd.com
DeltaFile
+2,998-0llvm/test/CodeGen/AMDGPU/amdgpu-spill-cfi-saved-regs.ll
+37-12llvm/lib/Target/AMDGPU/SIFrameLowering.cpp
+11-2llvm/lib/Target/AMDGPU/SIMachineFunctionInfo.h
+9-0llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
+7-0llvm/lib/Target/AMDGPU/SIFrameLowering.h
+2-1llvm/lib/Target/AMDGPU/SIMachineFunctionInfo.cpp
+3,064-152 files not shown
+3,067-168 files

LLVM/project 2bc69e5llvm/test/CodeGen/AMDGPU amdgcn.bitcast.1024bit.ll gfx-callable-argument-types.ll

[AMDGPU] Implement CFI for CSR spills

Introduce new SPILL pseudos to allow CFI to be generated for only CSR
spills, and to make ISA-instruction-level accurate information.

Other targets either generate slightly incorrect information or rely on
conventions for how spills are placed within the entry block. The
approach in this change produces larger unwind tables, with the
increased size being spent on additional DW_CFA_advance_location
instructions needed to describe the unwinding accurately.

Change-Id: I9b09646abd2ac4e56eddf5e9aeca1a5bebbd43dd
Co-authored-by: Scott Linder <scott.linder at amd.com>
Co-authored-by: Venkata Ramanaiah Nalamothu <VenkataRamanaiah.Nalamothu at amd.com>
DeltaFile
+3,360-1,955llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.1024bit.ll
+1,924-1,929llvm/test/CodeGen/AMDGPU/gfx-callable-argument-types.ll
+2,700-12llvm/test/CodeGen/AMDGPU/accvgpr-spill-scc-clobber.mir
+531-531llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.512bit.ll
+508-508llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.960bit.ll
+405-406llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.896bit.ll
+9,428-5,341103 files not shown
+13,413-7,714109 files

LLVM/project f07d684llvm/test/CodeGen/AMDGPU amdgcn.bitcast.1024bit.ll amdgcn.bitcast.512bit.ll

[AMDGPU] Use register pair for PC spill

Change-Id: Ibedeef926f7ff235a06de65a83087c151f66a416
DeltaFile
+2,562-2,562llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.1024bit.ll
+1,276-1,274llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.512bit.ll
+818-816llvm/test/CodeGen/AMDGPU/materialize-frame-index-sgpr.ll
+613-613llvm/test/CodeGen/AMDGPU/gfx-callable-argument-types.ll
+552-552llvm/test/CodeGen/AMDGPU/indirect-call.ll
+100-898llvm/test/CodeGen/AMDGPU/bf16.ll
+5,921-6,71586 files not shown
+9,565-10,29692 files

LLVM/project b559ab2llvm/test/CodeGen/AMDGPU accvgpr-spill-scc-clobber.mir pei-build-av-spill.mir

[AMDGPU] Implement CFI for non-kernel functions

This does not implement CSR spills other than those AMDGPU handles
during PEI. The remaining spills are handled in a subsequent patch.

Change-Id: I5e3a9a62cf9189245011a82a129790d813d49373
Co-authored-by: Scott Linder <scott.linder at amd.com>
Co-authored-by: Venkata Ramanaiah Nalamothu <VenkataRamanaiah.Nalamothu at amd.com>
DeltaFile
+5,568-0llvm/test/CodeGen/AMDGPU/accvgpr-spill-scc-clobber.mir
+3,000-96llvm/test/CodeGen/AMDGPU/pei-build-av-spill.mir
+2,208-72llvm/test/CodeGen/AMDGPU/pei-build-spill.mir
+2,196-0llvm/test/CodeGen/AMDGPU/eliminate-frame-index-s-mov-b32.mir
+2,136-0llvm/test/CodeGen/AMDGPU/vgpr-spill-scc-clobber.mir
+1,671-1llvm/test/CodeGen/AMDGPU/debug-frame.ll
+16,779-16978 files not shown
+22,562-60884 files

LLVM/project 1607b8bllvm/lib/Target/AMDGPU SIFrameLowering.cpp, llvm/test/CodeGen/AMDGPU debug-frame.ll eliminate-frame-index-v-add-u32.mir

[AMDGPU] Emit entry function Dwarf CFI

Entry functions represent the end of unwinding, as they are the
outer-most frame. This implies they can only have a meaningful
definition for the CFA, which AMDGPU defines using a memory location
description with a literal private address space address. The return
address is set to undefined as a sentinel value to signal the end of
unwinding.

Change-Id: I21580f6a24f4869ba32939c9c6332506032cc654
Co-authored-by: Scott Linder <scott.linder at amd.com>
Co-authored-by: Venkata Ramanaiah Nalamothu <VenkataRamanaiah.Nalamothu at amd.com>
DeltaFile
+1,405-0llvm/test/CodeGen/AMDGPU/debug-frame.ll
+204-12llvm/test/CodeGen/AMDGPU/eliminate-frame-index-v-add-u32.mir
+134-6llvm/test/CodeGen/AMDGPU/eliminate-frame-index-v-add-co-u32.mir
+114-10llvm/test/CodeGen/AMDGPU/eliminate-frame-index-s-add-i32.mir
+42-5llvm/lib/Target/AMDGPU/SIFrameLowering.cpp
+34-0llvm/test/CodeGen/AMDGPU/entry-function-cfi.mir
+1,933-3321 files not shown
+2,040-4527 files

LLVM/project 827ee52clang/lib/Driver/ToolChains Gnu.cpp, clang/test/Driver amdgpu-unwind.cl

[Clang] Default to async unwind tables for amdgcn

To avoid codegen changes when enabling debug-info (see
https://bugs.llvm.org/show_bug.cgi?id=37240) we want to
enable unwind tables by default.

There is some pessimization in post-prologepilog scheduling, and a
general solution to the problem of CFI_INSTRUCTION-as-scheduling-barrier
should be explored.

Change-Id: I83625875966928c7c4411cd7b95174dc58bda25a
DeltaFile
+26-0clang/test/Driver/amdgpu-unwind.cl
+1-0clang/lib/Driver/ToolChains/Gnu.cpp
+27-02 files

LLVM/project fbc3a31mlir/lib/Dialect/GPU/Pipelines GPUToXeVMPipeline.cpp

[mlir][xevm] Remove unnecessary attach target pass.  (#184432)

DeltaFile
+0-1mlir/lib/Dialect/GPU/Pipelines/GPUToXeVMPipeline.cpp
+0-11 files

LLVM/project 2c95b8dllvm/lib/Target/AMDGPU/MCTargetDesc AMDGPUTargetStreamer.cpp, llvm/lib/Target/AMDGPU/Utils AMDGPUBaseInfo.cpp AMDGPUBaseInfo.h

AMDGPU: Clean up print handling of AMDGPUTargetID (#184643)

Provide print to raw_ostream method and use it where applicable.
DeltaFile
+7-5llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp
+8-3llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUTargetStreamer.cpp
+9-0llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.h
+24-83 files

LLVM/project 2b3e30dllvm/lib/CodeGen MachineInstr.cpp, llvm/test/CodeGen/AArch64 sme-streaming-checkvl.ll

[CodeGen] Treat hasOrderedMemoryRef as implying arbitrary loads or stores (#182000)

This prevents MachineSink from sinking loads past fences (or any other instruction marked as hasSideEffects).

Fixes: #181708
DeltaFile
+250-5llvm/test/CodeGen/AMDGPU/misched-remat-revert.ll
+44-0llvm/test/CodeGen/AMDGPU/machine-sink-fence.ll
+12-13llvm/test/CodeGen/AMDGPU/iglp-no-clobber.ll
+13-6llvm/test/CodeGen/AArch64/sme-streaming-checkvl.ll
+5-5llvm/test/CodeGen/Thumb2/mve-postinc-lsr.ll
+1-2llvm/lib/CodeGen/MachineInstr.cpp
+325-316 files

LLVM/project d9d6b16llvm/lib/Analysis ValueTracking.cpp, llvm/test/Transforms/Attributor nofpclass.ll

ValueTracking: Handle ConstantDataSequential in computeKnownFPClass (#184191)

DeltaFile
+16-17llvm/test/Transforms/InstCombine/simplify-demanded-fpclass-aggregates.ll
+18-0llvm/test/Transforms/Attributor/nofpclass.ll
+7-0llvm/lib/Analysis/ValueTracking.cpp
+41-173 files