[RISCV][TTI] Implement cost of llvm.experimental.vector.extract.last.active (#184067)
This patch implements the cost of
llvm.experimental.vector.extract.last.active which will lower to:
vcpop.m a0, v0
beqz a0, exit # Return passthru when the entire lane is inactive.
vid v10, v0.t
vredmaxu.vs v10, v10, v10
vmv.v.s a0, v10
zext.b a0, a0
vslidedown v8, v8,
This patch also helps conditional-scalar-assignment (CSA) works for
scalable vector.
[Support] Move HTTP client/server to new LLVMSupportHTTP lib (NFC) (#184572)
Relocate HTTPClient and HTTPServer from the Debuginfod library to
llvm/Support/HTTP so they can be reused by other components.
---------
Co-authored-by: Alexandre Ganea <aganea at havenstudios.com>
Co-authored-by: Jonas Devlieghere <jonas at devlieghere.com>
[X86] Fix wrong RIP-relative relocations for AVX10.2 saturation conversions (#185254)
AVX10.2 saturation conversion instructions (VCVT{T}{BF16,PH,PS}2I{U}BS)
incorrectly inherited from AVX512{PS,XD,PD}Ii8Base, which sets
ImmT=Imm8. This caused X86MCCodeEmitter to account for a nonexistent
trailing immediate byte when computing RIP-relative displacement fixups,
producing an addend of -5 instead of the correct -4.
Replace AVX512PSIi8Base/AVX512XDIi8Base/AVX512PDIi8Base with just the
needed prefix classes (PS is default, XD, PD), dropping the bogus
ImmT=Imm8. Preserve the original ExeDomain values (SSEPackedSingle,
SSEPackedDouble, SSEPackedInt; verified with `llvm-tblgen
--print-records ./llvm/lib/Target/X86/X86.td -I./llvm/include/
-I./llvm/lib/Target/X86/`). ExeDomain is not covered by any test.
Fix https://github.com/llvm/llvm-project/issues/184251
[libclc] Use custom CMake handling to overhaul libclc compilation (#185247)
Summary:
This PR uses https://github.com/llvm/llvm-project/pull/185243 to
overhaul compilation of libclc. This brings libclc to the same kind of
compilation flow that the other GPU libraries use (compiler-rt, libc,
libc++, openmp, flang-rt).
The main brunt of this change is simply changing the SOURCES files to
CMake variables and altering the compilation. Now that these are
standard CMake libraries we do not need to bother redefining custom
library handling and targets.
This builds as a static library, which we then consume with `llvm-link`
which converts it into a single `.bc` bitcode file similarly to before.
The final result is then optimized all together.
Hopefully this doesn't break anything.
cxgbe(4): minor changes in code dealing with ncores
1. ncores and devlog information is read as a combination so it makes
sense to validate them in the same routine (and nowhere else).
2. ncores is never 0 and idx % ncores is always a valid coreid.
MFC after: 1 week
Sponsored by: Chelsio Communications
cxgbe(4): minor changes in code dealing with ncores
1. ncores and devlog information is read as a combination so it makes
sense to validate them in the same routine (and nowhere else).
2. ncores is never 0 and idx % ncores is always a valid coreid.
MFC after: 1 week
Sponsored by: Chelsio Communications
[flang] Fix segfault in CSHIFT/EOSHIFT with dynamically optional DIM (#184431)
When `DIM` is passed as an optional dummy argument and is absent at
runtime, the HLFIR lowering for the `CSHIFT` and `EOSHIFT` intrinsics
treated it as unconditionally present. This resulted in an unconditional
load of the `DIM` reference, causing a null pointer dereference and a
runtime segmentation fault when absent.
The underlying issue was that the `dim` argument for `cshift` and
`eoshift` was not marked with `handleDynamicOptional` during intrinsic
argument lowering setup. As a result, the `isPresent` state was never
populated, and the lowering implementation incorrectly fell through to
an unconditional scalar load.
This patch resolves the issue by:
1. Updating the `dim` entries for `cshift` and `eoshift` in
`IntrinsicCall.cpp` to use `handleDynamicOptional`. This enables
`getOperandVector()` to appropriately emit a guarded load (via
`loadOptionalValue()`) that safely returns a 0 placeholder when `DIM` is
[10 lines not shown]
[Flang] Fix crash in structure constructor lowering for PDT (#183543)
Fixes - [#181278](https://github.com/llvm/llvm-project/issues/181278)
This patch fixes a crash in Flang when parsing array constructors like:
`[ty0(2)(4)]`
The current implementation parses this as a type constructor ty0(2),
followed by what appears to be another call (4), instead of rejecting it
as invalid syntax. The lowering of` StructureConstructor` attempts to
retrieve the parent derived type using `sym->owner().derivedTypeSpec()`,
which return `nullptr` for PDT cases and lead to a crash.
In` flang/lib/Lower/ConvertConstant.cpp`, a safeguard is being added
which ensures that we fall back to the constructor’s derived type
specification when the parent type cannot be obtained, preventing the
null dereference and eliminating the crash. This change addresses only
the immediate crash, proper diagnostic handling for this invalid syntax
[4 lines not shown]
Revert "Reapply "[VPlan] Remove manual region removal when simplifying for VF and UF. (#181252)""
This reverts commit 6aa115bba55054b0dc81ebfc049e8c7a29e614b2.
This is causing crashes. See #185345 for details.
[AMDGPU] Multi dword spilling for unaligned tuples
While spilling unaligned tuples, rather than breaking the
spill into 32-bit accesses, spill the first register as a
single 32-bit spill, and spill the remainder of the tuple
as an aligned tuple.
Some additional bookkeeping is required in the spilling
loop to manage the state.
[AMDGPU] Multi dword spilling for unaligned tuples
While spilling unaligned tuples, rather than breaking the
spill into 32-bit accesses, spill the first register as a
single 32-bit spill, and spill the remainder of the tuple
as an aligned tuple.
Some additional bookkeeping is required in the spilling
loop to manage the state.
[AMDGPU] Remove alignment constraint from spill pseudos (#177317)
Spill pseudo opcodes don't require target reg class alignment
constraint.
For targets which do have alignment constraints, lower the spills to
32-bit accesses.
Update the machine verifier accordingly.
Sgpr spill pseudos didn't enforce alignment constraints.
Modify vgpr spills reg class to not enforce them either.