[clang][AArch64] Use structured bindings in feature parsing code (#197689)
Clearer than having to know that first is a CPU and second is the
feature list.
[PowerPC] Match intrinsics ppc_amo_st[dw]at with a pattern
The intrinsics are 1:1 to the instructions except for the order of
the operands, thus it is easy to match them with a pattern.
However, the intrinsics are defined as reading and writing to
memory, but the instructions explicitly set mayLoad to false.
Looking at the ISA description it seems to me that the latter
is not true. In any case, the side effect flags must be the
same, otherwise the pattern is rejected.
[libc] Annex K: Add constraint handler unit test class
This unit test class will be useful for the tests related to Annex K.
The functions in Annex K may call a constraint handler, so this new unit
test class will facilitate the checks that the constraint handling
mechanism is working as expected.
[libc++] Replace ranges::find_first_of with std::find_first_of in __try_constant_folding (#197641)
This reduces the time it takes to instantiate `std::format` from ~160ms
to ~120ms in my testing.
[AMDGPU][NFC] Remove redundant hasMadU64U32NoCarry helper (#197682)
Use hasMadNC64_32Insts() (backed by SubtargetFeature) for MAD 64_32
no-carry and drop the old helper.
linux/io: handle memtype_wc mapping for !DMAP range
The amdgpu driver in drm-kmod will attempt to update/reserve certain GPU
VRAM ranges as write-combining. Depending on the system, this address
range may fall outside of FreeBSD's constructed DMAP. We cannot use
pmap_change_attr() in this case.
When INVARIANTS is enabled, this results in the following:
panic: physical address 0x880000000 not covered by the DMAP
Add a guard against triggering the KASSERT in PHYS_TO_DMAP().
This limitation in our implementation of arch_io_reserve_memtype_wc() is
already known in drm-kmod's amdgpu_bo_init(), and errors are ignored
there (see "BSDFIXME"). This change is only to eliminate the preventable
assertion failure within this scheme.
Tested by: kevans
[4 lines not shown]
[libc] Include correct headers in type_traits (#197691)
Otherwise we end up with errors like the following when building with
bazel:
```c++
In file included from external/+_repo_rules+llvm-project/libc/src/__support/CPP/type_traits/is_move_constructible.h:12:
external/+_repo_rules+llvm-project/libc/src/__support/CPP/type_traits/is_constructible.h:32:14: error: no template named 'bool_constant'
32 | : public bool_constant<__is_constructible(T, Args...)> {};
```
[DAG] SimplifyMultipleUseDemandedBits - fold (mul X, 1) -> X (#197677)
Use DemandedElts + KnownBits to match hidden identity patterns - helps
especially with reduction patterns padded by legalisation
Once #197455 has landed, I'm intending to convert this (plus
SMIN/SMAX/UMIN/UMAX and the existing ISD::ADD case) to use
isIdentityElement directly.
[LV][NFC] Remove instcombine from RUN lines in AArch64 tests (#197448)
This PR continues other work I've been doing trying to remove
unnecessary extra passes from the RUN lines in order to make it easier
to map the expected vectoriser output to the CHECK lines. As a result it
has exposed some potential optimisations that we may be able to perform
in VPlan.
Here is a summary of the changes I've noticed:
1. instcombine likes to canonicalise GEPs into certain forms. I'm not
sure if there is value in VPlan trying to guess what the canonical form
should be.
2. In tests like sve-cond-inv-loads.ll, etc. the pattern sub(urem) is
often replaced with and(sub). This is potentially something the
vectoriser could improve although I don't know if it would change the
cost model.
3. There is poor codegen in gather_nxv4i32_ind64_stride2 in the file
sve-gather-scatter.ll, which is due to
[19 lines not shown]
[PowerPC] Update base crypto builtins and intrinsics (#197017)
Update the base crypto builtins and LLVM intrinsics to drop the mma_
prefix. Also fix the builtin definitions for dmsha2hash, dmsha3hash,
and dmxxshapad to use the correct immediate constraints.
[CodeGen] Debug insns must not affect liveness analysis (#193104)
Register references in debug instructions can affect LiveRegUnits
analysis. Skip over debug instructions.
Tests in this PR would fail due to calls to LiveRegUnits::stepBackward
in RegisterScavenging, DeadMachineInstructionElim, and
AArch64InstrInfo.cpp getOutlinableRanges().
Other call-sites to stepBackward may also pass debug instructions to
LiveRegUnits::stepBackward, but LIT testing did not fail when
-debugify-and-strip-all-safe was enabled by default.
---------
Signed-off-by: John Lu <John.Lu at amd.com>
math/py-scipy: Work around meson finding cython-${NOT_PYVERSION}
When e.g. building py313-scipy, with py314-cython installed, meson
will find py314-cython. It appears to lack a way to specify the path
instead of searching, based on the two previous workarounds.
usbdevs: Add TP-Link UB500 (RTL8761BUV) USB ID
This device is not yet supported.
Unfortunately some recently purchased UB400 dongles also contain this
Realtek IC.
Sponsored by: The FreeBSD Foundation