InstCombine: Improve single use fabs SimplifyDemandedFPClass handling (#176359)
SimplifyDemandedFPClass's handling of fabs recently became smarter in
the multiple use case than single. Unify these so the single use case
is equally as smart. This includes propagating ninf / nnan context into
the instruction, and accounting for nsz if the only bit difference is
for zero.
AMDGPU/GlobalISel: Regbanklegalize rules for G_UNMERGE_VALUES
Move G_UNMERGE_VALUES handling to AMDGPURegBankLegalizeRules.cpp.
Fix sgpr S16 unmerge by lowering using shift and using S32.
Previously sgpr S16 unmerge was selected using _lo16 and _hi16 subreg
indexes which are exclusive to vgpr register classes.
For remaing cases we do trivial mapping, assigns same reg bank
to all operands, vgpr or sgpr.
exterr: Sort output from make_libc_exterr_cat_filenames.sh
Otherwise the script may permute the order of entries in the file since
find(1) output is not stable.
Reviewed by: kib
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D54669
pkg-query.8: Add supports for complex attribute evaluation in query
Update pkg-query(8) to include documentation for the newly supported
multiline variables in evaluation expressions.
- Split variables into 'Normal Variables' and 'Multiline variables'
- Add description and example for multiline variable evaluation (e.g., %dn)
- List supported multiline variables: %d, %r, %C, %L, %B, %b, %A
pkg-query: Add support query evaluation of complex attributes
Add support for querying complex attributes (lists) in the evaluation
string mechanism. This includes:
- %d[nov]: Dependencies (Name, Origin, Version)
- %r[nov]: Reverse dependencies (Name, Origin, Version)
- %C: Categories
- %L: Licenses
- %B: Shared libraries required
- %b: Shared libraries provided
- %A[tv]: Annotations (Tag, Value)
These are implemented by generating subqueries using EXISTS/NOT EXISTS
operators in the resulting SQL.
The following fields are not implemented as their use cases are unclear:
- %F[psugmftl]: the list of files
- %S[pugmf]: the list of (sub-)directories
[3 lines not shown]
tests: Add test cases for dependency query evaluation
Add regression tests to verify the behavior of evaluating dependency
attributes (%dn) in query expressions.
- Test exact match (=)
- Test inequality (!=)
[DSE][Verifier] Respect the calling convention of the function specified by "alloc-variant-zeroed" (#175911)
Require that the calling convention between the zeroed and non-zeroed
variants is the same, and set it appropriate in the DSE transform.
[AMDGPU] si-peephole-sdwa: Handle V_PACK_B32_F16_e64 (WIP)
Change si-peephole-sdwa to eliminate V_PACK_B32_F16_e64 instructions
by changing the second operand to write to the upper word of the
destination directly.
[AMDGPU] Enable ISD::{FSIN,FCOS} custom lowering to work on v2f16
Currently ISD::FSIN and ISD::FCOS of type MVT::v2f16 are legalized by
first expanding and then using a custom lowering on the resulting f16
instructions. This ordering prevents using packed math variants of the
instructions introduced by the legalization (e.g. the multiplication),
if available, and makes it difficult to eliminate the packing of the
results by using SDWA form; previous attempts to deal with the latter
situation in the si-peephole-sdwa pass were unwieldly since it was
necessary to reconstruct the association between the source and target
vectors.
Change the legalization action for ISD::FSIN and ISD::FCOS of type
MTF::v2f16 to Custom and change the custom intrinsic lowering to deal
with the v2f16 for the intrinsics introduced in this way.
[AMDGPU] SIIselLowering: Use intrinsics in LowerTrig
This allows to apply further legalization actions to the
resulting nodes which is a preparatory step to extend the
custom lowering to vector types.
devel/godot35: Deprecate
Legacy version should have been removed with devel/godot35-tools,
consider migrating to devel/godot.
PR: 292141
(cherry picked from commit 4073e1292654ee946e82f5f25147e49c7a35559b)
[MLIR][XeGPU] Add support for cross-subgroup reduction from wg to sg (#170936)
This PR adds support for cross-sg reduction whilst distributing from
workgroup to subgroup. It has following limitation
1. Cannot reduce to a scalar
2. For cross-sg, only 1:1 decomposition (each sg should be assigned only
one tile in the original WG tile) is supported for now. For example for
a WG tile of size 256x128, sg_layout = [8, 4], sg_data = [16, 16] wont
be supported.