[LV] Add select instruction to VPReplicateRecipe::computeCost (#186825)
I've added the Instruction::Select opcode to the existing list of
opcodes that call getCostForRecipeWithOpcode. There are currently 5
tests that ask for the cost of the select:
Transforms/LoopVectorize/AArch64/widen-gep-all-indices-invariant.ll
Transforms/LoopVectorize/first-order-recurrence-with-uniform-ops.ll
Transforms/LoopVectorize/narrow-to-single-scalar.ll
Transforms/LoopVectorize/replicate_fneg.ll
Transforms/LoopVectorize/single-scalar-cast-minbw.ll
The fact they all pass with this change is hopefully proof enough that
the costs are correct.
libclc: Use small trig reduction for nan (#186983)
Nan should work on either path, but the small reduction
path is smaller. There's also possible codegen benefits to
knowing the large reduction will not need to handle nans.
devel/deheader: update to 1.12
Changes:
* Fixed typo in hsearch() pattern.
* Added requirements for entire SUSv2 coverage.
* Spellchecked the documentation.
* Corrected an error in CMake build detection.
* Typo fixes and code hardening by ChatGPT 5.2.
* Don’t over-remove duplicates when -r is specified.
libclc: Use small trig reduction for nan
Nan should work on either path, but the small reduction
path is smaller. There's also possible codegen benefits to
knowing the large reduction will not need to handle nans.
libclc: Move edge case handling of trig functions (#186429)
The explicit handling of nan is unnecessary. Clamp infinities
to nan at the input. This allows optimizations of the following
implementation code to take advantage of the knowledge that it
does not need to handle infinities.
[VPlan] Factor collectGroupedReplicateMemOps (NFC) (#186820)
Factor out a collectGroupedReplicateMemOps from
collectComplementaryPredicatedMemOps, so it can be re-used in other
places.
[OpenMP][OMPT] Add missing `error` entry to device tracing record union (#185683)
While `omp-tools.h` already includes the `ompt_record_error_t` struct,
the corresponding union entry was missing from `ompt_record_ompt_t`.
This commit adds the missing entry.
Note that this does not enable any functionality for device tracing
records.
This only aligns the struct with OpenMP v5.1 and newer. OpenMP v5.0 did
not contain the `error` directive.
CC: @jprotze
Signed-off-by: Jan André Reuter <j.reuter at fz-juelich.de>