[compiler-rt][ARM] Optimized FP double <-> single conversion (#179926)
This commit provides assembly versions of the conversions both ways
between double and float.
[SLP] Improve InsertElement scalarization cost modeling
When costing InsertElement tree entries, pass getScalarizationOverhead the
per-lane insert operands via AdjustedVL, set ForPoisonSrc from whether the
base vector is entirely undef, and supply a VectorInstrContext hint derived
from the demanded insert instructions. Move the scalarization cost adjustment
to after InMask is computed so ForPoisonSrc reflects the actual base vector
state.
Reviewers: bababuck, RKSimon, hiraditya
Pull Request: https://github.com/llvm/llvm-project/pull/199514
[VPlan] Construct VPlan1 once, share across buildVPlans calls. (#197276)
Extract the VF-independent VPlan1 setup pipeline (header phis,
simplification, early-exit handling, middle check, loop regions, tail
folding, mask introduction) into a new helper tryToBuildVPlan1().
Construct the initial Vlan1 once, and pass to repeated buildVPlans
calls.
Note that this means we need to move collectInLoopReductions up. We not
may construct VPlan1 on code paths where we did not before, because we
failed UserVF validation/selection, but I think that should be fine as
this makes the overall code simpler and the UserVF code paths are for
testing.
PR: https://github.com/llvm/llvm-project/pull/197276
[clangd] Prefer .hpp files over .h with header source switch (#198152)
Previously, The "Switch Between Source/Header" action picked `.h` over
`.hpp` when both files existed next to a `.cpp` file, because `.h` is
listed first in the header-extension list.
This patch reorders `HeaderExtensions` and `SourceExtensions` so the
`C++`-flavored extensions come before `.h` and `.c`. `C++`-flavor of
file is preffered since (at least in my opinion) more people using
`clangd` for `C++` than `C` with `.hpp` ext so switching from `.cpp`
should go into `.hpp`, not `.h`.
This brings an edje case that when swithing from `.c` it will go into
`.hpp` instead of `.h`, but I think this situation is more rare than
having `.cpp` with `.hpp` and `.h` combination since `.h` headers can be
used as `extern "C"` wrapper of cpp library.
net-im/libquotient: Update to 0.9.6.1
- Build shared library
- Remove stale CONFLICTS
- Pass the port to kde@ team
PR: 295459
Approved by: adridg (maintainer)
py-jupyterlab: updated to 4.5.7
4.5.7
Enhancements made
- Update default font family to honor macOS system-wide ui-monospace
Bugs fixed
- Video and Audio Content Providers: Fix JupyterLite support
- Fix notebook hang when dropping cells
- Fix Contextual Help keyboard shortcut reliability and menu Help functionality
- Fix focusing input element when opening a dialog from Command Palette
- Fix native context menu blocked even when context menu is suppressed
- Fix flaky toolbar item placement in popup
Maintenance and upkeep improvements
[7 lines not shown]
py-jupyter_server: updated to 2.18.2
2.18.2
Bugs fixed
- Fix saving user avatar URL
- Fix path resolution if `root_dir` is a filesystem root
Maintenance and upkeep improvements
- Add Zulip notification when a release is complete
- chore: update pre-commit hooks
[LV] Handle loop.dependence.mask in verifyLastActiveLaneRecipe() (#199897)
This verification can be called after the alias-mask has been expanded
so needs to recognize loop.dependence.mask intrinsics.
[MLIR][AMDGPU] Add permlane16.var and permlanex16.var intrinsic ops (#199501)
## Summary
Add ROCDL and AMDGPU dialect support for the GFX12+ variable-selector
permlane intrinsics (`v_permlane16_var_b32` / `v_permlanex16_var_b32`).
Unlike the existing fixed-selector `permlane16`/`permlanex16` ops where
source-lane indices come from SGPR immediates, the "var" variants take
per-lane source-lane indices from a VGPR, enabling arbitrary per-lane
intra-row and cross-row permutations within a wave32 subgroup.
### ROCDL dialect
- `ROCDL_Permlane16VarOp` → `llvm.amdgcn.permlane16.var`
- `ROCDL_PermlaneX16VarOp` → `llvm.amdgcn.permlanex16.var`
- Both take `(old, src0, src1, fi, boundControl)` with `fi` and
`boundControl` as immediate i1 attrs
### AMDGPU dialect
[11 lines not shown]