Reapply "AMDGPU: Use real copysign in fast pow (#97152)"
This reverts commit bff619f91015a633df659d7f60f842d5c49351df.
This was reverted due to regressions caused by poor copysign
optimization, which have been fixed.
[libc++] Rewrite the std::make_heap benchmark (#178696)
This rewrites the `make_heap` benchmark to make it significantly faster
to run. In my test it saves ~10 minutes.
This patch also drops `ranges::` heap benchmarks, since we've decided to
remove `ranges::` benchmarks if there is a `std::` equivalent.
[flang][openmp] Fix GPU byref reduction descriptor initialization (#178934)
When generating GPU reduction code for arrays passed by reference, only
the base_ptr field was initialized in the shuffled descriptor, leaving
extent, stride, and rank fields uninitialized. This caused garbage
metadata to be passed to user reduction combiners, resulting in
incorrect iteration bounds and crashes on GPU targets.
Fix by copying the entire source descriptor and then updating the
base_ptr to point to thread-private storage. This preserves all metadata
(extents, strides, rank) while correctly pointing to the shuffled data
location.
The fix applies to three reduction helper functions:
- _omp_reduction_shuffle_and_reduce_func (warp-level shuffle)
- _omp_reduction_list_to_global_reduce_func (block-to-global)
- _omp_reduction_global_to_list_copy_func (global-to-block)
Fixes multi-dimensional array reductions on GPU target regions with
[2 lines not shown]
[AArch64] Use brk{a,b} for a lane mask from cttz.elts (#178674)
cttz.elts is usually lowered (for SVE) to a brkb followed by a cntp. If
we then want a mask based on that (say, for early exit masking) then we
would use a whilelo from 0 to the result of cntp. But that just gives us
the same mask as the initial brkb, so we can just remove the cntp and
the whilelo.
Brka matches the extra +1 in the pattern.
py-dulwich: updated to 1.0.0
1.0.0 2026-01-17
* Release of 1.0!
From here on, Dulwich will not break backwards compatibility until 2.0 -
although we may print ``DeprecationWarning`` when using deprecated
functionality.
Micro releases (1.x.y) will be reserved for important bugfixes.
Major releases (1.x.0) will introduced new features and functionality,
without breaking backwards compatibility.
[alpha.webkit.RetainPtrCtorAdoptChecker] Don't treat calling (void)copy:(id) as a leak (#179713)
UIResponderStandardEditActions defines (void)copy:(id)sender but this
selector should not be treated as a copy operation since it's a "copy"
in the sense of application triggering copy & paste for the system
pasteboard.
---------
Co-authored-by: Balázs Benics <benicsbalazs at gmail.com>
ruby40: fix build with CPPFLAGS=-g
CPPFLAGS got passed unmodified to dtrace, but that does not support
all compiler flags that can be passed this way.
Add link to upstream bug report.
[AArch64] Add test coverage for funnel shift with undef amount. NFC (#179888)
Precommit tests for #57256 showing inconsistencies between SDAG and
GISel for funnel shift with undef amount. GISel is wrong and should
match SDAG.