[clang][cir] Adding myself in CODEOWNERS for CIRGenBuiltinAArch64.cpp (#187570)
This is to help with #185382 and to make sure that I don't miss any PRs.
libclc: Use log intrinsic for half and float cases for amdgpu (#187538)
This is pretty verbose and ugly. We're pulling the base implementation
in for the double cases, and scalarizing it. Also fully defining the
half and float cases to directly use the intrinsic, for all vector
types. It would be much more convenient if we had linker based overrides
for the generic implementations, rather than per source file.
libclc: Rewrite log implementation as gentype inc file (#187537)
Follow the ordinary gentype conventions for the log implementation,
instead of using a plain header. This doesn't quite yet enable
vectorization, due to how the table is currently indexed. This should
make it easier for targets to selectively overload the function for
a subset of types.
[AArch64] Use an unknown size for memcpy ops with non-constant sizes. (#187445)
The previous value of 0 was allowing loads to move past the mops
operations where it is not valid. Use a LocationSize::afterPointer()
size instead.
The GISel lowering currently loses the MMO, which is fine as it should
be conservatively treated as a load/store to any location.
libclc: Update trigpi functions (#187579)
These were originally ported from rocm device
libs in bc81ebefb7d9d9d71d20bfee2ce4cccb09701e9b.
Merge in more recent changes.
[LV] Explain why a less profitable VF was chosen (NFCI) (#187469)
I was very puzzled the other day when it showed that VF 8 had a cost of
X and VF 16 had a cost of X/2, yet it still choose VF 8. This PR adds
some extra debug output to explain why this happens.
timerfd: Suppress kqueue readability after jump read
Do not report EVFILT_READ after reading a discontinuous clock jump.
This makes the kqueue filter consistent with Linux epoll behavior
and timerfd_poll(), which already checks tfd_jumped != TFD_READ before
reporting POLLIN.
MFC after: 2 weeks
timerfd: Wake up on discontinuous jump
If a discontinous realtime clock change occurs and sets any TFD_JUMPED
bits on the timerfd, then wake up waiting readers. This fixes failures
from the timerfd_root__clock_change_notification test case.
MFC after: 2 weeks
sys/time: Add saturating sbt conversions
When converting from timespec to sbintime, the timespec's 64-bit tv_sec
component is shifted to the left 32 bits, causing any information in the
upper 32 bits to be lost.
This data loss during conversion can turn timespecs with very large
tv_sec counters into sbintimes that represent much smaller time
durations.
Add tstosbt_sat() and tvtosbt_sat(), which are saturating versions of
tstosbt and tvtosbt. With these routines, any overflow resulting from
the conversion is clamped to [-SBT_MAX - 1, SBT_MAX].
Reviewed by: imp, markj
Differential Revision: https://reviews.freebsd.org/D55791
MFC after: 2 weeks
timerfd: Add tests
Take Jan Kokemuller's timerfd tests from the epoll-shim project,
stripping out code that isn't directly related to FreeBSD.
Reviewed by: markj
Differential Revision: https://reviews.freebsd.org/D55789
MFC after: 2 weeks
timerfd: Use saturating sbintime conversions
Some timerfd consumers set expirations with timespec tv_sec components
larger than 2^31 - 1. In such cases, converting that timespec to
sbintime results in data loss or sign flip, yielding a shorter
expiration than desired.
To avoid this problem, use saturating timespec-to-sbintime conversion
functions. These will clamp the converted sbintime to SBT_MAX under
circumstances where the normal conversion functions would overflow.
Saturating conversions still result in data loss, but the consequences
are less severe, causing problems only after SBT_MAX (~68 years) of
system uptime elapses.
Reviewed by: imp
Differential Revision: https://reviews.freebsd.org/D55792
MFC after: 2 weeks
timerfd: Fix interval callout scheduling
When a timerfd interval callout misses its scheduled activation time, a
differential is calculated based on the actual activation time and the
scheduled activation time. This differential is divided by the timerfd's
interval time and the quotient is added to the timerfd's counter.
Before this change, the next callout was scheduled to activate at:
scheduled activation time + timerfd interval.
This change fixes the scheduling of the next callout to activate at:
actual activation time + timerfd interval - remainder.
Reviewed by: markj
Differential Revision: https://reviews.freebsd.org/D55790
MFC after: 2 weeks
libclc: Rewrite log implementation as gentype inc file
Follow the ordinary gentype conventions for the log implementation,
instead of using a plain header. This doesn't quite yet enable
vectorization, due to how the table is currently indexed. This should
make it easier for targets to selectively overload the function for
a subset of types.
libclc: Use log intrinsic for half and float cases for amdgpu
This is pretty verbose and ugly. We're pulling the base implementation
in for the double cases, and scalarizing it. Also fully defining the
half and float cases to directly use the intrinsic, for all vector
types. It would be much more convenient if we had linker based overrides
for the generic implementations, rather than per source file.
libclc: Replace flush_if_daz implementation (#187569)
The fallback non-canonicalize path didn't work. Use a more
straightforward implementation. Eventually this should use
the pattern from #172998
[Coroutines][NFC] Elide coro.free based on frame instead of coro.id (#187627)
Part 2/4: Implement HALO for coroutines that flow off final suspend.
Parent PR approved in https://github.com/llvm/llvm-project/pull/185336,
with no change since then
Since `coro.id` is unavailable in resumers, Elide `coro.free` based on
frame instead of `coro.id`