[lldb][dwarf] Compute fully qualified names on simplified template names with DWARFTypePrinter (#117071)
This is a reland of https://github.com/llvm/llvm-project/pull/112811.
Fixed the bot breakage by running ld.lld explicitly.
[libc++][Android] BuildKite CI: update Clang and sysroot versions (#116151)
Android clang-r536225 identifies as Clang 19 but it predates LLVM
19.0.0. It is based off of fc57f88f007497a4ead0ec8607ac66e1847b02d6.
[libc++][Android] Allow testing libc++ with clang-r536225 (#116149)
The Android clang-r536225 compiler identifies as Clang 19, but it is
based on commit fc57f88f007497a4ead0ec8607ac66e1847b02d6, which predates
the official LLVM 19.0.0 release.
Some tests need fixes:
* The sized delete tests fail because clang-r536225 leaves sized
deallocation off by default.
* std::array<T[0]> is true when this Android Clang version is used with
a trunk libc++, but we expect it to be false in the test. In practice,
Clang and libc++ usually come from the same commit on Android.
[memprof] Upgrade a unit test to MemProf Version 3 (#117063)
This patch upgrades a unit test to MemProf Version 3 while removing
those bits that cannot be upgraded to Version 3.
The bits being removed expect instrprof_error::hash_mismatch from a
broken MemProf profile that references a frame that doesn't actually
exist. Now, Version 3 no longer issues
instrprof_error::hash_mismatch. Even if it still issued
instrprof_error::hash_mismatch, we would have a couple of hurdles:
- InstrProfWriter::addMemProfData will soon require all (or none) of
the fields (frames, call stacks, and records) be populated. That
is, it won't accept an instance of IndexedMemProfData with frames
missing.
- writeMemProfV3 asserts that every frame occurs at least once:
assert(MemProfData.Frames.size() == FrameHistogram.size());
[2 lines not shown]
[libc] Use best-fit binary trie to make malloc logarithmic (#106259)
This reworks the free store implementation in libc's malloc to use a
dlmalloc-style binary trie of circularly linked FIFO free lists. This
data structure can be maintained in logarithmic time, but it still
permits a relatively small implementation compared to other
logarithmic-time ordered maps.
The implementation doesn't do the various bitwise tricks or
optimizations used in actual dlmalloc; it instead optimizes for
(relative) readability and minimum code size. Specific optimization can
be added as necessary given future profiling.
Add test for maximum SMB valid pwrite range
We need to validate we can write above 16 TiB offset and that
we get properly stopped writing about 64 TiB offset.
[RISCV][GISel] Move G_BRJT expansion to legalization (#73711)
Instead of custom selecting a bunch of instructions, we can expand to
generic MIR during legalization.
pf: Let rdr rules modify the src port if doing so would avoid a conflict
If NAT rules cause inbound connections to different external IPs to be
mapped to the same internal IP, and some application uses the same
source port for multiple such connections, rdr translation may result in
conflicts that cause some of the connections to be dropped.
Address this by letting rdr rules detect state conflicts and modulate
the source port to avoid them.
Reviewed by: kp, allanjude
MFC after: 3 months
Sponsored by: Klara, Inc.
Sponsored by: Modirum
Differential Revision: https://reviews.freebsd.org/D44488
(cherry picked from commit 9897a66923a3e79c22fcbd4bc80afae9eb9f277c)
pf: Make pf_get_translation() more expressive
Currently pf_get_translation() returns a pointer to a matching
nat/rdr/binat rule, or NULL if no rule was matched or an error occurred
while applying the translation. That is, we don't distinguish between
errors and the lack of a matching rule. This, if an error (e.g., a
memory allocation failure or a state conflict) occurs, we simply handle
the packet as if no translation rule was present. This is not
desireable.
Make pf_get_translation() return the matching rule as an out-param and
instead return a reason code which indicates whether there was no
translation rule, or there was a translation rule and we failed to apply
it, or there was a translation rule and we applied it successfully.
Reviewed by: kp, allanjude
MFC after: 3 months
Sponsored by: Klara, Inc.
Sponsored by: Modirum
[3 lines not shown]
pf: Let pf_state_insert() handle redirect state conflicts
When handling a redirect state conflict, pf_get_translation() tries
modifying the source port to avoid it. If it fails to find a free port,
the translation is aborted.
Instead, if we fail to find a free source port, simply press on with the
original source port and let pf_state_insert() handle the conflict as it
pleases, rather than second-guessing what it will do. In particular,
pf_state_insert() has special handling for TCP connections in a terminal
state, and might succeed despite a state conflict.
Reviewed by: kp
MFC after: 3 months
Sponsored by: Klara, Inc.
Sponsored by: Modirum
Differential Revision: https://reviews.freebsd.org/D46612
(cherry picked from commit 9569fddd8d0e48211e67fdc63dd72eba83883525)
gve: Fix TX livelock
Before this change the transmit taskqueue would enqueue itself when it
cannot find space on the NIC ring with the hope that eventually space
would be made. This results in the following livelock that only occurs
after passing ~200Gbps of TCP traffic for many hours:
100% CPU
┌───────────┐wait on ┌──────────┐ ┌───────────┐
│user thread│ cpu │gve xmit │wait on │gve cleanup│
│with mbuf ├────────►│taskqueue ├────────►│taskqueue │
│uma lock │ │ │ NIC ring│ │
└───────────┘ └──────────┘ space └─────┬─────┘
▲ │
│ wait on mbuf uma lock │
└───────────────────────────────────────────┘
Further details about the livelock are available on
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=281560.
[15 lines not shown]
gve: Add DQO QPL support
DQO is the descriptor format for our next generation virtual NIC.
It is necessary to make full use of the hardware bandwidth on many
newer GCP VM shapes.
This patch extends the previously introduced DQO descriptor format
with a "QPL" mode. QPL stands for Queue Page List and refers to
the fact that the hardware cannot access arbitrary regions of the
host memory and instead expects a fixed bounce buffer comprising
of a list of pages.
The QPL aspects are similar to the already existing GQI queue
queue format: in that the mbufs being input in the Rx path have
external storage in the form of vm pages attached to them; and
in the Tx path we always copy the mbuf payload into QPL pages.
Signed-off-by: Shailend Chand <shailend at google.com>
Reviewed-by: markj
[4 lines not shown]
gve: Add DQO RDA support
DQO is the descriptor format for our next generation virtual NIC.
It is necessary to make full use of the hardware bandwidth on many
newer GCP VM shapes.
One major change with DQO from its predecessor GQI is that it uses
dual descriptor rings for both TX and RX queues.
The TX path uses a descriptor ring to send descriptors to HW, and
receives packet completion events on a TX completion ring.
The RX path posts buffers to HW using an RX descriptor ring and
receives incoming packets on an RX completion ring.
In GQI-QPL, the hardware could not access arbitrary regions of
guest memory, which is why there was a pre-negotitated bounce buffer
(QPL: Queue Page List). DQO-RDA has no such limitation.
[14 lines not shown]
linux sendfile: Fix handling of non-blocking sockets
FreeBSD sendfile() may perform a partial transfer and return EAGAIN if
the socket is non-blocking. Linux sendfile() expects no error in this
case, so squash EAGAIN.
PR: 282495
Tested by: pieter at krikkit.xyz
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D47447
(cherry picked from commit a43b745aaf4f5bbc96875d2ab3ec9bea8024eda4)
ZAP: Reduce leaf array and free chunks fragmentation
Previous implementation of zap_leaf_array_free() put chunks on the
free list in reverse order. Also zap_leaf_transfer_entry() and
zap_entry_remove() were freeing name and value arrays in reverse
order. Together this created a mess in the free list, making
following allocations much more fragmented than necessary.
This patch re-implements zap_leaf_array_free() to keep existing
chunks order, and implements non-destructive zap_leaf_array_copy()
to be used in zap_leaf_transfer_entry() to allow properly ordered
freeing name and value arrays there and in zap_entry_remove().
With this change test of some writes and deletes shows percent of
non-contiguous chunks in DDT reducing from 61% and 47% to 0% and
17% for arrays and frees respectively. Sure some explicit sorting
could do even better, especially for ZAPs with variable-size arrays,
but it would also cost much more, while this should be very cheap.
[3 lines not shown]
rtsw: Break out as soon as we find we're doing the inversion workaround
Once we set that we're doing the inversion workaround, there's no sense
continuing to search for the inversion workaround.
Sponsored by: Netflix
Reviewed by: adrian
Differential Revision: https://reviews.freebsd.org/D47686
cdefs: Document what we do when _XOPEN_SOURCE is an empty string
X/Open originally had _XOPEN_SOURCE defined to signify conformance with
the Single Unix Specification, starting with its third iteration. There
it defined _XOPEN_SOURCE being defined as the same thing as
_POSIC_C_SOURCE=2, though the different versions of the spec had slight
variances as to what's defined and wheter or not _XOPEN_SOURCE_EXTENSION
needed to be defined. Document that we don't do anything in this case.
It turns out that enabling the proper strict environment breaks at least
some old software, so for the moment it's a nop until that can be sorted
out (though that is a very low proprity task).
Sponsored by: Netflix