[GlobalOpt] Add TTI interface useFastCCForInternalCall for FASTCC (#164768)
Background: X86 APX feature adds 16 registers within the same 64-bit
mode. PR #164638 is trying to extend such registers for FASTCC. However,
a blocker issue is calling convention cannot be changeable with or
without a feature.
The solution is to disable FASTCC if APX is not ready. This is an NFC
change to the final code generation, becasue X86 doesn't define an
alternative ABI for FASTCC in 64-bit mode. We can solve the potential
compatibility issue of #164638 with this patch.
aes(9): Rewrite x86 SSE2 implementation.
This computes eight AES_k instances simultaneously, using the
bitsliced 32-bit aes_ct logic which computes two blocks at a time in
uint32_t arithmetic, vectorized four ways.
Previously, the SSE2 code was a very naive adaptation of aes_ct64,
which computes four blocks at a time in uint64_t arithmetic, without
any 2x vectorization -- I did it at the time because:
(a) it was easier to get working,
(b) it only affects really old hardware with neither AES-NI nor SSSE3
which are both much much faster.
But it was bugging me that this was a kind of dumb use of SSE2.
Substantially reduces stack usage (from ~1200 bytes to ~800 bytes)
and should approximately double throughput for CBC decryption and for
XTS encryption/decryption.
[10 lines not shown]
aes(9): New 64-bit bitsliced implementation.
Derived from BearSSL's aes_ct64 code. Compared to the aes_ct code,
on machines with native 64-bit integer arithmetic, aes_ct64 should
have approximately:
- the same throughput for:
. CBC encryption,
. CCM encryption/decryption, and
. CBC-MAC;
- double the throughput for:
. CBC decryption,
. XTS encryption/decryption.
(aes_ct computes AES on two blocks at a time; aes_ct64 computes it on
four blocks at a time, with roughly the same number of instructions.
CBC encryption and CBC-MAC are inherently sequential; CCM, being a
combination of CTR and CBC-MAC, can only really be parallelized two
[12 lines not shown]
Revert "[ASTMatchers] Make isExpandedFromMacro accept llvm::StringRef… (#167060)" (#169238)
This reverts commit a52e1af7f766e26a78d10d31da98af041dd66410.
That commit reverted a change (making isExpandedFromMacro take a
std::string) that was explicitly added to avoid lifetime issues. We ran
into issues with some internal matchers due to this, and it probably is
not an uncommon downstream use case. This patch restroes the original
functionality and adds a test to ensure that the functionality is
preserved.
https://reviews.llvm.org/D90303 contains more discussion.
libpthread: Link with -Wl,-z,nodelete.
Can't safely unload libpthread because of the interaction with libc
thread stubs.
PR lib/59784: dlopening and dlclosing libpthread is broken
net/unison240: Fetch from github, deprecate
The distfiles used by the port are not available anymore, take the one from github. Adapt the port to build with the slightly different layout of this distfile. [1]
The new distfile does not provide html, ps and pdf documentation, so remove those files from the port.
I'm also deprecating this port, it is an ancient version not really supported anymore by upstream. set a long expiration time.
While here:
- Refresh Makefiles for other legacy unison ports
- Remove CONFLICTS with no more existing -devel port
PR: 291166 [1]
MFH: 2025Q4
(cherry picked from commit ba72838fff3e7fa001d247aa5409e889a7c864c3)