[X86] Simplify duplicate MMO offset tracking in breakBlockedCopies (NFC) (#202904)
LMMOffset and SMMOffset in breakBlockedCopies/buildCopies/buildCopy were
both initialized to 0 and advanced in lockstep by identical amounts, so
they were always equal. Collapse them into a single Offset used for both
the load and store MachineMemOperands.
This also removes a latent typo: the final buildCopies call passed
LMMOffset for the store offset argument instead of SMMOffset. Since the
two were always equal this was harmless, and the unified Offset makes
the divergence unrepresentable.
Found via @jlebar's X86 LLVM bug hunt / FuzzX effort:
https://github.com/SemiAnalysisAI/FuzzX/blob/master/x86/bugs/042-sfb-buildcopies-wrong-mmo-offset/NOTES.md
cc @jlebar
[lldb][test] Speed up ProcessAttach test (#201530)
ProcessAttach is our slowest test and runs for about 70s. We spend 60s
in the autocontinue test waiting for the target program to terminate.
The reason we wait for the program is that our autocontinue test is not
running its command in async mode, and we wait after the attach for the
next breakpoint or the program terminates.
This patch makes the attach and autocontinue run in async mode so we
don't wait for the program to finish. This reduces the test time from
70s to about 10s.
It also replaces the assertTrue call that was supposed to be an
assertEqual, which made the test succeed even though the inferior
process already terminated.
[AArch64][GlobalISel] Select narrow G_INSERT_VECTOR_ELT GPR operands (#203568)
RegBankSelect currently extends narrow i8/i16 G_INSERT_VECTOR_ELT GPR
operands to 32-bits. Move this widening to pre-isel lowering. This will
help enable a simple fast pure type-based RBS alternative.
Assisted-by: codex
databases/cassandra[45]: add run_depends on java
These ports need to have a specific JDK installed to run.
In cassandra3 this was already correctly configured.
NB: Cassandra 4 is out of beta for a long time.
PR: 296095
Approved-by: Angelo Polo <language.devel@>
[libomp] Add kmp_vector (ADT 2/2) (#176163)
See rationale in the commit adding kmp_str_ref.
This commit introduces kmp_vector, a class intended primarily for small
vectors. It currently only includes methods I need at the moment, but
it's easily extensible.
AMDGPU: Remove xnack-any-only subtarget feature and handling
This reverts commit f4caa0a172d96597c375e6b6b2192c289723a6b9.
This feature was added to gfx12-5-generic only, which does not make
sense given that both gxf1250 and gfx1251 have the same unconditional
xnack handling. It also does not make sense to diagnose trying to use
a specific xnack mode on the generic target only, and only from the
backend.
The current feature management is a confusing mess, given that we have
2 parallel feature systems. AMDGPUTargetParser has a table containing
a bitmask of features, which already contained FEATURE_XNACK_ALWAYS
for gfx1250/gfx1251, but not gfx12-5-generic. Add this handling there
so the sanitizer detection is consistent on the generic target.
These 2 feature tables probably should be unified in some way. We also
probably should have a subtarget feature for the xnack handling, but it
should be inverted. xnack-any-only is an antifeature, in that it removes
[2 lines not shown]
AMDGPU: Remove xnack-any-only subtarget feature and handling
This reverts commit f4caa0a172d96597c375e6b6b2192c289723a6b9.
This feature was added to gfx12-5-generic only, which does not make
sense given that both gxf1250 and gfx1251 have the same unconditional
xnack handling. It also does not make sense to diagnose trying to use
a specific xnack mode on the generic target only, and only from the
backend.
The current feature management is a confusing mess, given that we have
2 parallel feature systems. AMDGPUTargetParser has a table containing
a bitmask of features, which already contained FEATURE_XNACK_ALWAYS
for gfx1250/gfx1251, but not gfx12-5-generic. Add this handling there
so the sanitizer detection is consistent on the generic target.
These 2 feature tables probably should be unified in some way. We also
probably should have a subtarget feature for the xnack handling, but it
should be inverted. xnack-any-only is an antifeature, in that it removes
[2 lines not shown]
[lld-macho] Ignore labels on sections ld64 treats as ignoreLabel (#194275)
In ld64, labels on records in some sections never become named atoms and
never enter the symbol table:
- Unconditionally: __cfstring, __objc_classrefs, and __objc_selrefs
- Prefix-gated on `L`/`l`: __literal{4,8,16} and __cstring-family
sections such as __objc_methname
LLD, however, ran every such label through `SymbolTable::addDefined`,
which diverged from ld64 whenever an identically-named symbol appeared
in another section. This patch mirrors ld64's behavior in LLD. The
Defined is still created for the affected labels, but it bypasses the
symbol table entirely and cannot collide with any cross-TU symbol.
I have encountered a few link failures caused by this, and reduced them
into the regression tests in the patch.
[RISCV] Fix the AST type printing code for VectorKind::RVVFixedLengthMask_1/2/4 (#204498)
These types have a fixed size of 1, 2, 4. The formula used for the other
types does not apply.
Assisted-by: Claude
zsh: Matthew Martin takes maintainer, ok sthen
It's been many years since the former maintainer gave an ok for this port
or even touched it. Matthew has been maintaining this shell for well over
a decade, so this change simply reflects reality.
witness: add tunables debug.witness.lock_order_{data_count,hash_size}
Add tunable debug.witness.lock_order_data_count to allow adjusting the
number of witness lock order data entries (stacks) without recompiling
the kernel. This may help to display stacks when a lock order reversal
is reported but the number of entries is exhausted before recording the
first lock order, by allowing the user to reboot with an adjusted
tunable and try again.
Tunable debug.witness.lock_order_hash_size is also provided to allow the
hash table load factor to be managed, though that is not required.
Also tweak witness_lock_order_add to avoid computing a hash when it
won't be needed because the lock order data entries are exhausted.
Reviewed by: kib, markj
Sponsored by: Dell Inc.
Differential Revision: https://reviews.freebsd.org/D57600
witness: actually set read-only tunables in time for witness_startup
SYSCTL_XXX with CTLFLAG_RDTUN and without CTLFLAG_NOFETCH should not be
used for values that are needed before SI_SUB_KLD. Otherwise they are
tuned after they are needed. Set CTLFLAG_RDTUN | CTLFLAG_NOFETCH for
the debug.witness.witness_count and debug.witness.skipspin sysctls and
add separate tunables for them, which run at SI_SUB_TUNABLES time, i.e.,
in time for witness_startup.
Reviewed by: kib, markj
Sponsored by: Dell Inc.
Differential Revision: https://reviews.freebsd.org/D57613
[clang-cl][test] Use /Zs to avoid writing unnecessary output files (#204501)
#194779 adds a test clang/test/Preprocessor/init-datetime-macros.c which
verifies some diagnostics. However, it does so with `/c`, which will
unnecessarily generate an output, and when run on a build system that
does not run tests in a writeable dir by default, will cause the test to
fail.
Since we don't care about the resulting object file, use `/Zs`
(equivalent of `-fsyntax-only`) to check the diagnostics but not produce
any output files.