[libc][docs][NFC] Document cross-compilation testing with QEMU (#188838)
Added a "Building and Testing with an Emulator" section to
full_cross_build.rst using riscv64 and qemu-riscv64 as the example.
Outlined necessary CMake flags for cross-compiling with Clang, including
CMAKE_C_COMPILER_TARGET, CMAKE_CXX_COMPILER_TARGET, and
LLVM_ENABLE_LLD=ON. Switched from CMAKE_SYSROOT to LIBC_KERNEL_HEADERS
and added the gcc-riscv64-linux-gnu package dependency to ensure sysroot
issues on Debian-based systems are avoided while retaining access to
cross-compiler runtime objects.
Explained the self-hosted libc-hermetic-tests target as the required
target for executing tests during a standalone cross build, since the
standard check-libc tests are not hermetic.
Refactored existing CMake examples in full_cross_build.rst to use -S and
-B flags instead of cd and mkdir.
Removed prompt characters from code blocks and separated host
[5 lines not shown]
[flang][OpenMP] Restrict isSafeToParallelize to write-only thread-local effects (#188595)
This is a follow-up fix for commit 0f5e9bee.
Only write effects to thread-local memory should be considered safe to
parallelize in workshare lowering, not reads. When both reads and writes
were safe, the cascading effect in moveToSingle could cause entire
SingleRegions to become fully parallelized, eliminating the omp.single
and its implicit barrier. This removed synchronization points needed to
keep threads coordinated inside sequential loops containing workshared
operations, causing race conditions in forall-workshare patterns.
This was exposed by the Fujitsu Test Suite and made the following tests
regress:
FAIL: test-suite :: Fujitsu/Fortran/0398/Fujitsu-Fortran-0398_0031.test
FAIL: test-suite :: Fujitsu/Fortran/0398/Fujitsu-Fortran-0398_0013.test
FAIL: test-suite :: Fujitsu/Fortran/0398/Fujitsu-Fortran-0398_0030.test
FAIL: test-suite :: Fujitsu/Fortran/0398/Fujitsu-Fortran-0398_0014.test
Updates #143330
[MLIR][AMDGPU] Added l2-prefetch op to AMDGPU (#188457)
This PR adds `global_prefetch` op to prefetch a cache line to high-level
caches using the aligned address of the source `memref` and an offset
provided by the indices of the element containing the cache line. This
provides temporal hints (e.g., regular or high-priority). Note that
out-of-bounds access is allowed in speculative mode. Ensure the source
`memref` is in address space `1`.
---------
Co-authored-by: Krzysztof Drewniak <Krzysztof.Drewniak at amd.com>
[MLIR][Arith] Fix int-range-optimizations miscompile from stale solver state (#188992)
The `--int-range-optimizations` pass runs the `DataFlowSolver` once,
then calls `applyPatternsGreedily` with a `DataFlowListener` that erases
solver state when ops are deleted. However, the greedy driver's
`simplifyRegions` step (which calls `runRegionDCE` between pattern
iterations) can remove block arguments without notifying the listener.
This frees the `BlockArgumentImpl` storage, which may be reused by a
subsequent allocation. The solver then finds stale lattice state keyed
at the reused address and incorrectly treats the new block argument as a
known constant, causing a miscompile.
The existing `enableFolding(false)` was added for the same class of bug
(folding can also remove block arguments). This patch extends the fix by
also disabling region simplification, preventing dead-arg elimination
from causing the same address-reuse problem.
Fixes #137281
Fixes #126195
Assisted-by: Claude Code
[MLIR][SCF] Fix ForLoopRangeFolding miscompile with non-positive MulIOp multiplier (#188995)
The scf-for-loop-range-folding pass transforms loops of the form
for (i = lb; i < ub; i += step) { use(i * c) }
into
for (j = lb*c; j < ub*c; j += step*c) { use(j) }
This transformation is only valid when c is strictly positive, since
scf.for requires a positive step. When c is zero or negative, the new
step becomes zero or effectively negative (wrapping in unsigned
arithmetic for index type), producing an incorrect loop.
Add a guard that restricts the MulIOp folding to cases where the
loop-invariant multiplier is a statically known positive integer
constant. Non-constant loop-invariant multipliers are also excluded
since their sign cannot be determined at compile time.
[4 lines not shown]
mail/claws-mail*: Update 4.3.1 => 4.4.0, remove GTK2 option that use version 3.21.0
Release Notes:
https://lists.claws-mail.org/pipermail/users/2026-March/034710.html
- Remove option GTK2 - GTK2 supported in 3.x only, but 3.x reached EoL.
- Make cosmetic improvements and cleanups.
PR: 293704
Approved by: Chris Hutchinson <portmaster at bsdforge.com> (maintainer)
Co-authored-by: Polarian <polarian at polarian.dev>
Co-authored-by: Chris Hutchinson <portmaster at bsdforge.com>
DEVICE_IDENTIFY.9: Fix function call to detect driver in example code
Fixes: ccabc7c2e556 ("DEVICE_IDENTIFY.9: Modernize description and use cases")
MFC after: 3 days
Sponsored by: The FreeBSD Foundation
Replace pysnmp with truenas_pysnmp C extension for SNMP traps
Rewrites snmp_trap.py to use the truenas_pysnmp C extension and replaces
the python3-pysnmp4 dependency with python3-truenas-pysnmp.
[LangRef] Specify that syncscopes can affect the monotonic modification order
If a target specifies that atomics with mismatching syncscopes appear
non-atomic to each other, there is no point in requiring them to be ordered in
the monotonic modification order. Notably, the [AMDGPU target user
guide](https://llvm.org/docs/AMDGPUUsage.html#memory-scopes) has specified
syncscopes to relax the modification order for years.
So far, I haven't found an example where this less constrained ordering would
be observable (at least with the AMDGPU inclusive scope rules). Whenever a load
would be able to see two monotonic stores with non-inclusive scope, that's
considered a data race (i.e., the load would return `undef`), so it cannot be
used to observe the order of the stores.
[AMDGPUUsage] Specify what one-as syncscopes do
This matches the currently implemented and (as far as I could determine)
intended semantics of these syncscopes.
The sync scope table is unchanged except for removing its indentation;
otherwise it would be rendered as part of the preceding note.
[LangRef][AMDGPU] Specify that syncscope can cause atomic operations to race
Targets should be able to specify that the syncscope of atomic operations
influences whether they participate in data races with each other.
For example, in AMDGPU, we want (and already implement) the load in the
following case to be in a data race (i.e., return `undef` according to the
current definition), because there is an atomic store with workgroup syncscope
executing in a different workgroup:
```
; workgroup 0:
store atomic i32 1, ptr %p syncscope("workgroup") monotonic, align 4
; workgroup 1:
store atomic i32 2, ptr %p syncscope("workgroup") monotonic, align 4
load atomic i32, ptr %p syncscope("workgroup") monotonic, align 4
```
[3 lines not shown]
[LangRef] Allow monotonic & seq_cst accesses to inter-operate with other accesses
Currently, the LangRef says that atomic operations (which includes `unordered`
operations, which don't participate in the monotonic modification order) must
read a value from the modification order of monotonic operations.
In the following example, this means that the load does not have a store it
could read from, because all stores it may see do not participate in the
monotonic modification order:
```
; thread 0:
store atomic i32 1, ptr %p unordered, align 4
; thread 1:
store atomic i32 2, ptr %p unordered, align 4
load atomic i32, ptr %p unordered, align 4
```
[18 lines not shown]
[libc][math] Implement C23 half precision erfc function (#180930)
Add support for the half-precision complementary error function
`erfcf16``, using a Sollya generated polynomial implementation with
proper handling of special cases.
Extend the MPFR utilities with erfc support to allow tests.
closes: #180927
Revert "[MLIR] Fix ErasedOpsListener false positives for newly created ops/blocks" (#189010)
Reverts llvm/llvm-project#188956
Hit "merge" by accident on the wrong tab, juggling too may PRs in
parallel...
[SPIR-V] Support for C++ for OpenCL source language (#188791)
- Add CPP_for_OpenCL source language operand
- Handle opencl.cxx.version metadata
Align handling with SPIR-V translator logic and tests presented there
[CFG] Add shortcut if CycleInfo is available (#188928)
isPotentiallyReachable() currently returns "reachable" early if BB
dominates StopBB. If CycleInfo is available, and BB is not part of a
cycle, we can also perform the reverse inference: Return "not reachable"
if StopBB dominates BB.
This both allows aborting the walk earlier, and provides a more precise
result.
[Hexagon] Add coverage tests for CodeGen passes (#183951)
Add tests targeting specific Hexagon CodeGen passes with low coverage:
- peephole-sxtw-combine.mir: HexagonPeephole pass exercising SXTW
removal, combine generation, and LSR copy patterns. Improves
HexagonPeephole.cpp line coverage from 63.89% to 99.31%.
- vec-print-wq.ll: HexagonVectorPrint pass with V (single vector) and W
(double vector) register printing via 128b HVX. Improves
HexagonVectorPrint.cpp line coverage from 71.19% to 87.29%.
- tfr-cleanup-double-imm.mir: HexagonTfrCleanup pass exercising 64-bit
immediate rewrite paths. Improves HexagonTfrCleanup.cpp line coverage
from 80.85% to 88.30%.
- cfgopt-newpt-invert.ll: HexagonCFGOptimizer pass exercising branch
inversion with new-value predicate transfers.
[Hexagon] Add AP register to liveins when used for frame index access (#188942)
This is a follow-up to commit 3ef59d80c5ce ("[Hexagon] Fix
use-before-def of AP register in prologue CSR spills").
When the AP (alignment pointer) register is used as a base register for
frame index elimination, add it to the basic block's livein set. This
ensures liveness information is accurate for the machine verifier.
The original commit fixed the use-before-def issue by moving PS_aligna
after CSR spills. However, when the prologepilog pass is run in
isolation (as in MIR tests) with expensive checks enabled, the verifier
reports an error because AP is used in blocks where it's not marked as
live-in.
In the full compilation pipeline, the Hexagon Packetizer adds AP as an
implicit operand to instruction bundles, which satisfies the verifier.
However, when running only the prologepilog pass (before packetization),
AP remains an explicit operand and must be in the livein set.
This fix adds AP to liveins when AP is used as the base register,
ensuring correct liveness tracking regardless of whether packetization
has run.
after a report from 'K r' on bugs that the manual page section rfc868 '-o'
option has incorrect dates, let's recognize that this is no longer a good
way to get time information and only the ntp interface is needed.
ok sthen florian henning
[libunwind][PAC] Defang ptrauth's PC in valid CFI range abort (#184041)
It turns out making the CFI check a release mode abort causes many, if
not the majority, of JITs to fail during unwinding as they do not set up
CFI sections for their generated code. As a result any JITs that do
nominally support unwinding (and catching) through their JIT or assembly
frames trip this abort.
rdar://170862047
[XRay] Always register constructor(0) alongside .preinit_array (#188788)
On musl-based systems the dynamic linker does not process
DT_PREINIT_ARRAY, so the .preinit_array entry alone never calls
__xray_init(). Without initialization, the global XRay Flags struct is
zero-initialized and flags()->xray_mode is NULL. When the basic-mode or
FDR-mode static initializers run from .init_array and call
internal_strcmp(flags()->xray_mode, ...), they dereference NULL and
crash.
Fix this by always registering a constructor(0) in addition to the
.preinit_array entry. On glibc where .preinit_array works, __xray_init()
will have already run and the constructor returns immediately (the
function is idempotent). On musl, the constructor ensures __xray_init()
runs before other .init_array entries that depend on XRay flags being
initialized.
[lldb] Add PlatformWebInspectorWasm (#188751)
Add a new PlatformWebInspectorWasm, which is a Wasm platform that
automatically connects to the WebInspector platform server.
The existing "wasm" platform handles WebAssembly generally and allows
you to configure a runtime to launch under. The "webinspector-wasm"
platform does the inverse, and only supports attaching to an already
running WebAssembly instance in Safari. The workflow here is always
`platform process list` followed by `platform process attach`. This
explains why you can only force create this platform and it's never
automatically selected when loading a Wasm target.