[AArch64] Address issue reported in PR#196029 (#199122)
For certain types of truncating stores, the lowering action is set to
custom although no custom lowering exists for them.
This patch addresses issue reported in PR #196029 by removing the custom lowering entry.
Remove incorrect gcal link from C/C++ language wg (#201374)
The C and C++ language working group meets on the first and third Wed of
the month, but Google Calendar does not support doing this via a single
event. Instead, we have one event for recurring on the 1st Wed and a
second event for recurring on the 3rd Wed. That means we cannot use a
single gcal link for the event. Instead of listing two links, this
removes the gcal link entirely because the meeting is also listed on the
community calendar itself. This reduces confusion for folks, but it
would be nice to get a replacement link at some point.
[NVPTX] Respect FTZ flag when lowering atomicrmw fadd. (#200732)
Previously we unconditionally lowered LLVM atomicrmw fadd to PTX
atom.add. This is incorrect, because it ignores the FTZ behavior of the
LLVM and PTX instructions.
i386: Fix build (of 'genassym.o')
i386's genassym.c needs to define some assembly symbols holding the size
of NFS structures to support NFS_ROOT while booting with a nfs_diskless
structure. For this, it needs to include a few NFS headers, which
require definitions from <sys/mount.h> (fhandle_t, vfs_init_t), which
was removed by commit 72ab129799a2 ("x86: remove sys/mount.h from
genassym.c").
Since recently, <sys/mount.h> has been including <sys/vnode.h>, so needs
"vnode_if.h" to have been generated for the compilation of 'genassym.o'
not to fail. Make sure this is the case (for all architectures for
simplicity) by tweaking the rule for 'genassym.o' in
'sys/conf/kern.post.mk', leaving a comment there so that it can be
removed when i386 is dropped (or if the above-mentioned dependency is
broken).
Fixes: 72ab129799a2 ("x86: remove sys/mount.h from genassym.c")
Sponsored by: The FreeBSD Foundation
[LoopInterchange] Bail out if function that may diverge is called (#201348)
This patch fixes the issue pointed out in
https://github.com/llvm/llvm-project/pull/200828#issuecomment-4593914293.
As demonstrated by the test cases added in #201331, it is not legal to
interchange loops that contain call instructions which may diverge. This
patch adds an additional check and bails out early when we cannot prove
that a call instruction in the loops doesn't diverge.
genfs_do_io: fix a pagedaemon deadlock
this should fix the following panic i observed on my machine.
```
panic: out of memory before the pagedaemon thread exists
cpu0: Begin traceback...
vpanic() at netbsd:vpanic+0x189
panic() at netbsd:panic+0x3c
uvm_wait() at netbsd:uvm_wait+0xa5
uvm_km_kmem_alloc() at netbsd:uvm_km_kmem_alloc+0x21b
pool_page_alloc() at netbsd:pool_page_alloc+0x2c
pool_grow() at netbsd:pool_grow+0x367
pool_get() at netbsd:pool_get+0x9f
pool_cache_get_slow() at netbsd:pool_cache_get_slow+0x136
pool_cache_get_paddr() at netbsd:pool_cache_get_paddr+0x256
getiobuf() at netbsd:getiobuf+0x23
genfs_do_io() at netbsd:genfs_do_io+0xde
genfs_gop_write() at netbsd:genfs_gop_write+0x52
[5 lines not shown]
sw_reg_strategy: fix a pagedaemon deadlock
this should fix the following panic i observed on my machine.
```
panic: out of memory before the pagedaemon thread exists
cpu0: Begin traceback...
vpanic() at netbsd:vpanic+0x189
panic() at netbsd:panic+0x3c
uvm_wait() at netbsd:uvm_wait+0xa5
uvm_km_kmem_alloc() at netbsd:uvm_km_kmem_alloc+0x21b
pool_page_alloc() at netbsd:pool_page_alloc+0x2c
pool_grow() at netbsd:pool_grow+0x367
pool_get() at netbsd:pool_get+0x9f
pool_cache_get_slow() at netbsd:pool_cache_get_slow+0x136
pool_cache_get_paddr() at netbsd:pool_cache_get_paddr+0x256
getiobuf() at netbsd:getiobuf+0x23
swstrategy() at netbsd:swstrategy+0x25a
bdev_strategy() at netbsd:bdev_strategy+0x83
[6 lines not shown]
[Dexter] Add basic result evaluation for structured scripts (#198803)
This patch adds evaluation for structured scripts, completing the
features required to run simple Dexter tests using structured scripts.
The basic output from these evaluations is a list of named metrics
aggregating the results of evaluating !value nodes. The verbose output
gives a per-step summary of the results for each expect node active at
that step.
Most of the new functionality is in the evaluation/ dir, which has also
absorbed some functionality previously stored in the
ScriptDebuggerController for matching !where nodes to a debugger StepIR,
as this is logic which is common to both managing a debugger session and
evaluating the end result.
NAS-141235 / 26.0.0-RC.1 / Fix ZFS tiering event subscription (by anodos325) (#19055)
During the course of development in our design documents subscriptions
shifted from being locked into dataset names to being locked into the
tier job id, which is dataset_name at uuid.
Original PR: https://github.com/truenas/middleware/pull/19054
Co-authored-by: Andrew Walker <andrew.walker at truenas.com>
NAS-141236 / 27.0.0-BETA.1 / Fix ZFS tiering event subscription (#19054)
During the course of development in our design documents subscriptions
shifted from being locked into dataset names to being locked into the
tier job id, which is dataset_name at uuid.
(cherry picked from commit 3a765d25fab254f95636b61ad8de00b2f9bed9a4)
[flang][openacc] add extension which accepts multiple names in a OpenACC routine directive (#200296)
This PR adds an extension which allows one or more function names in a
single named routine directive. This is treated as multiple named
routine directives with the same clauses. The bind clause is forbidden.
The empty list of names isn't excepted. Routine clauses are stable under
unparsing.
This PR tests Parsing, Unparsing, Semantics, and Lowering.
[LV] Vectorize early exit loops with stores using masking (#178454)
This is an alternative approach to vectorizing early exit loops with
stores that avoids needing to add an extra check block. This is a
fairly straightforward approach that should work on vector ISAs
supporting masked memory ops.
The basic approach is to create a mask covering all lanes _before_ any
exiting lane, using cttz.elts and active.lane.mask (which sets all lanes
to true if the uncountable exit wasn't taken). If the uncountable exit
was taken, then there will still be one scalar iteration left to perform
after the vector loop, which will also handle which exit block we should
branch to.
We no longer need to advance exit conditions in the vector body to the
next iteration (compared to the other PR), though we still need to move
the recipes needed to generate the exit condition (depending on which
memory operations are first in the loop).
[56 lines not shown]
NAS-141235 / 27.0.0-BETA.1 / Fix ZFS tiering event subscription (#19054)
During the course of development in our design documents subscriptions
shifted from being locked into dataset names to being locked into the
tier job id, which is dataset_name at uuid.
[AArch64][llvm] Restrict luti6 (4 regs, 8-bit) to 0 <= Zn <= 7
The `luti6` instruction (table, four registers, 8-bit) should only
allow `0 <= Zn <= 7`, since there's only 3 bits. It actually allows:
```
luti6 { z0.b - z3.b }, zt0, { z8 - z10 }
```
which produces a duplicate encoding to the following:
```
luti6 { z0.b - z3.b }, zt0, { z0 - z2 }
```
Fix tablegen to ensure Zn is only allowed in correct range of 0 to 7.
[CIR][AArch64] Lower vfmaq_lane_v and vfma_laneq_v (#197084)
Lower BI__builtin_neon_vfmaq_lane_v and BI__builtin_neon_vfma_laneq_v in
CIR.
This handles the covered vfmaq_lane_* and vfma_laneq_* ACLE wrappers by
bitcasting operands to the expected types, selecting the requested lane
from the lane source operand, and emitting fma through
emitCallMaybeConstrainedBuiltin.
For vfmaq_lane_v, the selected lane is splatted with emitNeonSplat.
For vfma_laneq_v, the lane is selected from the wider lane source; the
f64 case extracts the scalar lane before emitting scalar fma.
Neighboring scalar lane/laneq wrappers and other out-of-scope forms
remain explicit NYI cases.
Tests are moved into the existing CIR-enabled fused multiply files under
clang/test/CodeGen/AArch64/neon/, reusing upstream LLVM checks where
[3 lines not shown]