[flang][cuda][acc] Fix use_device device attribute for USE-renamed variables (#205902)
Example:
```fortran
module m
complex(8), allocatable, pinned :: v(:,:)
interface callee
subroutine callee_x(x, n)
complex(8), device :: x(:,:)
integer :: n
end subroutine
end interface
end module
subroutine driver(n)
use m, only : callee, v_renamed => v
integer :: n
!$acc data copy(v_renamed)
!$acc host_data use_device(v_renamed)
[11 lines not shown]
[lldb] Reject DW_OP_deref_size with size 0 (#205911)
`Evaluate_DW_OP_deref` validated that the dereference size was `<= 8`
but
not that it was non-zero. The DWARF expression evaluator parses
untrusted operands, so a `DW_OP_deref_size` with size operand `0` is
reachable (it is hit by the lldb-dwarf-expression-fuzzer).
A zero dereference size flows into `DerefSizeExtractDataHelper`, which
constructs a `DataExtractor` with `addr_size == 0` and aborts on its
assertion. The unit test that feeds `DW_OP_lit0, DW_OP_deref_size, 0x00`
shows the crash:
```
[ RUN ] DWARFExpressionMockProcessTest.DW_OP_deref_size_zero
Assertion failed: (addr_size >= 1 && addr_size <= 8), function
DataExtractor, file DataExtractor.cpp, line 134.
#8 DataExtractor::DataExtractor(...)
#11 DWARFExpression::Evaluate(...)
[6 lines not shown]
[Clang] Optionally use NewPM to run CodeGen Pipeline (#205928)
This patch adds a new -cc1 option to clang that runs the codegen
pipeline using the NewPM. This enables easy testing of this flow through
clang.
Lands #191579 actually, because that PR I accidentally landed into a
user branch.
[AMDGPU] Fold constant offsets into named barrier addresses
Allow isOffsetFoldingLegal to fold a constant offset into an LDS
named-barrier global, and include the node offset when materializing the
LDS address in LowerGlobalAddress. s_barrier_signal_var on a GEP'd named
barrier now selects the immediate form, matching a bare global and GlobalISel.
With object linking the offset folds into the relocation addend.
The barrier ID is derived from the address via (addr >> 4) & 0x3F, so a
byte offset that does not land on a 16-byte barrier boundary is still
valid: it simply selects the containing barrier. No alignment assertion
is needed, and such offsets must not crash the compiler (see the
misaligned test).
Change-Id: I639bc723eb001573585cc05d0ad19f2773054f21
Assisted-by: Cursor
[AMDGPU] Pre-commit test for constant-offset named barrier signal_var
A GEP into a named-barrier array (&bars[1]) lowers s_barrier_signal_var to
the dynamic m0 form on SelectionDAG, unlike the bare global and GlobalISel.
With object linking it emits a runtime add of the offset instead of folding
it into the relocation addend.
Also add a misaligned-offset test: a byte GEP that does not land on a
barrier boundary is valid IR and must not crash the compiler.
Change-Id: I7cea0dd64d050eb3e2143841e7136355cbb3bc50
Assisted-by: Cursor
Assert SA bulk array bounds before sa_bulk_update
The SA bulk arrays are filled by series of conditional SA_ADD_BULK_ATTR
calls whose worst-case count is easy to miscount, so a future attribute
add could silently overrun the fixed-size array.
Add ASSERT3S(count, <=, ARRAY_SIZE(bulk)) before each sa_bulk_update so
an overrun trips in debug builds, and use ARRAY_SIZE so the bound stays
tied to the declaration. Linux zfs_setattr allocates its arrays on the
heap sized by 'bulks', so it asserts against that instead.
Also tighten FreeBSD zfs_setattr's xattr_bulk from [7] to [6], its
actual worst case.
Reviewed-by: Alexander Motin <alexander.motin at TrueNAS.com>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Ameer Hamza <ahamza at ixsystems.com>
Closes #18573
Update mtime/ctime when fallocate grows a file
Growing a file with fallocate updated its size but left mtime/ctime
unchanged and didn't log the change. A fallocate that changes the file
size should update mtime/ctime, and the change should be logged so it
survives a crash.
Pass log=TRUE to zfs_freesp() on the extend path so it updates the
timestamps and logs the size change, matching zfs_space(). Punch-hole
and zero-range already use this path and are unaffected.
Reviewed-by: Rob Norris <rob.norris at truenas.com>
Reviewed-by: Alexander Motin <alexander.motin at TrueNAS.com>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Ameer Hamza <ahamza at ixsystems.com>
Closes #18573
[AMDGPU] Pre-commit test for constant-offset named barrier signal_var
A GEP into a named-barrier array (&bars[1]) lowers s_barrier_signal_var to
the dynamic m0 form on SelectionDAG, unlike the bare global and GlobalISel.
With object linking it emits a runtime add of the offset instead of folding
it into the relocation addend.
Also add a misaligned-offset test: a byte GEP that does not land on a
barrier boundary is valid IR and must not crash the compiler.
Change-Id: I7cea0dd64d050eb3e2143841e7136355cbb3bc50
Assisted-by: Cursor
Persist z_seq across znode eviction
Commit 312bdab0f5 advertises STATX_ATTR_CHANGE_MONOTONIC and builds
the NFSv4 change_cookie from (ctime.tv_sec << 32) | zp->z_seq.
zp->z_seq is reset to a magic constant in zfs_znode_alloc(), so any
event that drops the znode from cache (memory pressure, remount,
reboot) regresses the lower bits of the cookie, a backward step
within the same second.
NFSv4 clients that trust this contract treat a regressed cookie as
evidence that the file's metadata cannot be relied on. VMware ESXi
over NFSv4.1 surfaces this as "The file specified is not a virtual
disk", and a VM stored on the affected NFS-exported ZFS dataset
fails to power on.
Widen z_seq to 64 bit and present it directly as the change_cookie,
dropping the ctime packing, so the cookie is a single monotonic
counter that no longer depends on the clock. FreeBSD's va_filerev
consumer also takes the wider value.
[29 lines not shown]
[clang-doc] Try to make testing more uniform (#205586)
Today clang-doc has tests for its various backends that use the same
input files, and mix the checks for each format. This leads to very
large test files that are quite hard to update or maintain. Thus far
we've assumed that this is better than updating several files, but as we
leverage mustache and JSON more and more to test feature completeness,
much of the output complexity is now limited to each backend and its
mustache templates. To make this simpler to maintain, we can lean into
common test Inputs keeping the annotate source separate from the test
checks, and split the checks out into their own directory hierarchy.
This patch is mostly mechanical rewriting of code. This was done with
the assistance of an LLM, but was checked by me, and verified with
instrumentation based coverage that we did not lose any line coverage.
[AMDGPU] Fold constant offsets into named barrier addresses
Allow isOffsetFoldingLegal to fold a constant offset into an LDS
named-barrier global, and include the node offset when materializing the
LDS address in LowerGlobalAddress. s_barrier_signal_var on a GEP'd named
barrier now selects the immediate form, matching a bare global and GlobalISel.
With object linking the offset folds into the relocation addend.
The barrier ID is derived from the address via (addr >> 4) & 0x3F, so a
byte offset that does not land on a 16-byte barrier boundary is still
valid: it simply selects the containing barrier. No alignment assertion
is needed, and such offsets must not crash the compiler (see the
misaligned test).
Change-Id: I639bc723eb001573585cc05d0ad19f2773054f21
Assisted-by: Cursor
[lldb] Support indexes of hidden children in SBValue.child accessor (#205604)
A value object can have "hidden" children, which are children which can
be access by an index that is past the number of children reported by
the value object. This is used for example with `$$dereference$$`
children. It's also useful for designing a value object which shows only
a subset of its children.
This change updates the `SBValue.child` accessor to allow indexes that
are past the reported number children, to match the behavior or
`GetChildAtIndex`.
[flang][OpenMP] Delete definitions of non-delimited end-directives, NFC
Delimited directives are those that come in begin/end pairs, e.g. "begin
declare target"/"end declare target". Other block-associated directives
in Fortran do have end-forms, but they don't need to have specific
directive enums. Some such enums have been used in the past, but are not
anymore. Delete those extraneous definitions to clean up the OMP.td file.
[AMDGPU] Pre-commit test for constant-offset named barrier signal_var
A GEP into a named-barrier array (&bars[1]) lowers s_barrier_signal_var to
the dynamic m0 form on SelectionDAG, unlike the bare global and GlobalISel.
With object linking it emits a runtime add of the offset instead of folding
it into the relocation addend.
Also add a misaligned-offset test: a byte GEP that does not land on a
barrier boundary is valid IR and must not crash the compiler.
Change-Id: I7cea0dd64d050eb3e2143841e7136355cbb3bc50
Assisted-by: Cursor
netinet6: refactor in6_pcbconnect()
If the inpcb is already bound to a local address, there is no reason to
call in6_pcbladdr(). If the inpcb is already bound to a local port, there
is no reason to call in_pcb_lport_dest(). In the opposite case, if the
inpcb is not bound, and we are about to choose a non-conflicting local
addr:port, then there is no reason to call in6_pcblookup_internal().
This change makes in6_pcbconnect() to look much more alike the IPv4
in_pcbconnect(). I tracked this strange logic all the way down to initial
KAME import and failed to find any reasoning for it.
Reviewed by: pouria
Differential Revision: https://reviews.freebsd.org/D57534
[Clang] Optionally use NewPM to run CodeGen Pipeline (#191579)
This patch adds a new -cc1 option to clang that runs the codegen
pipeline using the NewPM. This enables easy testing of this flow through
clang.
[flang][cuda] Fix predefined variable processing with inlining (#205888)
The pass was skipping some variables when they were inlined inside a cuf
kernel for example.
[LV] Use getSmallBestKnownTC in IV-overflow-check (#195226)
It has the benefit of also handling scalable TCs.
Co-authored-by: Florian Hahn <flo at fhahn.com>
[AArch64] Fix stack protectors with tiny code model. (#205668)
The tiny code model was using the address of the stack protector as the
stack protector, which doesn't provide the expected protection. Fix it
to use the usual adrp+ldr.
[CIR] Fix return type of __cxa_atexit (#205905)
The return type should be 'int', not 'void'. We even have a comment
above the code that generates this that it should be an int.
This patch changes it and updates all the affected tests.