LLVM/project dc2b5bfflang/lib/Lower OpenACC.cpp, flang/lib/Semantics resolve-names.cpp

[flang][cuda][acc] Fix use_device device attribute for USE-renamed variables (#205902)

Example:
```fortran
module m
  complex(8), allocatable, pinned :: v(:,:)
  interface callee
    subroutine callee_x(x, n)
      complex(8), device :: x(:,:)
      integer :: n
    end subroutine
  end interface
end module

subroutine driver(n)
  use m, only : callee, v_renamed => v
  integer :: n
  !$acc data copy(v_renamed)
  !$acc host_data use_device(v_renamed)

    [11 lines not shown]
DeltaFile
+22-4flang/lib/Semantics/resolve-names.cpp
+16-0flang/test/Lower/OpenACC/acc-host-data-cuda-device.f90
+12-3flang/lib/Lower/OpenACC.cpp
+50-73 files

LLVM/project 6b42621lldb/source/Expression DWARFExpression.cpp, lldb/unittests/Expression DWARFExpressionTest.cpp

[lldb] Reject DW_OP_deref_size with size 0 (#205911)

`Evaluate_DW_OP_deref` validated that the dereference size was `<= 8`
but
not that it was non-zero.  The DWARF expression evaluator parses
untrusted operands, so a `DW_OP_deref_size` with size operand `0` is
reachable (it is hit by the lldb-dwarf-expression-fuzzer).

A zero dereference size flows into `DerefSizeExtractDataHelper`, which
constructs a `DataExtractor` with `addr_size == 0` and aborts on its
assertion. The unit test that feeds `DW_OP_lit0, DW_OP_deref_size, 0x00`
shows the crash:

```
[ RUN      ] DWARFExpressionMockProcessTest.DW_OP_deref_size_zero
Assertion failed: (addr_size >= 1 && addr_size <= 8), function
DataExtractor, file DataExtractor.cpp, line 134.
 #8  DataExtractor::DataExtractor(...)
 #11 DWARFExpression::Evaluate(...)

    [6 lines not shown]
DeltaFile
+16-0lldb/unittests/Expression/DWARFExpressionTest.cpp
+1-1lldb/source/Expression/DWARFExpression.cpp
+17-12 files

LLVM/project 38f6ccbclang/include/clang/Basic CodeGenOptions.def, clang/include/clang/Options Options.td

[Clang] Optionally use NewPM to run CodeGen Pipeline (#205928)

This patch adds a new -cc1 option to clang that runs the codegen
pipeline using the NewPM. This enables easy testing of this flow through
clang.

Lands #191579 actually, because that PR I accidentally landed into a
user branch.
DeltaFile
+77-17clang/lib/CodeGen/BackendUtil.cpp
+9-0clang/test/CodeGen/X86/newpm.c
+8-0clang/include/clang/Options/Options.td
+1-0clang/include/clang/Basic/CodeGenOptions.def
+95-174 files

LLVM/project 16ef565llvm/lib/Target/AMDGPU AMDGPUISelLowering.cpp SIISelLowering.cpp, llvm/test/CodeGen/AMDGPU s-barrier-signal-var-gep-misaligned.ll s-barrier-signal-var-gep.ll

[AMDGPU] Fold constant offsets into named barrier addresses

Allow isOffsetFoldingLegal to fold a constant offset into an LDS
named-barrier global, and include the node offset when materializing the
LDS address in LowerGlobalAddress. s_barrier_signal_var on a GEP'd named
barrier now selects the immediate form, matching a bare global and GlobalISel.
With object linking the offset folds into the relocation addend.

The barrier ID is derived from the address via (addr >> 4) & 0x3F, so a
byte offset that does not land on a 16-byte barrier boundary is still
valid: it simply selects the containing barrier. No alignment assertion
is needed, and such offsets must not crash the compiler (see the
misaligned test).

Change-Id: I639bc723eb001573585cc05d0ad19f2773054f21
Assisted-by: Cursor
DeltaFile
+0-65llvm/test/CodeGen/AMDGPU/s-barrier-signal-var-gep-misaligned.ll
+8-6llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp
+4-8llvm/test/CodeGen/AMDGPU/s-barrier-signal-var-gep.ll
+8-0llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+20-794 files

LLVM/project e983d49llvm/test/CodeGen/AMDGPU s-barrier-signal-var-gep-misaligned.ll

[AMDGPU] Pre-commit test for constant-offset named barrier signal_var

A GEP into a named-barrier array (&bars[1]) lowers s_barrier_signal_var to
the dynamic m0 form on SelectionDAG, unlike the bare global and GlobalISel.
With object linking it emits a runtime add of the offset instead of folding
it into the relocation addend.

Also add a misaligned-offset test: a byte GEP that does not land on a
barrier boundary is valid IR and must not crash the compiler.

Change-Id: I7cea0dd64d050eb3e2143841e7136355cbb3bc50
Assisted-by: Cursor
DeltaFile
+65-0llvm/test/CodeGen/AMDGPU/s-barrier-signal-var-gep-misaligned.ll
+65-01 files

LLVM/project 537b7f8llvm/include/llvm/CodeGen CommandFlags.h, llvm/lib/CodeGen CommandFlags.cpp

opt: Use Triple overloads of TargetRegistry lookup (#205889)

Continue migrating target creation functions to use
Triple instead of StringRef.
DeltaFile
+9-2llvm/lib/CodeGen/CommandFlags.cpp
+7-0llvm/include/llvm/CodeGen/CommandFlags.h
+4-2llvm/tools/llvm-isel-fuzzer/llvm-isel-fuzzer.cpp
+3-2llvm/tools/llvm-opt-fuzzer/llvm-opt-fuzzer.cpp
+1-1llvm/tools/llvm-reduce/ReducerWorkItem.cpp
+1-1llvm/tools/opt/optdriver.cpp
+25-81 files not shown
+26-97 files

LLVM/project 533321eclang/include/clang/Basic DiagnosticSemaKinds.td, clang/lib/Sema SemaOpenMP.cpp

Add fixes
DeltaFile
+11-1clang/test/OpenMP/teams_num_teams_messages.cpp
+9-0clang/lib/Sema/SemaOpenMP.cpp
+2-0clang/include/clang/Basic/DiagnosticSemaKinds.td
+2-0clang/tools/libclang/CIndex.cpp
+24-14 files

OpenZFS/src d3af7c2module/os/freebsd/zfs zfs_vnops_os.c zfs_dir.c, module/os/linux/zfs zfs_dir.c zfs_vnops_os.c

Assert SA bulk array bounds before sa_bulk_update

The SA bulk arrays are filled by series of conditional SA_ADD_BULK_ATTR
calls whose worst-case count is easy to miscount, so a future attribute
add could silently overrun the fixed-size array.
Add ASSERT3S(count, <=, ARRAY_SIZE(bulk)) before each sa_bulk_update so
an overrun trips in debug builds, and use ARRAY_SIZE so the bound stays
tied to the declaration. Linux zfs_setattr allocates its arrays on the
heap sized by 'bulks', so it asserts against that instead.
Also tighten FreeBSD zfs_setattr's xattr_bulk from [7] to [6], its
actual worst case.

Reviewed-by: Alexander Motin <alexander.motin at TrueNAS.com>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Ameer Hamza <ahamza at ixsystems.com>
Closes #18573
DeltaFile
+4-1module/os/freebsd/zfs/zfs_vnops_os.c
+4-0module/os/linux/zfs/zfs_dir.c
+4-0module/os/linux/zfs/zfs_vnops_os.c
+4-0module/os/freebsd/zfs/zfs_dir.c
+3-0module/zfs/zfs_vnops.c
+1-0module/os/freebsd/zfs/zfs_acl.c
+20-14 files not shown
+24-110 files

OpenZFS/src 1d98dfemodule/os/linux/zfs zpl_file.c, tests/runfiles linux.run

Update mtime/ctime when fallocate grows a file

Growing a file with fallocate updated its size but left mtime/ctime
unchanged and didn't log the change. A fallocate that changes the file
size should update mtime/ctime, and the change should be logged so it
survives a crash.
Pass log=TRUE to zfs_freesp() on the extend path so it updates the
timestamps and logs the size change, matching zfs_space(). Punch-hole
and zero-range already use this path and are unaffected.

Reviewed-by: Rob Norris <rob.norris at truenas.com>
Reviewed-by: Alexander Motin <alexander.motin at TrueNAS.com>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Ameer Hamza <ahamza at ixsystems.com>
Closes #18573
DeltaFile
+77-0tests/zfs-tests/tests/functional/fallocate/fallocate_extend_timestamps.ksh
+6-1module/os/linux/zfs/zpl_file.c
+1-1tests/runfiles/linux.run
+1-0tests/zfs-tests/tests/Makefile.am
+85-24 files

LLVM/project c7edbc0llvm/test/CodeGen/AMDGPU s-barrier-signal-var-gep.ll

[AMDGPU] Pre-commit test for constant-offset named barrier signal_var

A GEP into a named-barrier array (&bars[1]) lowers s_barrier_signal_var to
the dynamic m0 form on SelectionDAG, unlike the bare global and GlobalISel.
With object linking it emits a runtime add of the offset instead of folding
it into the relocation addend.

Also add a misaligned-offset test: a byte GEP that does not land on a
barrier boundary is valid IR and must not crash the compiler.

Change-Id: I7cea0dd64d050eb3e2143841e7136355cbb3bc50
Assisted-by: Cursor
DeltaFile
+258-0llvm/test/CodeGen/AMDGPU/s-barrier-signal-var-gep.ll
+258-01 files

LLVM/project 2c71c11lldb/include/lldb/Breakpoint BreakpointResolverName.h, lldb/source/Breakpoint BreakpointResolverName.cpp

[lldb] Replace ConstString in BreakpointResovlerName::AddNameLookup signature (#205924)
DeltaFile
+10-10lldb/source/Breakpoint/BreakpointResolverName.cpp
+1-1lldb/include/lldb/Breakpoint/BreakpointResolverName.h
+11-112 files

OpenZFS/src d0b43eeinclude/sys zfs_znode.h, module/os/freebsd/zfs zfs_vnops_os.c zfs_znode_os.c

Persist z_seq across znode eviction

Commit 312bdab0f5 advertises STATX_ATTR_CHANGE_MONOTONIC and builds
the NFSv4 change_cookie from (ctime.tv_sec << 32) | zp->z_seq.
zp->z_seq is reset to a magic constant in zfs_znode_alloc(), so any
event that drops the znode from cache (memory pressure, remount,
reboot) regresses the lower bits of the cookie, a backward step
within the same second.

NFSv4 clients that trust this contract treat a regressed cookie as
evidence that the file's metadata cannot be relied on. VMware ESXi
over NFSv4.1 surfaces this as "The file specified is not a virtual
disk", and a VM stored on the affected NFS-exported ZFS dataset
fails to power on.

Widen z_seq to 64 bit and present it directly as the change_cookie,
dropping the ctime packing, so the cookie is a single monotonic
counter that no longer depends on the clock. FreeBSD's va_filerev
consumer also takes the wider value.

    [29 lines not shown]
DeltaFile
+36-34module/os/linux/zfs/zfs_vnops_os.c
+24-15module/os/freebsd/zfs/zfs_vnops_os.c
+25-11module/zfs/zfs_vnops.c
+32-4module/os/linux/zfs/zfs_znode_os.c
+28-3module/os/freebsd/zfs/zfs_znode_os.c
+26-1include/sys/zfs_znode.h
+171-6810 files not shown
+247-8616 files

LLVM/project 0ce7d23clang-tools-extra/test/clang-doc enum.cpp templates.cpp, clang-tools-extra/test/clang-doc/html enum.cpp

[clang-doc] Try to make testing more uniform (#205586)

Today clang-doc has tests for its various backends that use the same
input files, and mix the checks for each format. This leads to very
large test files that are quite hard to update or maintain. Thus far
we've assumed that this is better than updating several files, but as we
leverage mustache and JSON more and more to test feature completeness,
much of the output complexity is now limited to each backend and its
mustache templates. To make this simpler to maintain, we can lean into
common test Inputs keeping the annotate source separate from the test
checks, and split the checks out into their own directory hierarchy.
This patch is mostly mechanical rewriting of code. This was done with
the assistance of an LLM, but was checked by me, and verified with
instrumentation based coverage that we did not lose any line coverage.
DeltaFile
+0-839clang-tools-extra/test/clang-doc/enum.cpp
+0-491clang-tools-extra/test/clang-doc/templates.cpp
+0-380clang-tools-extra/test/clang-doc/namespace.cpp
+377-0clang-tools-extra/test/clang-doc/json/enum.cpp
+354-0clang-tools-extra/test/clang-doc/html/enum.cpp
+196-0clang-tools-extra/test/clang-doc/json/templates.cpp
+927-1,710106 files not shown
+2,716-2,827112 files

LLVM/project bb02279llvm/test/CodeGen/AMDGPU div_v2i128.ll bf16.ll, llvm/test/CodeGen/AMDGPU/GlobalISel udiv.i64.ll urem.i64.ll

Merge branch 'main' into users/aokblast/moneypunct_fbsd_test
DeltaFile
+2,592-2,587llvm/test/CodeGen/AMDGPU/div_v2i128.ll
+1,940-1,931llvm/test/CodeGen/AMDGPU/bf16.ll
+1,410-1,359llvm/test/CodeGen/AMDGPU/GlobalISel/udiv.i64.ll
+1,351-1,351llvm/test/CodeGen/AMDGPU/GlobalISel/urem.i64.ll
+1,701-810llvm/test/CodeGen/AMDGPU/llvm.set.rounding.ll
+1,555-816mlir/lib/Dialect/XeGPU/Transforms/XeGPULayoutImpl.cpp
+10,549-8,8542,914 files not shown
+105,678-92,8572,920 files

LLVM/project e749849flang/test/Semantics cuf09.cuf cuf27.cuf, flang/test/Semantics/CUDA cuf09.cuf cuf27.cuf

[flang][cuda][NFC] Move CUDA Fortran semantic tests to a dedicated directory (#205927)
DeltaFile
+283-0flang/test/Semantics/CUDA/cuf09.cuf
+0-283flang/test/Semantics/cuf09.cuf
+0-116flang/test/Semantics/cuf27.cuf
+116-0flang/test/Semantics/CUDA/cuf27.cuf
+0-110flang/test/Semantics/cuf03.cuf
+110-0flang/test/Semantics/CUDA/cuf03.cuf
+509-50972 files not shown
+2,028-2,02878 files

LLVM/project de4a78d

Merge branch 'users/chinmaydd/gep-fix-1' into users/chinmaydd/named-barrier-gep-offset
DeltaFile
+0-00 files

LLVM/project 720a24allvm/lib/Target/AMDGPU AMDGPUISelLowering.cpp SIISelLowering.cpp, llvm/test/CodeGen/AMDGPU s-barrier-signal-var-gep.ll s-barrier-signal-var-gep-misaligned.ll

[AMDGPU] Fold constant offsets into named barrier addresses

Allow isOffsetFoldingLegal to fold a constant offset into an LDS
named-barrier global, and include the node offset when materializing the
LDS address in LowerGlobalAddress. s_barrier_signal_var on a GEP'd named
barrier now selects the immediate form, matching a bare global and GlobalISel.
With object linking the offset folds into the relocation addend.

The barrier ID is derived from the address via (addr >> 4) & 0x3F, so a
byte offset that does not land on a 16-byte barrier boundary is still
valid: it simply selects the containing barrier. No alignment assertion
is needed, and such offsets must not crash the compiler (see the
misaligned test).

Change-Id: I639bc723eb001573585cc05d0ad19f2773054f21
Assisted-by: Cursor
DeltaFile
+64-10llvm/test/CodeGen/AMDGPU/s-barrier-signal-var-gep.ll
+0-65llvm/test/CodeGen/AMDGPU/s-barrier-signal-var-gep-misaligned.ll
+8-6llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp
+8-0llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+80-814 files

LLVM/project bbba463lldb/bindings/interface SBValueExtensions.i, lldb/test/API/functionalities/data-formatter/data-formatter-python-synth TestDataFormatterPythonSynth.py

[lldb] Support indexes of hidden children in SBValue.child accessor (#205604)

A value object can have "hidden" children, which are children which can
be access by an index that is past the number of children reported by
the value object. This is used for example with `$$dereference$$`
children. It's also useful for designing a value object which shows only
a subset of its children.

This change updates the `SBValue.child` accessor to allow indexes that
are past the reported number children, to match the behavior or
`GetChildAtIndex`.
DeltaFile
+10-0lldb/test/API/functionalities/data-formatter/data-formatter-python-synth/TestDataFormatterPythonSynth.py
+4-3lldb/bindings/interface/SBValueExtensions.i
+14-32 files

LLVM/project acc5799llvm/include/llvm/Frontend/OpenMP OMP.td

[flang][OpenMP] Delete definitions of non-delimited end-directives, NFC

Delimited directives are those that come in begin/end pairs, e.g. "begin
declare target"/"end declare target". Other block-associated directives
in Fortran do have end-forms, but they don't need to have specific
directive enums. Some such enums have been used in the past, but are not
anymore. Delete those extraneous definitions to clean up the OMP.td file.
DeltaFile
+1-65llvm/include/llvm/Frontend/OpenMP/OMP.td
+1-651 files

LLVM/project 1b7faebllvm/test/CodeGen/AMDGPU s-barrier-signal-var-gep.ll

[AMDGPU] Pre-commit test for constant-offset named barrier signal_var

A GEP into a named-barrier array (&bars[1]) lowers s_barrier_signal_var to
the dynamic m0 form on SelectionDAG, unlike the bare global and GlobalISel.
With object linking it emits a runtime add of the offset instead of folding
it into the relocation addend.

Also add a misaligned-offset test: a byte GEP that does not land on a
barrier boundary is valid IR and must not crash the compiler.

Change-Id: I7cea0dd64d050eb3e2143841e7136355cbb3bc50
Assisted-by: Cursor
DeltaFile
+177-0llvm/test/CodeGen/AMDGPU/s-barrier-signal-var-gep.ll
+177-01 files

LLVM/project 852c60fclang-tools-extra/test/clang-doc basic-project.mustache.test, clang-tools-extra/test/clang-doc/html enum.cpp

fix more whitespace issues
DeltaFile
+0-11clang-tools-extra/test/clang-doc/html/enum.cpp
+0-9clang-tools-extra/test/clang-doc/md/enum.cpp
+0-3clang-tools-extra/test/clang-doc/basic-project.mustache.test
+0-233 files

LLVM/project d1c291bflang/test/Semantics/OpenMP atomic01.f90 clause-validity01.f90

[flang][OpenMP] Unify wording of directive names in diagnostics (#205608)
DeltaFile
+63-63flang/test/Semantics/OpenMP/atomic01.f90
+28-28flang/test/Semantics/OpenMP/clause-validity01.f90
+27-27flang/test/Semantics/OpenMP/combined-constructs.f90
+24-24flang/test/Semantics/OpenMP/if-clause-45.f90
+14-14flang/test/Semantics/OpenMP/device-constructs.f90
+10-10flang/test/Semantics/OpenMP/atomic-compare.f90
+166-16643 files not shown
+255-25549 files

FreeBSD/src 90ea8e8sys/netinet6 in6_pcb.c

netinet6: refactor in6_pcbconnect()

If the inpcb is already bound to a local address, there is no reason to
call in6_pcbladdr().  If the inpcb is already bound to a local port, there
is no reason to call in_pcb_lport_dest().  In the opposite case, if the
inpcb is not bound, and we are about to choose a non-conflicting local
addr:port, then there is no reason to call in6_pcblookup_internal().

This change makes in6_pcbconnect() to look much more alike the IPv4
in_pcbconnect().  I tracked this strange logic all the way down to initial
KAME import and failed to find any reasoning for it.

Reviewed by:            pouria
Differential Revision:  https://reviews.freebsd.org/D57534
DeltaFile
+17-17sys/netinet6/in6_pcb.c
+17-171 files

LLVM/project a2b1a42clang/include/clang/Basic CodeGenOptions.def, clang/include/clang/Options Options.td

[Clang] Optionally use NewPM to run CodeGen Pipeline (#191579)

This patch adds a new -cc1 option to clang that runs the codegen
pipeline using the NewPM. This enables easy testing of this flow through
clang.
DeltaFile
+77-17clang/lib/CodeGen/BackendUtil.cpp
+9-0clang/test/CodeGen/X86/newpm.c
+8-0clang/include/clang/Options/Options.td
+1-0clang/include/clang/Basic/CodeGenOptions.def
+95-174 files

LLVM/project f60f84fllvm/include/llvm/CodeGen CommandFlags.h

Apply suggestions from code review

Co-authored-by: Matt Arsenault <arsenm2 at gmail.com>
DeltaFile
+1-0llvm/include/llvm/CodeGen/CommandFlags.h
+1-01 files

LLVM/project 67ab8b4flang/lib/Optimizer/Transforms/CUDA CUFPredefinedVarToGPU.cpp, flang/test/Fir/CUDA predefined-variables.mlir

[flang][cuda] Fix predefined variable processing with inlining (#205888)

The pass was skipping some variables when they were inlined inside a cuf
kernel for example.
DeltaFile
+83-0flang/test/Fir/CUDA/predefined-variables.mlir
+13-11flang/lib/Optimizer/Transforms/CUDA/CUFPredefinedVarToGPU.cpp
+96-112 files

LLVM/project 53e0a1fllvm/lib/Transforms/Vectorize LoopVectorize.cpp, llvm/test/Transforms/LoopVectorize/AArch64 indvar-overflow-check-scalable-tc.ll

[LV] Use getSmallBestKnownTC in IV-overflow-check (#195226)

It has the benefit of also handling scalable TCs.

Co-authored-by: Florian Hahn <flo at fhahn.com>
DeltaFile
+158-0llvm/test/Transforms/LoopVectorize/AArch64/indvar-overflow-check-scalable-tc.ll
+97-0llvm/test/Transforms/LoopVectorize/RISCV/indvar-overflow-check-scalable-tc.ll
+22-11llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+277-113 files

LLVM/project b05f34cflang/lib/Lower Bridge.cpp, flang/test/Lower/CUDA cuda-data-transfer.cuf

[flang][cuda] Do not add shape when no references are involved (#205919)
DeltaFile
+17-0flang/test/Lower/CUDA/cuda-data-transfer.cuf
+8-3flang/lib/Lower/Bridge.cpp
+25-32 files

LLVM/project 52de711llvm/lib/Target/AArch64 AArch64InstrInfo.cpp, llvm/test/CodeGen/AArch64 tiny-model-static.ll

[AArch64] Fix stack protectors with tiny code model. (#205668)

The tiny code model was using the address of the stack protector as the
stack protector, which doesn't provide the expected protection. Fix it
to use the usual adrp+ldr.
DeltaFile
+79-2llvm/test/CodeGen/AArch64/tiny-model-static.ll
+0-8llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
+79-102 files

LLVM/project abe675fclang/test/CIR/CodeGen global-tls-dyn-init.cpp global-init.cpp

[CIR] Fix return type of __cxa_atexit (#205905)

The return type should be 'int', not 'void'. We even have a comment
above the code that generates this that it should be an int.

This patch changes it and updates all the affected tests.
DeltaFile
+6-6clang/test/CIR/CodeGen/global-tls-dyn-init.cpp
+6-6clang/test/CIR/CodeGen/global-init.cpp
+6-6clang/test/CIR/CodeGen/thread-local-in-func.cpp
+4-5clang/test/CIR/CodeGen/global-temp-dtor.cpp
+3-3clang/test/CIR/CodeGen/static-members.cpp
+2-2clang/test/CIR/CodeGen/global-array-dtor.cpp
+27-286 files not shown
+36-3612 files