LLVM/project f5d5ff9flang/docs OpenACC-extensions.md, flang/include/flang/Semantics semantics.h

[flang][semantics][OpenACC] Warn for DEFAULT(NONE) scalars by default (#205683)

Change OpenACC `DEFAULT(NONE)` scalar handling to use the
pre-OpenACC-3.2 scalar behavior by default while emitting a warning.
Scalars referenced in a `default(none)` compute region without an
explicit data clause now warn by default instead of erroring. Arrays and
other non-scalars still error under `default(none)`.

Users can opt into OpenACC 3.2 strict scalar behavior with:
`-fopenacc-default-none-scalars-strict` and the default scalar warning
can be suppressed with: `-Wno-openacc-default-none-scalars-strict`
DeltaFile
+13-17flang/docs/OpenACC-extensions.md
+28-0flang/unittests/Frontend/CompilerInstanceTest.cpp
+15-12flang/test/Semantics/OpenACC/acc-default-none-scalars-strict.f90
+12-12flang/test/Semantics/OpenACC/acc-default-none-scalars.f90
+9-0flang/include/flang/Semantics/semantics.h
+7-2flang/lib/Frontend/CompilerInvocation.cpp
+84-437 files not shown
+97-5013 files

FreeBSD/ports b60f4fbnet/bsdec2-image-upload distinfo Makefile

net/bsdec2-image-upload: update to 1.4.11
DeltaFile
+3-3net/bsdec2-image-upload/distinfo
+1-1net/bsdec2-image-upload/Makefile
+4-42 files

FreeBSD/src 240330alib/libusb libusb20_ugen20.c

libusb: don't treat EINVAL from USB_FS_COMPLETE as device detach

ugen20_process() treats any non-EBUSY errno returned by USB_FS_COMPLETE
as device detach and returns LIBUSB20_ERROR_OTHER. This causes libusb10
to set device_is_gone and fail all subsequent transfer with
LIBUSB_ERROR_NO_DEVICE.

However, USB_FS_COMPLETE can also return EINVAL when a completion
references an endpoint that no longer exists, for example after
SET_INTERFACE or SET_CONFIG removes and recreates endpoints. This is a
transient condition and does not indicate device detach.

Treat EINVAL the same as EBUSY and stop draining completions. This
prevents a guest selecting an isochronous streaming altsetting from
permanently breaking the passed-through device.

Reviewed by:    bapt
Event:          Halifax Hackathon 202606
Location:       Peggy's Cove Rock

    [2 lines not shown]
DeltaFile
+10-1lib/libusb/libusb20_ugen20.c
+10-11 files

LLVM/project c580406clang/lib/CIR/CodeGen CIRGenFunction.cpp CIRGenModule.cpp, clang/test/CIR/CodeGen goto-address-label-table.c label-values.c

[CIR] Wire const goto labels into indirect branch (#201644)

A computed goto through a constant dispatch table -- the GNU static
dispatch-table idiom `static const void *tbl[] = {&&L1, &&L2}; goto *tbl[i];`
-- reached `errorNYI("Indirect goto without a goto block")` in
`emitIndirectGotoStmt`. #203644 emits the label-address constant (the
value-like `#cir.block_addr_info`) into the table, but it takes a label's
address in a constant context without registering the label as address-taken,
so no indirect-goto block exists for the following `goto *tbl[i]` to branch to.
(#203644 landed the constant attribute, its lowering, and the GotoSolver label
retention; this is the remaining dispatch wiring.)

`VisitAddrLabelExpr` in the constant emitter now records each label via
`takeAddressOfConstantLabel`, which instantiates the indirect-goto block and
tracks the label; `finishIndirectBranch` then adds those labels as
`cir.indirect_br` successors alongside the existing op-form labels. A label
named more than once in a table is kept as a distinct successor each time, to
match classic codegen.


    [8 lines not shown]
DeltaFile
+104-0clang/test/CIR/CodeGen/goto-address-label-table.c
+23-29clang/lib/CIR/CodeGen/CIRGenFunction.cpp
+9-38clang/test/CIR/CodeGen/label-values.c
+0-23clang/lib/CIR/CodeGen/CIRGenModule.cpp
+12-4clang/lib/CIR/CodeGen/CIRGenExprConstant.cpp
+9-4clang/lib/CIR/CodeGen/CIRGenFunction.h
+157-983 files not shown
+162-1269 files

LLVM/project 7cb370dlldb/include/lldb/Breakpoint BreakpointName.h, lldb/source/Target Target.cpp

[lldb] Replace ConstString with std::string in BreakpointName (#205910)
DeltaFile
+9-10lldb/include/lldb/Breakpoint/BreakpointName.h
+3-2lldb/source/Target/Target.cpp
+12-122 files

LLVM/project 0c4cc9fclang/include/clang/Basic CodeGenOptions.def, clang/include/clang/Options Options.td

Revert "[Clang] Optionally use NewPM to run CodeGen Pipeline" (#205943)

Reverts llvm/llvm-project#205928

Is missing dependencies in a shared libraries build. Will investigate
offline.
DeltaFile
+17-77clang/lib/CodeGen/BackendUtil.cpp
+0-9clang/test/CodeGen/X86/newpm.c
+0-8clang/include/clang/Options/Options.td
+0-1clang/include/clang/Basic/CodeGenOptions.def
+17-954 files

LLVM/project eea0690llvm/lib/Transforms/Vectorize SLPVectorizer.cpp, llvm/test/Transforms/SLPVectorizer/X86 copyable-reduced-erased.ll

[SLP]Fix crash erasing reduced value extract still used by reduction

A reduced value vectorized in an operand subtree is replaced by an
extractelement that can be excluded from another reduction group's
candidates as incompatible, yet it is still consumed by the final
reduction. Keep such excluded extracts externally used so they are not
erased while vectorizing that group.

Fixes #205886

Reviewers: 

Pull Request: https://github.com/llvm/llvm-project/pull/205942
DeltaFile
+64-0llvm/test/Transforms/SLPVectorizer/X86/copyable-reduced-erased.ll
+17-5llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+81-52 files

LLVM/project 294f9beutils/bazel/llvm-project-overlay/clang/unittests BUILD.bazel

[Bazel] Fixes a18c09a (#205917)

This fixes a18c09a05d6d6a950b1b7cd9c59a43fcee5cb442.

Co-authored-by: Google Bazel Bot <google-bazel-bot at google.com>
DeltaFile
+2-0utils/bazel/llvm-project-overlay/clang/unittests/BUILD.bazel
+2-01 files

NetBSD/src SZylNScusr.bin/mail thread.c fio.c

   PR bin/59635 - src/usr.bin/mail: fix post realloc() cleanup

   This is a rather hackish solution, much better would be to abandon the
   pointers altogether, and simply use message offsets (ints) into the array
   to provide the relationships between messages.

   Or abandon the message array (and the need for realloc() along with it)
   and replace it with a list.

   Both methods would achieve the aim of getting rid of the need to go and
   massage the data to keep things correct when a realloc moves things around.

   Either would require more changes in more places that this crude change,
   and to get this done before -11 gets released, the few changes the better.

   Another possibility would be to just revert to the adjustment method used
   in -10 (which looks like it should work to me - but I don't know why it
   was changed).


    [4 lines not shown]
VersionDeltaFile
1.17+48-15usr.bin/mail/thread.c
1.46+15-11usr.bin/mail/fio.c
1.29+16-6usr.bin/mail/def.h
1.4+3-2usr.bin/mail/thread.h
+82-344 files

LLVM/project b0735fdclang/unittests/ScalableStaticAnalysis/Analyses/PointerFlow PointerFlowTest.cpp

[SSAF][PointerFlow] Upstream Reference-to-pointer binding tests

The majority of the content of rdar://179151476 duplicates the
PointerFlow analysis after
https://github.com/llvm/llvm-project/pull/203633.  Therefore, we only
need to upstream the tests for better test coverage and proving the
duplication.

rdar://179151476
DeltaFile
+134-0clang/unittests/ScalableStaticAnalysis/Analyses/PointerFlow/PointerFlowTest.cpp
+134-01 files

LLVM/project 62b9b98llvm/include/llvm/IR IntrinsicsAMDGPU.td, llvm/lib/Target/AMDGPU AMDGPUInstructionSelector.cpp SIISelLowering.cpp

[AMDGPU] Guard more intrinsics with target features
DeltaFile
+1-51llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp
+0-42llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+0-24llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
+15-2llvm/include/llvm/IR/IntrinsicsAMDGPU.td
+4-4llvm/test/CodeGen/AMDGPU/unsupported-av-store.ll
+4-4llvm/test/CodeGen/AMDGPU/unsupported-av-load.ll
+24-12712 files not shown
+47-14318 files

LLVM/project 44ae8a8clang/lib/CodeGen CodeGenAction.cpp, llvm/lib/CodeGen/SelectionDAG SelectionDAGBuilder.cpp

[RFC][CodeGen] Add generic target feature checks for intrinsics

This PR adds target-independent infrastructure for annotating LLVM intrinsics
with required subtarget feature expressions.

It introduces a TargetFeatures string field to intrinsic TableGen records.
TableGen emits an intrinsic-to-feature mapping table.

Both SelectionDAG and GlobalISel now perform this check before lowering target
intrinsics. This allows targets to opt in by annotating intrinsic definitions
directly, rather than adding custom checks during lowering, legalization, or
instruction selection.

This PR uses one AMDGPU intrinsic as an example.
DeltaFile
+96-3llvm/lib/MC/MCSubtargetInfo.cpp
+38-0clang/lib/CodeGen/CodeGenAction.cpp
+33-1llvm/utils/TableGen/Basic/IntrinsicEmitter.cpp
+31-0llvm/lib/IR/DiagnosticInfo.cpp
+28-0llvm/test/TableGen/intrinsic-target-features.td
+25-0llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
+251-414 files not shown
+361-920 files

LLVM/project 276c06fflang/lib/Semantics check-omp-structure.cpp check-omp-structure.h

[flang][OpenMP] Delete no longer needed CheckAllowedClause

This removes the older overload of CheckAllowedClause(clauseId).
After 0f1abfe0af that function was no longer doing anything.
DeltaFile
+17-158flang/lib/Semantics/check-omp-structure.cpp
+58-3flang/lib/Semantics/check-omp-structure.h
+0-8flang/lib/Semantics/check-omp-loop.cpp
+0-1flang/lib/Semantics/check-omp-variant.cpp
+75-1704 files

LLVM/project dc2b5bfflang/lib/Lower OpenACC.cpp, flang/lib/Semantics resolve-names.cpp

[flang][cuda][acc] Fix use_device device attribute for USE-renamed variables (#205902)

Example:
```fortran
module m
  complex(8), allocatable, pinned :: v(:,:)
  interface callee
    subroutine callee_x(x, n)
      complex(8), device :: x(:,:)
      integer :: n
    end subroutine
  end interface
end module

subroutine driver(n)
  use m, only : callee, v_renamed => v
  integer :: n
  !$acc data copy(v_renamed)
  !$acc host_data use_device(v_renamed)

    [11 lines not shown]
DeltaFile
+22-4flang/lib/Semantics/resolve-names.cpp
+16-0flang/test/Lower/OpenACC/acc-host-data-cuda-device.f90
+12-3flang/lib/Lower/OpenACC.cpp
+50-73 files

LLVM/project 6b42621lldb/source/Expression DWARFExpression.cpp, lldb/unittests/Expression DWARFExpressionTest.cpp

[lldb] Reject DW_OP_deref_size with size 0 (#205911)

`Evaluate_DW_OP_deref` validated that the dereference size was `<= 8`
but
not that it was non-zero.  The DWARF expression evaluator parses
untrusted operands, so a `DW_OP_deref_size` with size operand `0` is
reachable (it is hit by the lldb-dwarf-expression-fuzzer).

A zero dereference size flows into `DerefSizeExtractDataHelper`, which
constructs a `DataExtractor` with `addr_size == 0` and aborts on its
assertion. The unit test that feeds `DW_OP_lit0, DW_OP_deref_size, 0x00`
shows the crash:

```
[ RUN      ] DWARFExpressionMockProcessTest.DW_OP_deref_size_zero
Assertion failed: (addr_size >= 1 && addr_size <= 8), function
DataExtractor, file DataExtractor.cpp, line 134.
 #8  DataExtractor::DataExtractor(...)
 #11 DWARFExpression::Evaluate(...)

    [6 lines not shown]
DeltaFile
+16-0lldb/unittests/Expression/DWARFExpressionTest.cpp
+1-1lldb/source/Expression/DWARFExpression.cpp
+17-12 files

LLVM/project 38f6ccbclang/include/clang/Basic CodeGenOptions.def, clang/include/clang/Options Options.td

[Clang] Optionally use NewPM to run CodeGen Pipeline (#205928)

This patch adds a new -cc1 option to clang that runs the codegen
pipeline using the NewPM. This enables easy testing of this flow through
clang.

Lands #191579 actually, because that PR I accidentally landed into a
user branch.
DeltaFile
+77-17clang/lib/CodeGen/BackendUtil.cpp
+9-0clang/test/CodeGen/X86/newpm.c
+8-0clang/include/clang/Options/Options.td
+1-0clang/include/clang/Basic/CodeGenOptions.def
+95-174 files

LLVM/project 16ef565llvm/lib/Target/AMDGPU AMDGPUISelLowering.cpp SIISelLowering.cpp, llvm/test/CodeGen/AMDGPU s-barrier-signal-var-gep-misaligned.ll s-barrier-signal-var-gep.ll

[AMDGPU] Fold constant offsets into named barrier addresses

Allow isOffsetFoldingLegal to fold a constant offset into an LDS
named-barrier global, and include the node offset when materializing the
LDS address in LowerGlobalAddress. s_barrier_signal_var on a GEP'd named
barrier now selects the immediate form, matching a bare global and GlobalISel.
With object linking the offset folds into the relocation addend.

The barrier ID is derived from the address via (addr >> 4) & 0x3F, so a
byte offset that does not land on a 16-byte barrier boundary is still
valid: it simply selects the containing barrier. No alignment assertion
is needed, and such offsets must not crash the compiler (see the
misaligned test).

Change-Id: I639bc723eb001573585cc05d0ad19f2773054f21
Assisted-by: Cursor
DeltaFile
+0-65llvm/test/CodeGen/AMDGPU/s-barrier-signal-var-gep-misaligned.ll
+8-6llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp
+4-8llvm/test/CodeGen/AMDGPU/s-barrier-signal-var-gep.ll
+8-0llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+20-794 files

LLVM/project e983d49llvm/test/CodeGen/AMDGPU s-barrier-signal-var-gep-misaligned.ll

[AMDGPU] Pre-commit test for constant-offset named barrier signal_var

A GEP into a named-barrier array (&bars[1]) lowers s_barrier_signal_var to
the dynamic m0 form on SelectionDAG, unlike the bare global and GlobalISel.
With object linking it emits a runtime add of the offset instead of folding
it into the relocation addend.

Also add a misaligned-offset test: a byte GEP that does not land on a
barrier boundary is valid IR and must not crash the compiler.

Change-Id: I7cea0dd64d050eb3e2143841e7136355cbb3bc50
Assisted-by: Cursor
DeltaFile
+65-0llvm/test/CodeGen/AMDGPU/s-barrier-signal-var-gep-misaligned.ll
+65-01 files

LLVM/project 537b7f8llvm/include/llvm/CodeGen CommandFlags.h, llvm/lib/CodeGen CommandFlags.cpp

opt: Use Triple overloads of TargetRegistry lookup (#205889)

Continue migrating target creation functions to use
Triple instead of StringRef.
DeltaFile
+9-2llvm/lib/CodeGen/CommandFlags.cpp
+7-0llvm/include/llvm/CodeGen/CommandFlags.h
+4-2llvm/tools/llvm-isel-fuzzer/llvm-isel-fuzzer.cpp
+3-2llvm/tools/llvm-opt-fuzzer/llvm-opt-fuzzer.cpp
+1-1llvm/tools/llvm-reduce/ReducerWorkItem.cpp
+1-1llvm/tools/opt/optdriver.cpp
+25-81 files not shown
+26-97 files

LLVM/project 533321eclang/include/clang/Basic DiagnosticSemaKinds.td, clang/lib/Sema SemaOpenMP.cpp

Add fixes
DeltaFile
+11-1clang/test/OpenMP/teams_num_teams_messages.cpp
+9-0clang/lib/Sema/SemaOpenMP.cpp
+2-0clang/include/clang/Basic/DiagnosticSemaKinds.td
+2-0clang/tools/libclang/CIndex.cpp
+24-14 files

OpenZFS/src d3af7c2module/os/freebsd/zfs zfs_vnops_os.c zfs_dir.c, module/os/linux/zfs zfs_dir.c zfs_vnops_os.c

Assert SA bulk array bounds before sa_bulk_update

The SA bulk arrays are filled by series of conditional SA_ADD_BULK_ATTR
calls whose worst-case count is easy to miscount, so a future attribute
add could silently overrun the fixed-size array.
Add ASSERT3S(count, <=, ARRAY_SIZE(bulk)) before each sa_bulk_update so
an overrun trips in debug builds, and use ARRAY_SIZE so the bound stays
tied to the declaration. Linux zfs_setattr allocates its arrays on the
heap sized by 'bulks', so it asserts against that instead.
Also tighten FreeBSD zfs_setattr's xattr_bulk from [7] to [6], its
actual worst case.

Reviewed-by: Alexander Motin <alexander.motin at TrueNAS.com>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Ameer Hamza <ahamza at ixsystems.com>
Closes #18573
DeltaFile
+4-1module/os/freebsd/zfs/zfs_vnops_os.c
+4-0module/os/linux/zfs/zfs_dir.c
+4-0module/os/linux/zfs/zfs_vnops_os.c
+4-0module/os/freebsd/zfs/zfs_dir.c
+3-0module/zfs/zfs_vnops.c
+1-0module/os/freebsd/zfs/zfs_acl.c
+20-14 files not shown
+24-110 files

OpenZFS/src 1d98dfemodule/os/linux/zfs zpl_file.c, tests/runfiles linux.run

Update mtime/ctime when fallocate grows a file

Growing a file with fallocate updated its size but left mtime/ctime
unchanged and didn't log the change. A fallocate that changes the file
size should update mtime/ctime, and the change should be logged so it
survives a crash.
Pass log=TRUE to zfs_freesp() on the extend path so it updates the
timestamps and logs the size change, matching zfs_space(). Punch-hole
and zero-range already use this path and are unaffected.

Reviewed-by: Rob Norris <rob.norris at truenas.com>
Reviewed-by: Alexander Motin <alexander.motin at TrueNAS.com>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Ameer Hamza <ahamza at ixsystems.com>
Closes #18573
DeltaFile
+77-0tests/zfs-tests/tests/functional/fallocate/fallocate_extend_timestamps.ksh
+6-1module/os/linux/zfs/zpl_file.c
+1-1tests/runfiles/linux.run
+1-0tests/zfs-tests/tests/Makefile.am
+85-24 files

LLVM/project c7edbc0llvm/test/CodeGen/AMDGPU s-barrier-signal-var-gep.ll

[AMDGPU] Pre-commit test for constant-offset named barrier signal_var

A GEP into a named-barrier array (&bars[1]) lowers s_barrier_signal_var to
the dynamic m0 form on SelectionDAG, unlike the bare global and GlobalISel.
With object linking it emits a runtime add of the offset instead of folding
it into the relocation addend.

Also add a misaligned-offset test: a byte GEP that does not land on a
barrier boundary is valid IR and must not crash the compiler.

Change-Id: I7cea0dd64d050eb3e2143841e7136355cbb3bc50
Assisted-by: Cursor
DeltaFile
+258-0llvm/test/CodeGen/AMDGPU/s-barrier-signal-var-gep.ll
+258-01 files

LLVM/project 2c71c11lldb/include/lldb/Breakpoint BreakpointResolverName.h, lldb/source/Breakpoint BreakpointResolverName.cpp

[lldb] Replace ConstString in BreakpointResovlerName::AddNameLookup signature (#205924)
DeltaFile
+10-10lldb/source/Breakpoint/BreakpointResolverName.cpp
+1-1lldb/include/lldb/Breakpoint/BreakpointResolverName.h
+11-112 files

OpenZFS/src d0b43eeinclude/sys zfs_znode.h, module/os/freebsd/zfs zfs_vnops_os.c zfs_znode_os.c

Persist z_seq across znode eviction

Commit 312bdab0f5 advertises STATX_ATTR_CHANGE_MONOTONIC and builds
the NFSv4 change_cookie from (ctime.tv_sec << 32) | zp->z_seq.
zp->z_seq is reset to a magic constant in zfs_znode_alloc(), so any
event that drops the znode from cache (memory pressure, remount,
reboot) regresses the lower bits of the cookie, a backward step
within the same second.

NFSv4 clients that trust this contract treat a regressed cookie as
evidence that the file's metadata cannot be relied on. VMware ESXi
over NFSv4.1 surfaces this as "The file specified is not a virtual
disk", and a VM stored on the affected NFS-exported ZFS dataset
fails to power on.

Widen z_seq to 64 bit and present it directly as the change_cookie,
dropping the ctime packing, so the cookie is a single monotonic
counter that no longer depends on the clock. FreeBSD's va_filerev
consumer also takes the wider value.

    [29 lines not shown]
DeltaFile
+36-34module/os/linux/zfs/zfs_vnops_os.c
+24-15module/os/freebsd/zfs/zfs_vnops_os.c
+25-11module/zfs/zfs_vnops.c
+32-4module/os/linux/zfs/zfs_znode_os.c
+28-3module/os/freebsd/zfs/zfs_znode_os.c
+26-1include/sys/zfs_znode.h
+171-6810 files not shown
+247-8616 files

LLVM/project 0ce7d23clang-tools-extra/test/clang-doc enum.cpp templates.cpp, clang-tools-extra/test/clang-doc/html enum.cpp

[clang-doc] Try to make testing more uniform (#205586)

Today clang-doc has tests for its various backends that use the same
input files, and mix the checks for each format. This leads to very
large test files that are quite hard to update or maintain. Thus far
we've assumed that this is better than updating several files, but as we
leverage mustache and JSON more and more to test feature completeness,
much of the output complexity is now limited to each backend and its
mustache templates. To make this simpler to maintain, we can lean into
common test Inputs keeping the annotate source separate from the test
checks, and split the checks out into their own directory hierarchy.
This patch is mostly mechanical rewriting of code. This was done with
the assistance of an LLM, but was checked by me, and verified with
instrumentation based coverage that we did not lose any line coverage.
DeltaFile
+0-839clang-tools-extra/test/clang-doc/enum.cpp
+0-491clang-tools-extra/test/clang-doc/templates.cpp
+0-380clang-tools-extra/test/clang-doc/namespace.cpp
+377-0clang-tools-extra/test/clang-doc/json/enum.cpp
+354-0clang-tools-extra/test/clang-doc/html/enum.cpp
+196-0clang-tools-extra/test/clang-doc/json/templates.cpp
+927-1,710106 files not shown
+2,716-2,827112 files

LLVM/project bb02279llvm/test/CodeGen/AMDGPU div_v2i128.ll bf16.ll, llvm/test/CodeGen/AMDGPU/GlobalISel udiv.i64.ll urem.i64.ll

Merge branch 'main' into users/aokblast/moneypunct_fbsd_test
DeltaFile
+2,592-2,587llvm/test/CodeGen/AMDGPU/div_v2i128.ll
+1,940-1,931llvm/test/CodeGen/AMDGPU/bf16.ll
+1,410-1,359llvm/test/CodeGen/AMDGPU/GlobalISel/udiv.i64.ll
+1,351-1,351llvm/test/CodeGen/AMDGPU/GlobalISel/urem.i64.ll
+1,701-810llvm/test/CodeGen/AMDGPU/llvm.set.rounding.ll
+1,555-816mlir/lib/Dialect/XeGPU/Transforms/XeGPULayoutImpl.cpp
+10,549-8,8542,914 files not shown
+105,678-92,8572,920 files

LLVM/project e749849flang/test/Semantics cuf09.cuf cuf27.cuf, flang/test/Semantics/CUDA cuf09.cuf cuf27.cuf

[flang][cuda][NFC] Move CUDA Fortran semantic tests to a dedicated directory (#205927)
DeltaFile
+283-0flang/test/Semantics/CUDA/cuf09.cuf
+0-283flang/test/Semantics/cuf09.cuf
+0-116flang/test/Semantics/cuf27.cuf
+116-0flang/test/Semantics/CUDA/cuf27.cuf
+0-110flang/test/Semantics/cuf03.cuf
+110-0flang/test/Semantics/CUDA/cuf03.cuf
+509-50972 files not shown
+2,028-2,02878 files

LLVM/project de4a78d

Merge branch 'users/chinmaydd/gep-fix-1' into users/chinmaydd/named-barrier-gep-offset
DeltaFile
+0-00 files

LLVM/project 720a24allvm/lib/Target/AMDGPU AMDGPUISelLowering.cpp SIISelLowering.cpp, llvm/test/CodeGen/AMDGPU s-barrier-signal-var-gep.ll s-barrier-signal-var-gep-misaligned.ll

[AMDGPU] Fold constant offsets into named barrier addresses

Allow isOffsetFoldingLegal to fold a constant offset into an LDS
named-barrier global, and include the node offset when materializing the
LDS address in LowerGlobalAddress. s_barrier_signal_var on a GEP'd named
barrier now selects the immediate form, matching a bare global and GlobalISel.
With object linking the offset folds into the relocation addend.

The barrier ID is derived from the address via (addr >> 4) & 0x3F, so a
byte offset that does not land on a 16-byte barrier boundary is still
valid: it simply selects the containing barrier. No alignment assertion
is needed, and such offsets must not crash the compiler (see the
misaligned test).

Change-Id: I639bc723eb001573585cc05d0ad19f2773054f21
Assisted-by: Cursor
DeltaFile
+64-10llvm/test/CodeGen/AMDGPU/s-barrier-signal-var-gep.ll
+0-65llvm/test/CodeGen/AMDGPU/s-barrier-signal-var-gep-misaligned.ll
+8-6llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp
+8-0llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+80-814 files