[flang][CUDA] Keep host literals from using unified-memory generic distance (#201257)
Fix CUDA generic resolution under `-gpu=mem:unified` so unattributed
literals and expression temporaries are not treated as unified-memory
actuals.
Previously, a host scalar literal such as `1.0` could score as
compatible with a `DEVICE` dummy and incorrectly select the
device-scalar overload. This could pass a host stack address to a device
helper and fail at runtime. The fix applies the unified/managed memory
distance columns only to symbol-backed actuals.
[flang][cuda] Fix host loads from CUDA constant globals (#203064)
This fixes CUDA Fortran lowering for scalar module variables with the
constant attribute that are read from host code, such as launch
configuration expressions or CUF kernel loop bounds.
Previously, host-side declarations for these globals could be rewritten
to device constant-memory addresses, causing host loads to dereference
the result of _FortranACUFGetDeviceAddress. The fix preserves host reads
from the host-visible global while still using the device address for
host-to-device assignment updates.
A FIR regression test covers host reads and assignment updates for
scalar CUDA constant globals.
(textproc/R-vroom) Updated 1.6.5 to 1.7.1
# vroom 1.7.1
-------------
* Internal changes requested by CRAN for forward compatibility with
clang 22.
* Internal changes requested by CRAN to remove non-API calls to
`Rf_findVarInFrame` and `R_NamespaceRegistry`.
# vroom 1.7.0
-------------
* [vroom.tidyverse.org](https://vroom.tidyverse.org/) is the new home
of vroom's website, catching up to the much earlier move (April
2022) of vroom's GitHub repository from the r-lib organization to
the tidyverse. The motivation for that was to make it easier to
transfer issues between these two closely connected packages.
* The `path` parameter has been removed from `vroom_write()`. This
[77 lines not shown]
[BOLT] Change DataAggregator error types (#203651)
1. In `filterBinaryMMapInfo`, replace `incovertibleErrorCode` with errc
code as `parseMainEvents` converts returned Error to std::error_code.
2. In `parsePerfData`, pass through Error returned by `prepareToParse`
for memory events.
Test Plan: updated perf_test.test
[libc] fix EAGAIN being treated as timeout in mutex and rwlock (#203574)
fix #203411.
This PR addresses the problem that `EAGAIN` may be treated as timeout in
mutex and rwlock. Two changes are applied:
1. timeout sites always explicitly check for timeout now to make the
logic more robust;
2. the futex wait now discards the error of `EAGAIN/EWOULDBLOCK` and
returns 0;
We don't distinguish waking up from signal and waking up from mismatch
for the following 3 reasons:
- We have userspace guard to avoid futex syscall if we already know
value would match, it seems awkward to make that check returns error, as
we may wake up and loop back to the check, where signal is consumed but
we still return error....;
- futex syscall can spuriously wake up anyway, there is no way to tell
[3 lines not shown]
QuantileType bytecode patch (#203495)
Since the merge of this
PR(https://github.com/llvm/llvm-project/pull/190321) there were some
issues identified, such as QuantileType not being added in the ByteCode
files. This PR focuses on fixing these missing pieces which should make
QuantileType a complete and functional type.
[libc] implement mkstemp (#199220)
Fixes #191266
Implements `mkstemp` as specified in POSIX
Currently Linux-only since it relies on the Linux syscall wrappers for
`getrandom` and `open`
Revert "[AMDGPU] In `LowerDYNAMIC_STACKALLOC`, hoist the `readfirstlane` up one instruction" (#203645)
Reverts llvm/llvm-project#201528
Reverting due to change causing "illegal VGPR to SGPR copy"
[SPIR-V] Lower freeze instructions with aggregate operands (#203584)
An aggregate freeze takes its result type from its operand, like a PHI
or select, but was handled by neither the up-front value-id mutation nor
replaceMemInstrUses, so the pass aborted with "illegal aggregate
intrinsic user". Mutate aggregate freezes to the i32 value-id type and
replace their operands alongside PHIs and selects.
MFV: openssl 3.5.7
This change is a security release which resolves several issues with OpenSSL 3.5,
the highest severity issue being ranked "High". Users are strongly encouraged to
update to this release.
More information about the release (from a high level) can be found in
the release notes [1].
1. https://github.com/openssl/openssl/blob/openssl-3.5.7/NEWS.md
All conflicts were resolved with `--theirs`, taking the release diff
over the local diff; the conflicts occurred due to preemptive security
fixes applied by so@ in e508c343.
MFC after: 3 days (the important security issues have been
preemptively addressed)
Merge commit '3a71a35ad9dad0e5d2cad8efecc8ba9d57c42d43'
[8 lines not shown]
[AMDGPU][true16] extract 16bit for scratch_load_ubyte_st when spilling (#203589)
In sramecc mode scratch_load_ubyte_st is selected for 16bit spilling.
Need a tmp vgpr32 and extract lo16 from it
audio/libresidfp: New port: Software emulation of MOS6581/8580 SID chip
Fork of Dag Lem's reSID 0.16 which is a reverse engineered software
emulation meant to replicate the SID as faithfully as possible while
keeping good performance for realtime use
Merge branch 'main' of github.com:llvm/llvm-project into users/ziqingluo/PR-179173940
Conflicts:
clang/unittests/ScalableStaticAnalysisFramework/Analyses/PointerFlow/PointerFlowTest.cpp
[AArch64][PAuth] Fix return-address auth for swifttailcc with FPDiff > 0 (#203340)
When a swifttailcc tail call has FPDiff > 0 (the caller received more
stack argument space than the callee pops), the epilogue contains an SP
adjustment to discard the leftover argument space. The existing code
treated both FPDiff < 0 and FPDiff > 0 uniformly in a single 'FPDiff !=
0' block, using AUTI[AB]1716 with a reconstructed entry-SP in x16 for
both cases.
For FPDiff < 0 (callee pops more) that reconstruction is necessary and
correct. For FPDiff > 0 it is wrong: by the time we enter the block the
post-index LDP has already adjusted SP back to the frame base, but the
'add sp, sp, #N' argument pop has not yet run. Entry SP equals the
current SP at that point, so AUTI[AB]SP would work directly, but instead
the combined block bumped SP via StackOffset::getFixed(-FPDiff) which
overshoots, and then emits AUTIA1716 with a wrong discriminator. Worse
yet, the SP restore had already been emitted *before* the auth, leaving
the live argument stack below SP and outside the red-zone during the
authentication window.
[9 lines not shown]
[SSAF][PointerFlow] Recognize reference-to-pointer/array Decls
Decls of reference-to-pointer/array types are now treated the same as
those of pointer/array type.
rdar://179173940