[flang] Disable copy-out to INTENT(IN) args (#192382)
Don't copy out to actual args that themselves happen to be INTENT(IN)
dummy args.
```
subroutine test(a)
real, intent(in) :: a(:)
call require_contiguous_arg(a(1:n:2)) ! copy-in only, no copy-out
end
```
---------
Co-authored-by: Claude Sonnet 4.6 <noreply at anthropic.com>
[X86] Improve FREEZE node elimination for SETCC operations (#192362)
This improves FREEZE node handling around SETCC and SETCC_CARRY
operations to enable better optimization, particularly for APX
CCMP/CTEST
pattern matching with fastmath comparisons.
Resolve https://github.com/llvm/llvm-project/issues/191716.
[lldb/test] Fix shared library symlinks for remote testing (#189177)
When running tests on a remote device, framework convenience symlinks
created by test Makefiles (e.g. `$(BUILDDIR)/Framework` pointing to
`$(BUILDDIR)/Framework.framework/Framework`) cause launch failures.
`Platform::Install` recreates these as symlinks on the remote device
pointing to host build paths that don't exist, resulting in "No such
file or directory" from dyld.
This patch changes `LN_SF` in Makefile.rules to strip the common
directory prefix from the symlink source using `patsubst` so it produces
relative symlinks instead of absolute ones.
It also resolve symlinks with `os.path.realpath()` in
`registerSharedLibrariesWithTarget` before registering modules so that
`Platform::Install` sees a regular file and transfers the actual binary
content.
[2 lines not shown]
[SPIR-V] Encode Atomic metadata as UserSemantic string decoration (#193019)
AMDGPU uses metadata to guide atomic related optimisations. SPIR-V was
not handling it, which led to significant and spurious performance
differences. This patch fixes this oversight by encoding the metadata as
UserSemantic string decorations applied to the atomic instructions.
x86: rename and clean up __copy_from_user_inatomic_nocache()
From Linus Torvalds
03fd014cd9f3a3d173740ab9c5cbede82fd6322c in linux-6.18.y/6.18.24
5de7bcaadf160c1716b20a263cf8f5b06f658959 in mainline linux
drm/amdkfd: Fix queue preemption/eviction failures by aligning control stack size to GPU page size
From Donet Tom
647fb0dc3818733024fc96c1df1ec3af806b0256 in linux-6.18.y/6.18.24
78746a474e92fc7aaed12219bec7c78ae1bd6156 in mainline linux
Fix difftime() result when it is passed a negative value
We need to cast the result of bitwise AND to time_t before the cast
to double in the HI and LO macros. Otherwise, we get a very large
positive floating point value instead of a negative value.
Reported by Xuntao Chi
drm/amdgpu: Handle GPU page faults correctly on non-4K page systems
From Donet Tom
6a9f2683c66dc54d3598589684c0b3c5cb2862ad in linux-6.18.y/6.18.24
4e9597f22a3cb8600c72fc266eaac57981d834c8 in mainline linux
[ExpandMemCmp] Pre-collect memcmp calls to improve compile time (#193415)
Avoid restarting the basic block iteration from the beginning of the
function every time a memcmp/bcmp is expanded. Instead, pre-collect all
memcmp/bcmp calls and process them in a single pass.
[libc][CndVar] reimplmement conditional variable with FIFO ordering (#192748)
This PR reimplements conditional variable with two different variants:
- futex-based shared condvar with atomic counter for waiters
- queue-based private condvar
Notice that thread-local queue node cannot be reliably accessed in
shared processes, so we cannot use a unified implementation in this
case.
POSIX.1-2024 (Issue 8) added atomicity conditions to conditional
variable:
- The `pthread_cond_broadcast()` function shall, **as a single atomic
operation**, determine which threads, if any, are blocked on the
specified condition variable cond and unblock all of these threads.
- The `pthread_cond_signal()` function shall, as a **single atomic
operation**, determine which threads, if any, are blocked on the
[41 lines not shown]
[DirectX] Implement lowering of Texture Load and Texture .operator[] (#193343)
Fixes https://github.com/llvm/llvm-project/issues/192546 and
https://github.com/llvm/llvm-project/issues/192558
This PR defines the TextureLoad DXIL Op (opcode 66), and implements
lowering of the texture load (dx_resource_load_level) intrinsic to the
DXIL op.
This PR also implements the transformation of loads from texture
resources (via dx_resource_getpointer) into dx_resource_load_level
intrinsics.
Assisted-by: Claude Opus 4.7