[lldb][Windows] Try falling back to TLS 1.2 before erroring out (#206108)
TLS 1.3 is only supported on Windows Server 2022 and beyond. Windows
Server 2019 only supports up to TLS 1.2.
This causes test failures on CI runners which run on Windows Server
2019.
This patch allows falling back to TLS 1.2 if 1.3 is not available.
[CostModel][X86] Add more realistic v8i64/v16i32 + v8f64/v16f32 add reduction costs (#206124)
Fixes failure to fold to v16i32 reduction on ax512 targets
We still need to determine better CostKind values - but that can wait until #194621 is complete
[flang][OpenMP] Properly resolve CRITICAL construct names (#205904)
Resolve the names of CRITICAL constructs even if they are reserved
names.
This also limits locator parsing to known reserved names.
Fixes https://github.com/llvm/llvm-project/issues/205855
[libc++] Move compiler-specific configuration into <__configuration/compiler.h> (#205590)
These macros are essentially there to query compiler features, so they
should be moved into `<__configuration/compiler.h>`.
[clang][deps] Avoid `CompilerInvocation` copies (#205632)
When constructing the dependency graph for compilation caching, the
dependency scanner needs to do some extra operations on the compiler
invocations. Historically, these have not utilized the copy-on-write
variant well. This patch takes care to minimize `CompilerInvocation`
copies, which improves incremental scans with populated up-to-date
scanning module cache by 16-18%. Together with
https://github.com/llvm/llvm-project/pull/203350 which operates in the
same space, wall-times are improved by 1.54x and instruction counts by
1.66x.
[clang][analyzer] Detect use-after-move for 3-arg std::move (#196602)
This implementation detects a use-after-move for the 3-arguments
std::move on containers. This PR fixes #137157.
Since my current implementation uses `IteratorModeling` which is in
alpha stage I mark this PR as draft.
When both the `IteratorModeling` and `MoveChecker` are enabled my
implementation works to detect the use-after-move for the 3 argument
std::move case.
```cpp
std::move(l1.begin(), l1.end(), std::back_inserter(l2));
std::cout << "l1: " << *l1.cbegin() << '\n'; // <--- should have a use-after-move
```
```text
move_iterator.cpp:14:28: warning: Method called on moved-from object 'l1' of
[14 lines not shown]
[lldb] send 0x0 size packet if LLDB_LAUNCH_FLAG_USE_PIPES is set (#206107)
`LLDB_LAUNCH_FLAG_USE_PIPES=1` is used in tests to run lldb without the
ConPTY on Windows. This reduces the flakyness of tests.
This patch ensures that we read the value of
`LLDB_LAUNCH_FLAG_USE_PIPES` when setting up gdbremote tests, to make
sure they don't use the ConPTY.
This fixes `tools/lldb-server/TestGdbRemote_qThreadStopInfo.py` on
https://ci-external.swift.org/job/lldb-windows/job/main/.
[mlir][linalg] Guard pack tensor semantics (#206011)
Added a guard so the structured pack transform reports a normal tiling
failure when the target has already been bufferized, instead of reaching
a tensor-only path and asserting.
Fixes #205744
[libc++] Move _LIBCPP_FOPEN_CLOEXEC_MODE to <fstream> (#205537)
The macro is only required inside `<fstream>`, so we can move it there
instead of having it as a general configuration macro.
[MLIR][XeGPU][VectorToXeGPU] Minor fix for proper handling of 0D memrefs (#195877)
It fixes the following case:
```
vector.transfer_read %arg0[], %0 : memref<f16>, vector<f16>
```
[libc++][NFC] Simplify the implementation of aligned_union (#185449)
Instead of manually calculating the size and alignment of a union, we
can just generate an actual union and take the size and alignment of
that.
Co-authored-by: Louis Dionne <ldionne.2 at gmail.com>
[flang] Attach a placeholder `acc.var_name` to allocations in recipes. (#205939)
`ACCRecipeMaterialization` can replace the placeholder with the actual
variable name when materializing the recipe.
Assisted-by: Claude Code
[libc] introduce shared compiler-rt builtins (#200094)
Introduce shared compiler-rt builtins to libc and addtf3 builtin
Split from #197950
Part of #197824
[mlir][XeGPU][Transform] Add XeGPU contiguity analysis. (#201684)
Add an AxisInfo-based (borrows the idea from Triton Axis Info analysis)
dataflow analysis that computes, for each
`xegpu.load` / `xegpu.store` gather/scatter, how many elements are
contiguous
along the innermost offsets dimension, and stamps that count as a
`contiguity` **operation attribute** (`OptionalAttr<I64Attr>`) on the
op.
`contiguity` is a target-independent property of the offsets, not a
request tied to any optimization — a consumer is free to use or ignore
it. The
analysis performs no rewrite. Turning the property into a concrete
`lane_layout` / `lane_data` split (which needs the subgroup size) and
the
actual memory-message rewrite are consumer concerns, handled by later
layout-propagation steps (subsequent PRs) or, for testing, by the apply
helper
[4 lines not shown]
[mlir][gpu] Fix mgpuLaunchKernel sharedMemBytes type in LevelZero runtime (#206119)
The GPU launch lowering in SelectObjectAttr.cpp declares and calls
`mgpuLaunchKernel` with the dynamic shared memory size argument typed as
`i32`, but the Level Zero runtime wrapper declared the corresponding
parameter as `size_t` (8 bytes on 64-bit targets). Since these are
positional C-ABI arguments, the 4-byte vs 8-byte mismatch shifts the
layout of every following argument (stream, params, extra, paramsCount),
corrupting the call and crashing at launch.
Change the parameter to int32_t to match the codegen, consistent with
the CUDA and ROCm runtime wrappers which already use int32_t smem.
Co-authored-by: Claude Opus 4.8 <noreply at anthropic.com>
[mlir][SCF]: promote one-iteration loops with equal ub and step values (#205826)
Adds a fast-path to `constantTripCount` to return 1 on and enables
promotion of single-iteration loops of the form:
```
scf.for %j = %c0 to %val step %val ... { ... }
```
Signed-off-by: Ege Beysel <beyselege at gmail.com>
[AMDGPU] Align to LDS granularity in occupancy calculation (#205637)
Account for LDS allocation granularity by rounding per-workgroup LDS up
to the block size in getOccupancyWithWorkGroupSizes, fixing
overestimated occupancy.
[NFC][clang-tidy] Extend doc-comment of BranchCloneCheck (#206116)
Commit 8ac2b77a11c9db9879557ce1c26e38628e1ef45f extended the check
bugprone-branch-clone with a new feature but forgot to mention this in
the doc-comment at the beginning of BranchCloneCheck.h.
Although I don't think that this comment is read too often, let's still
update it to provide accurate information.