[HLSL][Matrix] Add Matrix Layout Keywords (#192284)
fixes #192263
HLSL allows per-declaration matrix layout overrides via the row_major
and column_major keywords. Prior to this commit, matrix layout was only
controlled globally via the -fmatrix-memory-layout command-line flag
(LangOptions::DefaultMatrixMemoryLayout). This commit adds the parsing
and semantic analysis infrastructure needed to support per-declaration
layout, matching DXC behavior.
Assisted-by: Claude Opus 4.6
[CIR] Remove overly strict end_catch assertion in catch handler flattening (#193796)
The assertion in `flattenCatchHandler` required `end_catch` to be the
last operation before `yield` in catch handlers, with only branches in
between. Complex catch handlers with cleanup code or nested control flow
can have additional operations between `end_catch` and `yield`. The
yield-to-branch replacement does not depend on `end_catch` position.
Made with [Cursor](https://cursor.com)
[flang][cuda] Fix CUFDeviceGlobal duplicate skip and CUFAddConstructor for empty gpu.module (#194290)
**Fix 1:**
```fortran
module m
real, device :: a(3), b(3), c(3)
contains
attributes(global) subroutine kernel()
a(1) = 1.0
end subroutine
end module
```
When a kernel references global `a`, an earlier pass
(`CUFDeviceFuncTransform`) clones it into the `gpu.module`. When
`CUFDeviceGlobal` later processes all device globals, it finds `a`
already exists, and `break` exits the loop — skipping `b` and `c`
entirely.
[25 lines not shown]
workflows/issue-subscriber: Use a GitHub app token (#194073)
This removes one user of the ISSUE_SUBSCRIBER_TOKEN secret, which we
want to eventually remove since secrets are more difficult to maintain.
[analyzer] Fix use-after-free in CheckerContext::getMacroNameOrSpelling (#194174)
This UAF bug was introduced 14 years ago, in 43de767b55c07.
The first use of this API was in
3b754b25bde4914e5ab693e7db9533c3260e926e, roughly around the same era.
It was dormant because the first param of `socket` and friends are
macros, and those didn't trigger this UAF.
I decided to go against adding a test because I figure that would be not
really meaningful.
Fixes #194136
[MLIR][GPU] Fix async.yield gpu.async.token lowering race (#190717)
Root cause of #170833 (flakiness of `Integration/GPU/CUDA/async.mlir` on
the Tesla T4 mlir-nvidia buildbot).
In `gpu-to-llvm`, two patterns matched `async.yield` with the same
benefit: the structural `ConvertYieldOpTypes` from
`populateAsyncStructuralTypeConversionsAndLegality` (which just retypes
operands), and `ConvertAsyncYieldToGpuRuntimeCallPattern` (which also
creates and records an event on the stream backing each
`gpu.async.token` operand). When the IR contained `gpu.launch_func`, the
dialect-conversion framework picked the structural pattern, silently
dropping the event record. The `async.execute` then yielded a stream
pointer where its consumers expected an event, and the host await ended
up calling `cuEventSynchronize` on a stream pointer. That call returns
an error without waiting, so the host raced against the GPU.
This change implements a fix which registers
`ConvertAsyncYieldToGpuRuntimeCallPattern` with pattern benefit 2 so it
[8 lines not shown]
[lldb] Override UpdateBreakpointSites in ProcessGDBRemote to use MultiBreakpoint
This concludes the implementation of MultiBreakpoint by actually using
the new packet to batch breakpoint requests.
https://github.com/llvm/llvm-project/pull/192910
[CIR] Handle explicit instantiation declaration in getVTableLinkage (#193809)
Replace errorNYI for TSK_ExplicitInstantiationDeclaration with the
correct linkage logic: use discardable ODR linkage for MSVC, and for
Itanium choose between AvailableExternallyLinkage (when the vtable can
be speculatively emitted) or ExternalLinkage.
Fixes point 3 of #192330.
[flang][NFC] Converted five tests from old lowering to new lowering (part 50) (#194277)
Tests converted from test/Lower: polymorphic-temp.f90,
polymorphic-types.f90, procedure-declarations.f90,
read-write-buffer.f90, real-operations-2.f90
[lldb] Fix has_lldb_codesign check (#194412)
I met a problem with this test on WSL. WSL has access to Windows's
files. Windows distribution has `security` dir (
`/mnt/c/Windows/security` from WSL point of view). So when we try to run
`security` command we get `PermissionError` instead of
`FileNotFoundError`. It is would not be a problem, but Python firstly
calls decorators for methods (see example below), so we call
`has_lldb_codesign()` not only on Darwin platform. Also I think some
Linux distros might have `security` command, so the current check is not
robust enough.
```python
import unittest
def checker():
raise Exception("test")
@unittest.skipUnless(False, "")
[5 lines not shown]