[lldb-dap] Refactoring lldb-dap port listening mode to allow multiple connections. (#116392)
This adjusts the lldb-dap listening mode to accept multiple clients.
Each client initializes a new instance of DAP and an associated
`lldb::SBDebugger` instance.
The listening mode is configured with the `--connection` option and
supports listening on a port or a unix socket on supported platforms.
When running in server mode launch and attach performance should
be improved by lldb sharing symbols for core libraries between debug
sessions.
PeepholeOpt: Allow introducing subregister uses on reg_sequence (#127052)
This reverts d246cc618adc52fdbd69d44a2a375c8af97b6106. We now handle
composing subregister extracts through reg_sequence.
[flang][cuda] Avoid assign element mismatch when doing data transfer from a constant (#128252)
Currently when we do a CUDA data transfer from a constant, we embox it
and delegate the assignment to the runtime. When the type of the
constant is not exactly the same as the destination descriptor, the
runtime will emit an assignment mismatch error.
Convert the constant when necessary so the assignment is fine.
[RISCV] Assembler support for XRivosVizip (#127694)
This implements assembler support for the XRivosVizip custom/vendor
extension from Rivos Inc. which is defined in:
https://github.com/rivosinc/rivos-custom-extensions (See
src/xrivosvizip.adoc)
Codegen support will follow in a separate change.
[MLIR][ARITH] Adds missing foldings for truncf (#128096)
This patch is mainly to deal with folding `truncf`, as follows:
`truncf(extf(a))` -> `a`, if `a` has the same bitwidth as the result
`truncf(extf(a))` -> `truncf(a)`, if `a` has larger bitwidth than the
result
[SLP]Fix a crash when checking a scalar in a reordered buildvector node
Need to check reordered scalars, not the original ones, to correctly
check proper scalar.
[MemProf] Display backedges with dotted line in dot graphs (#128235)
Add checking of this behavior in the postbuild dot graphs, facilitated
by PR128226 which marked these edges at the end of the graph building.
AMDGPU: Widen f16 minimum/maximum to v2f16 on gfx950 (#128121)
Unfortunately we only have the vector versions of v2f16 minimum3
and maximum. Widen to v2f16 so we can lower as minimum333(x, y, y).
(cherry picked from commit e729dc759d052de122c8a918fe51b05ac796bb50)
[Clang] Fix cross-lane scan when given divergent lanes (#127703)
Summary:
The scan operation implemented here only works if there are contiguous
ones in the executation mask that can be used to propagate the result.
There are two solutions to this, one is to enter 'whole-wave-mode' and
forcibly turn them back on, or to do this serially. This implementation
does the latter because it's more portable, but checks to see if the
parallel fast-path is applicable.
Needs to be backported for correct behavior and because it fixes a
failing libc test.
(cherry picked from commit 6cc7ca084a5bbb7ccf606cab12065604453dde59)
[libc++] Reduce the dependency of the locale base API on the base system from the headers (#117764)
Many parts of the locale base API are only required when building the
shared/static library, but not from the headers. Document those
functions and carve out a few of those that don't work when
_XOPEN_SOURCE is defined to something old.
Fixes #117630
(cherry picked from commit f00b32e2d0ee666d32f1ddd0c687e269fab95b44)