[AMDGPU][GlobalISel] Add register bank legalization for buffer_load byte and short (#167798)
This patch adds register bank legalization support for buffer load byte
and short operations in the AMDGPU GlobalISel pipeline.
[flang][OpenMP] Implement COMBINER clause
This adds parsing and lowering of the COMBINER clause. It utilizes the
existing lowering code for combiner-expression to lower the COMBINER
clause as well.
[lldb][InstrumentationRuntime] Run sanitizer utility expressions as C (#172019)
The utility expressions in the `InstrumentationRuntime` plugins are just
plain C code, but we run them as `ObjC++`. That meant we were doing
redundant work (like looking up decls in the Objective-C runtime). The
sanitizer tests sporadically time out while looking up function symbols
in the Objective-C runtime. This patch switches the expression language
to `C`.
Didn't find a great way of testing this other than looking at the
expression log.
rdar://165656320
[lldb][nfc] Change ProcessGDBRemote::ParseMultiMemReadPacket signature (#172020)
Instead of returning an `Expected<vector<...>>` it now returns an Error,
and receives a vector argument to fill in. This will be useful to
support a change were ParseMultiMemReadPacket will be called multiple
times in a loop with the same vector; without this change, we would have
to concatenate vectors and copy memory around.
[AArch64][GlobalISel] Renamed GI nodes describing intrinsics with immediate offsets
In SDAG, nodes that expect an immediate offset end with _I. This is now reflected in GISel for the vector shifts.
[mlir][PDL] Relax PDL verification constraints
This commit introduces the following changes:
1. HasParentNotOf: A trait that verifies an operation's parent is not one of
the specified parent operations.
2. Adds a `nonmaterializable` attribute to `pdl.pattern`
that indicates a pattern cannot be directly lowered to pdl_interp. This
allows patterns to contain non-PDL operations (e.g., func.call), and relax
other constraint for when a pattern requires further transformations before
materialization. An example, of such transformation is function inlining on
the example below.
3. Relax parent constraints in PDL. For example, instead of `HasParent<RewriteOp>`
use `HasParentNotOf<PatternOp>`, as the latter has the same intended meaning within
PDL, but allows using the ops outside PDL like in func.func.
4. Add error in PDLToPDLInterp pass for nonmaterializable patterns. It's responsibility
[30 lines not shown]
ValueTracking: Handle amdgcn.rsq intrinsic in computeKnownFPClass (#171837)
We have other target intrinsics already in ValueTracking functions,
and no access to TTI.
[AMDGPU] Add missing cases for V_INDIRECT_REG_{READ/WRITE}_GPR_IDX and V/S_INDIRECT_REG_WRITE_MOVREL (#171835)
A buildbot failure in https://github.com/llvm/llvm-project/pull/170323
when expensive checks were used highlighted that some of these patterns
were missing.
This patch adds `V_INDIRECT_REG_{READ/WRITE}_GPR_IDX` and
`V/S_INDIRECT_REG_WRITE_MOVREL` for `V6` and `V7` vector sizes.
Use `llvm::SmallVector` instead of `OwningArrayRef` in `VTableLayout`. (#168768)
This simplifies the code by removing the manual optimization for size ==
1, and also gives us an optimization for other small sizes.
Accept a `llvm::SmallVector` by value for the constructor and move it
into the destination, rather than accepting `ArrayRef` that we copy
from. This also lets us not have to construct a reference to the
elements of a `std::initializer_list`, which requires reading the
implementation of the constructor to know whether it's safe.
Also explicitly document that the constructor requires the input indexes
to have a size of at least 1.
[ROCDL] Added LDS barrier ops to ROCDL (gfx1250) (#171810)
Added `ds.atomic.barrier.arrive.rtn.b64` and
`ds.atomic.async.barrier.arrive.b64` to ROCDL. These are parts of the
LDS memory barrier concept in GFX1250. Also added alias analysis to
`global/flat` data prefetch ops. Extended rocdl tests.
[AArch64][SVE] Fix -msve-vector-bits=256 fixed width vector crash (#171776)
This adds tests for and fixes an issue where v8bf16 ISD::FP_ROUND v8f32
cannot be lowered when -msve-vector-bits=256.
[lldb] improve the heuristics for checking if a terminal supports Unicode (#171832)
This patch improves the way lldb checks if the terminal it's opened in
(if any) supports Unicode or not.
On POSIX systems, we check if `LANG` contains `UTF-8`.
On Windows, we always return `true` since we use the `WriteToConsoleW`
api.
This is a relanding of https://github.com/llvm/llvm-project/pull/168603.
The tests failed because the bots support Unicode but the tests expect
ASCII. To avoid different outputs depending on the environment the tests
are running in, this patch always force ASCII in the tests.
[X86] combineHorizOpWithShuffle - ensure we handle undef elements from widened shuffle (#172014)
Since #170838 we no longer canonicalise away whole-lane shuffles of
horizontal ops, so we need to better handle cases where widened shuffle
masks might still contain undefs.
Fixes #172010
[MLIR][LLVM] Add pass to update ops with default visibility (#171727)
To support the `-fvisibility=...` option in Flang, we need a pass to
rewrite all the global definitions in the LLVM dialect that have the
default visibility to have the specified visibility. This change adds
such a pass.
Note that I did not add an option for `visiblity=default`; I believe
this makes sense for compiler drivers since users may want to tack an
option on at the end of a compile line to override earlier options, but
I don't think it makes sense for this pass to accept
`visibility=default`--it would just be an early exit IIUC.