[flang][OpenMP] More detailed checks for argument list items in clauses (#201334)
For clauses that take list of variable, locator, and extended list
items, perform checks that the actual arguments meet the corresponding
requirements. This is version-based, since clause requirements have
changed over time.
AMDGPU/GlobalISel: RegBankLegalize rules for cvt f16<->fp8/bf8 (#202361)
Small types are impemented using integers in LLVMIR,
because of this there are no irtranslator failures.
[SelectionDAG] Promote FPOWI/FLDEXP exponents where possible, and raise an error otherwise (#200621)
PromoteIntOp_ExpOp is reached when the exponent type is illegal.
- When the exponent type was smaller than int, we'd hit an assertion. In
builds where asserts were disabled, we actually ended up doing the right
thing; makeLibCall would sign-extend the value to int.
- When the exponent type was too large, we'd also hit an assertion. In
builds were asserts were disabled, we would *not* do the right thing;
we'd end up silently truncating the value. Now we explicitly raise an
error.
[SPIRV] Add support for G_PTRMASK (#201450)
This instruction is generated by the
[llvm.ptrmask](https://llvm.org/docs/LangRef.html#llvm-ptrmask-intrinsic)
intrinsic, which is used for Clang builtins like
[__builtin_align_up](https://clang.llvm.org/docs/LanguageExtensions.html#alignment-builtins)
which is used in `libc`.
We are working on building `libc` for SPIR-V, so we hit this problem.
Signed-off-by: Nick Sarnie <nick.sarnie at intel.com>
Co-Authored-By: Claude Sonnet 4.5 <noreply at anthropic.com>
---------
Signed-off-by: Nick Sarnie <nick.sarnie at intel.com>
Co-authored-by: Claude Sonnet 4.5 <noreply at anthropic.com>
[lld-macho] Sort LC_LINKER_OPTIONS before processing (#201604)
Previously https://reviews.llvm.org/D157716 brought handling of
LC_LINKER_OPTIONS closer to Apple linker behavior by processing the
options at the end after all object files have been added.
This corrects another difference in behavior, processing frameworks
before regular libraries (linked with -lFoo), and processing each group
in sorted order.
Processing a LC_LINKER_OPTIONS can trigger loads of more object files
which in turn may have more LC_LINKER_OPTIONS. We iterate this to a
fixed point, walking this graph in BFS order, processing each "level" of
the graph in the order described above. This graph traversal order
hasn't changed in this commit, only the sorting has.
The diff of the linker map produced for the included test before and
after:
```
[20 lines not shown]
[Clang][Driver] default-on include path backslash warning on PS5 (#202300)
It seems like there is precedent for using addClangWarningOptions in the
driver to set warning default states per-target, in e.g. AMDGPU.
These warnings are usually disabled by default to avoid overdiagnosing
common patterns on Windows host+target builds which don't care about
portability. Since PS5 builds are cross-compiled it makes less sense to
assume things about the host, so we want to diagnose portability issues
more eagerly.
[Clang] Set default LTO mode for AMDGCN/SPIR-V targets to full (#201457)
Summary:
Previously we had several layers of if conditions that functionally
amounted to pretending like we were in LTO-mode. The previous changes
moved the LTO settings into the toolchain so we can now override it for
our offloading toolchains. This allows us to respect the LTO mode, where
previously there was no way to override it.
The main artifact of this PR should be trimming up the massive if
statement.
Some slight by-products on the old-driver path, but this can be
recovered with `-fno-offload-lto` and the old driver should be deleted
in a few months anyways.
[MLIR][NVVM] Add explicit aligned attribute to nvvm.barrier and nvvm.barrier.reduction (#200745)
This PR according to the third PR commitments in #192203
This PR adds an explicit aligned boolean attribute to `nvvm.barrier`,
defaulting to true to preserve the existing semantic default, and
extends the op's LLVM IR lowering to pick between the `.aligned` and
non-`.aligned` spellings of the `@llvm.nvvm.barrier.cta.*` intrinsic
family.
Notes on using `BoolAttr` instead of UnitAttr: `nvvm.barrier`'s existing
lowering always emits an aligned intrinsic variant. Making aligned a
BoolAttr with default true captures that as the op's default, and the
`custom<Aligned>` described below only emits the keyword when
non-default.
(I am not able to merge branches, please help me)
[SYCL] Error on C inputs when compiling with -fsycl (#200318)
`SYCL` is a `C++`-based programming model and requires `C++` source
files.
Enforce this invariant in the frontend by rejecting `C` inputs when SYCL
mode is active, ensuring that `LangOpts.SYCL` implies
`LangOpts.CPlusPlus` regardless of how the compiler is invoked.
[CodeGen] Add initial multi-def rematerialization support
This significantly improves support for rematerializing registers
with more than one definition. In particular, this includes cases where
different lanes of a register are defined over multiple instructions.
There are still a few restrictions that can hopefully be relaxed in the
future.
- All defining instructions must be part of the same rematerialization
region.
- No pure user of the register (i.e., an MI that doesn't also defined a
part of the register) must read the register before its last
definition.
These constraints ensure that the underlying DAG representation
maintained by the rematerializer is still valid, making this a
relatively incremental improvement.
[CodeGen] Fine-grained LIS updates on remat and dead-def handling
This replaces the rematerializer's manual bulk LIS update paradigm in
favor of an automated fine-grained one that
1. performs LIS updates as rematerializations happen and
2. handles the removal of dead-definitions properly (this replaces the
prior partial handling of live interval splitting).
The new approach should be less error-prone (clients do not have to
periodically update the LIS, which is now up-to-date at all times from
the clients's perspective) and faster in general (live intervals aren't
fully re-created every time a def or use of a register changes).
Handling dead-definitions (through a `LiveRangeEditor`) adds some
complexity to the rematerializer since unrematerializable MIs can now
also be deleted. This is exposed to listeners through a new event.
Furthermore, rematerializable registers can now become "permanently
dead" if all their users were unrematerializable MIs that became dead as
[11 lines not shown]
[CodeGen][AMDGPU] Prepare rematerializer for subreg remat support (NFC)
This makes some NFCs to the rematerializer before starting to improve
support for sub-register rematerialization. The main changes are the
replacement of `Rematerializer::Reg::Dependency` type (essentially a
pair of a machine operand index and a register index) in favor of a
simple register index, dropping the machine operand index. The latter
has no current uses and will lose meaning once we allow rematerializable
registers to be defined by multiple MIs. Similarly, and for the same
rationale, unrematerializable register dependencies are now tracked as
a register/lanemask pair instead of a machine operand index.
Other minor changes listed below.
- Removal of `DefRegion` argument to `Rematerializer::recreteReg`.
Registers are always re-created in their original region so there is
no need to set their region again.
- Removal of `InsertPos` unused argument to
`Rematerializer::postRematerialization`.
[3 lines not shown]
[CodeGen] Fix incorrect rematerialization rollback order
This fixes an issue in the rematerializer's rollbacker wherein adjacent
MIs that were deleted through rematerializations would
sometimes---depending on the exact order in which they were
deleted---not be re-created in their original
pre-rematerialization order. While this does not impact correctness
(i.e., use-def relations are always honored), this goes against the
rollbacker's intent to re-create the MIR exactly as it was
pre-rematerializations (up to slot index changes).
[IR] Preserve pointer-byte bitcasts around addrspacecast (#202454)
This fixes cast-pair elimination for addrspacecast combined with
pointer/byte bitcast.
The LLVM LangRef defines [bN byte
types](https://llvm.org/docs/LangRef.html#byte-type) as raw memory data
in SSA registers, where each bit may be an integer bit, part of a
pointer value, or poison.
The LangRef permits pointer-to-byte bitcast: the [bitcast .. to
instruction](https://llvm.org/docs/LangRef.html#bitcast-to-instruction)
says that if the source type is a pointer, the destination type must be
a pointer or a byte/vector-of-bytes type of the same size.
The same [bitcast .. to
section](https://llvm.org/docs/LangRef.html#bitcast-to-instruction) also
defines byte-to-pointer behavior: when the destination type is a pointer
type, a byte value whose bits all come from the same correctly ordered
[27 lines not shown]
[SPIRV] Support selection of G_CONCAT_VECTORS (#201686)
Implement the G_CONCAT_VECTOR opcode using `OpCompositeConstruct`. The
semantics are similar so the implementation is straightforward.
This opcode being generated is somewhat rare, in this case it seems to
have remained due to the non-power of 2 vector length ABI.
Co-Authored-By: Claude Opus 4.8 <noreply at anthropic.com>
Co-authored-by: Claude Opus 4.8 <noreply at anthropic.com>