[AMDGPU] Be less optimistic when allocating module scope lds (#161464)
Make the test for when additional variables can be added to the struct
allocated at address zero more stringent. Previously, variables can be
added to it (for faster access) even when that increases the lds
requested by a kernel. This corrects that oversight.
Test case diff shows the change from all variables being allocated into
the module lds to only some being, in particular the introduction of
uses of the offset table and that some kernels now use less lds than
before.
Alternative to PR 160181
[NVPTX] expand trunc/ext on v2i32 (#161715)
#153478 made v2i32 legal on newer GPUs, but we can not lower all
operations yet. Expand the `trunc/ext` operation until we implement
efficient lowering.
[InstCombine] Fold icmp with clamp into unsigned bound check (#161303)
Fix #157315
alive2: https://alive2.llvm.org/ce/z/TEnuFV
The equality comparison of `min(max(X, Lo), Hi)` and `X` is actually a
range check on `X`. This PR folds this into an unsigned bound check `(X
- Lo) u< (Hi - Lo + 1)`.
---------
Co-authored-by: Yingwei Zheng <dtcxzyw at qq.com>
[RISCV][GISel] Share an atomic load isel pattern GISel RV64 and SDAG RV32. NFC (#161721)
Use stricter type for RV64 only patterns.
Stores are different because atomic_store doesn't differentiate
truncating and non-truncating stores.
[OFFLOAD] Restore interop functionality (#161429)
This implements two pieces to restore the interop functionality (that I
broke) when the 6.0 interfaces were added:
* A set of wrappers that support the old interfaces on top of the new
ones
* The same level of interop support for the CUDA amd AMD plugins
[RISCV] Always use XLenVT for pointer operand in PatLAQ and PatSRL. NFC (#161709)
The vt argument is not used today so it always gets the default XLenVT
which is why this is NFC. I plan to use it in a future patch.
[libc] Implement faccessat (#161065)
#160404
- Implement POSIX function "faccessat"
- Remove redundant param in facessat syscall in access implementation,
faccessat syscall does not take a flags arg
[NFC][OpenACC] Remove 'initExpr' from AST/etc. (#161674)
I originally expected that we were going to need the initExpr stored
separately from the allocaDecl when doing arrays/pointers, however after
implementing it, we found that the idea of having the allocaDecl just
store its init directly still works perfectly. This patch removes the
extra field from the AST.
[mlir][Arith] arith.select doesn't need to be emulated for small floats (#161707)
arith.select isn't an arithmetic operation in the sense of things like
addf or mulf, which the emulate-unsupported-floats rewrites using extf
and truncf.
This patch adds select as a legal operation to prevent a pointless
conversion aronud conditional moves.
Fixes https://github.com/iree-org/iree/issues/22181
[MLIR][XeGPU] Use operand layouts for store scatter (#161447)
The PR adds a change to use the layouts from the operands since store
doesn't have a result
[LAA,LV] Add early-exit tests with deref assumes and nofree via context.
Add tests with early exits and dereferenceable assumptions that need
proving no-free via the context.
[AArch64][SME] Support split ZPR and PPR area allocation (#142392)
For a while we have supported the `-aarch64-stack-hazard-size=<size>`
option, which adds "hazard padding" between GPRs and FPR/ZPRs. However,
there is currently a hole in this mitigation as PPR and FPR/ZPR accesses
to the same area also cause streaming memory hazards (this is noted by
`-pass-remarks-analysis=sme -aarch64-stack-hazard-remark-size=<val>`),
and the current stack layout places PPRs and ZPRs within the same area.
Which looks like:
```
------------------------------------ Higher address
| callee-saved gpr registers |
|---------------------------------- |
| lr,fp (a.k.a. "frame record") |
|-----------------------------------| <- fp(=x29)
| <hazard padding> |
|-----------------------------------|
[54 lines not shown]
[clang] NFCI: Clean up `CompilerInstance::create{File,Source}Manager()` (#160748)
The `CompilerInstance::createSourceManager()` function currently accepts
the `FileManager` to be used. However, all clients call
`CompilerInstance::createFileManager()` prior to creating the
`SourceManager`, and it never makes sense to use a `FileManager` in the
`SourceManager` that's different from the rest of the compiler. Passing
the `FileManager` explicitly is redundant, error-prone, and deviates
from the style of other `CompilerInstance` initialization APIs.
This PR therefore removes the `FileManager` parameter from
`createSourceManager()` and also stops returning the `FileManager`
pointer from `createFileManager()`, since that was its primary use. Now,
`createSourceManager()` internally calls `getFileManager()` instead.
[mlir] [irdl] Add support for regions in irdl-to-cpp (#158540)
Fixes https://github.com/llvm/llvm-project/issues/158034
For the input
```mlir
irdl.dialect @conditional_dialect {
// A conditional operation with regions
irdl.operation @conditional {
// Create region constraints
%r0 = irdl.region // Unconstrained region
%r1 = irdl.region() // Region with no entry block arguments
%v0 = irdl.any
%r2 = irdl.region(%v0) // Region with one i1 entry block argument
irdl.regions(cond: %r2, then: %r0, else: %r1)
}
}
[70 lines not shown]
[HLSL] [SPIR-V] Add counter member for typed buffer (#161414)
This is part 1 of implementing the typed buffer counters proposal:
https://github.com/llvm/wg-hlsl/blob/main/proposals/0023-typed-buffer-counters.md
This patch adds the initial plumbing for supporting counter variables
associated with structured buffers for the SPIR-V backend. It introduces
an `IsCounter` attribute to `HLSLAttributedResourceType` and threads it
through the AST, type printing, and mangling. It also adds a
`__counter_handle` member to the relevant buffer types in
`HLSLBuiltinTypeDeclBuilder`.
Contributes to https://github.com/llvm/llvm-project/issues/137032
[libc] Fix issue with fuzz input too short for atoi diff fuzz (#161705)
The string to integer differential fuzzer assumes at least one byte of
meaningful input, but wasn't explicitly checking that. Now it does.