[PowerPC] Require PPC32 for 32-bit addc/adde/subc/sube (#179186)
Unlike the base add/sub opcodes which will just overflow, these will
produce incorrect results, because the carry operates on the full
64-bits. Trying to use these with i32 operands on PPC64 should result in
a selection failure instead of a silent miscompile, like the one seen in
https://github.com/llvm/llvm-project/pull/178979.
[OFFLOAD] Support host plugin on Windows (#180401)
Changes to make host plugin compile on Windows:
* Change IO code to be portable
* Adjust Makefiles
Allow plugin to work partially when libffi support is not found
dynamically (compilation works fine even on Windows because of the
wrapper support).
[MLIR] Guard optional operand resolution in generated op parsers (#180796)
Skip resolveOperands for optional operands when they are absent to
avoid out-of-bounds access on the empty types vector.
[AMDGPU] Introduce asyncmark/wait intrinsics (#180467)
Asynchronous operations are memory transfers (usually between the global
memory and LDS) that are completed independently at an unspecified
scope. A thread that requests one or more asynchronous transfers can use
async marks to track their completion. The thread waits for each mark to
be completed, which indicates that requests initiated in program order
before this mark have also completed.
For now, we implement asyncmark/wait operations on pre-GFX12
architectures that support "LDS DMA" operations. Future work will extend
support to GFX12Plus architectures that support "true" async operations.
This is part of a stack split out from #173259
- #180467
- #180466
Co-authored-by: Ryan Mitchell ryan.mitchell at amd.com
Fixes: SWDEV-521121
[libc++][test] Include `<ios>` and `<ctime>` in tests for `time` locale facets (#179986)
Add inclusion of `<ios>` and `<ctime>` to ensure that the definitions of `std::basic_ios` and `std::tm` are available.
As a drive-by fix, change uses of `tm` to `std::tm`. The latter is guaranteed to be available in `<ctime>`, but the former isn't.
[CIR] Add CIRGen support for static local variables with non-constant initializers
This adds CIRGen infrastructure for C++ function-local static variables
that require guarded initialization (Itanium C++ ABI).
Changes:
- Add ASTVarDeclAttr to carry VarDecl AST through the pipeline
- Add emitGuardedInit() to CIRGenCXXABI for guarded initialization
- Add emitCXXGuardedInit() to CIRGenFunction
- Replace NYI in addInitializerToStaticVarDecl() with ctor region emission
- Set static_local attribute on GlobalOp and GetGlobalOp
The global's ctor region contains the initialization code, which will be
lowered by LoweringPrepare to emit the actual guard variable pattern with
__cxa_guard_acquire/__cxa_guard_release calls.
[NewPM] Port x86-insert-x87-wait (#180128)
Similar to other portings created by @aidenboom154. No specific test
coverage as there are no MIR->MIR tests that exercise this pass. Going
with other naming conventions, I renamed WaitInsert to
X86InsertX87WaitLegacy
[MLIR][Affine] Remove restriction in slice validity check on symbols (#180709)
Remove restriction in affine analysis utility for checking slice
validity. This was unnecessarily bailing out still after the underlying
methods were extended. This update enables fusion of affine nests with
symbolic bounds.
Fixes: https://github.com/llvm/llvm-project/issues/61784
Based on and revived from https://reviews.llvm.org/D148559 from
@anoopjs.
[Flang][OpenMP] Fix visibility of user-defined reductions for derived types and module imports (#180552)
User-defined reductions declared in a module were not visible to
programs that imported the module via USE statements, causing valid code
to be incorrectly rejected. The reduction identifier defined in the
module scope wasn't being found during semantic analysis of the main
program.
Ref:
OpenMP Spec 5.1
_"If a directive appears in the specification part of a module then the
behavior is as if that directive,
with the variables, types and procedures that have PRIVATE accessibility
omitted, appears in the
specification part of any compilation unit that references the module
unless otherwise specified "_
Fixes :
[https://github.com/llvm/llvm-project/issues/176279](https://github.com/llvm/llvm-project/issues/176279)
Co-authored-by: Chandra Ghale <ghale at pe31.hpc.amslabs.hpecorp.net>
[AMDGPU] Introduce asyncmark/wait intrinsics
Asynchronous operations are memory transfers (usually between the global memory
and LDS) that are completed independently at an unspecified scope. A thread that
requests one or more asynchronous transfers can use async marks to track their
completion. The thread waits for each mark to be completed, which indicates that
requests initiated in program order before this mark have also completed.
For now, we implement asyncmark/wait operations on pre-GFX12 architectures that
support "LDS DMA" operations. Future work will extend support to GFX12Plus
architectures that support "true" async operations.
Co-authored-by: Ryan Mitchell ryan.mitchell at amd.com
Fixes: SWDEV-521121
[clang] Ensure -mno-outline adds attributes
Before this change, `-mno-outline` and `-moutline` only controlled the
pass pipelines for the invoked compiler/linker.
The drawback of this implementation is that, when using LTO, only the
flag provided to the linker invocation is honoured (and any files which
individually use `-mno-outline` will have that flag ignored).
This change serialises the `-mno-outline` flag into each function's
IR/Bitcode, so that we can correctly disable outlining from functions in
files which disabled outlining, without affecting outlining choices for
functions from other files. This matches how other optimisation flags
are handled so the IR/Bitcode can be correctly merged during LTO.
[clang] Add clang::nooutline Attribute
This change:
- Adds a `[[clang::nooutline]]` function attribute for C and C++. There
is no equivalent GNU syntax for this attribute, so no `__attribute__`
syntax.
- Uses the presence of `[[clang::nooutline]]` to add the `nooutline`
attribute to IR function definitions.
- Adds test for the above.
The `nooutline` attribute disables both the Machine Outliner (enabled at
Oz for some targets), and the IR Outliner (disabled by default).
[outliners] Turn nooutline into an Enum Attribute (#163665)
This change turns the `"nooutline"` attribute into an enum attribute
called `nooutline`, and adds an auto-upgrader for bitcode to make the
same change to existing IR.
This IR attribute disables both the Machine Outliner (enabled at Oz for
some targets), and the IR Outliner (disabled by default).
[AMDGPU] Asynchronous loads from global/buffer to LDS on pre-GFX12 (#180466)
The existing "LDS DMA" builtins/intrinsics copy data from global/buffer
pointer to LDS. These are now augmented with their ".async" version,
where the compiler does not automatically track completion. The
completion is now tracked using explicit mark/wait intrinsics, which
must be inserted by the user. This makes it possible to write programs
with efficient waits in software pipeline loops. The program can now
wait for only the oldest outstanding operations to finish, while
launching more operations for later use.
This change only contains the new names of the builtins/intrinsics,
which continue to behave exactly like their non-async counterparts. A
later change will implement the actual mark/wait semantics in
SIInsertWaitcnts.
This is part of a stack split out from #173259:
- #180467
- #180466
Fixes: SWDEV-521121
[clang] Ensure -mno-outline adds attributes
Before this change, `-mno-outline` and `-moutline` only controlled the
pass pipelines for the invoked compiler/linker.
The drawback of this implementation is that, when using LTO, only the
flag provided to the linker invocation is honoured (and any files which
individually use `-mno-outline` will have that flag ignored).
This change serialises the `-mno-outline` flag into each function's
IR/Bitcode, so that we can correctly disable outlining from functions in
files which disabled outlining, without affecting outlining choices for
functions from other files. This matches how other optimisation flags
are handled so the IR/Bitcode can be correctly merged during LTO.
[clang] Add clang::nooutline Attribute
This change:
- Adds a `[[clang::nooutline]]` function attribute for C and C++. There
is no equivalent GNU syntax for this attribute, so no `__attribute__`
syntax.
- Uses the presence of `[[clang::nooutline]]` to add the `nooutline`
attribute to IR function definitions.
- Adds test for the above.
The `nooutline` attribute disables both the Machine Outliner (enabled at
Oz for some targets), and the IR Outliner (disabled by default).