[ORC] Flush streams in WaitingOnGraphOpStreamRecorder record ops.
Allows us to get useful recordings out of JIT sessions that crash, or are kept
alive indefinitely. (Note that an 'end' operation will have to be appended to
the output in these cases).
[MLIR][Func] Return nullptr for empty ResultAttrs (#185219)
Fixes #185156
When an empty res_attrs is passed manually, we should still return
nullptr to indicate that no results have attributes.
[OpenMP] Add definitions of FLATTEN and SPLIT to OMP.td
Add the definitions of the "flatten" and the "split" constructs
to the OMP.td file. This will allow the implementation efforts
in clang and flang to proceed independently.
There is no other functionality added in this patch.
[clang-tidy] Fix false negative in `readability-simplify-subscript-expr` when subscripting substituted types (#185570)
This check's bespoke method of avoiding matching
in template instantations is overeager. This commit
changes it to just rely on IgnoreUnlessSpelledInSource
traversal instead. This is the same problem
as in #185559.
[mlir][OpenMP] Allow tile composition (#185380)
The verifier of the TileOp did not allow composition of multiple
transformations out of precaution. However, composition works, therefore
remove the "currently only supports omp.canonical_loop as applyee" check
and add regression tests.
[CIR][AArch64] Add support for the remaining `vceqz` builtins (#185440)
Implement the remaining CIR lowerings for the AdvSIMD (Neon)
`vceqz` intrinsic group (bitwise equal to zero).
Most variants of `vceqz` variant were already supported; this patch
completes the rest of the group [1] that was left as a TODO.
Tests for these intrinsics are moved from:
* test/CodeGen/AArch64/neon_intrinsics.c
* test/CodeGen/AArch64/v8.2a-fp16-intrinsics.c
to:
* test/CodeGen/AArch64/neon/intrinsics.c
* test/CodeGen/AArch64/neon/fullfp16,
respectively.
The implementation largely mirrors the existing lowering in
[3 lines not shown]
[TableGen] Do not order register classes based on heap addresses
Compare registers using their enum values instead, which I
suspect was the intention in the first place, since we already
have lexicographical ordering defined for CodeGenRegisters.
This does not cause any changes in .inc files and is likely
NFC, but it's still best to have it be deterministic.
[libclc][CMake] Append target_name to external-funcs test target name (#185639)
Avoid name conflicts when multiple libararies use the same target
triple.
Reapply "Reapply "[clang][ssaf] Add --ssaf-extract-summaries= and --ssaf-tu-summary-file= options"" (#185616)
This reverts commit 9a1c63230b8ad3f19cb624f0d283f7df10957ab7.
1st attempt: #184421
2nd attempt: #185414
Third time the charm!
rdar://172173836
[SPIRV] Add support for emitting DebugFunction debug info instructions
This commit adds support for emitting SPIRV DebugFunction and
DebugFunctionDefinition instructions for function definitions.
[TableGen] Fix inferring missing sub-classes for various subreg indices
We should not imply artificial registers have sub-registers for a given
index even if the class is known to 'fully support' that index.
Fixes crashes reported in
https://github.com/llvm/llvm-project/pull/183371#discussion_r2905495313
[NFC][SPIRV] Extract helper functions in SPIRVEmitNonSemanticDI
This commit extracts reusable helper functions to improve code
organization and reduce duplication. This is a pure refactoring
that does not change behavior.
These helpers will be used in subsequent commits to refactor
emitGlobalDI and add function-level debug info emission.
[SPIRV] Refactor NonSemantic debug info placement logic.
Refactor the logic for determining which NonSemantic.Shader.DebugInfo.100
instructions should be placed in the global section from a whitelist
to a blacklist approach.
[PowerPC][NFC] Clean up code in RegisterInfo.td (#185520)
Just some cleanup work. Moving non register related operands to
PPCOperands.td and PatLeaf def to PPCInstrInfo.td.
[VPlan] Materialize VectorTripCount in narrowInterleaveGroups. (#182146)
When narrowInterleaveGroups transforms a plan, VF and VFxUF are
materialized (replaced with concrete values). This patch also
materializes the VectorTripCount in the same transform.
This ensures that VectorTripCount is properly computed when the narrow
interleave transform is applied, instead of using the original VF
+ UF to compute the vector trip count. The previous behavior generated
correct code, but executed fewer iterations in the vector loop.
The change also enables stricter verification prevent accesses of UF,
VF, VFxUF etc after materialization as follow-up.
Note that in some cases we no miss branch folding, but that should be
addressed separately, https://github.com/llvm/llvm-project/pull/181252
Fixes one of the violations accessing a VectorTripCount after UF and VF
being materialized
PR: https://github.com/llvm/llvm-project/pull/182146
[lldb][test] PlatformDarwinTest.cpp: add more test-cases for script name sanitization
Adds more test-cases since I'm making changes around this area (see https://github.com/llvm/llvm-project/pull/185627).
libclc: Use elementwise exp for exp functions (#185626)
For amdgpu use the exp intrinisc. Really, this should be
the default generic implementation. But we're stuck in a
mess where essentially nothing works. All of the exp
intrinsics work for AMDGPU, but aren't really implemented
for spirv or nvptx. Ideally the intrinsic and/or libm call
would be the default implementation.