[lldb] Have ObjectFile::FindPlugin send a copy of the DE (#185727)
ObjectFile::FindPlugin iterates over plugins to find one that can handle
the binary provided. It is currently sending the one DataExtractorSP to
each subclass, but some subclasses may modify this DataExtractor during
their processing, e.g. calling DataExtractor::SetData on it, and I think
it is safer to isolate these with a copy of the DataExtractor so the
order the plugins are tried cannot possibly change behavior.
[clang] fix explicit incomplete enum (#184210)
stop BuildConvertedConstantExpression early for already-broken
expressions to prevent crashes in the constant conversion
fixes #183887
[mlir][acc] Add ACCComputeLowering pass (#185501)
Introduce a pass that lowers OpenACC compute constructs to a
representation that separates the data environment from the compute body
and prepares for parallelism assignment and privatization at the right
granularity.
- Decompose acc.parallel, acc.serial, and acc.kernels into
acc.kernel_environment and acc.compute_region. Launch arguments
(num_gangs, num_workers, vector_length) are turned into acc.par_width
and passed as compute_region launch operands.
- Convert acc.loop to SCF based on context: unstructured loops to
scf.execute_region; sequential (serial or seq) to scf.parallel with
par_dims=sequential; auto loops to scf.for (with collapse when
multi-dimensional); orphan loops to scf.for; independent loops in
parallel/kernels to scf.parallel with par_dims from the GPU mapping.
---------
Co-authored-by: Scott Manley <rscottmanley at gmail.com>
libclc: Add div_cr utility function
This is a workaround for the modal div operator precision. The
OpenCL default is not correctly rounded, so this provides a backdoor
to get a correctly rounded fdiv. Ideally clang would have a builtin
or some other mechanism to control the precision.
virtio: Restore mb() calls
Until an issue seen on amd64 can be investigated restore two mb() calls
to virtio.
Reviewed by: andrew
Fixes: c499ad6f997c ("virtio: Use bus_dma for ring and indirect buffer allocations")
Sponsored by: Arm Ltd
Differential Revision: https://reviews.freebsd.org/D55766
virtio: Restore mb() calls
Until an issue seen on amd64 can be investigated restore two mb() calls
to virtio.
Reviewed by: andrew
Fixes: c499ad6f997c ("virtio: Use bus_dma for ring and indirect buffer allocations")
Sponsored by: Arm Ltd
Differential Revision: https://reviews.freebsd.org/D55766
Use findAllocaInsertPoint when possible and move the affinity packing logic to OpenMPToLLVMIRTranslation
- Move the omp.affinity_list packing logic from OMPIRBuilder to
OpenMPToLLVMIRTranslation so that we have all the omp.affinity_list
allocating logic inside the lambda defined in buildAffinityData
- all the allocation logic for affinity list is now using
findAllocaInsertPoint when possible (static count)
- `task_affinity_iterator_dynamic_tripcount` in
openmp-iterator.mlir is a regression test add previously for
dynamic tripcount
[lld][WebAssembly] Restore inactive checks relocatable.ll test. NFC (#185569)
Back in 6474d1b20 this test was updated, removing the NORMAL vs SHARED
distinction in the output checking. However many of the NORMAL-NEXT
lines were left unmodified, making them effectively disabled.
This restores and updates the expectations.
[copmiler-rt] Initial support for building profile library on the GPU (#185552)
Summary:
As suggested in https://github.com/llvm/llvm-project/pull/177665, we
should build a GPU version of the compiler-rt profile library instead of
writing it in-line in the lowering. This PR does not define anything GPU
specific, it simply re-uses the baremetal handling. Later PRs will
prevent the GPU specific handling we would want to do to optimize
counter handling on the GPU.
Note that this will require using the cache file, or setting these
options
manually for existing users. Hopefully if people are using the cache
file
as they should it won't break anything.
[AMDGPU] Adds AGPR pressure during candidate init in GCN scheduler.
Scheduling heuristics automatically will consider AGPR pressure.
AGPRExcessLimit and AGPRCriticalLimit are added. Some of the VGPR
bias and error limits are reused. Helpers added mostly mirror the
existing VGPR logic. A ConsiderAGPR boolean controls whether AGPRs
should at all be factored in during candidate initialization, e.g.
on targets with allocatable AGPRs.
Verified that updated LIT tests use AGPRs.
Originally Authored-by: Nicholas Baron
(https://github.com/llvm/llvm-project/pull/150288)
Modified-by: Dhruva Chakrabarti
Assisted-by: Cursor