[flang] Fix abort on invalid -fdo-concurrent-to-openmp value. (#193929)
We observed that following command can cause an assertion fail
`flang -fopenmp -fdo-concurrent-to-openmp=devic,e` <file>
It happened because `parseDoConcurrentMapping` reported an error but
still called `val.value()` on failure, tripping std::optional
assertions.
The fix is to return false on error and wire return into
`createFromArgs`.
CodeGen: Fix double counting bundles in inst size verification
The AMDGPU implementation handles bundles by summing the
member instructions. This was starting with the size of the
bundle instruction, then re-adding all of the same instructions.
This loop is over the iterator, not instr_iterator, so it should
not be looking through the bundled instructions. Most of the other
uses of getInstSizeInBytes are also on the iterator, not the
instr_iterator so the convention seems to be targets need to handle
BUNDLE correctly themselves.
[DirectX] Apply DXIL op fnattrs to declarations (#193622)
We need to apply DXIL op attributes to the functions themselves, and all
DXIL ops should have the `unwind` attribute. This matches the DXC
behaviour and what consumers like warp's GPU-based validation expect.
Fixes #193620
[libclc] Make sure PACKAGE_VERSION is set for libclc (#193966)
Summary:
This can be unset because CMake does not expose this as a raw variable
when you use the find_package interface. If it is not set as in the case
of standalone builds the clang resource directory won't be found
Revert "[PreISelIntrinsicLowering] Expand binary elementwise intrinsics (#193552) (#193580) (#193990)
This reverts commit a1d11348aba4b70295ef9ada4a13a722455165d3.
Two problems have been identified:
- The expansion for powi was functionally wrong - the second argument
stales scalar even when the first is a vector.
- AArch64 uses the Expand option on libcalls to select vector libcalls
in some cases. The test coverage for this uses -start-after which
happens to miss this pass.
[MLIR][XeGPU] Remove use-by-broadcast-only restriction for ShapeCast op in Wg-to-Sg distribution pass (#193640)
The WgToSgVectorShapeCastOp pattern previously required that
vector.shape_cast operations expanding unit dimensions could only be
used by vector.broadcast operations. This constraint was not necessary
anymore after the recent refectory.
[compiler-rt][TySan] Add Hexagon target support (#191603)
Add shadow memory mapping for Hexagon (32-bit architecture) and enable
the TySan build for the Hexagon target.
Hexagon uses a 4-byte shadow entry (PtrShift=2) with the shadow region
at 0x80000000-0xBFFFFFFF (1GB). A 28-bit mask (kAppMemMsk) covers 256MB
of app address space; addresses differing only in bits 28-31 alias in
the shadow. kAppAddr is set to 0xC0000000 to size the mmap to exactly
the 1GB shadow region.
[libc][math] Refactor dmul family to header-only (#182151)
Refactors the dmul math family to be header-only.
Closes https://github.com/llvm/llvm-project/issues/182150
Target Functions:
- dmulf128
- dmull
[AMDGPU][Disassembler] Permit unneeded VOPD3 operands to be non-zero (#193974)
Use ? instead of 0 in the tablegen definitions for those unused operands
of VOPD3 instructions.
This enables the instruction to be disassembled regardless of what bits
are in those fields, which helps diagnose broken code. Previously, the
disassembler would reject these.
[lldb][AArch64][Linux] Rename "por" register to "por_el0" (#193983)
As agreed with my Arm colleagues working on GDB.
The suffix means we are matching the architectural name exactly, and
reducing confusion if you're
debuging multiple exception levels where there could be por_el<N> as
well.
In the process of updating the tests I found some
"register read" output has changed alignment so I
have fixed that too.
[flang][NFC] Converted five tests from old lowering to new lowering (part 48) (#193889)
Tests converted from test/Lower: pointer-args-caller.f90,
pointer-assignments.f90, pointer-association-polymorphic.f90,
pointer-default-init.f90, pointer-disassociate.f90
[clangd] [C++20] [Modules] Introduce GC for clangd built modules (#193973)
This patch introduces simple GC for clangd built module files to avoid
the clangd built module cache to increase infinitely.
The strategy is, in a clangd built module file cache, if the clangd
built module (we think all PCM files in clangd cache are built by
clangd) was not accessed in a time (by default 3 day, controlled by
--modules-builder-versioned-gc-threshold-seconds),clangd will remove it.
The strategy is not perfect. e.g., I heard in some systems, the atime
was forbid or not update. But given a trade off between usability and
maintainability. I feel the current stategy is fine.
AI assisted.
[libclc] Use 'LLVM_DEFAULT_TARGET_TRIPLE' instead of 'LLVM_RUNTIMES_TARGET' (#193969)
Summary:
The 'LLVM_RUNTIMES_TARGET' variable is the raw value used by the LLVM
CMake. It can contain multilib arguments which will not compile when
used as a triple. The more canonical value is
`LLVM_DEFAULT_TARGET_TRIPLE`, which is used by flang-rt, libc, openmp,
etc.