Revert "[PreISelIntrinsicLowering] Expand binary elementwise intrinsics (#193552) (#193580) (#193990)
This reverts commit a1d11348aba4b70295ef9ada4a13a722455165d3.
Two problems have been identified:
- The expansion for powi was functionally wrong - the second argument
stales scalar even when the first is a vector.
- AArch64 uses the Expand option on libcalls to select vector libcalls
in some cases. The test coverage for this uses -start-after which
happens to miss this pass.
[MLIR][XeGPU] Remove use-by-broadcast-only restriction for ShapeCast op in Wg-to-Sg distribution pass (#193640)
The WgToSgVectorShapeCastOp pattern previously required that
vector.shape_cast operations expanding unit dimensions could only be
used by vector.broadcast operations. This constraint was not necessary
anymore after the recent refectory.
[compiler-rt][TySan] Add Hexagon target support (#191603)
Add shadow memory mapping for Hexagon (32-bit architecture) and enable
the TySan build for the Hexagon target.
Hexagon uses a 4-byte shadow entry (PtrShift=2) with the shadow region
at 0x80000000-0xBFFFFFFF (1GB). A 28-bit mask (kAppMemMsk) covers 256MB
of app address space; addresses differing only in bits 28-31 alias in
the shadow. kAppAddr is set to 0xC0000000 to size the mmap to exactly
the 1GB shadow region.
[libc][math] Refactor dmul family to header-only (#182151)
Refactors the dmul math family to be header-only.
Closes https://github.com/llvm/llvm-project/issues/182150
Target Functions:
- dmulf128
- dmull
[AMDGPU][Disassembler] Permit unneeded VOPD3 operands to be non-zero (#193974)
Use ? instead of 0 in the tablegen definitions for those unused operands
of VOPD3 instructions.
This enables the instruction to be disassembled regardless of what bits
are in those fields, which helps diagnose broken code. Previously, the
disassembler would reject these.
[lldb][AArch64][Linux] Rename "por" register to "por_el0" (#193983)
As agreed with my Arm colleagues working on GDB.
The suffix means we are matching the architectural name exactly, and
reducing confusion if you're
debuging multiple exception levels where there could be por_el<N> as
well.
In the process of updating the tests I found some
"register read" output has changed alignment so I
have fixed that too.
[flang][NFC] Converted five tests from old lowering to new lowering (part 48) (#193889)
Tests converted from test/Lower: pointer-args-caller.f90,
pointer-assignments.f90, pointer-association-polymorphic.f90,
pointer-default-init.f90, pointer-disassociate.f90
[clangd] [C++20] [Modules] Introduce GC for clangd built modules (#193973)
This patch introduces simple GC for clangd built module files to avoid
the clangd built module cache to increase infinitely.
The strategy is, in a clangd built module file cache, if the clangd
built module (we think all PCM files in clangd cache are built by
clangd) was not accessed in a time (by default 3 day, controlled by
--modules-builder-versioned-gc-threshold-seconds),clangd will remove it.
The strategy is not perfect. e.g., I heard in some systems, the atime
was forbid or not update. But given a trade off between usability and
maintainability. I feel the current stategy is fine.
AI assisted.
[libclc] Use 'LLVM_DEFAULT_TARGET_TRIPLE' instead of 'LLVM_RUNTIMES_TARGET' (#193969)
Summary:
The 'LLVM_RUNTIMES_TARGET' variable is the raw value used by the LLVM
CMake. It can contain multilib arguments which will not compile when
used as a triple. The more canonical value is
`LLVM_DEFAULT_TARGET_TRIPLE`, which is used by flang-rt, libc, openmp,
etc.
[BOLT][AArch64] Refuse to run JTFootprintReduction pass (#193946)
JTFootprintReduction results in a no-op on AArch64. This is because it
emits createIJmp32Frag() which is unimplemented for AArch64 and is only
overridden by x86.
- Add a guard for non-x86
- Update unsupported-passes.test with expected error message
[LifetimeSafety] Remerge "Add support for `new`/`delete`" (#193776)
This PR extends LifetimeSafety to support heap allocations via
`new`/`delete`.
# Contents
* Adds a new warning emitted for use-after-free.
* Renames `reportUseAfterFree` to `reportUseAfterScope`, since the old
name was misleading (the warnings are still called `use_after_scope`).
* Adds a new `AccessPath::Kind` value, `NewAllocation`, which is used
for loans issued from `new` allocations.
* Adds `VisitCXXNewExpr` and `VisitCXXDeleteExpr`, which handle loan
issuance and origin propagation for `new` and `delete`.
* Comes with extensive test coverage for the new features, including new
[15 lines not shown]
[CIR] Introduce LocalInitOp, & lower static locals (#193576)
During an investigation of something else, I discovered that our
handling of static-local as a ctor/dtor on a GlobalOp meant that it
couldn't actually be initialized with reference to any local, parameter,
or member declarations. This is obviously problematic.
This patch instead introduces a `LocalInitOp`, which is an operation
that represents the location of initialization for the static local.
This is lowered during lowering-prepare, same as we did previously (in
fact, it uses basically the exact same lowering code, with some slight
modifications).
Lowering from AST itself splits slightly from global declarations, but
the two share implementation as closely as possible.
At the moment, this operation only works for static-locals, and has a
handful of asserts to do the same. It is intended that the
thread-local-storage use the exact same mechanism, with some slight
modifications to the lowering-prepare pass to introduce the different
init behavior.
[mlir][tosa] Fix integer bilinear (quantized) tosa.resize lowering to use floordivsi (#193821)
## Background
`tosa.resize` in bilinear integer (quantized) mode lowers to a
`linalg.generic`
body that, for each output pixel, computes a corresponding input
coordinate and
blends the four neighboring input pixels. The mapping is:
```
val = out_coord * scale_d + offset
index = val / scale_n // integer part — which input pixel to start from
delta = val - index * scale_n // fractional part, scaled to [0, scale_n)
```
`delta` is the interpolation weight toward the next pixel. The bilinear
formula
(integer path) is:
[133 lines not shown]
[flang][NFC] Converted five tests from old lowering to new lowering (part 47) (#193886)
Tests converted from test/Lower: namelist-common-block.f90,
nested-where.f90, nullify.f90,
OpenMP/Todo/omp-default-clause-inner-loop.f90, optional-value-caller.f90