[mlir][tosa] Fix select folder when operands are broadcast (#165481)
This commit addresses a crash in the dialects folder. The currently
folder assumes no broadcasting of the input operand happens and
therefore the folder can complain that the returned value was not the
same
shape as the result.
For now, this commit ensures no folding happens when broadcasting is
involved. In the future, folding with a broadcast could likely be
supported by inserting a `tosa.tile` operation before returning the
operand. This type of transformation is likely better suited for a
canonicalization pass. This commit only aims to avoid the crash.
[flang] simplify pointer assignments (#168732)
Pointer assignment lowering was done in different ways depending on
contexts and types, sometimes still using runtime calls when this is not
needed and the complexity of doing this inline is very limited (the
pointer and target descriptors were already prepared inline, the runtime
is just doing the descriptor assignment and ensuring the pointer
descriptor keep its pointer flag).
Slightly extent the inline version that was used for Forall and use it
for all cases.
When lowering without HLFIR is removed, this will allow removing more
code.
[flang-rt] Fix TypeCategory for quad-precision COMPLEX (#168090)
Modify the TypeCategory for quad-precision COMPLEX to
CFI_type_float128_Complex so it matches the TypeCode returned
by SELECT TYPE lowering.
Fixes #134565
[flang][debug] Make common blocks data extraction more robust. (#168752)
Our current implementation for extracting information about common block
required traversal of FIR which was not ideal but previously there was
no other way to obtain that information. The `[hl]fir.declare` was
extended in commit https://github.com/llvm/llvm-project/pull/155325 to
include storage and storage_offset. This commit adds these operands in
`fircg.ext_declare` and then use them in `AddDebugInfoPass` to create
debug data for common blocks.
[MLIR][SCFToGPU] Guard operands before AffineApplyOp::create to avoid crash (#167959)
This fixes a crash in SCF→GPU when building the per‑dim index for mapped
scf.parallel.
**Change**:
- Map step/lb through cloningMap, then run ensureLaunchIndependent.
- If either is still unavailable at launch scope, emit a match‑failure;
otherwise build the affine.apply.
**Why this is correct:**
- Matches how the pass already handles launch bounds; avoids creating an
op with invalid operands and replaces a segfault with a clear
diagnostic.
**Tests**:
- Added two small regressions that lower to gpu.launch and exercise the
affine.apply path.
[2 lines not shown]
[AArch64][PAC] Use enum to describe LR signing condition (NFC) (#168548)
Express the condition of signing the return address in a function using
an `enum class` instead of a pair of `bool`s. Define `enum class
SignReturnAddress` with the values corresponding to the three possible
modes that can be requested via "sign-return-address" function
attribute.
Previously, there were two overloads of `shouldSignReturnAddress`
accepting either `const MachineFunction &` or `bool` argument. Due to
pointer-to-bool conversion, when `shouldSignReturnAddress` was
incorrectly called with `const MachineFunction *` argument, the latter
overload was used instead of reporting a compile-time error.
[mlir][memref] Generalize dead store detection to all view-like ops (#168507)
The dead alloc elimination pass previously considered only subviews when
checking for dead stores. This change generalizes the logic to support
all view-like operations, ensuring broader coverage.
[OpenMP][AIX] Not to create symbolic links to libomp.so in install step (NFC) (#168585)
Commit bb563b1 handles the links in the build directory but
misses the case in the install step. This patch is to link only
the libomp.a on AIX.
[clang][diagnostics] added warning for possible enum compare typo (#168445)
Added diagnosis and fixit comment for possible accidental comparison
operator in an enum.
Closes: #168146
[HLSL] Implement the `fwidth` intrinsic for DXIL and SPIR-V target (#161378)
Adds the fwidth intrinsic for HLSL.
The DXIL path only requires modification to the hlsl headers.
The SPIRV path implements the OpFwidth builtin in Clang and instruction
selection for the OpFwidth instruction in LLVM.
Also adds shader stage tests to the ddx_coarse and ddy_coarse
instructions used by fwidth.
Closes #99120
---------
Co-authored-by: Alexander Johnston <alexander.johnston at amd.com>
[BOLT][BTI] Add MCPlusBuilder::addBTItoBBStart
This function contains most of the logic for BTI:
- it takes the BasicBlock and the instruction used to jump to it.
- then it checks if the first non-pseudo instruction is a sufficient
landing pad for the used call.
- if not, it generates the correct BTI instruction.
Also introduce the isBTIVariantCoveringCall helper to simplify the logic.
[LLVM][CodeGen][SVE] Only use unpredicated bfloat instructions when all lanes are in use. (#168387)
While SVE support for exception safe floating point code generation is
bare bones we try to ensure inactive lanes remiain inert. I mistakenly
broke this rule when adding support for SVE-B16B16 by lowering some
bfloat operations of unpacked vectors to unpredicated instructions.