[clang-doc] Create a partial for HTML <head> sections
The <head> sections of the existing partials are already identical, so
creating a partial will help reduce lines in the templates. Now
changes to <head> sections can easily propogate and can easily be added
to future HTML pages.
[clang-doc] Align indentation in templates (#171667)
Indentation was inconsistent between the namespace and class templates.
This patch assumes that `<body>` is not indented.
[Hexagon] Fix HWBF16 PatLeaf type (#170560)
Correct the definition of `HWBF16` to reference `VecPBF16` rather than
`VecBF16`, aligning it with the existing pattern-leaf conventions for
pointer vector types.
Co-authored-by: Muntasir Mallick <quic_mallick at quicinc.com>
[VPlan] Remove legacy costing inside VPBlendRecipe::computeCost (#171846)
A VPBlendRecipe always emits selects, even when the VF is scalar.
However the legacy cost model always costs all scalar non-header phis as
a phi, and the VPlan cost model has to account for this.
This can cause the cost to be a little off, for example not including
the cost of the select in @smax_call_uniform leading to unprofitable
vectorization.
This removes this from the VPlan cost model and handles checks for the
case in planContainsAdditionalSimplifications instead.
I considered trying to make the legacy cost model more accurate but I'm
not sure if it's possible. We need information as to whether or not the
scalar VF we are costing is the original loop in which case it's
actually a phi, or if it's a VPBlendRecipe that emits a select,
potentially from a VF=1, UF>=1 VPlan.
[flang][OpenMP] Implement loop construct iterator range (#170734)
Since we're trying to preserve compiler directives in loop constructs,
not every element of the associated parser::Block needs to be a loop or
an OpenMP loop construct. Implement a helper class `LoopRange` to make
it easy to iterate over elements of parser::Block that are loops or loop
constructs.
ValueTracking: Handle amdgcn.rsq intrinsic in computeKnownFPClass
We have other target intrinsics already in ValueTracking functions,
and no access to TTI.
[mlir][tosa] Add clamp op support to `TosaNarrowI64ToI32` pass (#169308)
This commit allows the narrowing of `tosa.clamp` when the min/max
attributes are within the int32 range.
[AArch64] Run optimizeTerminators earlier too. (#170907)
Running optimizeTerminators prior to other optimizations like branch
layout can lead to more folding and better codegen, but is not on its
own able to capture all cases. There is benefit to running it in both
places. This adds the existing code from #161508 into the
AArch64RedundantCopyElimination pass, which sounds like a sensible
enough place for it.
This is a recommit with an extra fix for shrink-wrapping domtree use.
[mlir][vector] Remove hooks deprecated pre Release/22 branch (#171829)
As mentioned on Discourse,
* https://discourse.llvm.org/t/psa-vector-standardise-operand-naming
I am removing the deprecated Vector hooks near the creation of the
release/22 branch. These hooks were introduced in #158258 (~September
'25, ~3 months ago), so I assume folks have enough time to transition
away.
[AMDGPU][True16] remove pack32 pattern from true16 mode (#171756)
Remove pack32 so that isel use reg_sequence in true16 mode for
build_vector. This generates better code
[X86] mayFoldIntoVector - relax load alignment requirements (#171830)
If we're trying to move big integers to vector types, relax the SSE alignment requirements - unlike regular uses of mayFoldLoad, we're not testing to confirm every load will fold into a vector op, just that it can move to the FPU.
Fixes #144861