[Hexagon] Track type locally in HexagonVectorCombine (#179066)
Replace getAllocatedType calls with tracked types from alloca creation.
The types are known at the CreateAlloca call sites, so we track them
locally instead of re-querying through getAllocatedType, to facilitate
someday possibly removing getAllocatedType from the API of AllocaInst.
Co-authored-by: Claude Sonnet 4.5 <noreply at anthropic.com>
[DirectX] remove getAllocatedType in DXILDataScalarization (#179067)
Update dynamicallyLoadArray to take the allocated type as a parameter
instead of querying getAllocatedType. This is to facilitate removing
other incorrect uses of getAllocatedType, and eventually possibly even
getAllocatedType itself.
Co-authored-by: Claude Sonnet 4.5 <noreply at anthropic.com>
[ELF,test] Improve error/warning message checks
Update tests to include proper `error:` or `warning:` prefixes and
file/section information in CHECK patterns. Add
--implicit-check-not=error: to ensure no unexpected errors are produced.
[libc] Address sincosf size bloat (#179004)
The recent refactoring in #177523 marked some functions as static which
increased the size of sinf/cosf functions. Removing the static storage
for these functions to remove the bloat which is especially problematic
in size constrained baremetal target builds.
[ELF,test] Improve error message checks with proper format
Update tests to use the canonical error message format with `error:`
prefix and file:section information. Add `--implicit-check-not=error:`
to ensure no unexpected errors are produced.
This commit focuses on "out of range" and "not aligned" errors.
[AMDGPU][Scheduler] Simplify scheduling revert logic (#177203)
When scheduling must be reverted for a region, the current
implementation re-orders non-debug instructions and debug instructions
separately; the former in a first pass and the latter in a second pass
handled by a generic machine scheduler helper whose state is tied to the
current region being scheduled, in turns limiting the revert logic to
only work on the active scheduling region.
This makes the revert logic work in a single pass for all MIs, and
removes the restriction that it works exclusively on the active
scheduling region. The latter enables future use cases such as reverting
scheduling of multiple regions at once.
Reapply "[VPlan] Detect and create partial reductions in VPlan. (NFCI) (#167851)"
This reverts commit d1e477b00b49c63ff4dd513eeb14a5b18bc055d7.
Recommit with a extra checks making sure extends are VPWidenCastRecipes,
rejecting VPReplicateRecipes.
Original message:
As a first step, move the existing partial reduction detection logic to
VPlan, trying to preserve the existing code structure & behavior as
closely as possible.
With this, partial reductions are detected and created together in a
single step.
This allows forming partial reductions and bundling them up if
profitable together in a follow-up.
PR: https://github.com/llvm/llvm-project/pull/167851
[llvm-lipo] Fix handling of archives in universal binaries (#176448)
When extracting slices from a universal binary, llvm-lipo was not
handling the case where the slice is an archive.
Fixes #90156
[X86] getScalarMaskingNode - FIXUPIMM scalar ops take upper elements from second operand (#179101)
FIXUPIMMSS/SD instructions passthrough the SECOND operand upper elements, and not the first like most (2-op) instructions
Fixes #179057
[Analysis] Add Intrinsics::CLMUL case to cost calculations to getIntrinsicInstrCost / getTypeBasedIntrinsicInstrCost (#176552)
This patch adds a case in getIntrinsicInstrCost and
getTypeBasedIntrinsicInstrCost in
llvm/include/llvm/CodeGen/BasicTTIImpl.h for Intrinsic::clmul. This
patch uses TLI->isOperationLegalOrCustom to check if the instruction is
cheap. If not cheap, it sums up the cost of the arithmetic operations
(AND, SHIFT, XOR) multiplied by the bit width.
Fixes #176354
Revert "[clang][bytecode] Use in `Expr::tryEvaluateObjectSize` (#1790… (#179099)
…33)"
This reverts commit 756c321c33af2be0bd40707948aae3c06163a0a6.
Test failure in clang/test/AST/ByteCode/builtins.c in CI build
CC @tbaederr
[clang][bytecode] Use in `Expr::tryEvaluateObjectSize` (#179033)
Fixes #138474
Use new bytecode intepreter in `Expr::tryEvaluateObjectSize`. Reuses the
already existing implementation for `__builtin_object_size` in of the
intepreter.
---------
Co-authored-by: Timm Baeder <tbaeder at redhat.com>
[libc][test] Fix TmMatcher and correct tm_yday/tm_wday test values (#179029)
The TmMatcher was using || instead of && to compare tm struct fields,
causing it to match if ANY field was equal rather than ALL fields. This
masked incorrect expected values in the time tests.
Happily, only the tests needed fixing. The code was correct.
Fixed the matcher and corrected all tm_yday and tm_wday values to match
glibc's gmtime_r output.
[lldb] [Process/FreeBSDKernel] Select paniced thread automatically (#178069)
Kernel panic is a special case, and there is no signal or exception for
that so we need to rely on special workaround called `dumptid`.
FreeBSDKernel plugin is supposed to find this thread and set it manually
through `SetStopInfo()` in `CalculateStopInfo()` like Mach core plugin
does.
Before (We had to find and select crashed thread list otherwise thread 1
was selected by default):
```
➜ sudo lldb /boot/panic/kernel -c /var/crash/vmcore.last
(lldb) target create "/boot/panic/kernel" --core "/var/crash/vmcore.last"
Core file '/var/crash/vmcore.last' (x86_64) was loaded.
(lldb) bt
* thread #1, name = '(pid 12991) dtrace'
* frame #0: 0xffffffff80bf9322 kernel`sched_switch(td=0xfffff8015882f780, flags=259) at sched_ule.c:2448:26
frame #1: 0xffffffff80bd38d2 kernel`mi_switch(flags=259) at kern_synch.c:530:2
frame #2: 0xffffffff80c29799 kernel`sleepq_switch(wchan=0xfffff8014edff300, pri=0) at subr_sleepqueue.c:608:2
[38 lines not shown]