[RISCV][P-ext] Remove dead code from LowerOperation handling of ISD::STORE. NFC (#194088)
We rely on default type legaliation of v2i16 and v4i8 stores for RV64P.
graphics/libplacebo: unbreak GLSLANG build after 1697360b7726
src/glsl/meson.build:52:16: ERROR: C++ static library 'SPIRV' not found
PR: 294727
Reported by: Ivan Rozhuk
[DirectX] Emit `dx.precise` metadata when fast math is not present (#192526)
This patch introduces the ability for DXILOpBuilder to annotate
instructions with `dx.precise` whenever fast math flags are not present.
Fix: https://github.com/llvm/llvm-project/issues/149127
[RISCV] Expand fcanonicalize on vector types (#193842)
This change does a couple of related things:
1) It changes the default expansion strategy for scalable
vectors for fcanonicalize. This switches us from
emitting a loop (directly parallel to unrolling)
to using the already available fmul expansion.
2) Mark RISC-V legal scalable vector types as Expand
to leverage the previous item.
3) Wrap fixed vector types in their corresponding
scalable types to avoid unrolling.
The net effect is to improve the lowering for fixed vector cases and to
no longer crash for the scalable ones. We were crashing because the
scalable cases were marked Legal, not Expand. We could have just fixed that, but doing everyone at once seemed like a good investment.
Note that we can also choose to follow Aarch64 and consider a vfmin
based lowering. I left that until later thought it is worthwhile noting
that's what we do for scalar code.
[clang][deps] Always initialize module cache out params (#194082)
We did not initialize the out parameters in #192347, causing the
"sanitizer-x86_64-linux-fast" bot to complain with:
```
SUMMARY: MemorySanitizer: use-of-uninitialized-value /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Frontend/CompilerInstance.cpp:1525:63 in compileModuleImpl(clang::CompilerInstance&, clang::SourceLocation, clang::SourceLocation, clang::Module*, clang::ModuleFileName)
Exiting
==clang==3084515==WARNING: MemorySanitizer: use-of-uninitialized-value
#0 0x586360f7a604 in compileModuleImpl(clang::CompilerInstance&, clang::SourceLocation, clang::SourceLocation, clang::Module*, clang::ModuleFileName) /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Frontend/CompilerInstance.cpp:1525:63
#1 <...>
```
This PR should fix that.
[llvm-profgen] Add support for ETM trace decoding (#191584)
This patch introduces ETMReader to llvm-profgen,
enabling the reconstruction of execution profiles from ETM formatted
trace data.
- Integrate OpenCSD (CoreSight Decoding Library) as an optional
dependency via the LLVM_ENABLE_OPENCSD flag.
- Implement ETMTraceDecoder in ProfileData to interface with OpenCSD.
- Implement ETMReader, which uses hardware configuration and ELF memory
mapping to decode instruction traces.
- Add the --etm command-line option to specify raw trace inputs.
- Add the --target-triple command-line option to override the target
architecture for the binary.
The implementation targets microcontroller-class (Cortex-M) devices
based on the binary's target triple.
RFC:
https://discourse.llvm.org/t/rfc-add-etm-trace-support-to-llvm-profgen/90525
[dsymutil] Report error when section offsets exceed DWARF32 limit (#193867)
When linking very large binaries, debug section offsets can exceed the 4
GB DWARF32 limit. Previously this caused an assertion in
MCStreamer::emitIntValue when trying to emit an overflowing
DW_FORM_sec_offset value.
Detect the overflow at the point where section offsets are patched in
DWARFStreamer (for .debug_ranges, .debug_rnglists, .debug_loc,
.debug_loclists) and in DWARFLinker (for .debug_line and .debug_addr).
rdar://107413300
[LFI][AArch64] Add rewrites for control flow (#192602)
Adds LFI rewrites for control flow instructions (indirect branches and
returns). Indirect branches must go through `x28`, which is always
guaranteed to hold a sandbox address. Modifications to `x30` must guard
`x30` afterwards, to uphold the invariant that `x30` always holds a
sandbox address. As a result, bare return instructions can be used
without any additional rewrites.
[clang][NFC] Linux/Windows Multilib Include Path Tests (#193869)
This adds checks to the tests to show how the include path is changed by
the multilib logic for Linux/Windows added in commit
78820cb91605693b7d768be4ebc8b66181d3e9c3.
Assisted By: Claude
ena: Budget rx descriptors, not packets
We had ENA_RX_BUDGET = 256 in order to allow up to 256 received
packets to be processed before we do other cleanups (handling tx
packets and, critically, refilling the rx buffer ring). Since the
ring holds 1024 buffers by default, this was fine for normal packets:
We refill the ring when it falls below 7/8 full, and even with a large
burst of incoming packets allowing it to fall by another 1/4 before we
consider refilling the ring still leaves it at 7/8 - 1/4 = 5/8 full.
With jumbos, the story is different: A 9k jumbo (as is used by default
within the EC2 network) consumes 3 descriptors, so a single rx cleanup
pass can consume 3/4 of the default-sized rx ring; if the rx buffer
ring wasn't completely full before a packet burst arrives, this puts
us perilously close to running out of rx buffers.
This precise failure mode has been observed on some EC2 instance types
within a Cluster Placement Group, resulting in the nominal 10 Gbps
single-flow throughput between instances dropping to ~100 Mbps as a
[19 lines not shown]
ena: Adjust ena_[rt]x_cleanup to return bool
The ena_[rt]x_cleanup functions are limited internally to a maximum
number of packets; this ensures that TX doesn't starve RX (or vice
versa) and also attempts to ensure that we get a chance to refill
the RX buffer ring before the device runs out of buffers and starts
dropping packets.
Historically these functions have returned the number of packets which
they processed which ena_cleanup compares to their respective budgets
to decide whether to reinvoke them. This is unnecessary complication;
since the precise number of packets processed is never used, adjust
the APIs of those functions to return a bool indicating if they want
to be reinvoked (aka if they hit their limits).
Since ena_tx_cleanup now only uses work_done if diagnostics are
enabled (ena_log_io macros to nothing otherwise) eliminate that
variable and pass its value (ENA_TX_BUDGET - budget) to ena_log_io
directly.
[7 lines not shown]
Merge tag 'trace-ring-buffer-v7.1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull ring-buffer fix from Steven Rostedt:
- Fix accounting of persistent ring buffer rewind
On boot up, the head page is moved back to the earliest point of the
saved ring buffer. This is because the ring buffer being read by user
space on a crash may not save the part it read. Rewinding the head
page back to the earliest saved position helps keep those events from
being lost.
The number of events is also read during boot up and displayed in the
stats file in the tracefs directory. It's also used for other
accounting as well. On boot up, the "reader page" is accounted for
but a rewind may put it back into the buffer and then the reader page
may be accounted for again.
Save off the original reader page and skip accounting it when
[4 lines not shown]
Merge tag 'block-7.1-20260424' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux
Pull block fixes from Jens Axboe:
- Series for zloop, fixing a variety of issues
- t10-pi code cleanup
- Fix for a merge window regression with the bio memory allocation mask
- Fix for a merge window regression in ublk, caused by an issue with
the maple tree iteration code at teardown
- ublk self tests additions
- Zoned device pgmap fixes
- Various little cleanups and fixes
[22 lines not shown]