[flang][cuda] Add support for cudaStreamDestroy (#183648)
Add specific lowering and entry point for cudaStreamDestroy. Since we
keep associated stream for some allocation, we need to reset it when the
stream is destroy so we don't use it anymore.
[Clang][Hexagon] Add QURT as recognized OS in target triple (#183622)
Add support for the QURT as a recognized OS type in the LLVM triple
system, and define the __qurt__ predefined macro when targeting it.
[scudo] Add reallocarray C wrapper. (#183385)
`reallocarray()` is a POSIX extension to C standard which wraps
`realloc` function and adds `calloc`-like overflow detection. It is
available in glibc and some other standard library implementations. Add
`reallocarray` to the list of Scudo C wrappers, so that the code that
depends on `reallocarray` presence will continue to work.
NAS-139102 / 26.0.0-BETA.1 / Convert nfs plugin `yaml.safe_load` to `CSafeLoader` (#18308)
Move definition of `safe_yaml_load` from plugins/apps/ix_apps/utils.py
to middleware/utils/yaml.py
Update all current imports of `safe_yaml_load`.
Convert nfs plugin usage of `yaml.safe_load` to `safe_yaml_load`.
CI tests underway
[here](http://jenkins.eng.ixsystems.net:8080/job/tests/job/api_tests/7815/).
and
`api2/test_300_nfs.py::TestNFSops::test_client_status PASSED [ 7%]`
build: correct `MSVC` and Windows mixup for `CLANG_BUILD_STATIC` (#183609)
The build incorrectly used `MSVC` to determine that we were building for
Windows (MS ABI). This prevents the use of the GNU driver for building
LLVM for Windows. Adjust the condition to `WIN32 AND NOT MINGW` to
correctly identify that we are building for Windows MS ABI.
[scudo] Change header tagging for the secondary allocator (#182487)
When secondary allocator allocates a new chunk, the allocation is
prepended with a chunk header (common with the primary allocator)
and large header (only used for secondary).
Only the headers are tagged, the data is not, and the headers are
tagged individually as different tags are used for them.
In the current implementation while tagging the large header the unused
area is tagged with it, so the allocator can tag up to a page size (in
worst case), which is costly and does not bring security benefit (as the
area is unused).
With the current fix we can get rid of around 97-98% of the tagging for
the secondary allocator, measured with random benchmarks.
Co-authored-by: Christopher Ferris <cferris1000 at users.noreply.github.com>
[AArch64] Decompose FADD reductions with known zero elements (#167313)
FADDV is matched into FADDPv4f32 + FADDPv2i32p but this can be relaxed
when one element (usually the 4th) or more are known to be zero.
Before:
```
movi d1, #0000000000000000
mov v0.s[3], v1.s[0]
faddp v0.4s, v0.4s, v0.4s
faddp s0, v0.2s
```
After:
```
mov s1, v0.s[2]
faddp s0, v0.2s
fadd s0, s0, s1
```
[2 lines not shown]
Initial import of devel/zydis, version 4.1.1.
Fast and lightweight x86/x86-64 disassembler and code generation library.
Zydis comes with absolutely no dependencies, making it a perfect
candidate not only for user-mode code, but also for kernel drivers
and exotic build environments like UEFI.
Initial import of devel/zycore-c version 1.5.1.
Internal library for zydis disassembler providing platform independent
types, macros and a fallback for environments without LibC.
[mlir][xegpu] Retain order attribute during load + transpose optimization. (#183608)
As described in the title `order` attribute is ignored in this
transformation causing downstream test failures.
[VPlan] Process instructions in reverse order when widening
It doesn't matter right now because we're using CM's decision, but
https://github.com/llvm/llvm-project/pull/182595 introduces some
scalarization (first-lane-only) opportunites that aren't known in CM and
those require reverse iteration order to support as those are determined
by VPUsers and not operands.
[Hexagon] Fix memory type for vgather intrinsics (#183563)
Some of the Hexagon vgather intrinsics were picking the memory type
(memVT) from a fixed argument position, but for several variants (e.g.
the predicated ones), that argument isn’t actually the data vector being
gathered. As a result, LLVM could end up recording the wrong memory type
or size (e.g. i32 or mask instead of the vector arg). This patch fixes
that by always taking memVT from the last intrinsic argument, which is
the actual data vector.