[MLIR][Linalg][NFC] Simplify tiling canonical pattern (#182909)
Prepare for better composition of canonicalization patterns by splitting
the linalg own canonicalizers from others for particular purposes (ex.
tiling).
Once dialects have their own registration mechanisms, specific passes
can just add more ops/dialects using a yet-to-be-created helper that
would be similar to the existing
`populateLinalgTilingCanonicalizationPatterns`.
[mlir][python] Fix segfault in DenseResourceElementsAttr.get_from_buffer for 0-d tensors (#183070)
When `ndim == 0`, `view->strides[view->ndim - 1]` is an out-of-bounds
access (unsigned underflow to `SIZE_MAX`). Use `view->itemsize` for
alignment instead, since a scalar buffer is trivially aligned to its
element size.
Fixes iree-org/iree-turbine#1312.
[clang] Clean up large clang binaries copied into test temp directories (#182304)
I noticed a couple of tests leave behind copies of clang binaries they
copy into their temp directories. Replicate the cleanup from another
test (clang/test/Driver/clang_f_opts_withspaces.c) to remove these.
[LAA][LV]Allow recognition of strided pointers with constant stride (#171151)
This patch fixes an issue found during LoopAccessAnalysis with respect
to recognizing strided pointers that make use of runtime constants. Loop
accesses of the form `p[base + offset * const]` , where `const` is a
runtime constant
should be considered for vectorization. However, it was found that there
were cases that these access patterns weren't recognized. This patch
resolves
this by adding an explicit pattern match within LAA.
---------
Co-authored-by: Florian Hahn <flo at fhahn.com>
interfaces: remove inconsistent "consistency check" and fix indent
If the VLAN parent isn't there the system has other problems.
Never seen this validation message out in the wild either.
See also: https://github.com/pfsense/pfsense/commit/66bcba1bcd806
[MLIR][NVVM][NVPTX] Support for new mma/mma.sp variants from PTX 9.1 (#182325)
This change adds support for `.scale_vec::4X` with `.ue8m0` as `.stype`
with `.kind::mxf4nvf4` for `mma/mma.sp` instructions introduced in [PTX
ISA
9.1](https://docs.nvidia.com/cuda/parallel-thread-execution/index.html?highlight=mma%2520sp#ptx-isa-version-9-1).
Also, it updates MLIR mma/mma.sp block scale tests with struct usage
instead of vector.
[CIR] Implement 'assume' attribute lowering (#182960)
This attribute applies to null statements and emits an assume-op
sometimes. This patch adds this for statements, which includes the
infrastructure for AttributedStmt lowering.
---------
Co-authored-by: Sirui Mu <msrlancern at gmail.com>
Fix test_arc_max_set and test_firstboot_checks readonly assertion
The switch from /proc/self/mountinfo parsing to statmount separated
per-mount VFS flags from filesystem-specific options. Readonly was
previously included in both mount_opts and super_opts but now only
appears in mount_opts. Update tests to check the correct field.
[IR] Specify alloca with poison element count (#183072)
An alloca with a poison element count is undefined behavior.
This matches existing behavior of optimizations. This also matches the
behavior of llubui for `poison`, but llubi currently does not report
immediate UB for `undef` element counts. A future patch will fix that.
[Hexagon] Avoid contracting predicates in createHvxPrefixPred
The function createHvxPrefixPred should only need to expand a predicate
to match the result's bytes-per-bit. Otherwise, contracting of the
predicate may lead to an input that is shorter than 4 bytes, making it
unsuitable for VINSERTW0.
When calling createHvxPrefixPred for vector concatention, re-group the
inputs to the concat to make sure that the resulting inputs to
createHvxPrefixPred would not need contraction.
Fixes https://github.com/llvm/llvm-project/issues/181362
NAS-139959 / 26.0.0-BETA.1 / Fix dead exception handling in pwenc check (#18288)
This commit fixes an issue where exceptions are never raised from
decrypt() in pwenc.check() because _raise defaulted to false, making the
except blocks and their relevant logging dead code.
[libsycl] Fix for static vars deinit order (libsycl vs liboffload) (#181366)
both libsycl & liboffload uses static variables.
on Linux static variable destructor is called earlier than the method
with `__attribute__((destructor(...)))`.
this fix helps to avoid crash due to liboffload static variable early
destruction.
the approach utilizes the following rule
"For each local object obj with static storage duration, obj is
destroyed as if a function calling the destructor of obj were registered
with
[std::atexit](https://en.cppreference.com/w/cpp/utility/program/atexit.html)
at the completion of the constructor of obj."
from `std::exit`.
in the first call of get_platforms we call liboffload's iterateDevices
that leads to liboffload static storage initialization. Then we
initialize our own local static var after this to be able to call our
shutdown methods earlier and before the liboffload objects are
[8 lines not shown]