[MemProf] Optimize BitcodeReader stack id lookups (#182097)
Introduce StackIdToIndex to ModuleSummaryIndexBitcodeReader to cache the
mapping from module-local stack id indices to the global index in the
ModuleSummaryIndex's StackIds vector. This avoids repeated hash lookups
when processing callsite and allocation records.
This reduced the thin link time for a large target built with memprof
by ~16%.
Also add assertions to ensure STACK_IDS records are processed once and
that the cache is empty initially.
[clang][dataflow] Cache getModeledFields (#180878)
Cache getModeledFields at the DataflowAnalysisContext level, since
different contexts could have different ModeledFields for the same type,
and helps cap the memory usage by being scoped. This isn't the most
sharing we can get, but still effective (~70% hit rate). Otherwise, the
underlying getFieldsFromClassHierarchy is repeated many times and can
end up taking 4.6% of a run (geomean across some benchmarks), compared
to 40% for parsing, and 5.3% for querySolver for the same benchmarks. So
not insignificant since we also wonder if querySolver is expensive.
Also change the return type to a reference, now that it is not fresh Set
each time (though that copy is minor).
[WebKit Checkers] Trivial analysis should check destructors of function parameters and local variables (#181576)
This PR fixes the bug in TrivialFunctionAnalysisVisitor that it wasn't
checking the triviality of destructors of function parameters and local
variables. This meant that code calls a non-trivial desturctors such as
RefPtr<T>::~RefPtr<T> which calls T::~T to be incorrectly treated as
trivial, resulting in false negatives.
To do this, we manually visit every function parameter and local
variable declaration and check the triviality of its destructor
recursively.
Also fix a bug that we were checking isVirtualAsWritten instead of
isVirtual in IsFunctionTrivial.
---------
Co-authored-by: Balazs Benics <benicsbalazs at gmail.com>
[mlir][LLVM] Fix verifier crash for llvm.blockaddress with missing function (#181519)
### Whats the Problem
Fix verifier crash in `llvm.blockaddress` when the referenced function
symbol is missing by guarding null before `dyn_cast`.
Adds regression test using `-verify-diagnostics` to ensure invalid IR
emits an error instead of aborting.
### Why it happened
`SymbolTable::lookupNearestSymbolFrom` may return null, and `dyn_cast`
on a non-existent value triggers an assertion in `mlir-opt`.
### Whats the Fix
Split symbol lookup returning early if lookup fails or symbol is not an
`LLVMFuncOp`.
Verifier now reports “expects an existing block label target” instead of
hitting `dyn_cast` assert.
Fixes #181451
[Sema][HLSL][Matrix] Make matrices initializable by a single vector and vice-versa (#177486)
Fixes #169561
The problem was that the logic for determining whether or not to convert
constructor syntax into list initialization did not account for cases
where the constructor only had a single argument. Thus, the constructor
for a matrix with a single vector, or a vector with a single matrix did
not convert to list initialization despite being valid (in DXC).
This PR allows constructor syntax to be converted into list
initialization when the number of arguments in the constructor is 1, but
if and only if the destination type is a matrix/vector and the source
type is a vector/matrix.
[DAG] expandCLMUL - unroll vector clmul if vector multiplies are not supported (#182041)
Fixes powerpc cases reported on #182039
I'm hoping #177566 can be adapted to improve upon this.
[LLVM][CLANG] Update signal-handling behavior to comply with POSIX (#169340)
The POSIX standard
[POSIX.1-2024](https://pubs.opengroup.org/onlinepubs/9799919799/utilities/V3_chap01.html#tag_18)
specifies how the utility reacts to signals as follows. This includes
clang when invoke through a invocation such as
[c17](https://pubs.opengroup.org/onlinepubs/9799919799/utilities/c17.html)
```
ASYNCHRONOUS EVENTS
The ASYNCHRONOUS EVENTS section lists how the utility reacts to such events as signals and what signals are caught.
Default Behavior: When this section is listed as "Default.", or it refers to "the standard action" for any signal, it means that the action taken as a result of the signal shall be as follows:
If the action inherited from the invoking process, according to the rules of inheritance of signal actions defined in the System Interfaces volume of POSIX.1-2024, is for the signal to be ignored, the utility shall ignore the signal.
If the action inherited from the invoking process, according to the rules of inheritance of signal actions defined in System Interfaces volume of POSIX.1-2024, is the default signal action, the result of the utility's execution shall be as if the default signal action had been taken.
When the required action is for the signal to terminate the utility, the utility may catch the signal, perform some additional processing (such as deleting temporary files), restore the default signal action, and resignal itself.
```
[9 lines not shown]
[TableGen] Return int32_t from InstrMapping table lookup functions. NFC. (#182079)
Since #182059 there is only one case in which these functions return -1,
so callers no longer need to distinguish between (int64_t)-1 and
(uint32_t)-1, so we can go back to a 32-bit return value like it was
before #180954.
[ConstantFolding] Fix type mismatch in ConstantFolding for vector types. (#181695)
Drop `Bitcast` case from `IsConstantOffsetFromGlobal` to avoid
misdetections.
[clang][ARM] Refactor argument handling in `EmitAArch64BuiltinExpr` (2/2) (NFC)
Refactor `EmitAArch64BuiltinExpr` so that all AArch64/NEON builtins
handled by this hook _and marked as overloaded_ share a common path
for generating LLVM IR arguments (collected into the `Ops`
`SmallVector<Value*>`) (*). This is a follow-up for #181794 - please
refer to that PR for more context.
As in the previous PR, the key change is implemented in
`HasExtraNeonArgument` , i.e. in the hook that identifies Builtins with
the extra argument. In this PR, I am replacing the ad-hoc switch
statement with a more principled approach borrowed from SemaARM.cpp,
namely:
```cpp
uint64_t mask = 0;
switch (BuiltinID) {
#define GET_NEON_OVERLOAD_CHECK
#include "clang/Basic/arm_fp16.inc"
#include "clang/Basic/arm_neon.inc"
[28 lines not shown]
[HLSL] Define CBuffer field alignment for matrix types (#179836)
fixes https://github.com/llvm/llvm-project/issues/179834
Change defines Matrix alignment as buffer row length (16). Same as
arrays and structs.
Change also adds tests for matrix, matrix in structs, & arrays.
[clang][ARM] Refactor argument handling in `EmitAArch64BuiltinExpr` (1/2) (NFC)
Refactor `EmitAArch64BuiltinExpr` so that all AArch64/NEON builtins
handled by this hook _and marked as non-overloaded_ share a common path
for generating LLVM IR arguments (collected into the `Ops`
`SmallVector<Value*>`) (*)
Previously, the argument emission loop unconditionally skipped the
trailing argument:
```cpp
for (unsigned i = 0, e = E->getNumArgs() - 1; i != e; ++i)
```
This was originally intended to ignore the extra Sema-only argument
used by overloaded NEON builtins (e.g. the type discriminator passed
by `__builtin_neon_*` intrinsics). However, this logic was applied
unconditionally.
[37 lines not shown]
[HLSL][Cbuffer][Matrix] Add Cbuffer padding and createBufferMatrixTempAddress (#181903)
fixes #181901
This change detects when a HLSL Cbuffer matrix’s layout differs from its
in-memory type and materializes a temporary with the non-padded matrix
type. Matrix elements are copied explicitly from the padded buffer
layout into the temporary, ensuring correct addressing and avoiding
overlapping GEPs or incorrect vector flattening.
[NFC] Re-enable MSVC C4351 diagnostic (#182082)
From MSDN:
https://learn.microsoft.com/en-us/previous-versions/1ywe7hcy(v=vs.140)
This diagnostic is no longer documented in MSDN and from my local
testing, the diagnostic is not emitted in our source. I believe we no
longer need to disable this diagnostic.
[lldb] Allow tests to share a single build (#181720)
This changes Python API tests to use a single build shared across all
test functions, instead of the previous default behavior of a separate
build dir for each test function.
This build behavior opt-out, tests can use the previous behavior of one
individual (unshared) build directory per test function, by setting
`SHARED_BUILD_TESTCASE` to False (in the test class).
The motivation is to make the test suite more efficient, by not
repeatedly building the same test source. When running tests on my macOS
machine, this reduces the time of `ninja check-lldb-api` by almost 60%
(sample numbers: from ~492s down to ~207s = 58%). Almost 5min time
saved.
Each test function still calls `self.build()`, but only the first call
will do a build, in the subsequent tests `make` will be a no-op because
the sources won't have changed.
[CIR][NEON] Add lowering for `vfmah_f16` and `vfmsh_f16` (#181148)
As with other NEON builtins, reuse the existing default-lowering
tests to validate the CIR lowering path.
[MLIR][XeGPU] Add LANE level integration test without XeGPU ops. (#181891)
XeGPU LANE level integration test lacks a test without usage of any
XeGPU dialect ops.
Add an integration test without XeGPU dialect ops.