[MemProf] Merge all callee guids for indirect call VP metadata (#170964)
When matching memprof profiles, for indirect calls we use the callee
guids recorded on callsites in the profile to synthesize indirect call
VP metadata when none exists. However, we only do this for the first
matching CallSiteEntry from the profile.
In some case there can be multiple, for example when the current
function was eventually inlined into multiple callers. Profile
generation propagates the CallSiteEntry from those callers into the
inlined callee's profile as it may not yet have been inlined in the
new compile.
To capture all of these potential indirect call targets, merge callee
guids across all matching CallSiteEntries.
[flang] add simplification for ProductOp intrinsic (#169575)
Add simplification for `ProductOp`, by implementing support for
`ReductionConversion` and adding it to the pattern list in
`SimplifyHLFIRIntrinsics` pass.
Closes:
https://github.com/issues/recent?issue=llvm%7Cllvm-project%7C169433
---------
Co-authored-by: Eugene Epshteyn <eepshteyn at nvidia.com>
[FlowSensitive] [StatusOr] [11/N] Assume const accessor calls are stable (#170935)
This is not necessarily correct, but prevents us from flagging lots of
false positives because code usually abides by this.
[RISCV] Remove unnecesary override of getVectorTypeBreakdownForCallingConv. NFC (#171155)
There used to be code in here to make i32 legal on RV64, but it was
removed.
Also remove unnecessary temporary variable from
getRegisterTypeForCallingConv.
[AMDGPU] Add argument range annotations to intrinsics where applicable (#170958)
This commit adds annotations to AMDGPU intrinscis that take arguments
which are documented to lie within a specified range, ensuring that
invalid instances of these intrinsics don't pass verification.
(Note that certain intrinsics that could have range annothations don't,
as their existing behavior is to clamp out-of-range values silently.)
Disclaimer: tests generated by LLM (code is mine)
[NFC] [FlowSensitive] Fix missing namespace in MockHeaders (#170954)
This happened to work because we were missing both a namespace close and
open and things happened to be included in the correct order.
[dsymutil] Remove spurious exit when falling back to fat64 header (#171189)
In #118898 I changed dsymutil to emit a warning instead of an error when
exceeding the 4GB limit for a slice and automatically fall back to using
the fat64 header. However, while doing so, I forgot to remove the return
which defeats the whole purpose.
rdar://140998416
[sancov] Add -diff option to compute set difference of sancov files
Add a new -diff action that computes the difference between two sancov
coverage files (A - B) and writes the result to a new .sancov file.
The option takes exactly two input .sancov files and requires an
--output option to specify the output file. The output file preserves
the binary format (magic number and bitness) from the first input file.
A warning is emitted if the two input files have different bitness
(32-bit vs 64-bit), though the operation proceeds using the bitness
from file A.
Fix VarArgs FixedStack object on AIX. (#170240)
Create a mutable aliased fixed stack object for the va_list when any of
the optional arguments are passed in gprs. Since we need to spill the
gpr registers into the parameter save area the stack object is not
immutable, and since the values will almost certainly be accessed
through the IR value for a va_list make the stack object aliased as
well.
[DAG] Generate UMULH/SMULH with wider vector types (#170283)
The existing code for generating umulh/smulh was checking that that the
getTypeToTransformTo was a LegalOrCustom operation. This only takes a
single legalization step though, so if v4i32 was legal, a v8i32 would be
transformed but a v16i32 would not.
This patch introduces a getLegalTypeToTransformTo that performs
getTypeToTransformTo until a legal type is reached. The umulh/smulh code
can then use it to check if the final resultant type will be legal.
[DebugInfo][test] Fix llvm/test/ThinLTO/X86/pr35472.ll (NFC) (#170952)
The test is intended to verify lazy loading of debug-location scope
metadata. However, after d5d3eb16b7ab72529c83dacb2889811491e48909,
DILexicalScope that was expected to be lazily loaded was not used in IR,
so lazy loading did not occur.
This patch fixes that, and adds an extra bcanalyzer check to ensure that
DILexicalScope record is emitted.