[mlir][spirv] Remove unnecessary assertion (#200137)
The use of the variable in the assertion was causing a build failure
when compiling with assertion off and hence the variable becomes unused.
Signed-off-by: Davide Grohmann <davide.grohmann at arm.com>
[NFC][TableGen] Reorganize GlobalISelMatchTable.h
This file was a bit of a kitchen sink, and the implementation of the
match table is sufficiently difficult to get comfortable with already.
I spent the past few weeks looking at it, finding improvements, etc. and
I think a nice way to make it a bit easier to approach is to split up
the file a bit so that the main implementation (Matchers.h/.cpp) only
contains the code pertaining to the Matchers (RuleMatchers, Preds, etc.).
We now have 3 files:
- One for type (LLT) related utilities.
- One for the MatchTable emission logic, which is generic and should not
be tied to any specific implementation. It just has the tools to emit
the opcodes for the table.
- One for the entire Matcher system, including PredicateMatchers and so on.
[LangRef] Specify that syncscopes can affect the monotonic modification order (#189017)
If a target specifies that atomics with mismatching syncscopes appear
non-atomic to each other, there is no point in requiring them to be ordered in
the monotonic modification order. Notably, the [AMDGPU target user
guide](https://llvm.org/docs/AMDGPUUsage.html#memory-scopes) has specified
syncscopes to relax the modification order for years.
So far, I haven't found an example where this less constrained ordering would
be observable (at least with the AMDGPU inclusive scope rules). Whenever a load
would be able to see two monotonic stores with non-inclusive scope, that's
considered a data race (i.e., the load would return `undef`), so it cannot be
used to observe the order of the stores.
Related RFC: https://discourse.llvm.org/t/rfc-clarifying-llvm-irs-concurrent-memory-model/90480
[DirectX] Generate shader debug file name part in llc (#199555)
This change modifies DXContainerGlobals pass to generate debug name
(ILDN) part in DXContainer. ILDN part allows consumers to find PDB file
containing shader debug info.
As ILDB emission PR is not merged yet, and PDB file creation is not
upstreamed yet, debug name is generated based on MD5-hash of bitcode
module in DXIL part.
This corresponds to DXC behavior when a shader is compiled with `/Zi
/Qembed_debug /Zsb` flags (with `/Qembed_debug`, DXC does not produce an
actual PDB file, but still emits ILDN, `/Zsb` tells DXC to use bitcode
from DXIL to compute hash).
However, here ILDN is emitted for any debug info flag configuration.
assuming that it won't break debug info consumers, and that PDB creation
will be added later.
[LangRef] Specify that syncscopes can affect the monotonic modification order (#189017)
If a target specifies that atomics with mismatching syncscopes appear
non-atomic to each other, there is no point in requiring them to be ordered in
the monotonic modification order. Notably, the [AMDGPU target user
guide](https://llvm.org/docs/AMDGPUUsage.html#memory-scopes) has specified
syncscopes to relax the modification order for years.
So far, I haven't found an example where this less constrained ordering would
be observable (at least with the AMDGPU inclusive scope rules). Whenever a load
would be able to see two monotonic stores with non-inclusive scope, that's
considered a data race (i.e., the load would return `undef`), so it cannot be
used to observe the order of the stores.
Related RFC: https://discourse.llvm.org/t/rfc-clarifying-llvm-irs-concurrent-memory-model/90480
[AArch64] Fix definition of system register move instructions (#185709)
Current implementation of these instructions makes bit20 in the encoding
part of the system register operand, which is incorrect since
[specification](https://developer.arm.com/documentation/ddi0602/latest)
specifies that bit must be set to 1. This patch changes that and removes
the bit 20 from the encoding of the operand and makes it fixed field for
these instructions. It also fixes the parser and codegen by checking
that Op0 in system register name/encoding is correctly constrained to 2
or 3.
Depends on #185970
[TySan] Expose __tysan_set_type_unknown interface (#198800)
This can help work around issues like
[#143587](https://github.com/llvm/llvm-project/issues/143587)
The function is renamed with two trailing underscores to match the
naming scheme of the other sanitizers.
[LifetimeSafety] Improve diagnostics for use-after-scope (#200031)
Reuses the function for getting object information that was added in
#199432
Comes as part of completing #186002
Co-authored-by: Utkarsh Saxena <usx at google.com>
Avoid infinite loop when parsing PFKEY replies
In bgpd, iked, isakmpd, ldpd and sasyncd we have similar code to
parse PFKEY replies from the kernel. To avoid an infinite loop on
malformed replies validate the SADB extension size.
For consistency with the other daemons rewrite the parsing loop of
iked.
sasyncd already validates the extension size, so no change needed.
ok claudio@ tb@ tobhe@
[AMDGPU] Use separate tables for VOPD3X and VOPDY in `getCanBeVOPD` (NFC) (#199072)
With this change tables for valid VOPD3X (VOPDX still uses
`getVOPDComponentHelper`) and VOPDY operations are
generated through TableGen. This simplifies the look-up
leading to a 2-3% compile-time speed-up for tested shaders
where `getCanBeVOPD` is on a hot path.
Assisted-by: Claude Code
[mlir][tosa] Limit consecutive concat rewrite to MAX_TENSOR_LIST_SIZE (#199051)
Previously folding could produce an operation that would later be
considered invalid in validation due to the number of operands it has.
This change adds a check to prevent rewriting consecutive concat
operations if the resulting operation has more than MAX_TENSOR_LIST_SIZE
operands, based on the selected target environment level. If no level is
specified, folding will proceed as before.
In addition, this change rewrites the concat folder as a
canonicalization pattern, since it is not a fold of constant operands.
The change also consolidates testing in
canonicalize.mlir.
[compiler-rt][ARM] Optimized FP -> integer conversions (#179927)
This commit adds a total of 8 new functions, all converting a
floating-point number to an integer, varying in 3 independent choices:
* input float format (32-bit or 64-bit)
* output integer size (32-bit or 64-bit)
* output integer type (signed or unsigned)
virtual_oss(8): Create loopback devices with GID_AUDIO
Make sure the user is part of the audio group to avoid unintended
snooping of loopback audio by unprivileged users.
While here, retire voss_dsp_perm, since we don't use the same value
everywhere now.
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Reviewed by: emaste
Pull-Request: https://ron-dev.freebsd.org/FreeBSD/src/pulls/26
(cherry picked from commit 5f904cb1b05c94453727abb606d6109fe504b10b)
rc: virtual_oss: Create a loopback device in the default configuration
The loopback device allows us to record desktop sound by reading from
it, or even use it as an input device, for example during a call.
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Reviewed by: emaste
Pull-Request: https://ron-dev.freebsd.org/FreeBSD/src/pulls/16
(cherry picked from commit 8532b4a436364d04d5c1feb7af5ecd4b5df71a9f)