[HIP][Driver] Forward -fcoverage-mapping flags to device compiler (#198872)
Add `-fcoverage-mapping`, `-fno-coverage-mapping`,
`-fcoverage-compilation-dir=`, `-ffile-compilation-dir=`, and
`-fcoverage-prefix-map=` to the LinkerWrapper `CompilerOptions`
forwarding list. Without this, passing `-fprofile-instr-generate
-fcoverage-mapping` to clang for a HIP program silently omits the
coverage mapping flags from the embedded device recompilation, so
`__llvm_covmap`/`__llvm_covfun` sections are never emitted for device
code.
Be consistent with "built in" vs "built-in"
Use the hyphenated version only when describing, or referring to, one (or
more, incl the general set of) actual built-in commands (like just there)
in sh.
Use the 2 word version in all other contexts, including when describing
functionality (like line editing) that is built in to sh (like just there)
except normally there one would write "built into" if not making the point!
[LoopFusion] reject unsafe scalar flow dependences (#195895)
`loop-fusion` treats any loop-invariant scalar non-anti dependence as
safe to fuse. In the linked issue, it incorrectly allows scalar flow
dependences where the first loop writes a loop-invariant location and
the second loop later reads that same location. Fusion interleaves the
producer and consumer and this changes the value observed by the second
loop.
Example C source would look like:
```C
for (int i = 0; i < N; i++) {
ptr[0] = i;
}
for (int j = 0; j < N; j++) {
out[j] = ptr[0];
}
=>
for (int i = 0; i < N; i++) {
[14 lines not shown]
nfsd: Allow vfs.nfsd.srvmaxio to be up to 4Mbytes
Without this patch, the maximum setting for
vfs.nfsd.srvmaxio was 1Mbyte. This patch increases
that to 4Mbytes.
The same as for any setting above 128Kbytes, settings up to
4Mbytes require that kern.ipc.maxsockbuf be increased.
(A message generated after setting vfs.nfsd.srvmaxio via
the /etc/rc.conf variable nfs_server_maxio will indicate
the minimum setting, which will be somewhat greater than
four times the setting of vfs.nfsd.srvmaxio.)
(cherry picked from commit b92b9da3300655c86dcd42ea8a5ba45badd90847)
subr_uio.c: Remove a KASSERT() for large NFS server I/O
When the NFS server is set to allow an I/O size greater
than 1Mbyte (not allowed in FreeBSD's main yet), a
KASSERT() in allocuio() can fail when:
zfs_freebsd_write()->zfs_write()->zfs_uiocopy()
->cloneuio()->allocuio()
is called for a large NFS server write.
Since the userland API callers to allocuio() already
check that the size does not exceed UIO_MAXIOV,
there does not seem to be a need to a KASSERT()
here.
Removing the KASSERT() allows NFS server writes
of greater than 1Mbyte to work, once the NFS code
is patched to allow them.
(cherry picked from commit 13d3bd165e225eec9af91b6e3361c2482931f95b)
[Driver] Honor /Fo when deriving the split-dwarf .dwo path (#199613)
SplitDebugName checked -o and /o but not /Fo, so clang-cl /Fo<path> /c
fell through to the cwd-relative fallback and every .dwo landed in cwd
under <source-stem>.dwo regardless of the .obj location.
[PGO][HIP] Stop pulling ROCm.o into every PGO host link (#200101)
PR #177665 added an unconditional `extern` reference to
`__llvm_profile_hip_collect_device_data` from `InstrProfilingFile.c`,
which forces `InstrProfilingPlatformROCm.o` (and its sanitizer_common /
interception dependencies) out of `libclang_rt.profile.a` in every PGO
binary. That breaks bots without `-lpthread` and races dlsym/PLT state
in non-HIP programs via the interceptor constructor.
Fix:
- Declare the hook `COMPILER_RT_WEAK` and gate the call on its address.
No `COMPILER_RT_VISIBILITY`: a hidden weak-undef function would be
non-preemptible and the address test would fold to true.
- Gate `installHipModuleInterceptors` on `dlsym(hipModuleLoad)` so the
constructor is a no-op if `ROCm.o` is still pulled in.
Fixes:
- https://lab.llvm.org/buildbot/#/builders/66/builds/31311
- https://lab.llvm.org/buildbot/#/builders/174/builds/36180
[7 lines not shown]
graphics/mesa-devel: unbreak build after 57a95f9faa65
Traceback (most recent call last):
File "src/intel/vulkan/anv_dricrc_gen.py", line 275, in <module>
main()
File "src/intel/vulkan/anv_dricrc_gen.py", line 268, in main
drirc_validate([args.validate], options, driver="anv")
File "src/util/drirc_gen.py", line 201, in drirc_validate
tree = ET.parse(conf_path)
^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/xml/etree/ElementTree.py", line 1219, in parse
tree.parse(source, parser)
File "/usr/local/lib/python3.11/xml/etree/ElementTree.py", line 570, in parse
source = open(source, "rb")
^^^^^^^^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory: 'src/util/00-mesa@-defaults.conf'
Reported by: pkg-fallout
[clang][AMDGPU] Fix -ast-print crash on expanded predicate builtins (#199963)
ExpandAMDGPUPredicateBuiltIn synthesized an IntegerLiteral typed
_Bool/bool — a shape no other producer creates, and one that
StmtPrinter::VisitIntegerLiteral has no case for. -ast-print on the
resulting if-condition hit llvm_unreachable.
Emit the canonical boolean literal instead:
- C++, C23, OpenCL, HIP: CXXBoolLiteralExpr 'bool'
- pre-C23 C: IntegerLiteral 'int'
In the C case this matches what <stdbool.h>'s true/false macros expand
to.
Fixes #199563
[clang] fix getTemplateInstantiationArgs
This implements a new strategy for collecting the template arguments, by
relying on the qualifiers and template parameter lists to navigate the template
context of out-of-line definitions.
This greatly simplifies the signature of that function, by removing a bunch
of workarounds, and simpliffying a couple that weren't removed yet.
Since this now relies on qualifiers and template parameter lists,
this patch expends most of its effort making sure these are placed,
transformed and propagated to template instantiations.
Also makes the explicit specialization AST nodes stop abusing the template
parameter lists by storing it's own template parameter list, creating a
dedicated field for them, similar to partial specializations.
[SelectionDAG] Widen <2 x T> vector types for atomic store
Vector types of 2 elements must be widened. This change does this
for vector types of atomic store in SelectionDAG so that it can
translate aligned vectors of >1 size.
[RISCV] Fix incorrect CM.MVSA01/QC_CM_MVSA01 generation with Zdinx (#200000)
The `RISCVMoveMerger` pass was incorrectly forming
`CM_MVSA01/QC_CM_MVSA01` when `Zdinx` was enabled. The pass attempted CM
merge for copy pairs even when the first copy was not an `a0/a1-based`
CM candidate.
Fix by only running `findMatchingInst` when the current copy is a valid
CM candidate.
[RISCV][P-ext] Split v4i16/v8i8 INSERT/EXTRACT_VECTOR_ELT on RV32. (#199917)
With a constant lane index, split the vector and recurse on the
single-GPR half containing Idx (already Custom-lowered).
fix(firewire): fix tcode switch fallthrough on little-endian
The break statements in fwohci_arcv_swap() were inside
#if BYTE_ORDER == BIG_ENDIAN guards, causing all cases to fall
through to the default "Unknown tcode" handler on little-endian
(x86) systems. This meant every received packet was dropped,
breaking bus manager election, split transactions, and all
asynchronous communication.
Move break statements outside the #if guards to match FreeBSD.