[clang-doc] Remove uses of consumeError (#168759)
In BitcodeReader, we were using consumeError(), which drops the error
and hides it from normal usage. To avoid that, we can just slightly
tweak the API to return an Expected<T>, and propagate the error
accordingly.
[WebAssembly] Implement addrspacecast to funcref (#166820)
Adds lowering of `addrspacecast [0 -> 20]` to allow easy conversion of
function pointers to Wasm `funcref`
When given a constant function pointer, it lowers to a direct
`ref.func`. Otherwise it lowers to a `table.get` from
`__indirect_function_table` using the provided pointer as the index.
Reland Refactor WIDE_READ to allow finer control over high-performance function selection (#165613) (#170738)
[Previous commit had an incorrect default case when
FIND_FIRST_CHARACTER_WIDE_READ_IMPL was not specified in config.json.
This PR is identical to that one with one line fixed.]
As we implement more high-performance string-related functions, we have
found a need for better control over their selection than the big-hammer
LIBC_CONF_STRING_LENGTH_WIDE_READ. For example, I have a memchr
implementation coming, and unless I implement it in every variant, a
simple binary value doesn't work.
This PR makes gives finer-grained control over high-performance
functions than the generic LIBC_CONF_UNSAFE_WIDE_READ option. For any
function they like, the user can now select one of four implementations
at build time:
1. element, which reads byte-by-byte (or wchar by wchar)
2. wide, which reads by unsigned long
[11 lines not shown]
[mlir][acc] Improve verifier messages for device_type duplicates (#170773)
This improves the acc dialect IR verifier messages when duplicate
device_types are found by also noting which device_type is the one
causing the error.
[acc] Add acc.specialized_routine attribute (#170766)
Introduce a new attribute `acc.specialized_routine` to mark functions
that have been specialized from a host function marked with
`acc.routine_info`.
The new attribute captures:
- A SymbolRefAttr referencing the original `acc.routine` operation
- The parallelism level via the new `ParLevel` enum
- The original function name (since specialized functions may be
renamed)
Example - before specialization:
```
acc.routine @routine_gang func(@foo) gang
acc.routine @routine_vector func(@foo) vector
func.func @foo() attributes {
acc.routine_info = #acc.routine_info<[@routine_gang, @routine_vector]>
[26 lines not shown]
[profcheck] Fix missing profile metadata in ExpandMemCmp (#169979)
This patch fixes a profile metadata missing in the `ExpandMemCmp` pass
when it expanding `memcmp` calls. This would cause branches between
different blocks to lose their profile data, potentially leading to
suboptimal code generation.
The patch updates the `ExpandMemCmp` pass to set branch weights to a
default `unknown`(50/50 weights) value when a profile is available. This
prevents the expansion from making a previously profiled branch
unprofiled.
The patch also includes updates to the tests to reflect the new branch
weights.
Co-authored-by: Jin Huang <jingold at google.com>
[clang-doc] Use DiagnosticsEngine to handle diagnostic output (#170219)
[clang-doc] Use DiagnosticsEngine to handle diagnostic output
Right now we use a combination of outs() and errs() to handle tool
output. Instead, we can use existing diagnostic support in clang and
LLVM to ensure our tool has a consistent behavior with other tools.
[CIR][NFC] Add flag support for eh cleanups (#170753)
This adds the `flags` variable to the EHScopeStack::Cleanup class and
routes it through the existing handlers. None of the currently
implemented handlers use these flags, but the flag will be needed for
array and NRVO variable cleanup handling.
AMDGPU: Improve exp10 lowering for f16
For f16, this can be done accurately by converting to f32
with a multiply. Previously this was treated as an f32 case
that we happen to know is not denormal.