AMDGPU: Move reg_sequence splat handling (#140313)
This code clunkily tried to find a splat reg_sequence by
looking at every use of the reg_sequence, and then looking
back at the reg_sequence to see if it's a splat. Extract this
into a separate helper function to help clean this up. This now
parses whether the reg_sequence forms a splat once, and defers the
legal inline immediate check to the use check (which is really use
context dependent)
The one regression is in globalisel, which has an extra
copy that should have been separately folded out. It was getting
dealt with by the handling of foldable copies in tryToFoldACImm.
This is preparation for #139908 and #139317
[clang-doc] Track if a type is a template or builtin
Originally part of #133161. This patch adds preliminary tracking
for of TypeInfo, by tracking if the type is a builtin or template.
The new functionality is not yet exercised.
Co-authored-by: Peter Chou <peter.chou at mail.utoronto.ca>
[clang-doc] Update clang-doc tool to enable mustache templates
This patch adds a command line option and enables the Mustache template
HTML backend. This allows users to use the new, more flexible templates
over the old and cumbersome HTML output. Split from #133161.
Co-authored-by: Peter Chou <peter.chou at mail.utoronto.ca>
[clang-doc] Extract Info into JSON values
Split from #133161. This patch provides the implementation of a number
of extractValue overloads used with the different types of Info.
The new helper functions extract the relevant information from the
different *Infos and inserts them into the correct fields of the JSON
values that will be used with the specific Mustache templates, which
will land separately.
Co-authored-by: Peter Chou <peter.chou at mail.utoronto.ca>
[clang-doc] Update serializer for improved template handling
This patch updates Serialize.cpp to serialize more data about C++
templates, which are supported by the new mustache HTML template.
Split from #133161.
Co-authored-by: Peter Chou <peter.chou at mail.utoronto.ca>
[clang-doc] Add helpers for Template config
This patch adds or fills in some helper functions related to template
setup when initializing the mustache backend. It was split from #133161.
Co-authored-by: Peter Chou <peter.chou at mail.utoronto.ca>
[clang-doc] Implement setupTemplateValue for HTMLMustacheGenerator
This patch implements the business logic for setupTemplateValue, which
was split from #133161. The implementation configures the relative path
relationships between the various HTML components, and prepares them
prior to their use in the generator.
Co-authored-by: Peter Chou <peter.chou at mail.utoronto.ca>
[AMDGPU][Attributor] Rework update of `AAAMDWavesPerEU` (#123995)
Currently, we use `AAAMDWavesPerEU` to iteratively update values based
on attributes from the associated function, potentially propagating
user-annotated values, along with `AAAMDFlatWorkGroupSize`. Similarly,
we have `AAAMDFlatWorkGroupSize`. However, since the value calculated
through the flat workgroup size always dominates the user annotation
(i.e., the attribute), running `AAAMDWavesPerEU` iteratively is
unnecessary if no user-annotated value exists.
This PR completely rewrites how the `amdgpu-waves-per-eu` attribute is
handled in `AMDGPUAttributor`. The key changes are as follows:
- `AAAMDFlatWorkGroupSize` remains unchanged.
- `AAAMDWavesPerEU` now only propagates user-annotated values.
- A new function is added to check and update `amdgpu-waves-per-eu`
based on the following rules:
- No waves per eu, no flat workgroup size: Assume a flat workgroup size
of `1,1024` and compute waves per eu based on this.
[22 lines not shown]
[mlir][bufferization] implement BufferizableOpInterface for concat op (#140171)
Lowers `tensor.concat` to an alloc with a series of `memref.copy` ops to
copy the operands to the alloc.
Example:
```mlir
func.func @tensor.concat(%f: tensor<8xf32>) -> tensor<16xf32> {
%t = tensor.concat dim(0) %f, %f : (tensor<8xf32>, tensor<8xf32>) -> tensor<16xf32>
return %t : tensor<16xf32>
}
```
Produces
```mlir
module {
func.func @tensor.concat(%arg0: tensor<8xf32>) -> tensor<16xf32> {
[28 lines not shown]
[Frontend] Avoid creating a temporary instance of std::string (NFC) (#140326)
Since getLastArgValue returns StringRef, and the constructor of
SmallString accepts StringRef, we do not need to go through a
temporary instance of std::string.
[lldb-dap] Avoid creating temporary instances of std::string (NFC) (#140325)
EmplaceSafeString accepts StringRef for the last parameter, str, and
then internally creates a copy of str via StringRef::str or
llvm::json::fixUTF8, so caller do not need to create their own
temporary instances of std::string.
[lldb-dap] Take two at refactoring the startup sequence. (#140331)
This is more straight forward refactor of the startup sequence that
reverts parts of ba29e60f9a2222bd5e883579bb78db13fc5a7588. Unlike my
previous attempt, I ended up removing the pending request queue and not
including an `AsyncReqeustHandler` because I don't think we actually
need that at the moment.
The key is that during the startup flow there are 2 parallel operations
happening in the DAP that have different triggers.
* The `initialize` request is sent and once the response is received the
`launch` or `attach` is sent.
* When the `initialized` event is recieved the `setBreakpionts` and
other config requests are made followed by the `configurationDone`
event.
I moved the `initialized` event back to happen in the `PostRun` of the
`launch` or `attach` request handlers. This ensures that we have a valid
[7 lines not shown]
[-Wunsafe-buffer-usage] Fix false warnings when const sized array is safely accessed (array [idx %const]) (#140113)
The -Wunsafe-buffer-usage analysis currently warns when a const sized
array is safely accessed, with an index that is bound by the remainder
operator. The false alarm is now suppressed.
Example:
int circular_buffer(int array[10], uint idx) {
return array[idx %10]; // Safe operation
}
rdar://148443453
---------
Co-authored-by: MalavikaSamak <malavika2 at apple.com>
[CIR] Fix problem with phantom function arguments (#140322)
There was a problem introduced today where sometimes the CIR for
functions with no arguments would be generated with phantom arguments.
This was causing intermittent test failures.
The problem appears to have been that we were using two different
Profile implementations to generate the FoldingSetNodeID for
CIRGenFunctionInfo so occaissionally when we tried to look for a
pre-existing entry for a function with no arguments it would incorrectly
match a CIRGenFunctionInfo entry that had arguments.
To prevent this from happening again, I rewrote one of the two Profile
functions to call the other.