1. onnx.MatMulInteger now converts to aten.matmul instead of aten.mm
2. aten.matmul, for ranks >=2, now allows quantized inputs and will
lower to linalg::quantized_matmul or linalg::quantized_batch_matmul.
3. added AtenMatmulOp to the FuseQuantizeOps rewrite patters
QuantizeOperands, QuantizeTransposedOperands, and QuantizeAccumulator
4. added several tests, including some to test AtenMmOp with varying
quantization signed-ness.
5. a quantized matmul mat-vec test is added to verify the failure to
lower to linalg; cleaned of out-of-date code related to common
torch-mlir lowering xfails.
6. in debugging a real model with quantized matmuls, I found a bug on
the scalarize-shapes pass which resulted from the aten.full op folder
returning an incompatible result type. This is fixed by the small change
here to
[lib/Dialect/Torch/IR/TorchOps.cpp](https://github.com/llvm/torch-mlir/compare/main...zjgarvey:torch-mlir:MatMulIntegerFix?expand=1#diff-dc8ed165c207918e606490eee3984b1ad51d7034e6aac36fc046bf47f6f03f4f).
Previously, it could only handle the situations where outputsize == (1,
1) or outputsize == (input_H, input_W). Now it supports all situations
where input_H % output_H== 0 && input_W % output_W == 0
…ute_reshape_shape
as that `aten.view` support at most one `-1` in dim list. The original
calculation of `numel` is wrong when there is a `-1` in dim list.
This tests COO for more than 2-dim. Note that sparsity should really
propagate into the relu activation and the output, but such cleverness
needs to wait for the pending work in the PyTorch tree.
This PR only performs a lit test. In lieu of an e2e test, https://github.com/nod-ai/SHARK-TestSuite/pull/142 makede sure that the lowering works & the numbers check out.
Co-authored-by: Xida Ren <xida.ren.dev@gmail.com>
This commit also cleans up the OnnxToTorch lowering for the ReduceMean
op and adds the support for handling edge cases.
Signed-Off By: Vivek Khandelwal vivekkhandelwal1424@gmail.com
The `convertTensorToElementType` function expects it's argument to have
a valid tensor type that is not `Torch::NoneType`. This PR checks that
the bias tensor is not of type `Torch::NoneType` before calling
`convertTensorToElementType` on the bias tensor argument in the
`matchAndRewrite` member function of the `ConvertAtenConvolutionOp`
class.
When lowering `torch.aten.convolution`, it is expected that the
'transposed' argument is a torch.constant operation. In some cases, the
argument was a `from_i1` operation converting an `arith.constant`
operation into a torch.bool. This is not wrong semantically, but instead
of generalizing the legality of the `torch.aten.convolution` op, we
canonicalize `arith.constant` ops followed by `from_i1` ops to
`torch.bool` ops.
For example:
```
//===-------------------------------------------===//
Legalizing operation : 'torch.aten.convolution'(0x124705b90) {
%33 = "torch.aten.convolution"(%arg0, %20, %21, %31, %29, %30, %19, %32, %0) : (!torch.vtensor<[1,1,28,28],f32>, !torch.vtensor<[10,1,5,5],f32>, !torch.vtensor<[10],f32>, !torch.list<int>, !torch.list<int>, !torch.list<int>, !torch.bool, !torch.list<int>, !torch.int) -> !torch.vtensor<[1,10,24,24],f32>
* Fold {
} -> FAILURE : unable to fold
* Pattern : 'torch.aten.convolution -> ()' {
** Failure : unimplemented: only constant transposed supported. <-- Resolved by this PR
} -> FAILURE : pattern failed to match
* Pattern : 'torch.aten.convolution -> ()' {
** Failure : not a supported Scalar to Tensor like op
} -> FAILURE : pattern failed to match
* Pattern : 'torch.aten.convolution -> ()' {
** Failure : not a supported elementwise op
} -> FAILURE : pattern failed to match
* Pattern : 'torch.aten.convolution -> ()' {
** Failure : not a supported reduce op
} -> FAILURE : pattern failed to match
} -> FAILURE : no matched legalization pattern
//===-------------------------------------------===//
<stdin>:21:11: error: failed to legalize operation 'torch.aten.convolution' that was explicitly marked illegal
%17 = torch.operator "onnx.Conv"(%arg0, %0, %1) {torch.onnx.dilations = [1 : si64, 1 : si64], torch.onnx.group = 1 : si64, torch.onnx.kernel_shape = [5 : si64, 5 : si64], torch.onnx.pads = [0 : si64, 0 : si64, 0 : si64, 0 : si64], torch.onnx.strides = [1 : si64, 1 : si64]} : (!torch.vtensor<[1,1,28,28],f32>, !torch.vtensor<[10,1,5,5],f32>, !torch.vtensor<[10],f32>) -> !torch.vtensor<[1,10,24,24],f32>
^
<stdin>:21:11: note: see current operation: %33 = "torch.aten.convolution"(%arg0, %20, %21, %31, %29, %30, %19, %32, %0) : (!torch.vtensor<[1,1,28,28],f32>, !torch.vtensor<[10,1,5,5],f32>, !torch.vtensor<[10],f32>, !torch.list<int>, !torch.list<int>, !torch.list<int>, !torch.bool, !torch.list<int>, !torch.int) -> !torch.vtensor<[1,10,24,24],f32>
```
Additionally, we require the canonicalization of `to_i1` operating on a
torch.constant bool to an `arith.constant ... : i1` for the e2e tests to
pass successfully.
In the prior state when I supported mutation of user inputs by treating
them as mutable-tensor SSA values, I had left the case of buffer
mutation only vaguely implemented until a concrete use emerged.
This patch reworks this buffer mutation support by assuming that buffers
must be resolved via the hooks symbolically and treated with load/store
semantics. This is implied in the structure since we have no SSA value
that represents a buffer and we already assume that reading parameters
happens via such a mechanism.
Now there no lowing for `aten.Int.bool` in `convert-torch-to-arith`
pass. this PR add this support.
Below is the UT.
```
func.func @torch.aten.Int.bool(%arg0: !torch.bool) -> !torch.int {
%0 = torch.aten.Int.bool %arg0 : !torch.bool -> !torch.int
return %0 : !torch.int
}
```
* Also adds the basic scaffolding for handling more of these, which will
be needed for cond, while, etc.
* Refactors some of the support in the generic OpOverload emitter so it
can be shared with these other special forms.
This has been on my list for a while, but it just so happens that as
part of upgrading to PyTorch 2.3 and a pure upstream flow in Turbine, we
were using a feature that required integration with auto_functionalized.
This is perhaps the "weirdest" of the higher-order ops and a poor place
to start, but needs must. We have testing for this in Turbine.
Full support in Turbine has an entire custom ops facility. I've reduced
this down to a unit test in torch-mlir.
Reshaping tensors depend on directly matching individual dimensions to
their corresponding dim in the `torch.view` reshape dimensions. This
involves decoupling dynamic dimensions from their static counterparts
and support cleanup / canonicalization.
This commit adds the OnnxToTorch lowering for the Mish, Softplus,
HardSwish, Trilu, ThresholdedRelu op
Signed-Off By: Vivek Khandelwal <vivekkhandelwal1424@gmail.com>
At some point, this op became kwarg-only instead of arg/kwarg.
Discovered when upgrading to PyTorch 2.3.
Also adds a test as this was untested in-tree (was caught out of tree).
This adds support for converting DynamicQuantizeLinear from torch-onnx
to torch.
I could not get an e2e test to pass, since there seems to be some issues
with uint8 casting somewhere lower in the pipeline. For example
compiling with IREE for llvm-cpu, I would get either the correct zero
point (if zp < 128) or the correct zero-point minus 256 (if zp >= 128).
The output tensor seems to always return a tensor of zeros, which also
occurs when running uint8 examples through QuantizeLinear.
Edit: the first problem can be resolved by casting the output back to
uint8 on output, the second problem is resolved with PR #3018
Reduce mean lowerings did not succesfully lower to `linalg` via torched.
There were two separate paths that could be consolidated to a single
simpler pass. This resulted in a significant improvement in test
coverage.
If the broadcast shape is length-1 at a dim while `?` in the input dim
then we need to broadcast to the dynamic dim. This is equivalent to
taking a max of two dimensions.
This folds small version of the tensor-scalar comparison operators as
they are commonly used for shape computations. This includes le, lt, ge,
gt, eq, and ne.
The current padding operation was not functional for dynamic shapes.
Updated and enabled tests so that onnx.pad tests pass.
Work TBD for reflection padding.
Set PyTorch and TorchVision version to nightly release 2024-03-07.
This commit also removes the deprecated constraints API:
342e7929b8
Signed-Off By: Vivek Khandelwal <vivekkhandelwal1424@gmail.com>
We can support `onnx.Size` by requesing the size of each dimensions and
taking the product of the results, then packing it into a tensor.
---------
Co-authored-by: Scott Todd <scott.todd0@gmail.com>
This mostly copy-pastes the reduce minimum implementation to reduce max
to improve test coverage. We also improve the aten lowering for min/max
dim for unsigned types.
The addition of an e2e test is actually provided in the Shark-Testsuite.
This adds 2 test cases for the gridsampler e2e test.
Also as intended there were some items found which needed correction, so
the Gridsampler op has also a change.
`getRawBuffer` expects a densely packed vector of `i1` values however
`onnx` does not densely pack the values. Include code to handle the
packing / unpacking.
A handful of operations are commonly used in shape calculations (slice,
concat, broadcast). Added these additional folders to better propagate
simple shape computations.