- Fix pad size to data_rank for dynamic paddingSize Tensor.
- This fix is in accordance with [input
specification](https://onnx.ai/onnx/operators/onnx__Pad.html#inputs) for
onnx.Pad
- Impl will need to be updated for dynamic padSize when support for
`axes` is added.
A choice was made to quantize the return type of Relu with a scale and
zero point copied from the input's quantization scheme. With this
choice, the torch-to-linalg conversion of quantized Relu essentially
computes max(input, zeroPoint) in the elementwise payload.
All e2e iree tests compiled, but they have the run issue of mismatch of
dtype like the following
```
expected:
1x1x2x2xsi32=[[[12 16][24 28]]]
actual:
1x1x2x2xi32=[[[12 16][24 28]]]
```
* promote input to output element-type when lowering to stablehlo, so
that it could satisfy stablehlo's type constraints.
* split promote-to-fp unary ops from fp-only unary ops.
This commit also cleans up the OnnxToTorch lowering for the Squeeze and
Unsqueeze op and adds the support for handling edge cases.
Signed-Off By: Vivek Khandelwal <vivekkhandelwal1424@gmail.com>
Version number was set too high. Lowered to support more cases allows
more tests to pass.
Co-authored-by: Robert Suderman <rsuderman@Roberts-MacBook-Pro.local>
Previous implementation erroneously mixed up num_outputs with
slice_size. New version correctly computs the slice size and directly
performs slicing rather than leveraging `aten.split.tensor`. This is due
to `onnx` supporting a fixed number of splits making the size
computation more easily computeable when lowering to `aten` rather than
deferring to `aten.split.tensor`.
---------
Co-authored-by: Robert Suderman <rsuderman@Roberts-MacBook-Pro.local>
We can map to `tensor.reshape` for handling multiple output dynamic
shapes. Later we can perform a more complex analysis for indentifying
expand/collapse cases from the tensor.reshape.
Initially we planned to handle this identification at the `torch` level
however it will be easier to handle once converted to core
mlir-dialects.
Decomposition RepeatInterleaveSelfInt with following ops:
```python
def my_repeat_interleave(input, repeats, dim=None):
if dim is None:
# Flatten the input and then repeat
return input.flatten().unsqueeze(-1).tile((1, repeats)).flatten()
else:
# Calculate the shape after repeat
expanded_shape = list(input.shape)
expanded_shape[dim] *= repeats
# Repeat the tensor along the specified dimension
repeat_shape = [1] * (input.dim() + 1)
repeat_shape[dim + 1] = repeats
input = input.unsqueeze(-1)
# Tile and then reshape
tiled = torch.tile(input, repeat_shape)
# Rearrange and reshape
repeated = tiled.reshape(*expanded_shape)
return repeated
```
I passed the tests of stablehlo and linalg. When testing onnx, strange
things happened.
In torch-mlir's CI **torch_nightly** and my own
environment(torch==2.4.0.dev20240318+cpu), it can **pass the pass**.
In torch-mlir's CI **torch_stable**, it **failed**.
The test case is `RepeatInterleaveSelfIntNoDimModule_basic`, the result
shape should be [120].
```python
class RepeatInterleaveSelfIntNoDimModule(torch.nn.Module):
def __init__(self):
super().__init__()
@export
@annotate_args([
None,
([3, 4, 5], torch.float32, True),
])
def forward(self, x):
return x.repeat_interleave(2)
@register_test_case(module_factory=lambda: RepeatInterleaveSelfIntNoDimModule())
def RepeatInterleaveSelfIntNoDimModule_basic(module, tu: TestUtils):
module.forward(tu.rand(3, 4, 5))
```
The error log is as follows:
```
Unexpected outcome summary: (onnx)
****** Failed tests - 1 tests
FAIL - "RepeatInterleaveSelfIntNoDimModule_basic"
@ trace item #0 - call to "forward"
@ output of call to "forward"
ERROR: shape (torch.Size([6, 4, 5])) is not equal to golden shape (torch.Size([120]))
```
@rsuderman
Would you please help me check what's wrong with my PR? Thanks a lot.
Align corner modes which select what the corners mean.
Either the center of the corner points or the edges of the edge points.
---------
Co-authored-by: Rob Suderman <rob.suderman@gmail.com>
The new cases added for quantized matmuls are:
1. vec-vec
2. vec-mat
3. mat-vec
each of which are now lowered to expand(s), quantized_matmul, and
collapse.
1. onnx.MatMulInteger now converts to aten.matmul instead of aten.mm
2. aten.matmul, for ranks >=2, now allows quantized inputs and will
lower to linalg::quantized_matmul or linalg::quantized_batch_matmul.
3. added AtenMatmulOp to the FuseQuantizeOps rewrite patters
QuantizeOperands, QuantizeTransposedOperands, and QuantizeAccumulator
4. added several tests, including some to test AtenMmOp with varying
quantization signed-ness.
5. a quantized matmul mat-vec test is added to verify the failure to
lower to linalg; cleaned of out-of-date code related to common
torch-mlir lowering xfails.
6. in debugging a real model with quantized matmuls, I found a bug on
the scalarize-shapes pass which resulted from the aten.full op folder
returning an incompatible result type. This is fixed by the small change
here to
[lib/Dialect/Torch/IR/TorchOps.cpp](https://github.com/llvm/torch-mlir/compare/main...zjgarvey:torch-mlir:MatMulIntegerFix?expand=1#diff-dc8ed165c207918e606490eee3984b1ad51d7034e6aac36fc046bf47f6f03f4f).
- Added linalg lowering for `AtenFloorDivideScalarOp`
- Needed `AtenDivScalarModeOp` for the decomp.
- Added linalg lowering for `AtenDivScalarModeOp`
- Moved linalg payload logic to `createDivModePayload()` since the logic
was nearly identical for both `AtenDivScalarModeOp` and
`AtenDivTensorModeOp`. Just a template function
- Added `AtenDivScalarModeOp` lowering for stablehlo
Pytorch's
[`torch.floor_divide()`](https://pytorch.org/docs/stable/generated/torch.floor_divide.html)
in a previous version (for a reason unknown to me) preformed a
truncation instead of "floor". The already implemented op
`AtenFloorDivideTensorOp` was done before this change. However, this
wasn't caught because our testcases only tested positive floor division.
I changed this to floor as well as adding a few test cases.
If there is only a single value scattered there can be an implicit batch
dimension. This includes a check for the implicit batch dimension when
reshaping the update tensor. It includes an e2e test to verify
correctness.
1. Changes the linalg lowering for dequantization ops to always sign
cast to float to prevent misrepresenting uint32 overflow on subtraction
with zero point.
2. Adds a basic quantized model test which only quantizes and
dequantizes and now passes with these changes in linalg and onnx
configs.
3. Changes the aten.mm lowering to allow mismatched quantized types.
4. If a quantized matmul arg is uint8, we shift by 128 to faithfully
represent the quantization as a signed i8 quantization. This worked fine
in the AtenMmOp lowering, but I'd be happy to move it to a rewrite in
FuseQuantizedOps.cpp instead if that seems more appropriate.
With the changes 3 and 4, the QuantizedMLP_basic and
QuantizedSingleLayer_basic e2e tests now passes with the onnx config.
…ute_reshape_shape
as that `aten.view` support at most one `-1` in dim list. The original
calculation of `numel` is wrong when there is a `-1` in dim list.
This PR only performs a lit test. In lieu of an e2e test, https://github.com/nod-ai/SHARK-TestSuite/pull/142 makede sure that the lowering works & the numbers check out.
Co-authored-by: Xida Ren <xida.ren.dev@gmail.com>
Squeezes can be ambiguous without the output shape information. For
instance (1, 1, 256) squeezed can be either (1, 256) or (256). We need
to check the resulting shape to know what the shape should look like.
Shapes can be processed as tensors to represent the set of dimensions.
As reshapes take a list of scalars this can result in a single dynamic
dimension blocking the adjacent static dimensions.
This pass attempts to de-couple tensor computations related to shapes
and propagate values to better support lowering scalar tensor
computations.
This commit also cleans up the OnnxToTorch lowering for the ReduceMean
op and adds the support for handling edge cases.
Signed-Off By: Vivek Khandelwal vivekkhandelwal1424@gmail.com