This adds the `generate-runtime-verification` pass into the linalg
refbackend, and moves all tests that now abort at runtime into the crash
set, sorted by their respective errors.
I have fixed on set of errors found that way, which are mismatches
between the static dimensions we cast to and the actual dynamic
dimensions. This was caused by wrong annotations on the test cases, like
in
https://github.com/llvm/torch-mlir/pull/3615/files#diff-48bfbf41fcad5fa01b49197d251114f84a2b8de4f1d87ab938a061aedd1419b1R1931
This patch adds basic support for lowering graphs with per-channel
quantization. Per-channel quantized ops have to be excluded from
`FuseQuantizedOps` for now but can be used in QDQ quantized form.
Using this patch, we're able to import and execute (on the linalg
backend) graphs with per-channel quantization applied using the "new"
PyTorch 2.0 Export Quantization.
The static uneven divisible AdaptiveAvgPool2d means that although the
input size is not an integer multiple of ouput size, but the kernel and
stride size can also be fixed (not dynamic). The derivation logic of
kernel and stride size is consistent with
torch/_decomp/decomposations.py:adaptive_avg_pool2d as described in the
following:
1. Stride Size
Firstly , derive the start index in each reduce operation according to
the output size (`n`), `start_index = ([0, 1, ..., n - 1] * input_size)
// output_size`. For each index `k`, if `k * (input_size % output_size)
< output_size`, then the current and previous stride keeps the same as
`input_size // output_size`. So suppose `(n-1) * (input_size %
output_size) < output_size`, the stride in the whole AdaptiveAvgPool2d
process keeps static, as `input_size // output_size`.
2. Kernel Size
torch/_decomp/decomposations.py:adaptive_avg_pool2d calculates a static
kernel size when the input/output sizes satisfy either of the two
conditions, `input_size % output_size == 0` or `output_size %
(input_size % output_size) == 0`. Here if `input_size % output_size ==
0`, then the kernel size equals `input_size // output_size`, otherwise
`input_size // output_size + 1.`
- Adds support for lowering depthwise + quantized convolution ops to
linalg::DepthwiseConv2DNhwcHwcQOp
- Changed the variable name for groupSize (which is really C/G) to the
more appropriate numGroups (G).
- Discovered in e2e testing that linalg does not accept (Cin = groups &&
Cout = K*groups for K>1) as a "depthwise" conv, so this also updates the
case-checking to reflect this issue.
This PR adds a conversion in the TorchOnnxToTorch pass for the ONNX
Multinomial operation. It also adds a TorchToLinalg lowering for the
`aten.Multinomial` op and does a light refactor of some repeated code
that generates random floating point numbers in
`TorchToLinalg/Random.cpp`.
Fix the pad tensor rearrangement such that we change the representation
from [x1_begin, x2_begin, ..., x1_end, x2_end,...] to [xn_begin, xn_end,
...., x2_begin, x2_end, x1_begin, x1_end] where x1, x2 .. xn are the
dimensions of the pads tensor argument.
---------
Co-authored-by: zjgarvey <zjgarvey@gmail.com>
Co-authored-by: zjgarvey <47986913+zjgarvey@users.noreply.github.com>
This commit adds the support for new data types: uint4, and int4 and
uint8 tensor protos. Also, it moves some tests from failing to crashing.
Fixes https://github.com/llvm/torch-mlir/issues/3507
Signed-Off By: Vivek Khandelwal <vivekkhandelwal1424@gmail.com>
Addresses an issue with onnx.Gather lowering to linalg:
<https://github.com/nod-ai/SHARK-Turbine/issues/242>
The builder for tensor.expand_shape, without an explicitly provided
output shape, fails to infer an output shape in the case of multiple
dynamic reassociation dims. I tried adding the output shape explicitly
for tensor.expand_shape, but ran into compilation issues later on (see
<https://github.com/iree-org/iree/issues/17760>).
This PR adds support by lowering this op to tensor.reshape when multiple
dynamic reassociation dims are provided.
The `index_put` operation, `input[indices] = values`, allows for the
values to be any shape that is broadcastable to the slice
`input[indices]`. This commit adds broadcasting support to the Linalg
lowering of `IndexPutHackedTwinOp`.
Fixes: #3465
This adds support for a few ops:
- torch.linalg_det
- torch._linalg_det (if the LU and pivot returns are unused)
- onnx.Det
An scf loop is used, since the row reduction algorithm applied here has
some loop-carried dependencies.
The current support being added here is very basic, and only works if no
permutations are required during row reduction, and assumes the matrices
are non-singular.
This adds a torchvision op to torch-mlir and a path from onnx.DeformConv
to torchvision.deform_conv2d.
I'm not implementing the torch->linalg lowering for the torchvision op
yet, but posting this PR to get feedback on some of the choices being
made here and to flesh out the onnx frontend a bit.
Add a new op with shape/dtypes and decompose into
`fake_quantize_per_tensor_affine` when the second result is unused.
The xfail_set change is on ONNX because torch cannot export this op to
ONNX.
Issues was found here https://github.com/nod-ai/SHARK-Turbine/issues/643
- [ONNX] Fix padding attributes for onnx.AveragePool
- [Linalg] Add countIncludePad false support for AtenAvgPool1/2dOp
- [Linalg] Add an avg_pool2d countIncludePad False e2e tests
- [Linalg] Fix conflict with AtenAvgPool3dOp
- [Linalg] Fix e2e crash with AtenAvgPool1dOp
- [Linalg] Add dynamic dim support for AtenAvgPool2dOp
- [Linalg] Fix AvgPool2dDivisorOverrideModule crash
This addresses 7 of the model failures I'm seeing in the test suite. See
[Shark-Turbine issue
#566](https://github.com/nod-ai/SHARK-Turbine/issues/566).
Need the op ```linalg.conv_2d_ngchw_gfchw_q``` to be added upstream
before merging this. See [llvm-project PR #92136
](https://github.com/llvm/llvm-project/pull/92136).
A small additional expansion to operand quantization is included in this
patch to address a model failure that occurs when unblocking the
quantized group convolutions in one of these onnx models.