Addresses ~200 onnx model compile failures in
<https://github.com/nod-ai/SHARK-TestSuite> related to
<https://github.com/iree-org/iree/issues/18631>.
This change simplifies the result of the generated broadcast op
substantially, but reduces the case coverage slightly.
The case which will become unsupported:
- trying to actually broadcast a dynamic dim that is secretly 1.
When does this case appear in practical scenarios?
- for a model where onnx shape inference cannot figure out that a dim
should be 1.
Why do I think we should not support this case for now?
1. For all models with dynamic dim expand ops, the previous path
uniformly generates uglier linalg IR (making it harder for IREE to fuse
properly with other ops).
2. For models failing shape inference castastrophically enough to fail
to see a dim is statically 1, we can try to apply constant folding in
the onnx model before importing.
Leaving this as a draft PR, since it may be more appropriate to fix the
compilation failure in IREE rather than torch-mlir.
### Example of broadcast required in previous path:
```mlir
%300 = linalg.generic {indexing_maps = [#map11], iterator_types = ["parallel", "parallel", "parallel", "parallel"]} outs(%299 : tensor<?x12x?x?xi1>) {
^bb0(%out: i1):
%306 = linalg.index 0 : index
%307 = linalg.index 3 : index
%308 = arith.index_cast %285 : i64 to index
%309 = arith.cmpi eq, %308, %c1 : index
%310 = arith.select %309, %c0, %306 : index
%311 = arith.index_cast %286 : i64 to index
%312 = arith.cmpi eq, %311, %c1 : index
%313 = arith.select %312, %c0, %307 : index
%extracted_79 = tensor.extract %reshape_78[%310, %c0, %c0, %313] : tensor<?x1x1x?xi1>
linalg.yield %extracted_79 : i1
} -> tensor<?x12x?x?xi1>
```
### Example of broadcast with simplified shape list:
```mlir
%409 = linalg.generic {indexing_maps = [#map15, #map11], iterator_types = ["parallel", "parallel", "parallel", "parallel"]} ins(%reshape_135 : tensor<?x1x1x?xi1>) outs(%408 : tensor<?x12x?x?xi1>) {
^bb0(%in: i1, %out: i1):
linalg.yield %in : i1
} -> tensor<?x12x?x?xi1>
```
- When the signal tensor is real, onnx allows its shape to be
`[batch][length]` as well as `[batch][length][1]`.
- Onnx also allows to specify `frame_length` together with `window` (not
empty), given that it matches the window size.
- Adding checks on signal and result shapes.
Previously, if the value was absent, this conversion was creating a
dense resource of value 0 with shape equal to the result shape, then
later re-extracting a splat value. This only works if the shape is
statically known, and even when the shape is known, this is completely
unnecessary since the value's shape should be `[1]` and not the result
shape.
This patch simply sets the `splatvalue` to a `torch.constant.float 0.0`
when the onnx op's `value` attr is absent, and adds `nullptr` checks to
the subsequent conditionals to avoid them in the case where an `attr` is
not given.
Addresses <https://github.com/nod-ai/SHARK-Turbine/issues/831>.
Supports the result with dynamic shape and scalar indices like
```
func.func @test_gather_scalar(%arg0: !torch.vtensor<[3,4,5],f32>, %arg1: !torch.vtensor<[], si64>) -> !torch.vtensor<[?,?],f32> attributes {torch.onnx_meta.opset_version = 13 : si64} {
%0 = torch.operator "onnx.Gather"(%arg0, %arg1) {torch.onnx.axis = 0 : si64} : (!torch.vtensor<[3,4,5],f32>, !torch.vtensor<[], si64>) -> !torch.vtensor<[?,?],f32>
return %0 : !torch.vtensor<[?,?],f32>
}
```
`Torch::AtenSqueezeOp` is referring to the result shape, so it will
failed on lowering if the result shape is dynamic.
The `axis` attribute is optionally available. Added support by computing
the pad based on the axis values.
---------
Signed-off-by: Rob Suderman <rob.suderman@gmail.com>
- This PR adds new (and equivalent) more tensorized impl of
MelWeightMatrix which lowers all the way to linalg.
- [Ref Pytorch
Impl](https://gist.github.com/PhaneeshB/4e6dfcded3007b1b686fbe28f07a67cd)
- Thanks to @rsuderman for pointing out the difficulties [earlier
impl](#3503) posed during lowering to linalg and also for providing a
better numpy impl 🙏
This commit extends the OnnxToTorch lowering for BatchNormalization op
for supporting the case when training=True.
Signed-Off By: Vivek Khandelwal <vivekkhandelwal1424@gmail.com>
The `layout` attribute was not considered for the `onnx.RNN` operation.
Added support for the attribute to transpose the inputs / outputs of the
RNN when valid.
The saga of aligning onnx and torch padding conventions continues.
```python
onnx_pads = [low_x, low_y, low_z, high_x, high_y, high_z]
torch_pads = [low_z, high_z, low_y, high_y, low_x, high_x]
```
So not only is the lexicographical ordering hierarchy swapped (low/high
x spatial-dim -> spatial-dim x low/high) but the ordering in the the
spatial-dim specification is also reversed.
This patch properly reverses the pad ordering (and actually uses the
`shuffledPadding` to pad).
`onnx.Shape` can select only a subset of indices using attributes. Add
support for these attributes.
---------
Co-authored-by: zjgarvey <47986913+zjgarvey@users.noreply.github.com>
There were two issues related to `ignore_index` being set
(1) the onnx-to-linalg pass as not reading the value correctly (2) the
mean pass was not considering the `ignore_index` value
For (2) when taking the mean we need to know how many of the values were
considered in the sum and therefore we cannot divide by the total number
of elements. Adding a summation across the total number should correct
this issue.
This PR adds a conversion in the TorchOnnxToTorch pass for the ONNX
Multinomial operation. It also adds a TorchToLinalg lowering for the
`aten.Multinomial` op and does a light refactor of some repeated code
that generates random floating point numbers in
`TorchToLinalg/Random.cpp`.
This patch adds a few misc pad op related changes:
1. Addresses issue <https://github.com/llvm/torch-mlir/issues/3457>
2. Addresses issue <https://github.com/llvm/torch-mlir/issues/3442>
3. Fixes the padding order for asymmetrically padded onnx.Conv ops
4. Enables passing quantization through those onnx.Conv op pre-paddings
5. Modifies the torch-to-linalg lowering of AtenReplicationPad2d op to
enable support for input rank != 4
Unfortunately, even with all of these changes, the e2e tests for the
ReplicationPad2d still fail the onnx config, since the torch export
procedure for rearranging the pad order is complicated enough that the
padding ints end up not being able to fold back to constants.
The LpNormalization lowering was previously just computing the norm,
which is incorrect. This computes the norm then divides the input tensor
by it's norm.
I've tested this against some simple onnx models locally. I'll look into
adding a test case for this in an external test suite.
Fix the pad tensor rearrangement such that we change the representation
from [x1_begin, x2_begin, ..., x1_end, x2_end,...] to [xn_begin, xn_end,
...., x2_begin, x2_end, x1_begin, x1_end] where x1, x2 .. xn are the
dimensions of the pads tensor argument.
---------
Co-authored-by: zjgarvey <zjgarvey@gmail.com>
Co-authored-by: zjgarvey <47986913+zjgarvey@users.noreply.github.com>
- Adds limited support for lowering onnx.Loop to primLoopOp
- lower in the pipeline`torch-to-scf` there is a check to see if loop is
for like. A primLoopOp is for like when the input condition is a
`trueBoolConstant`. To adapt the onnx to torch lowering to take
advantage of it, the implementation checks for specific op patterns in
the loodBody region and decides if loop is for like and uses the right
input condition op.
- to adapt the onnxLoopBody to torchLoopBody, we need to adapt the input
block arguments and set the correct output condition variable in the
loop body.
- scanOutput variables are currently not supported.