After running the model tests in SHARK-TestSuite, I noticed a few model
failures due to half-fusion.
Notably, RDN_pytorch_vaiq_int8 had a depth=5 convolution chain with
multiple AtenViewOp's.
This change enables more customization with operand quantization, and
generalizes the patterns QuantizeOperands and QuantizeTransposeOperands
to QuantizeOperandsPastCommutingOps.
This allows for passing quantization through operations which are
functionally unaffected by quantization, such as view-like ops. The
purpose of this change is to address a myriad of quantization issues
seen in quantized onnx models that have some reshape-like operations
sandwiched in between a dequant and something like a matmul (whose other
operand is immediately quantizable).
* Enables assume_strict_symbolic_shapes on fx_importer imported
programs, indicating strict shape semantics.
* Reworks the view->reshape lowering to take advantage of strict mode
and do one of:
* Collapse to 0D
* Flatten/Unflatten when there is an inferred dim.
* Fallback to tensor.reshape
* Splits some test cases up and adds an attribute to control the old
pattern (so new corners can be tested in strict mode in isolation).
* Dynamic inferred mode needs upstream work to generalize expand_shape
(so that case is suppressed here).
* Deletes the assert from the existing tensor.reshape lowering if strict
shape mode is enabled (since the condition it is dynamically asserting
cannot happen).
(1) test full pytorch output for eltwise
(2) use "random" input for LIF, to get general sparse tensor
(3) introduce way to get true sparsity into network (needs backend fix
first)
…cation and sparse tensors.
**NOTE**: This PR _doges_ the issue in buffer-deallocation pass instead
of resolving it. In the future, we need to fix the bug in
buffer-deallocation pass when handling code generated by sparse
compiler.
While waiting for the full resolution of feature request
https://github.com/pytorch/pytorch/issues/117188
(which will propagate sparsity the right way in upstream PyTorch for all
FX Graphs), this minor change allows us to start testing sparsity
"within" a network, rather than just the parameters. Feel free to add
your own rules for testing (but within reason for what will be done
upstream).
Note, two TODOs need to be addressed to work around some pending issues
to make the JIT execution work.
This commit adds the OnnxToTorch support for ReduceSumSquare ops.
---------
Co-authored-by: Ubuntu <archana@archana-cpu.judsoscro3wupi0qm4bjlj5m3b.bx.internal.cloudapp.net>
While playing with TorchDynamo on ResNet18. I notice following issues:
- `prims.convert_element_type` can’t be canonicalized even if the input
and the output share the same type
- `aten.max_pool2d_with_indices` is always used instead of
`aten.max_pool2d`, even if the second returned output (indices) has no
user
This PR fixes above issues by adding a folder to the
PrimsConvertElementTypeOp and a canonicalizer to the
AtenMaxPool2dWithIndicesOp
Lit test:
`cmake --build build --target check-torch-mlir-all`
---------
Co-authored-by: Ze Zhang <ze.zhang@getcruise.com>
This is probably a decent PR for learning about blocks and regions.
If you're here to learn about that, consider also looking at
lib/Conversion/TorchToSCF/TorchToSCF.cpp
While this doesn't include an e2e test, it is tested downstream in
https://github.com/nod-ai/SHARK-TestSuite/blob/main/e2eshark/onnx/operators/If/model.py
---------
Co-authored-by: Xida Ren <xida.ren.dev@gmail.com>