* Fix c10::prim::Constant conversion; Added CAPI for passes; Added passes to base lazy backend
* Update ivalue_importer to use ImportOptions; Added tests for non-value/value tensor types
* Added tests for scalar Constant import; Updated MB::importFunction to use ImportOptions
* Test updates
* Move back module variable name
* Remove RefineTypes from TorchMlirLoweringContext::Build()
* Rename pass; Remove passes from base lazy backend
* Rename pass to VerifyBackendContractPass
* Aligned cmd pass name; Fixed TorchConversion passes registration
* test: allow spaces in path to Python executable
On Windows, the path to the Python binary may contain spaces, so this
patch adds quotes around the path to the python executable.
Thanks to @sstamenova for suggesting the fix!
* python: remove header file that causes Windows build failures
Similar to https://reviews.llvm.org/D125284, we can safely remove this
header file without affecting the build on either Linux. It is
necessary to remove this header file on Windows builds since otherwise
it causes build errors.
* python: drop `TORCH_API` from function defined in Torch-MLIR
`TORCH_API` should apply to functions that are either exported by
libtorch.so or ones that are imported from libtorch.so by its downstream
consumers (like Torch-MLIR). Neither case applies to the
`importJitFunctionAsFuncOp()` function, since it is defined in
Torch-MLIR (and thus outside libtorch.so). This patch fixes the problem
by dropping `TORCH_API` from that function's declaration.
* python: make output of class anotations deterministic
The `class-annotator-repr.py` test checks for class annotations in a
specific order, but prior to this patch, the order was
non-deterministic, since the code iterated on an _unordered_ map.
This patch makes the iteration order deterministic through two changes:
1. using a sorted map
2. using the class qualified name instead of the address of the class in
memory
* test: use Python3_EXECUTABLE as interpreter path for consistency
This ensures that tests use the Python3 version that was detected using
CMake, instead of whichever python version that happens to be in the
PATH variable when invoking the test.
* test: fix RUN string
The parenthesis syntax does not run on Windows (the shell interprets the
`(` character as part of the path). Moreover, the ODR violation in the
comment no longer seems to apply.
* python: port parallel test framework to Windows
Since Windows does not support `fork` natively, Python's
`multiprocessing` module needs to use `spawn` on Windows. However, to
use `spawn`, the multiprocessing module serializes (or pickles) the
worker function and its arguments. Sadly, the multiprocessing module
(both the default one in Python and the one that is extended in PyTorch)
is unable to serialize lambda functions (see
https://stackoverflow.com/a/19985580) for detals.
Unfortunately, given how our tests are structured, we require that the
function under test is passed as an argument to another function, so we
cannot sidestep our use of lambda functions.
To resolve this problem, this patch makes use of the `multiprocess` and
`dill` Python modules, which together offers a multiprocessing mechanism
that can serialize lambda functions. The multiprocess module also
offers a process pool, which simplifies the code for our parallel
testing framework.
This commit adds support for TorchToTosa lowering of
`aten.broadcast_to` op for cases:
1.) When the rank of input and output tensor is equal.
2.) When the rank of input tensor is zero.
Signed-Off By: Vivek Khandelwal<vivek@nod-labs.com>
* Propagate parameter name to MLIR
* Add TorchMlirNode Constructor Hook
* Make func_op mutable
- Purpose of this is to allow modification of func_op by subclass
backend
* Clean up unnecessary changes
* Remove unnecessary attribute case
* Address PR comments
This adds a very long and obnoxious option to disable crashing tests.
The right fix here is to use the right multiprocessing techniques to
ensure that segfaulting tests can be XFAILed like normal tests, but we
currently don't know how to implement "catch a segfault" in Python
(patches or even just ideas welcome).
Motivated by #1361, where we ended up removing two tests from *all*
backends due to a failure in one backend, which is undesirable.
Strength the shape inference for aten.arange-like op by
1. registering aten.sub and aten.ceil.Scalar op and design folders for them.
2. register a new constant-like op: Torch::ConstantNumberOp and design canonicalizer for it.
As @oroppas identified, literal strings that are over 16,380 characters
cause the MSVC compiler to throw an error (C2026), eventually causing
the Windows build of Torch-MLIR to fail because the length of the
generated MLIR for the shape library crosses the allowed threshold.
This patch fixes the problem by making the Python script generate one
literal string per line to satisfy the MSVC compiler.
Thanks to @oroppas for the bulk of the effort required to resolve this!
Summary of changes:
- Updated emitAccessorPrefix since the default value has changed
(https://reviews.llvm.org/D133179)
- Updated RefineTypes pass since Lattice::isUninitialized() is removed
(https://reviews.llvm.org/D132800)
- Updated MHLO tag so that it builds with the updated LLVM tag
- Disabled two tests that cause segfaults in the TOSA backend (see Issue
#1361)
* Add aten.frobenius_norm.dim op and init its conversion pattern to linalg and MHLO,
* run symbolic-shape-optimization before hlo-legalize-to-linalg to fit more mhlo e2e tests.
Summary of changes:
- Update the dataflow analysis in RefineTypes.cpp
- Add tosa-to-arith pass after tosa-to-linalg pass, since
tosa-to-linalg (and canonicalizations) can produce tosa.const() ops
- Fixed warning about not making `matchAndRewrite` as override
This commit adds decomposition of `aten.linear` op. Due to limited
support at tosa backend in case of dynamic dimensions, this
decomposition is currently disabled for tosa backend.
Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com>
We use it for more than TorchScript testing now. This is a purely
mechanical change to adjust some file paths to remove "torchscript".
The most perceptible change here is that now e2e tests are run with
```
./tools/e2e_test.sh
instead of:
./tools/torchscript_e2e_test.sh
```
Change logic so that we never run the multiprocessing codepath with only
1 worker. That configuration was causing all subsequent tests to
spuriously fail if one test failed with a crash (this was easy to see
after sorting the tests). That configuration was the one used by the CI.
Also, sort tests to make output nicer.
Also, make verbose mode more verbose so that it is easy to see in `-s`
mode which test is crashing.
This commit adds a method to `TestUtils` that generates random integer
tensors with a similar interface to the `TestUtils.rand`. This commit
also replaces with `tu.randint` all test inputs generated with
`torch.randint`.
We were already hitting many cases where backends different in terms of
the legal ops that they wanted. This caused unnecessary coupling between
the backends. Examples:
- https://github.com/llvm/torch-mlir/pull/1161
- https://github.com/llvm/torch-mlir/pull/862
This PR centralizes all compilation to go through `torch_mlir.compile`
so that we can keep the logic centralized there. We should move these
lists closer to each backend. Especially cases like
https://github.com/llvm/torch-mlir/pull/862 where blocking a
decomposition is necessary to avoid a crash emphasize that the set of
decompositions is tightly coupled to the backend, and should be
"controlled by the backend" and not something arbitrarily tweakable.
Also:
- Fix a small bug in the way we passed through the backendLegalOps
option.
- Add better error messages in `torch_mlir.compile` for import errors.
I recently fixed the handling of the `dim` argument in
`sum_mean_dim` (59fccab857). Therefore,
the checks that the `dim` input is `None` or `[]` are no longer needed.
Bumps the shape library:
- Updates the function signature for aten.arange.start_step
- upstream_shape_functions.mean_dim -> upstream_shape_functions.sum_mean_dim
* Propagate device data names
* Address PR comment
* Add example usage
* Add test for device data names
* Make TorchMlirComputation fields protected
* Add lazy backend device data name unit tests
* Disable lazy backend tests if LTC is disabled
* Add comments
* mac m1 cross compile
Add support for M1 cross compile
* Remove redundant ExecutionEngine
It is registered as part of RegisterEverything
* nuke non-universal zstd
disable LTC
follow up #761:
This patch updates the `torch_mlir::convertTensorToMlirElementsAttr()`
method to enable the creation of tensors whose base type is Float16.
This patch also adds a test to validate the IR generation, and it
updates the test for importing tensors of various types.
PyTorch recently added support for `dim=None` in the `torch.var`
(5ca9b2b6fa)
and `torch.std`op (eb0e30e0bc).
This commit adds the corresponding support in torch-mlir.
Signed-Off By: Vivek Khandelwal<vivek@nod-labs.com>
In some cases, users know that a traced graph is valid for a wider set
of shapes than they originally traced it with. Provide an option for
users to ignore the shapes in the traced graph when they know it is
legal.
Fixes#997
* Replace CHECK_EQ with TORCH_CHECK_EQ
* Check value of TORCH_MLIR_USE_INSTALLED_PYTORCH during LTC build
* Update LTC XFAIL with NewZerosModule ops
* Explicitly blacklist _like ops
* Automatically blacklist new_/_like ops
* Prune away unused Python dependencies from LTC
* Add flag to disable LTC
* Autogen dummy _REFERENCE_LAZY_BACKEND library when LTC is disabled
* Implement compute_shape_var
* Removed Var tests from XFAIL Set
* XFAIL tests using _local_scalar_dense or index.Tensor
* Add StdDim tests to XFAIL set
* Autogen aten::cat
* Changed Example MLIR backend to Reference MLIR backend
* Moved reference_ltc_backend into csrc
* Merged sys_utils.h
* Renamed reference_ltc_backend to reference_lazy_backend
* Addressed review comments
* Update docs with new library name
* Removed _REFERENCE_LAZY_BACKEND from .gitignore
* Added reference_lazy_backend to the TorchMLIRPythonModules dependency list
Fixed typo in `ltc_examples.md`
Missed instance where `ltc_backend` was used instead of `lazy_backend`.
- Pruned number of xfailed e2e LTC tests from 305 to 134
- Reviewed every failure to ensure the error genuinely warrants an xfail
- Fixed bug where non-tensor outputs of LTC computation had `.to('cpu')` called, which caused a failure and inflated the xfail count
- Fixed bug with `HBC_basic` test where a constant tensor was created in its constructor without being declared as a buffer, which prevented the device from being updated when the parent `torch.nn.Module` got moved to the `lazy` device
- Note that this test is still xfail'd due to some unsupported ops. Left a comment about some potential issues that may arise if it gets reenabled in the future
- Updated autogen `GeneratedTorchOps.td` to reflect the latest set of supported ops
- Renamed `aten.zero.functionalization` to `aten.zero` to reflect upstream PyTorch changes
* Added e2e LTC Torch MLIR tests
* Fix seed for reproducability
* Check if computation is None before getting debug string
* Updated unit tests, and added numeric tests
* Print name of the model layer that fails numeric validation
* Run LTC e2e test with CI/CD
* Set seed in main function, instead of beginning of execution
* Add comment to specify number of digits of precision
* Fixed typo
* Remove tests for LTC example models
* Added LTC option to torchscript e2e
* Implement compile and run for LTC e2e test
* xfail all tests that use ops that aren't currently supported
* Update native function definitions
* Add ops to support bert lowering
- Add empty_strided and as_strided
- Restore zeros_like to op blacklist (Without this, tensors will be unintentionally created with a CPU device rather than lazy)
- Check for composite implicit ops and add device data IR
- Also fix codegen for functionalization
* Add autogen to CMakeList
* Remove PyTorch submodule
* Reduced BERT model size
* Print Mark Step status in Torch MLIR LTC debug string
* Apply fixes to work with latest upstream/main
- Pass importOptions into getMlirTypeFromTorchType during NodeImporter::importNode
Without this, the tensor type created may have a mismatched type as ImportOptions may cause vtensor to be used instead of tensor
* Update shape inference functions
- Fixed compute_shape_native_batch_norm when mean and var are uninitialized
Previously, the number of shapes returned would be <3 if either mean or val was didn't exist. Instead, we now initialize them with a vector matching the number of channels.
- Implemented compute_shape_mul
- Fixed bug in reshape shape inference error message
* Get MLIR backend more consistent with TS backend
- Remove LazyNativeFunctions::_unsafe_view from autogen
- Blacklist ops to make JIT graph more like output of TS backend
- Print graph when SSA value has mismatch of types and results
- Remove normalize_index from LazyShapeInference
- Fix seeds for LTC example models
* Update and clean up shape inference functions
- Prune shape inference functions
- Add shape inference function for GenerateSlice
- Add shape inference function for GenerateCopy
Co-authored-by: Henry Tu <henry.tu@cerebras.net>