follow up #761:
This patch updates the `torch_mlir::convertTensorToMlirElementsAttr()`
method to enable the creation of tensors whose base type is Float16.
This patch also adds a test to validate the IR generation, and it
updates the test for importing tensors of various types.
PyTorch recently added support for `dim=None` in the `torch.var`
(5ca9b2b6fa)
and `torch.std`op (eb0e30e0bc).
This commit adds the corresponding support in torch-mlir.
Signed-Off By: Vivek Khandelwal<vivek@nod-labs.com>
In some cases, users know that a traced graph is valid for a wider set
of shapes than they originally traced it with. Provide an option for
users to ignore the shapes in the traced graph when they know it is
legal.
Fixes#997
* Replace CHECK_EQ with TORCH_CHECK_EQ
* Check value of TORCH_MLIR_USE_INSTALLED_PYTORCH during LTC build
* Update LTC XFAIL with NewZerosModule ops
* Explicitly blacklist _like ops
* Automatically blacklist new_/_like ops
* Prune away unused Python dependencies from LTC
* Add flag to disable LTC
* Autogen dummy _REFERENCE_LAZY_BACKEND library when LTC is disabled
* Implement compute_shape_var
* Removed Var tests from XFAIL Set
* XFAIL tests using _local_scalar_dense or index.Tensor
* Add StdDim tests to XFAIL set
* Autogen aten::cat
* Changed Example MLIR backend to Reference MLIR backend
* Moved reference_ltc_backend into csrc
* Merged sys_utils.h
* Renamed reference_ltc_backend to reference_lazy_backend
* Addressed review comments
* Update docs with new library name
* Removed _REFERENCE_LAZY_BACKEND from .gitignore
* Added reference_lazy_backend to the TorchMLIRPythonModules dependency list
Fixed typo in `ltc_examples.md`
Missed instance where `ltc_backend` was used instead of `lazy_backend`.
- Pruned number of xfailed e2e LTC tests from 305 to 134
- Reviewed every failure to ensure the error genuinely warrants an xfail
- Fixed bug where non-tensor outputs of LTC computation had `.to('cpu')` called, which caused a failure and inflated the xfail count
- Fixed bug with `HBC_basic` test where a constant tensor was created in its constructor without being declared as a buffer, which prevented the device from being updated when the parent `torch.nn.Module` got moved to the `lazy` device
- Note that this test is still xfail'd due to some unsupported ops. Left a comment about some potential issues that may arise if it gets reenabled in the future
- Updated autogen `GeneratedTorchOps.td` to reflect the latest set of supported ops
- Renamed `aten.zero.functionalization` to `aten.zero` to reflect upstream PyTorch changes
* Added e2e LTC Torch MLIR tests
* Fix seed for reproducability
* Check if computation is None before getting debug string
* Updated unit tests, and added numeric tests
* Print name of the model layer that fails numeric validation
* Run LTC e2e test with CI/CD
* Set seed in main function, instead of beginning of execution
* Add comment to specify number of digits of precision
* Fixed typo
* Remove tests for LTC example models
* Added LTC option to torchscript e2e
* Implement compile and run for LTC e2e test
* xfail all tests that use ops that aren't currently supported
* Update native function definitions
* Add ops to support bert lowering
- Add empty_strided and as_strided
- Restore zeros_like to op blacklist (Without this, tensors will be unintentionally created with a CPU device rather than lazy)
- Check for composite implicit ops and add device data IR
- Also fix codegen for functionalization
* Add autogen to CMakeList
* Remove PyTorch submodule
* Reduced BERT model size
* Print Mark Step status in Torch MLIR LTC debug string
* Apply fixes to work with latest upstream/main
- Pass importOptions into getMlirTypeFromTorchType during NodeImporter::importNode
Without this, the tensor type created may have a mismatched type as ImportOptions may cause vtensor to be used instead of tensor
* Update shape inference functions
- Fixed compute_shape_native_batch_norm when mean and var are uninitialized
Previously, the number of shapes returned would be <3 if either mean or val was didn't exist. Instead, we now initialize them with a vector matching the number of channels.
- Implemented compute_shape_mul
- Fixed bug in reshape shape inference error message
* Get MLIR backend more consistent with TS backend
- Remove LazyNativeFunctions::_unsafe_view from autogen
- Blacklist ops to make JIT graph more like output of TS backend
- Print graph when SSA value has mismatch of types and results
- Remove normalize_index from LazyShapeInference
- Fix seeds for LTC example models
* Update and clean up shape inference functions
- Prune shape inference functions
- Add shape inference function for GenerateSlice
- Add shape inference function for GenerateCopy
Co-authored-by: Henry Tu <henry.tu@cerebras.net>
* Assume zero rank tensors are scalar
* Run RefineTypes pass on JIT Graph
* Rollback assumption that zero rank tensors are scalar
* Set numSizes to -1 for non-ranked tensors
* Rename RefineTypes to RefineTupleTypes
* Save InputOutputAliases to TorchMlirComputation
* Implement GetResultShape for TorchMlirLoweringContext
* Use optional return type for GetResultShape
* Remove support for aten::detach
With this op enabled, tensors were being copied, which resulted in incorrect aliasing.
* Add newline before printing I/O alias mapping
* Changed printout to use "Input param" as label instead of "Input"
* Remote shape inference function for aten::detach
* Moved implementation of SetUpAlias to MlirLoweringContext
As part of this change, TorchMlirComputation has been moved to the end of mlir_lowering_context.h so that it can access some new structs in TorchMlirLoweringContext
* Use updated PyTorch API
* Remove GetResultShape
Complements this upstream PyTorch PR: pytorch/pytorch#75828
This PR adds support for mapping input and output tensors which alias each other. (e.g. maps input weight tensor in parameter to the same tensor in output after a training iteration)
MLIR:
func @graph(%arg0: !torch.vtensor<[1,5],f32>, %arg1: !torch.vtensor<[1],si64>, ..., %arg6: !torch.vtensor<[10,5],f32>, %arg7: !torch.vtensor<[10],f32>, ...) {
...
return %arg0, %arg1, %17, %23, ... : !torch.vtensor<[1,5],f32>, !torch.vtensor<[1],si64>, !torch.vtensor<[10,5],f32>, !torch.vtensor<[10],f32>, ...
}
Input/Output Alias Mapping:
Output: 0 -> Input: 0
Output: 1 -> Input: 1
Output: 2 -> Input: 6
Output: 3 -> Input: 7
The aten::detach op has also been disabled in this PR to fix the issue of tensors not aliasing properly due to copying.
* Added JIT to MLIR lowering
Lowering to JIT is performed in a way similar to how it's done in the TS LTC backend. After a jit::Graph is constructed, it gets converted to a jit::Function, which is fed into the existing utility to generate an MlirModule in torch-mlir.
* Renamed `csrc/backend` to `csrc/base_lazy_backend`
This commit fixes the shape calculation for:
1.) aten.mean.dim
2.) aten.var.dim
3.) aten.sum.dim_IntList op
Also, it fixes the lowering of `aten.mean.dim` and
`aten.sum.dim_IntList` for handling the cases of empty dim list.
Signed-Off By: Vivek Khandelwal <vivek@nod-labs.com
- Includes a canonicalizer for `aten.add.t`needed for successfully lowering the shape function
- Only offers support for statically sized index tensors when there is more than one
- Dynamic shape support remains for single indexing tensors
This enables building Pytorch from source in the CI.
The build should mostly hit the ccache.
Release builds will follow once we have some runtime on the CI.
In the interest of merging upstream LLVM quickly, a previous patch
(7f08169) updated the torch-mlir build to register all dialects and
passes through Python bindings. This patch limits the dialects and
passes to only those that are used in torch-mlir.
Key to this change are the removal of
`MLIRPythonExtension.RegisterEverything` and the introduction of a new
Python module (`_mlir_libs/_site_initialize_0.py`), where we register
the dialects and passes used by torch-mlir.
- Supports cases where the view op expands and collapses dims
simulataneously. This does not handle the case where it is neither
expanding nor collapsing (e.g. [2, 3] -> [3, 2])
- Additionally fixes a previous bug with adding 1-sized dims on both
sides of a tensor with aten.view
This patch makes some rudimentary changes to torch-mlir's use of MLIR
Python bindings to work with the most recent LLVM code. We can perhaps
do better by being more selective in what we link against, instead of
using `MLIRPythonExtension.RegisterEverything`.
This commit adds the support for negative dim cases for `aten.cat`,
`aten.slice.Tensor` and `aten.slice_scatter` op.
Signed-Off By: Vivek Khandelwal <vivek@nod-labs.com>
The original conversion pattern for `AtenBatchNormOp` required that
the input rank be greater than 2; however, the only
expectation in the conversion pattern and in Pytorch is that the input
rank is greater than 1, since the second dimension of the input must
match the size of the `weight`, `bias`, `runningMean`, and
`runningVar` inputs. This commit fixes the `inputRank` check.
This commit adds the decomposition for `aten.var.dim` op.
This commit also make changes in the decomposition for `aten.var` op.
Signed-Off By: Vivek Khandelwal <vivek@nod-labs.com>
Remove all the libtorch downloads. If the user sets
-DTORCH_MLIR_USE_INSTALLED_PYTORCH=OFF then just build from src.
Doesn't change developer workflow since we still default to local
PyTorch versions.
TEST: Build and verify all tests (except one xfail quant) pass on linux
This commit does three things:
1. Reverts some of the shape lib changes merged in
https://github.com/llvm/torch-mlir/pull/844
2. Updates the signature of `aten.sum_dim_IntList` that was recently
updated in
23bdb570cf
3. Replaces `aten.zero.functional` with `aten.zero`, updated in 960758b0b7
`aten.select_scatter` op.
This commit adds:
1. Lowering of `aten.slice_scatter` op into `tensor.insert_slice`
op.
2. Decomposes the `aten.select_scatter` op into `aten.slice_scater`
op.
Signed-Off-By: Prateek Gupta <gprateek93@gmail.com>
Temporarily revert to using PyTorch binaries until source builds
are ready to land.
TORCH_MLIR_USE_INSTALLED_PYTORCH can be turned to OFF if you want
to link against libtorch and/or source builds.
On my local machine, `unzip` didn't exist (producing a "command not
found" error), but CMake ignored the error. Although the build did
succeed (because it found a previously-built version of libtorch), it
seems better to abort builds on such failures, so this patch checks the
return code of all external process invocations.
Along similar lines, this patch also updates the shell scripts in
`build_tools` to extensively use double-quoting to prevent unintentional
word splitting or globbing. Since some of the scripts execute `rm`
while using shell variables, this patch also adds the preamble `set -u`
to abort execution if an undefined variable is referenced, so that we
reduce the chances of executing `rm -rf /` if the path expression
happens to refer to an undefined variable.
Add an option to cache libtorch/ releases if you don't want to
download the latest. Add an option to enable source builds.
TESTS:
macOS: verify with / without cache downloads
verify source builds -- shared and static
Linux: Build Tests and Release builds
A previous fix to the handling of size-1 dims in
`aten.view` (https://github.com/llvm/torch-mlir/pull/962) resulted in
the wrong grouping of dimensions when size-1 dims where between two
dims of size greater than 1. This commit fixes that.
TorchScript nodes like `prim::Load` and `prim::Store` aren't supported
in torch-mlir because they can't be lowered to backends, but such nodes
can occur in the TorchScript IR.
This patch adds a rudimentary translation from such nodes to
corresponding ops in the Torch dialect. Since we expected such nodes to
go away during lowering because of the SymbolDCE pass, this patch does
not add code to lower these ops beyond the Torch dialect.
This commit lowers `aten.matmul` to `linalg.BatchMatmul` under the
following conditions:
1. The result of matrix multiplication must have batch dimensions,
i.e., rank greater than 2.
2. The resultant matrix must have at most 1 dynamic batch dimension.
It also handles broadcasting of batch dimensions when batch dimensions
of the matrices are broadcastable.
Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com>
This commit fixes the shape function for `index.Tensor`, adding
support for multiple index tensors and `None`s in the indices
list. This commit also adds support for input tensors of rank greater
than 1. The lowering for `index.Tensor` still has the the limitation
that only a single index tensor along the first dimension of the input
tensor is supported.
Prior to this patch, the torch dialect included `AtenTriuOp` for
computing the upper triangular part of the input matrix, but there was
no code for lowering the op to the linalg dialect.
This patch adds code to generate a `linalg.generic` operation that
compares indices (computed using `linalg.index`) to choose between zero
or the original value (using `arith.select`). The lowering fails if the
number of dimensions are less than two. This patch also adds a few
end-to-end tests.