-- The dtype of the result of `aten.embedding` should match that of
the `weight` operand's (operand[0]) instead of hardcoding to f32.
-- This commit aims to provide a fix for the same.
Signed-off-by: Abhishek Varma <abhishek@nod-labs.com>
pytorch/pytorch@140a3139 reverted a change from yesterday, causing the
RollPyTorch action to break. This patch reverts the corresponding
change in the torch-mlir LTC code.
This patch also re-enables tests that were previously marked as XFAIL.
As [@ezyang suggested](https://github.com/pytorch/pytorch/issues/90276#issuecomment-1339791275),
use `torch._dynamo.optimizations.training.aot_autograd` instead of raw
`make_fx`. This is more future proof and gives us the backward pass and
functionalization. We don't currently get functionalization because of
https://github.com/pytorch/pytorch/issues/90759
This also incidentally fixes the source location handling, which makes
`lockstep_basic.py` give an accurate source location!
This was an experimental attempt at rolling out own op-by-op executor
with `__torch_dispatch__`, but it proved difficult to make it robust.
Op-by-op execution is very easy to implement robustly now with the
PyTorch 2.0 stack, so we don't need eager_mode.
Downstream users were using eager_mode to implement lockstep numerical
accuracy debuggers. We implemented the same functionality with
TorchDynamo in https://github.com/llvm/torch-mlir/pull/1681 so now there
is not much reason to continue maintaining it.
This more accurately reflects what it is. The previous name was
conflating the use of RefBackend (which `linalg`, `tosa`, and `mhlo`
configs all use) with the use of the linalg backend (e.g. TorchToLinalg).
This conflation was artifically giving the linalg backend a "privileged"
position, which we want to avoid. We still keep it as the default
backend, and it remains the most complete, but at least there's not
artificial boosting.
This commit replaces the LCG algorithm that was being used by the
`TorchToLinalg` lowering of `AtenUniformOp` to generate random numbers
with the `squares64` algorithm, for the LCG algorithm was producing
tensors that were highly correlated with one another.
Squares64 algorithm: https://arxiv.org/abs/2004.06278
Closes https://github.com/llvm/torch-mlir/issues/1608
There are a few e2e tests that take several very large tensors as
input, which leads to the e2e test suite leaking too much
memory. Running things locally resulted in a total memory usage of
12.5 GB when running the suite sequentially on the refbackend.
Many of the tests that take large tensors don't actually need
such large tensors to pass, and some that take several large tensors
as input are just doing the same thing multiple times. This commit
reduces the size of some of the tensors and removes repetitive parts
of tests to reduce the memory usage to a total of 3 GB.
Set PyTorch and TorchVision version to nightly release 2022-11-22.
Add failing tests to the xfail set.
Signed-Off By: Vivek Khandelwal<vivek@nod-labs.com>
This adds a basic e2e Config for TorchDynamo using
Linalg-on-Tensors/RefBackend.
But TorchDynamo is pretty orthogonal to
various other pieces, so it should compose nicely with variations like:
- Switching out all the backends (Linalg-on-Tensors, TOSA, MHLO)
- PyTorch functionalization and decompositions
- Taking the example inputs and compiling with all dynamic or all static
shapes without duplicating tests.
This adds it to the CI, but there are still a lot of XFAIL's.
This also adds a helper `from torch_mlir.dynamo import
make_simple_dynamo_backend` which simplifies some of the steps for
making a Torch-MLIR-based TorchDynamo backend. We include "simple" in
the name because we are going to be exploring various things next from
the long-term roadmap.
The next steps are:
- Burn down all the XFAIL's.
- Start working on the pieces from the [long-term roadmap](https://github.com/llvm/torch-mlir/blob/main/docs/long_term_roadmap.md).
- Add functionalization/decompositions into the TorchDynamo flow and
remove reliance on the current Torch-MLIR "frontend".
- Write a pure-Python direct FX->MLIR importer.
- Hook up the new PyTorch symbolic shape stuff.
- Explore PrimTorch decompositions for simplifying backends.
The purpose of the test suite is to accelerate the development of the
compiler. However, we had various tests there that were not expected to
work, had no in-progress work being tested by the test, and nobody was
actively working on them. Having such tests in our test suite just adds
clutter and slows down development on the compiler.
-- aten.upsample_nearest2d.vec op is not present
owing to https://github.com/pytorch/pytorch/pull/85638
-- So this commit adds a lowering on aten.upsample_nearest2d.
Signed-off-by: Abhishek Varma <abhishek@nod-labs.com>
This commit makes the following changes needed to update bump LLVM:
- Replace `linalg.init_tensor` with `tensor.empty` (see:
https://reviews.llvm.org/D135129)
- Replace `NoSideEffect` with `Pure` (see
https://reviews.llvm.org/D135505)
- Replace `body` region accessor for `ReduceOp` and `ReduceWindowOp`
with `getBody`
- Fix incorrect use of `tosa::ReduceSumOp` in `AtenNativeLayerNormOp`
conversion pattern. The result type of `tosa::ReduceSumOp` must have
the same rank as the input type. (see:
https://www.mlplatform.org/tosa/tosa_spec.html#_reduce_sum)
Co-authored-by: Ashay Rane <ashay@users.noreply.github.com>
Co-authored-by: Ashay Rane <ashay@users.noreply.github.com>
We originally added these to help bring up more complex models with
heavier dependencies. However, over time it has become clear that these
models usually require more than just heavier dependencies -- they often
require a nontrivial amount of "one-off" code to extract the relevant
parts of the model and compile them. This is not a good fit for a
component in the core Torch-MLIR repo.
However, in the community, nod.ai has developed the ["Shark
Tank"](https://github.com/nod-ai/SHARK/tree/main/tank) which has all the
appropriate code to wrangle these models and organize them. We intend to
more heaviliy lean on that as a community and improve the symbiosis
there to serve the role that these heavydep tests were meant to play.
* build: disable LTC again so that we can bump PyTorch version
When built using PyTorch's master branch, the LTC code has been failing
to build for a few days. As a result, the PyTorch version referenced by
Torch-MLIR is stalled to the one from October 4th.
In an effort to advance to PyTorch version, this patch disables LTC, and
a subsequent patch will advance the PyTorch version.
* update PyTorch version to 1.14.0.dev20221010
Also disables the `UpSampleNearest2dDynamicFactor_basic` e2e test, since
the (PyTorch) oracle differs from the computed value for both the
refbackend and the eager_mode backends.
This commit adds lowering of `aten.div.int` and `aten.bitwise_or.Tensor`
ops. Both these ops are required in order to support bloom_560m model.
Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com>