torch-mlir

Commit Graph

Author	SHA1	Message	Date
Vivek Khandelwal	f9d59eb500	[MLIR][TORCH] Add decomposition for aten.randn_like op This commit decomposes aten.randn_like op into aten.randn.generator op. Signed-Off By: Vivek Khandelwal <vivek@nod-labs.com>	2023-01-18 12:09:27 +05:30
Vivek Khandelwal	999fd9036b	[torchdynamo] Add native_group_norm and split op to the decomp list Signed-Off By: Vivek Khandelwal <vivek@nod-labs.com>	2023-01-18 10:40:46 +05:30
Jiahao Li	e2698433db	Fix empty tensor when select -1 (#1787 )	2023-01-17 10:14:14 -08:00
Jiahao Li	4f94831fed	[LINALG][TOSA][MHLO] Add e2e support for aten bitwise ops (#1753 )	2023-01-11 14:40:03 -08:00
Vivek Khandelwal	fd236b2c89	[MLIR][TORCH] Add decomposition for prims.var and prims.sqrt op Signed-Off By: Vivek Khandelwal <vivek@nod-labs.com>	2023-01-11 17:39:10 +05:30
Vivek Khandelwal	b966733e04	build: manually update PyTorch version Set PyTorch and TorchVision version to nightly release 2023-01-08. Signed-Off By: Vivek Khandelwal<vivek@nod-labs.com>	2023-01-11 17:39:10 +05:30
Gleb Kazantaev	c8b867b876	Added support for aten::norm.ScalarOpt_dim (#1774 ) * Added support for aten::norm.ScalarOpt_dim * Disable NormalizeModule_basic for linalg	2023-01-10 13:08:25 -05:00
Jiahao Li	8dc5d985eb	Add e2e support for aten logical or/and/xor/not ops (#1761 )	2023-01-03 18:11:25 -08:00
Ramiro Leal-Cavazos	273664ded6	[custom op] Replace `tanh` dtype function with `expm1` (#1769 ) This commit replaces the `tanh` dtype function, which was being used to test the implementation of dtype functions in `a710237437`, with a dtype function for `expm1`. The dtype function for `expm1` is identical to the `tanh` one, so the same level of testing is maintained. Currently, there are ops getting dtype information from the `RefineTypes` pass and ops getting dtype information from the `TorchDtypeRefinementPipeline`. Since each pass can only propagete dtype information for the ops it knows how to handle, some models with many ops handled in both passes require the two dtype propagation passes to execute many times, reaching the iteration limit set in the `LowerToBackendContractPass`. To temporarily avoid this issue while the migration to `TorchDtypeRefinementPipeline` is finished, this commit switches `tanh` to `expm1`, since the latter is used a lot less in large models.	2023-01-03 14:18:26 -08:00
Srirammaswamy	a88e3766e8	Add E2E support for LeakyRelu and LeakyReluBackward ops (#1733 ) Co-authored-by: srirammaswamy <srirammaswamy@gmail.com>	2023-01-03 08:30:16 -08:00
Ashay Rane	ac780529b4	Revert e2e support for aten logical or/and/xor/not ops (#1757 ) This reverts commit `eaab9be207`, since it is causing the post-merge CI tests to fail, causing subsequent PRs to be blocked. Specifically, the tests `ElementwiseAtenLogicalAndOpPromoteBroadcastModule_basic` and `ElementwiseAtenLogicalXorOpPromoteBroadcastModule_basic` fail because the oracle does not match the computed result. This patch reverts the commit to make the post-merge builds green again.	2022-12-29 21:01:06 -06:00
Shivam Gupta	2f45959f0d	Prelu lowering to linalg (#1712 ) Prelu lowering to linalg	2022-12-28 08:51:33 +05:30
Jiahao Li	eaab9be207	Add e2e support for aten logical or/and/xor/not ops (#1752 )	2022-12-26 10:23:38 +08:00
Ramiro Leal-Cavazos	3260a1ea6e	Allow passing traced `torch.nn.Module`s into `torch_mlir.compile` (#1743 ) This commit adds support for passing to `torch_mlir.compile` the result of running `torch.jit.trace` on a model by relaxing the condition that checks if the model is already in JIT IR to allow any `torch.jit.ScriptModule`. Fixes https://github.com/llvm/torch-mlir/issues/1739	2022-12-22 08:39:55 -08:00
Jiahao Li	60a139271d	Add aten.std.correction op and its decomposition (#1731 )	2022-12-21 21:02:40 -08:00
Jiahao Li	15b249777b	[Torch][MHLO] Decompose aten.copy op. Lower aten.rsqrt & sigmoid to mhlo. (#1734 )	2022-12-22 10:13:59 +08:00
Chi_Liu	b2cefc0b64	[TOSA] Add aten.masked_fill.Tensor/Scalar support (#1735 )	2022-12-21 08:56:07 -08:00
Jae Hoon (Antonio) Kim	1d695239ff	Unrevert #1724 (#1737 ) * Unrevert #1724 * Update pytorch requirements.txt	2022-12-20 11:17:21 -05:00
Abhishek Varma	66d7a412cb	[RefineTypes] Fix knowledge dtype for `aten.embedding` op -- The dtype of the result of `aten.embedding` should match that of the `weight` operand's (operand[0]) instead of hardcoding to f32. -- This commit aims to provide a fix for the same. Signed-off-by: Abhishek Varma <abhishek@nod-labs.com>	2022-12-20 19:56:12 +05:30
Ashay Rane	dd1cf578a6	build: fix LTC code after upstream PyTorch change (#1727 ) pytorch/pytorch@140a3139 reverted a change from yesterday, causing the RollPyTorch action to break. This patch reverts the corresponding change in the torch-mlir LTC code. This patch also re-enables tests that were previously marked as XFAIL.	2022-12-16 13:07:38 -06:00
ataheridezfouli-groq	17ee643aeb	[TORCH] Add Complex Number support (#1673 ) Add Complex number dtype support to torch tensors. Add aten.fft_fft op to test complex numbers.	2022-12-15 21:40:01 +00:00
Jae Hoon (Antonio) Kim	a2a93891ea	Replace asIntArrayRefSlow with macro (#1724 ) * Replace asIntArrayRefSlow with macro * Update pytorch requirements.txt	2022-12-15 11:52:41 -05:00
Prashant Kumar	8ba77ae2a5	Yapf Format `refbacked.py`.	2022-12-15 21:19:52 +05:30
Prashant Kumar	564403e3a1	Add float16 support in the refbackend. This will require https://reviews.llvm.org/D139121 patch to go through.	2022-12-15 21:19:52 +05:30
Sean Silva	af9e8a5e63	[torchdynamo] Move to aot_autograd instead of raw make_fx As [@ezyang suggested](https://github.com/pytorch/pytorch/issues/90276#issuecomment-1339791275), use `torch._dynamo.optimizations.training.aot_autograd` instead of raw `make_fx`. This is more future proof and gives us the backward pass and functionalization. We don't currently get functionalization because of https://github.com/pytorch/pytorch/issues/90759 This also incidentally fixes the source location handling, which makes `lockstep_basic.py` give an accurate source location!	2022-12-15 01:55:50 -08:00
Ahmed S. Taei	b1f6832849	Add aten.slice.Tensor & aten.cat folders (#1691 )	2022-12-13 13:02:47 -08:00
Ramiro Leal-Cavazos	a710237437	[custom op] Generalize shape library logic to work with dtypes (#1594 ) * [custom op] Generalize shape library logic to work with dtypes This commit generalizes the shape library logic, so that dtype rules for ops can also be expressed using the same mechanism. In other words, each op can now have a shape function and a dtype function specified in Python that is imported during lowering to calculate the shapes and dtypes throught a program. For more information about how to specify a dtype function, see the updated `docs/adding_a_shape_and_dtype_function.md`. For those not familiar with how the shape library works, the file `docs/calculations_lib.md` provides an overview.	2022-12-13 08:25:41 -08:00
Ashay Rane	430737b820	[cleanup] fix naming of private variable according to the style guide (#1704 )	2022-12-12 09:04:46 -06:00
Vivek Khandelwal	d4862ec611	[MLIR][TORCH] Add e2e support for aten.var_mean op Signed-Off By: Vivek Khandelwal<vivek@nod-labs.com>	2022-12-12 15:46:54 +05:30
Vivek Khandelwal	f783e19dcb	Revert "[MLIR][TORCH] Fix mean and mean.dim op for large-sized inputs" This reverts commit `55c7e66aa7`.	2022-12-09 19:30:46 +05:30
Sean Silva	7731211d02	Remove eager_mode This was an experimental attempt at rolling out own op-by-op executor with `__torch_dispatch__`, but it proved difficult to make it robust. Op-by-op execution is very easy to implement robustly now with the PyTorch 2.0 stack, so we don't need eager_mode. Downstream users were using eager_mode to implement lockstep numerical accuracy debuggers. We implemented the same functionality with TorchDynamo in https://github.com/llvm/torch-mlir/pull/1681 so now there is not much reason to continue maintaining it.	2022-12-09 03:50:00 -08:00
Gleb Kazantaev	804f9f1f8f	Extended TorchMLIRLoweringContext with virtual CreateComputation method (#1699 ) * Extended TorchMLIRLoweringContext with virtual CreateComputation method * Fix device_data_cast return value	2022-12-08 15:57:07 -05:00
Sean Silva	e8511840c3	[cleanup] Use a single function pipeline for TOSA->Linalg This should run faster and is overall clearer.	2022-12-08 09:02:38 -08:00
Sean Silva	69171c246a	[RefBackend] Add elementwise fusion and buffer deallocation This gives some decent improvements to memory consumption and latency of testing. I would have expected buffer-deallocation to actually make a big difference to the final process RSS but it doesn't appear to. Also running buffer-deallocation later in the pipeline results in miscompiles. I didn't have the time or interest to dig in deeper, but something is off. (numbers below are taken from a single run, but I did do a few runs to make sure that the variance wasn't that great) - Linalg-on-Tensors shows memory consumption improvements and some slight speedups. ``` ./tools/e2e_test.sh -s -v -c refbackend fuse=0 dealloc=0 RSS: 3071.33 MB real 3m58.204s user 6m22.299s sys 0m51.235s fuse=1 dealloc=0 RSS: 2515.89 MB real 3m34.797s user 5m56.902s sys 0m44.933s fuse=1 dealloc=post-bufferize: RSS: 2290.25 MB real 3m42.242s user 6m0.560s sys 0m46.335s ``` - TOSA ResNet18 gets significantly faster and uses significantly less memory. ``` time ./tools/e2e_test.sh -s -v -c tosa -f ResNet18 fuse=0 dealloc=0 rss 1328.56 MB real 0m50.303s user 0m55.355s sys 0m12.260s fuse=1 dealloc=0 rss 859MB real 0m30.454s user 0m35.551s sys 0m11.879s fuse=1 dealloc=post-bufferize: rss 851MB real 0m30.313s user 0m39.889s sys 0m11.941s ``` Big thanks to Ramiro for the methodology here for measuring the RSS with `psutil`: https://gist.github.com/ramiro050/5b5c2501f7389c008d9029210772c3a8	2022-12-08 03:14:42 -08:00
Ramiro Leal-Cavazos	dd35488da5	build: update llvm tag to 798fa4b4 (#1684 ) - Support for non-prefixed accessors has been removed. See: https://reviews.llvm.org/D136727 - Rename `operands` to `methodOperands` in `prim.CallMethod` since the name `operands` overlaps with a builtin method name. See: https://reviews.llvm.org/D136727 - Add passes in refbackend to lower memref.subview. See: https://reviews.llvm.org/D136377 - Replace `CopyToValueTensorOps` first in `RewriteViewLikeSubgraph` in maximize-value-semantics. The current implementation of the `RewriteViewLikeSubgraph` pass in maximize-value-semantics creates temporarily invalid IR. In particular, given a forward slice starting from a `CopyToNonValueTensorOp` and ending in `CopyToValueTensorOp`s, the pass first replaces all uses of the `CopyToNonValueTensorOp` with its operand, which results in all the `CopyToValueTensorOp` users having their operand have type `!torch.vtensor`, which is invalid. The correct way to do things is to first replace all the `CopyToValueTensorOp`s with their operand, and then replace all uses of the `CopyToNonValueTensorOp` with its operand. This only started failing now because the generated accessor `getOperand` for the `CopyToValueTensorOp` now returns a `TypedValue<NonValueTensorType>`, which has an assert checking that the value returned is of the expected type.	2022-12-07 12:20:41 -08:00
Sean Silva	b1f9e09f85	[torchdynamo] Add ResNet18 example with TorchDynamo This is a minor variation on our other resnet18 examples swapping in TorchDynamo. We replicate the refbackend_torchdynamo_backend out of the e2e test config to avoid making that appear like a public API. Also, some minor cleanups to TorchDynamoTestConfig.	2022-12-07 09:25:27 -08:00
Sean Silva	c956c39c86	[cleanup] Remove disabled e2e test This test has been disabled a long time, and since RefBackend is so slow we don't want to add this unnecessarily. I believe it is covered by downstream testing such as the Shark Tank.	2022-12-07 06:36:48 -08:00
Vivek Khandelwal	3e4bb2bd8e	[MLIR][TORCH] Add E2E support for randn and randn.generator op Signed-Off By: Vivek Khandelwal<vivek@nod-labs.com>	2022-12-06 22:41:24 +05:30
Sean Silva	485c18bb2f	[torchdynamo] Add "lockstep" numerical accuracy debugger. Thanks to TorchDynamo's great layering and design, this is only about 100 lines of code for a basic lockstep debugger. This should allow us to deprecate eager_mode, since AFAIK the only interesting use case that it was really supporting is for downstream users to write lockstep debuggers. NOTE: The exact reporting and interface here is subject to change. Please try it out and provide feedback (or patches :) ). - make_fx should not drop source locations: https://github.com/pytorch/pytorch/issues/90276 - Report tensors better (huge tensors should be summarized) - Maybe don't abort, but just warn? - Allow customizing atol/rtol. - How best to print the failing node? And include surrounding graph context?	2022-12-06 07:57:45 -08:00
Vivek Khandelwal	ef39b9ebb4	build: manually update PyTorch version Set PyTorch and TorchVision version to nightly release 2022-12-05. Signed-Off By: Vivek Khandelwal<vivek@nod-labs.com>	2022-12-05 22:44:32 +05:30
Vivek Khandelwal	f416953600	[MLIR][TORCH] Add TorchConversionToMLProgram and MLProgramBufferize pass This commit changes the `InsertRngGlobalsPass` to `TorchConversionToMLProgram` pass. This commit also adds the `MLProgramBufferize` pass for the bufferization of ml_program dialect ops to run on refbackend. Signed-Off By: Vivek Khandelwal<vivek@nod-labs.com>	2022-12-02 13:20:46 +05:30
Sean Silva	88db99946b	[torchdynamo] Use decompositions to support a few ops	2022-12-01 11:25:20 -08:00
Ramiro Leal-Cavazos	b4b92c990e	Replace LCG algorithm with squares64 algorithm in AtenUniformOp (#1633 ) This commit replaces the LCG algorithm that was being used by the `TorchToLinalg` lowering of `AtenUniformOp` to generate random numbers with the `squares64` algorithm, for the LCG algorithm was producing tensors that were highly correlated with one another. Squares64 algorithm: https://arxiv.org/abs/2004.06278 Closes https://github.com/llvm/torch-mlir/issues/1608	2022-12-01 08:30:10 -08:00
Ramiro Leal-Cavazos	0983a7f93a	Fix modulus calculation in LCG algorithm of refbackend (#1658 ) The current implementation sets the `nextSeed` value to `temp & 127`, which is wrong. The last step of the LCG algorithm for the multiplier and increment chosen should be `temp % 2^{64} = temp & (1 << 63)`. However, because we are dealing with i64 values, the modulus operation happens automatically, so it is not needed. See Donald Knuth's values for LCG here: https://en.wikipedia.org/wiki/Linear_congruential_generator	2022-11-30 08:46:52 -08:00
Abhishek Varma	c27c1791f1	[MLIR][TORCH] Add e2e support for `aten.amax` op -- This commit adds e2e support for `atend.amax` op. Signed-off-by: Abhishek Varma <abhishek@nod-labs.com>	2022-11-30 17:54:37 +05:30
Tanyo Kwok	bbcdb38d99	Revert "Decompose torch.slice_scatter (#1622 )" (#1659 ) This reverts commit `f3f2f10030`.	2022-11-30 12:47:13 +08:00
Daniel Ellis	e2de20575f	Automatically strip overloads for FX-based models.	2022-11-29 22:19:09 -05:00
Ramiro Leal-Cavazos	a8cbfff95b	Reduce memory usage of e2e tests by reducing input sizes (#1653 ) There are a few e2e tests that take several very large tensors as input, which leads to the e2e test suite leaking too much memory. Running things locally resulted in a total memory usage of 12.5 GB when running the suite sequentially on the refbackend. Many of the tests that take large tensors don't actually need such large tensors to pass, and some that take several large tensors as input are just doing the same thing multiple times. This commit reduces the size of some of the tensors and removes repetitive parts of tests to reduce the memory usage to a total of 3 GB.	2022-11-29 10:03:36 -08:00
Sean Silva	5a488ff085	Remove deprecated np.bool `np.bool is bool` and will never be returned as a dtype of an `np.ndarray`, so we don't need to handle it here. ``` >>> a = np.ndarray([1], dtype=bool) >>> a.dtype.type is np.bool_ True ``` More info here: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations	2022-11-29 01:46:21 -08:00
Sean Silva	5a27f826b8	Fix multiprocessing for `--config=torchdynamo` For reasons that I haven't yet fully tracked down, the TorchDynamo TestConfig seems to result in tensors that cannot be pickled. They seem to be holding some sort of weak handles to a `torch.fx.graph.Graph`. Here is the object structure that leads to the unpickleable object: ``` (<function _rebuild_tensor_v2 at 0x7f56346d56c0>, <class 'torch.Tensor'>, ( 1.0... {<object object at 0x7f557529e6b0>: <WeakKeyDictionary at 0x7f556a3efbb0>} {'data': {<weakref at 0x7f5615372ed0; to 'PythonKeyTracer' at 0x7f556a3ee5c0>: _... <class 'torch.fx.graph.Graph'> <class 'torch._ops.OpOverloadPacket'> TypeError("cannot pickle 'torch._C.FunctionSchema' object") ``` Upstream bug filed: https://github.com/pytorch/pytorch/issues/89626	2022-11-28 04:03:11 -08:00

1 2 3 4 5 ...

642 Commits (158c9b540846a428652ca090e7ea24335751bc36)