torch-mlir

Commit Graph

Author	SHA1	Message	Date
Prashant Kumar	8ba77ae2a5	Yapf Format `refbacked.py`.	2022-12-15 21:19:52 +05:30
Prashant Kumar	564403e3a1	Add float16 support in the refbackend. This will require https://reviews.llvm.org/D139121 patch to go through.	2022-12-15 21:19:52 +05:30
powderluv	cd90c0aaf5	Update buildAndTest.yml (#1723 )	2022-12-15 05:42:01 -08:00
Sean Silva	af9e8a5e63	[torchdynamo] Move to aot_autograd instead of raw make_fx As [@ezyang suggested](https://github.com/pytorch/pytorch/issues/90276#issuecomment-1339791275), use `torch._dynamo.optimizations.training.aot_autograd` instead of raw `make_fx`. This is more future proof and gives us the backward pass and functionalization. We don't currently get functionalization because of https://github.com/pytorch/pytorch/issues/90759 This also incidentally fixes the source location handling, which makes `lockstep_basic.py` give an accurate source location!	2022-12-15 01:55:50 -08:00
Ashay Rane	64f9a0e978	ci: print ccache statistics and configuration at end of CI run (#1719 ) There appear to be two problems with the caching layer in our CI runs: (a) the sizes of some of the caches have grown to multiples of the 300 MB limit and (b) caching on Windows seems to be provide little to no benefit. To help understand the reasons for these problems, this patch adds a line item to the list of steps run in CI to dump the ccache configuration and statistics just prior to uploading the cache artifact.	2022-12-14 09:50:43 -06:00
Roll PyTorch Action	a29f173a6b	update PyTorch version to 2.0.0.dev20221214	2022-12-14 15:23:09 +00:00
Sean Silva	b60da34f84	[cleanup] Fix a few more llvm::None -> std::nullopt	2022-12-14 05:59:49 -08:00
Sean Silva	8c3774bb2a	Minor fixes for development.md - Mention the rotation doc - Fix minor typos / broken link	2022-12-14 02:55:51 -08:00
Ashay Rane	f63bb9f86c	build: update llvm tag to 3a020527 (#1717 ) Summary of changes: - Replace `llvm::None` with `std::nullopt`, since the former is deprecated (https://reviews.llvm.org/D139763) - Use setter for symbol visibility instead of passing string attribute when creating FuncOp	2022-12-14 02:06:39 -06:00
Ahmed S. Taei	b1f6832849	Add aten.slice.Tensor & aten.cat folders (#1691 )	2022-12-13 13:02:47 -08:00
Ashay Rane	731c313231	ci: run `git pull` before committing pytorch version updates (#1716 ) The RollPyTorch action often takes more than 1.5 hours to finish. During this time, if another PR is merged, then the RollPyTorch action needs to first pull the merged changes before committing the updates to the PyTorch commit hash and version files. This patch adds the required `git pull` statement, without which, the subsequent `git push` statement fails, causing the RollPyTorch action to fail as well.	2022-12-13 13:41:41 -06:00
Daniel Ellis	07a65961dd	Disable pypi publishing. See https://github.com/llvm/torch-mlir/issues/1709	2022-12-13 11:45:41 -05:00
Ramiro Leal-Cavazos	a710237437	[custom op] Generalize shape library logic to work with dtypes (#1594 ) * [custom op] Generalize shape library logic to work with dtypes This commit generalizes the shape library logic, so that dtype rules for ops can also be expressed using the same mechanism. In other words, each op can now have a shape function and a dtype function specified in Python that is imported during lowering to calculate the shapes and dtypes throught a program. For more information about how to specify a dtype function, see the updated `docs/adding_a_shape_and_dtype_function.md`. For those not familiar with how the shape library works, the file `docs/calculations_lib.md` provides an overview.	2022-12-13 08:25:41 -08:00
Sean Silva	2acf7da63c	[README] Small touch-ups, and mention PT2	2022-12-13 08:06:17 -08:00
Roll PyTorch Action	8d098dc8d5	update PyTorch version to 2.0.0.dev20221213	2022-12-13 14:52:27 +00:00
Chi_Liu	163d19cce6	[TOSA] Add aten.add/sub.Scalar/Tensor si64 type support (#1604 )	2022-12-12 12:13:07 -08:00
Ramiro Leal-Cavazos	73bd32d06c	Make `getTensorRank` safer by changing return to `Optional<unsigned>` (#1707 ) Currently `getTensorRank` returns -1 if it was unable to get the rank of the tensor. However, not every use in the codebase was checking the return value, and in some cases, the return value was casted to unsigned leading to some infinte loops when an unranked tensor reached a decomposition. This commit changes the return of `getTensorRank` to `Optional<unsigned>` to make it clear to the user that the function can fail. This commit also changes a couple of for loops that iterate a vector in reverse order that can potentially become infinite loops into range-based for loops.	2022-12-12 08:56:28 -08:00
Ashay Rane	430737b820	[cleanup] fix naming of private variable according to the style guide (#1704 )	2022-12-12 09:04:46 -06:00
Sean Silva	a595942033	[cleanup] Use `"` instead of `'` for string literals This is the more predominant style in the codebase. I'm sure there are more in other parts of the codebase but it's hard to search/replace.	2022-12-12 02:40:09 -08:00
Vivek Khandelwal	d4862ec611	[MLIR][TORCH] Add e2e support for aten.var_mean op Signed-Off By: Vivek Khandelwal<vivek@nod-labs.com>	2022-12-12 15:46:54 +05:30
Vivek Khandelwal	143a8f378d	build: manually update PyTorch version Set PyTorch and TorchVision version to nightly release 2022-12-11. Signed-Off By: Vivek Khandelwal<vivek@nod-labs.com>	2022-12-12 15:46:54 +05:30
Vivek Khandelwal	f783e19dcb	Revert "[MLIR][TORCH] Fix mean and mean.dim op for large-sized inputs" This reverts commit `55c7e66aa7`.	2022-12-09 19:30:46 +05:30
Sean Silva	7731211d02	Remove eager_mode This was an experimental attempt at rolling out own op-by-op executor with `__torch_dispatch__`, but it proved difficult to make it robust. Op-by-op execution is very easy to implement robustly now with the PyTorch 2.0 stack, so we don't need eager_mode. Downstream users were using eager_mode to implement lockstep numerical accuracy debuggers. We implemented the same functionality with TorchDynamo in https://github.com/llvm/torch-mlir/pull/1681 so now there is not much reason to continue maintaining it.	2022-12-09 03:50:00 -08:00
Sambhav Jain	109c91ae9b	[CI] Verify bazel buildifier is run and changes committed (#1700 ) Ensures the buildifier (linter for bazel build files) is run and changes are pushed.	2022-12-08 15:56:57 -08:00
Gleb Kazantaev	804f9f1f8f	Extended TorchMLIRLoweringContext with virtual CreateComputation method (#1699 ) * Extended TorchMLIRLoweringContext with virtual CreateComputation method * Fix device_data_cast return value	2022-12-08 15:57:07 -05:00
Sambhav Jain	f8a2592905	[Bazel] Resolve circular dependency and add targets for conversion to MLProgram dialect (#1694 ) A circular dependency was introduced in `e7edcc62fd`. Specifically, the `makeShapeLLVMCompatible` and `makeShapeTorchCompatible` utilities were being called from `lib/Dialect/Torch/IR/TorchTypes.cpp` and `lib/Dialect/Torch/IR/TorchOps.cpp` defined under the `:TorchMLIRTorchDialect` bazel target, leading it to take a dependency on `:TorchMLIRConversionUtils` which already depends on `:TorchMLIRTorchDialect`, hence creating a circular dependency. This commit resolves the same by moving said utilities from `lib/Conversion/Utils/Utils.cpp` to `lib/Dialect/Torch/Utils/Utils.cpp`. Please LMK if there's a better way to fix this and I will update the code. This commit also adds the required targets to support building the new conversions from Torch to ML Program dialect that was introduced in `f416953600`. Bazel build GHA triggered manually to verify: https://github.com/sjain-stanford/torch-mlir/actions/runs/3645944517	2022-12-08 09:49:54 -08:00
Ramiro Leal-Cavazos	a54b334578	Allow running DecomposeComplexOps more than once (#1671 ) The current implementation of `DecomposeComplexOps` fails if an op expected to be decomposed does not get decomposed in the first iteration of the `createTorchSimplificationPipeline` in `LowerToBackendContractPass`. However, some graphs require multiple iterations of `createTorchSimplificationPipeline` to fully propagate all statically knowable information, such as dtypes and shapes, to the entire graph, sometimes resulting in the need to run `DecomposeComplexOps` more than once. This commit changes `DecomposeComplexOps` to use a greedy algorithm for pattern application and moves the legalization check of ops to the `LowerToBackendContractPass` to allow for the `DecomposeComplexOps` to run more than once.	2022-12-08 09:26:38 -08:00
Sean Silva	e8511840c3	[cleanup] Use a single function pipeline for TOSA->Linalg This should run faster and is overall clearer.	2022-12-08 09:02:38 -08:00
Ramiro Leal-Cavazos	76190e8a3f	Remove unnecessary decompose-complex-ops tests (#1693 ) This commit removes lit tests from the `decompose-complex-ops` that are essentially testing a macro expansion, in accordance with https://github.com/llvm/torch-mlir/blob/main/docs/architecture.md#dos-and-donts-for-unit-vs-end-to-end-testing .	2022-12-08 08:22:08 -08:00
Sean Silva	69171c246a	[RefBackend] Add elementwise fusion and buffer deallocation This gives some decent improvements to memory consumption and latency of testing. I would have expected buffer-deallocation to actually make a big difference to the final process RSS but it doesn't appear to. Also running buffer-deallocation later in the pipeline results in miscompiles. I didn't have the time or interest to dig in deeper, but something is off. (numbers below are taken from a single run, but I did do a few runs to make sure that the variance wasn't that great) - Linalg-on-Tensors shows memory consumption improvements and some slight speedups. ``` ./tools/e2e_test.sh -s -v -c refbackend fuse=0 dealloc=0 RSS: 3071.33 MB real 3m58.204s user 6m22.299s sys 0m51.235s fuse=1 dealloc=0 RSS: 2515.89 MB real 3m34.797s user 5m56.902s sys 0m44.933s fuse=1 dealloc=post-bufferize: RSS: 2290.25 MB real 3m42.242s user 6m0.560s sys 0m46.335s ``` - TOSA ResNet18 gets significantly faster and uses significantly less memory. ``` time ./tools/e2e_test.sh -s -v -c tosa -f ResNet18 fuse=0 dealloc=0 rss 1328.56 MB real 0m50.303s user 0m55.355s sys 0m12.260s fuse=1 dealloc=0 rss 859MB real 0m30.454s user 0m35.551s sys 0m11.879s fuse=1 dealloc=post-bufferize: rss 851MB real 0m30.313s user 0m39.889s sys 0m11.941s ``` Big thanks to Ramiro for the methodology here for measuring the RSS with `psutil`: https://gist.github.com/ramiro050/5b5c2501f7389c008d9029210772c3a8	2022-12-08 03:14:42 -08:00
Sean Silva	29c8823464	[e2e tests] Rename default config from "refbackend" to "linalg" This more accurately reflects what it is. The previous name was conflating the use of RefBackend (which `linalg`, `tosa`, and `mhlo` configs all use) with the use of the linalg backend (e.g. TorchToLinalg). This conflation was artifically giving the linalg backend a "privileged" position, which we want to avoid. We still keep it as the default backend, and it remains the most complete, but at least there's not artificial boosting.	2022-12-08 01:34:46 -08:00
Ramiro Leal-Cavazos	dd35488da5	build: update llvm tag to 798fa4b4 (#1684 ) - Support for non-prefixed accessors has been removed. See: https://reviews.llvm.org/D136727 - Rename `operands` to `methodOperands` in `prim.CallMethod` since the name `operands` overlaps with a builtin method name. See: https://reviews.llvm.org/D136727 - Add passes in refbackend to lower memref.subview. See: https://reviews.llvm.org/D136377 - Replace `CopyToValueTensorOps` first in `RewriteViewLikeSubgraph` in maximize-value-semantics. The current implementation of the `RewriteViewLikeSubgraph` pass in maximize-value-semantics creates temporarily invalid IR. In particular, given a forward slice starting from a `CopyToNonValueTensorOp` and ending in `CopyToValueTensorOp`s, the pass first replaces all uses of the `CopyToNonValueTensorOp` with its operand, which results in all the `CopyToValueTensorOp` users having their operand have type `!torch.vtensor`, which is invalid. The correct way to do things is to first replace all the `CopyToValueTensorOp`s with their operand, and then replace all uses of the `CopyToNonValueTensorOp` with its operand. This only started failing now because the generated accessor `getOperand` for the `CopyToValueTensorOp` now returns a `TypedValue<NonValueTensorType>`, which has an assert checking that the value returned is of the expected type.	2022-12-07 12:20:41 -08:00
Sean Silva	b1f9e09f85	[torchdynamo] Add ResNet18 example with TorchDynamo This is a minor variation on our other resnet18 examples swapping in TorchDynamo. We replicate the refbackend_torchdynamo_backend out of the e2e test config to avoid making that appear like a public API. Also, some minor cleanups to TorchDynamoTestConfig.	2022-12-07 09:25:27 -08:00
Daniel Ellis	98d80a642a	Publish releases to PyPI after build	2022-12-07 10:01:55 -05:00
Sean Silva	c956c39c86	[cleanup] Remove disabled e2e test This test has been disabled a long time, and since RefBackend is so slow we don't want to add this unnecessarily. I believe it is covered by downstream testing such as the Shark Tank.	2022-12-07 06:36:48 -08:00
Sean Silva	d52359a891	[docs] Add info about special e2e testing cases.	2022-12-07 12:53:07 +01:00
Vivek Khandelwal	3e4bb2bd8e	[MLIR][TORCH] Add E2E support for randn and randn.generator op Signed-Off By: Vivek Khandelwal<vivek@nod-labs.com>	2022-12-06 22:41:24 +05:30
Sean Silva	485c18bb2f	[torchdynamo] Add "lockstep" numerical accuracy debugger. Thanks to TorchDynamo's great layering and design, this is only about 100 lines of code for a basic lockstep debugger. This should allow us to deprecate eager_mode, since AFAIK the only interesting use case that it was really supporting is for downstream users to write lockstep debuggers. NOTE: The exact reporting and interface here is subject to change. Please try it out and provide feedback (or patches :) ). - make_fx should not drop source locations: https://github.com/pytorch/pytorch/issues/90276 - Report tensors better (huge tensors should be summarized) - Maybe don't abort, but just warn? - Allow customizing atol/rtol. - How best to print the failing node? And include surrounding graph context?	2022-12-06 07:57:45 -08:00
Vivek Khandelwal	ef39b9ebb4	build: manually update PyTorch version Set PyTorch and TorchVision version to nightly release 2022-12-05. Signed-Off By: Vivek Khandelwal<vivek@nod-labs.com>	2022-12-05 22:44:32 +05:30
Roll PyTorch Action	6c5360e281	update PyTorch version to 1.14.0.dev20221204	2022-12-04 14:28:48 +00:00
Roll PyTorch Action	8baa9e42e7	update PyTorch version to 1.14.0.dev20221203	2022-12-03 14:37:17 +00:00
Roll PyTorch Action	fcc670d785	update PyTorch version to 1.14.0.dev20221202	2022-12-02 14:50:28 +00:00
Vivek Khandelwal	f416953600	[MLIR][TORCH] Add TorchConversionToMLProgram and MLProgramBufferize pass This commit changes the `InsertRngGlobalsPass` to `TorchConversionToMLProgram` pass. This commit also adds the `MLProgramBufferize` pass for the bufferization of ml_program dialect ops to run on refbackend. Signed-Off By: Vivek Khandelwal<vivek@nod-labs.com>	2022-12-02 13:20:46 +05:30
Eric Kunze	3fc27cf6ca	Update LLVM Tag to 2c1fa734 (#1670 ) Summary of changes: - Change ShapedType::kDynamicSize -> ShapedType::kDynamic - llvm::NoneType has been deprecated, change convertScalarToDtype to use llvm::None	2022-12-01 20:38:28 -08:00
Sean Silva	88db99946b	[torchdynamo] Use decompositions to support a few ops	2022-12-01 11:25:20 -08:00
Ramiro Leal-Cavazos	b4b92c990e	Replace LCG algorithm with squares64 algorithm in AtenUniformOp (#1633 ) This commit replaces the LCG algorithm that was being used by the `TorchToLinalg` lowering of `AtenUniformOp` to generate random numbers with the `squares64` algorithm, for the LCG algorithm was producing tensors that were highly correlated with one another. Squares64 algorithm: https://arxiv.org/abs/2004.06278 Closes https://github.com/llvm/torch-mlir/issues/1608	2022-12-01 08:30:10 -08:00
Roll PyTorch Action	e66bf7b8cb	update PyTorch version to 1.14.0.dev20221201	2022-12-01 15:01:09 +00:00
Vivek Khandelwal	e7edcc62fd	build: update llvm tag to 147fe9de Summary of changes: - Replace call to `MemoryEffectOpInterface::hasNoEffect` with `isMemoryEffectFree`. - Make fix for the dynamic dims, since `kDynamicSize` value changed to `std::numeric_limits<int64_t>::min()` from `-1` in llvm - `makeShapeLLVMCompatible` and `makeShapeTorchCompatible` utilities convert shapes in order to remain consistent with the Torch and MLIR semantics. - Update tags llvm: 147fe9de29dc13c14835127b35280c4d95c8e8ba mhlo: 1944b5fa6062ec4c065d726c9c5d64f1487ee8c5 Signed-Off By: Vivek Khandelwal<vivek@nod-labs.com>	2022-12-01 13:36:50 +05:30
Abhishek Varma	47f67853ac	[RefineTypes] Add Float16Type dtype knowledge support for trivial ops -- This commit adds Float16Type dtype knowledge support for trivial ops. Signed-off-by: Abhishek Varma <abhishek@nod-labs.com>	2022-12-01 10:22:43 +05:30
Ramiro Leal-Cavazos	0983a7f93a	Fix modulus calculation in LCG algorithm of refbackend (#1658 ) The current implementation sets the `nextSeed` value to `temp & 127`, which is wrong. The last step of the LCG algorithm for the multiplier and increment chosen should be `temp % 2^{64} = temp & (1 << 63)`. However, because we are dealing with i64 values, the modulus operation happens automatically, so it is not needed. See Donald Knuth's values for LCG here: https://en.wikipedia.org/wiki/Linear_congruential_generator	2022-11-30 08:46:52 -08:00

1 2 3 4 5 ...

1651 Commits (8ba77ae2a519bc8db9f5c0ae9197471845b925cf) All Branches Search

1651 Commits (8ba77ae2a519bc8db9f5c0ae9197471845b925cf)

All Branches