The Torch-MLIR project aims to provide first class support from the PyTorch ecosystem to the MLIR ecosystem.

Go to file

Sambhav Jain 34478ab1c7 [Build] Add concurrency groups to address long queue times (#1219 ) We're seeing large CI queue times ([example](https://discord.com/channels/636084430946959380/742573221882364009/1007631811184164944)) especially with MacOS VMs on GHA. Part of the problem is follow-on commits to the same branch which trigger new runs while the previous runs are still in-progress, hogging on the scarce VMs. This PR adds concurrency groups to the GHA workflow which ensures that only a single job or workflow using the same concurrency group will run at a time. This would cancel any in-progress jobs in the same github workflow and github ref (e.g. `refs/heads/main` or `refs/pull/<pr_number>/merge`). As discussed on discord [thread](https://discord.com/channels/636084430946959380/1007787336848912386/1007787338895740928), once this lands we may have to closely monitor the workflows to see this didn't introduce unintended consequences. If so, we could either revert, or decide to selectively cancel particular runs (e.g. macos only which is the main bottleneck right now) instead of entire workflow. This will also require some expectation management. As in, if you see an ❌ on the main branch, it may not necessarily mean things broke, it could mean the run was killed by a more recent run. Making it a bit harder to traceback a failure to a commit in a sequence of commits (requiring to run those builds again). Thanks @powderluv for the proposal and pointer to this! It should help with the scarce VMs on GHA and save on queue time. References: * https://docs.github.com/en/actions/using-jobs/using-concurrency#example-only-cancel-in-progress-jobs-or-runs-for-the-current-workflow * https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions#example-only-cancel-in-progress-jobs-or-runs-for-the-current-workflow		2022-08-12 17:38:48 -07:00
.github	[Build] Add concurrency groups to address long queue times (#1219 )	2022-08-12 17:38:48 -07:00
build_tools	Simplify matrix configuration for CI workflows (#1213 )	2022-08-11 16:35:15 -07:00
docs	Reference Lazy Backend (#1045 )	2022-07-30 09:40:02 -04:00
e2e_testing/torchscript	E2E support for AtenRemainderScalarOp (#1200 )	2022-08-10 20:02:06 -04:00
examples	Reference Lazy Backend (#1045 )	2022-07-30 09:40:02 -04:00
externals	Don't set MLIR_TABLEGEN_EXE (#1197 )	2022-08-09 16:06:12 +02:00
include	Add decomposition of `aten.masked.tensor` op.	2022-08-11 07:48:04 +05:30
lib	Add decomposition of `aten.masked.tensor` op.	2022-08-11 07:48:04 +05:30
python	Add decomposition of `aten.masked.tensor` op.	2022-08-11 07:48:04 +05:30
test	Rename TorchToStd pass as TorchToArith (#1163 )	2022-08-10 20:12:51 +01:00
tools	build: improve robustness of cmake and shell scripts (#1018 )	2022-07-06 14:39:30 -07:00
utils/bazel	Update TorchToStd to TorchtoArith in bazel files too. (#1210 )	2022-08-10 14:51:13 -07:00
.clang-format	Add stub numpy dialect.	2020-04-26 17:20:58 -07:00
.gitignore	Reference Lazy Backend (#1045 )	2022-07-30 09:40:02 -04:00
.gitmodules	[MHLO] Init MHLO integration. (#1083 )	2022-07-20 16:18:16 -07:00
.style.yapf	Change preferred style to be PEP8	2022-04-20 14:38:19 -07:00
CMakeLists.txt	Don't set MLIR_TABLEGEN_EXE (#1197 )	2022-08-09 16:06:12 +02:00
LICENSE	Dual license the torch-mlir project.	2021-10-01 10:46:08 -07:00
README.md	LTC Documentation (#1021 )	2022-07-30 09:40:02 -04:00
Torch-MLIR.png	Update diagram for TOSA backend.	2022-04-01 22:46:25 +00:00
development.md	Add note about MLIR compiled outputs in dev docs (#1195 )	2022-08-10 20:14:41 +01:00
pyproject.toml	Minor buildsystem fixes (#778 )	2022-04-21 15:53:00 -07:00
requirements.txt	Add PyYaml to requirements.txt (#1174 )	2022-08-11 17:59:39 +01:00
setup.py	Revert "Reenable LTC in out-of-tree build (#1177 )" (#1183 )	2022-08-08 18:58:35 -07:00

README.md

The Torch-MLIR Project

The Torch-MLIR project aims to provide first class compiler support from the PyTorch ecosystem to the MLIR ecosystem.

This project is participating in the LLVM Incubator process: as such, it is not part of any official LLVM release. While incubation status is not necessarily a reflection of the completeness or stability of the code, it does indicate that the project is not yet endorsed as a component of LLVM.

PyTorch An open source machine learning framework that accelerates the path from research prototyping to production deployment.

MLIR The MLIR project is a novel approach to building reusable and extensible compiler infrastructure. MLIR aims to address software fragmentation, improve compilation for heterogeneous hardware, significantly reduce the cost of building domain specific compilers, and aid in connecting existing compilers together.

Torch-MLIR Multiple Vendors use MLIR as the middle layer, mapping from platform frameworks like PyTorch, JAX, and TensorFlow into MLIR and then progressively lowering down to their target hardware. We have seen half a dozen custom lowerings from PyTorch to MLIR. Having canonical lowerings from the PyTorch ecosystem to the MLIR ecosystem would provide much needed relief to hardware vendors to focus on their unique value rather than implementing yet another PyTorch frontend for MLIR. The goal is to be similar to current hardware vendors adding LLVM target support instead of each one also implementing Clang / a C++ frontend.

All the roads from PyTorch to Torch MLIR Dialect

We have few paths to lower down to the Torch MLIR Dialect.

TorchScript This is the most tested path down to Torch MLIR Dialect, and the PyTorch ecosystem is converging on using TorchScript IR as a lingua franca.
LazyTensorCore Read more details here.

Project Communication

#torch-mlir channel on the LLVM Discord - this is the most active communication channel
Github issues here
torch-mlir section of LLVM Discourse
Weekly meetings on Mondays 9AM PST. See here for more information.
Weekly op office hours on Thursdays 8:30-9:30AM PST. See here for more information.

Install torch-mlir snapshot

This installs a pre-built snapshot of torch-mlir for Python 3.7/3.8/3.9/3.10 on Linux and macOS.

python -m venv mlir_venv
source mlir_venv/bin/activate
# Some older pip installs may not be able to handle the recent PyTorch deps
python -m pip install --upgrade pip
pip install --pre torch-mlir torchvision -f https://github.com/llvm/torch-mlir/releases --extra-index-url https://download.pytorch.org/whl/nightly/cpu
# This will install the corresponding torch and torchvision nightlies

Demos

TorchScript ResNet18

Standalone script to Convert a PyTorch ResNet18 model to MLIR and run it on the CPU Backend:

# Get the latest example if you haven't checked out the code
wget https://raw.githubusercontent.com/llvm/torch-mlir/main/examples/torchscript_resnet18.py

# Run ResNet18 as a standalone script.
python examples/torchscript_resnet18.py

load image from https://upload.wikimedia.org/wikipedia/commons/2/26/YellowLabradorLooking_new.jpg
Downloading: "https://download.pytorch.org/models/resnet18-f37072fd.pth" to /home/mlir/.cache/torch/hub/checkpoints/resnet18-f37072fd.pth
100.0%
PyTorch prediction
[('Labrador retriever', 70.66319274902344), ('golden retriever', 4.956596374511719), ('Chesapeake Bay retriever', 4.195662975311279)]
torch-mlir prediction
[('Labrador retriever', 70.66320037841797), ('golden retriever', 4.956601619720459), ('Chesapeake Bay retriever', 4.195651531219482)]

Lazy Tensor Core

View examples here.

Eager Mode

Eager mode with TorchMLIR is a very experimental eager mode backend for PyTorch through the torch-mlir framework. Effectively, this mode works by compiling operator by operator as the NN is eagerly executed by PyTorch. This mode includes a fallback to conventional PyTorch if anything in the torch-mlir compilation process fails (e.g., unsupported operator). A simple example can be found at eager_mode.py. A ResNet18 example can be found at eager_mode_resnet18.py.

Repository Layout

The project follows the conventions of typical MLIR-based projects:

include/torch-mlir, lib structure for C++ MLIR compiler dialects/passes.
test for holding test code.
tools for torch-mlir-opt and such.
python top level directory for Python code

Developers

If you would like to develop and build torch-mlir from source please look at Development Notes