Commit Graph

68 Commits (4a7a7d76f8870cad43a1803312efce7a8ae8643b)

Author SHA1 Message Date
saienduri 8e2e5eeae9
add support for decomposition (#2879)
This commit adds decomposition support into the core aten operators
before importing the module from torch.

Also, this commit deals with the lifted tensor constants in
torch.export.export(). We don't want to add unnecessary placeholder
nodes in the graph (extra args in the block module), and should treat
them like the constants that they are. The unnecessary clone is also
removed for max efficiency.
2024-02-14 21:00:52 -08:00
saienduri bfcf93ea21
Rename torch_mlir.compile APIs and introduce FX based analogs (#2842)
Link to related RFC:
https://discourse.llvm.org/t/rfc-rename-torch-mlir-compile-apis-and-introduce-fx-based-analogs/76646
This commit updates the documentation, tests, CMake files, and API for
the proposed changes in the RFC. There is a new torch_mlir/fx.py for
user level APIs related to importing modules and a corresponding test
for this path can be found at test/python/fx_importer/basic_test.py.

---------

Co-authored-by: MaheshRavishankar <mravisha@amd.com>
2024-02-06 19:07:59 -08:00
Stella Laurenzo 77c14ab22b
[ci] Upgrade to new runners and disable unsupported jobs. (#2818)
Per the RFC and numerous conversations on Discord, this rebuilds the
torch-mlir CI and discontinues the infra and coupling to the binary
releases
(https://discourse.llvm.org/t/rfc-discontinuing-pytorch-1-binary-releases/76371).

I iterated on this to get latency back to about what it was with the old
(much larger and non-ephemeral) runners: About 4m - 4.5m for an
incremental change.

Behind the scenes changes:

* Uses a new runner pool operated by AMD. It is currently set to manual
scaling and has two runners (32-core, 64GiB RAM) while we get some
traction. We can either fiddle with some auto-scaling or use a schedule
to give it an increase during certain high traffic hours.
* Builds are now completely isolated and cannot have run-to-run
interference like we were getting before (i.e. lock file/permissions
stuff).
* The GHA runner is installed directly into a manylinux 2.28 container
with upgraded dev tools. This eliminates the need to do sub-invocations
of docker on Linux in order to run on the same OS that is used to build
wheels.
* While not using it now, this setup was cloned from another project
that posts the built artifacts to the job and fans out testing. Might be
useful here later.
* Uses a special git cache that lets us have ephemeral runners and still
check out the repo and deps (incl. llvm) in ~13s.
* Running in an Azure VM Scale Set.

In-repo changes:

* Disables (but does not yet delete):
  * Old buildAndTest.yml jobs
  * releaseSnapshotPackage.yml
* Adds a new `ci.yml` pipeline and scripts the steps in `build_tools/ci`
(by decomposing the existing `build_linux_packages.sh` for in-tree
builds and modularizing it a bit better).
* Test framework changes:
* Adds a `TORCH_MLIR_TEST_CONCURRENCY` env var that can be used to bound
the multiprocess concurrency. Ended up not using this in the final
version but is useful to have as a knob.
* Changes the default concurrency to `nproc * 0.8 + 1` vs `nproc * 1.1`.
We're running on systems with significantly less virtual memory and I
did a bit of fiddling to find a good tradeoff.
* Changed multiprocess mode to spawn instead of fork. Otherwise, I was
getting instability (as discussed on discord).
* Added MLIR configuration to disable multithreaded contexts globally
for the project. Constantly spawning `nproc * nproc` threads (more than
that actually) was OOM'ing.
* Added a test timeout of 5 minutes. If a multiprocess worker crashes,
the framework can get wedged indefinitely (and then will just be reaped
after multiple hours). We should fix this, but this at least keeps the
CI pool from wedging with stuck jobs.

Functional changes needing followup:

* No matter what I did, I couldn't get the LTC tests to work, and I'm
not 100% sure they were being run in the old setup as the scripts were a
bit twisty. I disabled them and left a comment.
* Dropped out-of-tree build variants. These were not providing much
signal and increase CI needs by 50%.
* Dropped MacOS and Windows builds. Now that we are "just a library" and
not building releases, there is less pressure to test these commit by
commit. Further, since we bump torch-mlir to known good commits on these
platforms, it has been a long time since either of these jobs have
provided much signal (and they take ~an hour+ to run). We can add them
back later post-submit if ever needed.
2024-01-27 18:35:45 -08:00
Stella Laurenzo ccd469ca0d
[fx] Upstream the turbine FxImporter to torch-mlir. (#2681)
Changes made during upstreaming:

* Removed comments attributing some copied code back to torch-mlir
(since it is now repatriated).
* Re-organized imports.
* Inlined RefMapping/RefTracker and TypeSubclassMap from an external
utility module.
* Added FxImporter class comments.
* Updated stack trace extraction to be fail safe.
* Added an entry-point for `import_frozen_exported_program` which uses
the shiny new upstream `torch.export.export()` API (versus the
lower-level/older API that Turbine is presently using). This
necessitated a small FX rewrite to line external state management up
with current conventions.
* Adapted one of Turbine's importer tests to go with this initial
submission. Turbine unfortunately has a lot of more-integration-ey
tests, and I would like to extract those as more of unit tests of the
importer features and upstream them that way vs trying to copy directly.
For now, one overall test with the initial submission gets us moving.

I acknowledge that there are some code quality things that could be
improved in this submission: this was authored over the course of many
months (and often via some trial and error). I would like to keep it
relatively converged with the downstream for the next few steps while
getting the test suite upstreamed. And then it will be easier to take a
hygienic pass through the code.

Including co-authors for contributors in the git log of the original
repository.

Co-authored-by: Ean Garvey <87458719+monorimet@users.noreply.github.com>
Co-authored-by: Avinash Sharma <aviator1994@gmail.com>
Co-authored-by: Arham Khan <arhammkhan@gmail.com>
Co-authored-by: brucekimrokcmu <kwangkyk@alumni.cmu.edu>
Co-authored-by: saienduri <77521230+saienduri@users.noreply.github.com>
2023-12-21 08:40:10 -08:00
Stella Laurenzo ed4df38e8d
[onnx] Add torch-mlir-import-onnx tool. (#2637)
Simple Python console script to import an ONNX protobuf to the torch
dialect for additional processing.

For installed wheels, this can be used with something like:

```
torch-mlir-import-onnx test/python/onnx_importer/LeakyReLU.onnx
```

Or from a dev setup:

```
python -m torch_mlir.tools.import_onnx ...
```
2023-12-12 22:01:30 -08:00
Stella Laurenzo 74f7a0c9d6
Upstream the ONNX importer. (#2636)
This is part 1 of 2, which will also include upstreaming the FX
importer. I started with ONNX because it forces some project layout
updates and is more self contained/easier as a first step.

Deviating somewhat from the RFCs on project layout, I made the following
decisions:

* Locating the `onnx_importer.py` into `torch_mlir.extras` as Maks
already has opened up that namespace and it seemed to fit. Better to
have fewer things at that level.
* Setup the build so that the root project only contains MLIR Python and
pure Python deps (like the importers), but this can be augmented with
the `projects/` adding more depending on which features are enabled.
* The default build continues to build everything whereas in
`TORCH_MLIR_ENABLE_ONLY_MLIR_PYTHON_BINDINGS=1` mode, it builds a
`torch-mlir-core` wheel with the pure contents only.

`onnx_importer.py` and `importer_smoke_test.py` are almost verbatim
copies from SHARK-Turbine. I made some minor local alterations to adapt
to paths and generalize the way they interact with the outer project. I
expect I can copy these back to Turbine verbatim from here. I also
updated the license boilerplate (they have the same license but slightly
different project norms for the headers) but retained the correct
copyright.

Other updates:

* Added the ONNX importer unit test (which also can generate test data)
in lit, conditioned on the availability of the Python `onnx` package. In
a followup once I know everything is stable, I'll add another env var
that the CI can set to always enable this so we know conclusively if
tests pass.
* Moved the ONNX conversion readme to `docs/`.
* Renamed CMake option `TORCH_MLIR_ENABLE_ONLY_MLIR_PYTHON_BINDINGS` ->
`TORCH_MLIR_ENABLE_PYTORCH_EXTENSIONS` and inverted the sense. Made the
JitIR importer and LTC options `cmake_dependent_options` for robustness.
2023-12-12 19:02:51 -08:00
Stella Laurenzo 6961f0a247
Re-organize project structure to separate PyTorch dependencies from core project. (#2542)
This is a first step towards the structure we discussed here:
https://gist.github.com/stellaraccident/931b068aaf7fa56f34069426740ebf20

There are two primary goals:

1. Separate the core project (C++ dialects and conversions) from the
hard PyTorch dependencies. We move all such things into projects/pt1 as
a starting point since they are presently entangled with PT1-era APIs.
Additional work can be done to disentangle components from that
(specifically LTC is identified as likely ultimately living in a
`projects/ltc`).
2. Create space for native PyTorch2 Dynamo-based infra to be upstreamed
without needing to co-exist with the original TorchScript path.

Very little changes in this path with respect to build layering or
options. These can be updated in a followup without commingling
directory structure changes.

This also takes steps toward a couple of other layering enhancements:

* Removes the llvm-external-projects/torch-mlir-dialects sub-project,
collapsing it into the main tree.
* Audits and fixes up the core C++ build to account for issues found
while moving things. This is just an opportunistic pass through but
roughly ~halves the number of build actions for the project from the
high 4000's to the low 2000's.

It deviates from the discussed plan by having a `projects/` tree instead
of `compat/`. As I was thinking about it, this will better accommodate
the follow-on code movement.

Once things are roughly in place and the CI passing, followups will
focus on more in-situ fixes and cleanups.
2023-11-02 19:45:55 -07:00
Matthias Gehre 816880774b
Fix version comparison against stable (#2209) 2023-06-07 10:19:38 +02:00
Maksim Levental c3cd7471b4
Pure-Python FX importer. (#2098)
Co-authored-by: Sean Silva <silvasean@google.com>
2023-05-12 00:46:33 -05:00
Maksim Levental 2eddb3fde7
WIP: No PyTorch dep (#1854) 2023-02-13 14:21:06 -06:00
Ramiro Leal-Cavazos a710237437
[custom op] Generalize shape library logic to work with dtypes (#1594)
* [custom op] Generalize shape library logic to work with dtypes

This commit generalizes the shape library logic, so that dtype rules
for ops can also be expressed using the same mechanism. In other
words, each op can now have a shape function and a dtype function
specified in Python that is imported during lowering to calculate the
shapes and dtypes throught a program. For more information about how
to specify a dtype function, see the updated
`docs/adding_a_shape_and_dtype_function.md`.

For those not familiar with how the shape library works, the file
`docs/calculations_lib.md` provides an overview.
2022-12-13 08:25:41 -08:00
Sean Silva 7731211d02 Remove eager_mode
This was an experimental attempt at rolling out own op-by-op executor
with `__torch_dispatch__`, but it proved difficult to make it robust.
Op-by-op execution is very easy to implement robustly now with the
PyTorch 2.0 stack, so we don't need eager_mode.

Downstream users were using eager_mode to implement lockstep numerical
accuracy debuggers. We implemented the same functionality with
TorchDynamo in https://github.com/llvm/torch-mlir/pull/1681 so now there
is not much reason to continue maintaining it.
2022-12-09 03:50:00 -08:00
Sean Silva 28957adaac [torchdynamo] Initial TorchDynamo support
This adds a basic e2e Config for TorchDynamo using
Linalg-on-Tensors/RefBackend.
But TorchDynamo is pretty orthogonal to
various other pieces, so it should compose nicely with variations like:
- Switching out all the backends (Linalg-on-Tensors, TOSA, MHLO)
- PyTorch functionalization and decompositions
- Taking the example inputs and compiling with all dynamic or all static
  shapes without duplicating tests.

This adds it to the CI, but there are still a lot of XFAIL's.

This also adds a helper `from torch_mlir.dynamo import
make_simple_dynamo_backend` which simplifies some of the steps for
making a Torch-MLIR-based TorchDynamo backend. We include "simple" in
the name because we are going to be exploring various things next from
the long-term roadmap.

The next steps are:
- Burn down all the XFAIL's.
- Start working on the pieces from the [long-term roadmap](https://github.com/llvm/torch-mlir/blob/main/docs/long_term_roadmap.md).
  - Add functionalization/decompositions into the TorchDynamo flow and
    remove reliance on the current Torch-MLIR "frontend".
  - Write a pure-Python direct FX->MLIR importer.
  - Hook up the new PyTorch symbolic shape stuff.
  - Explore PrimTorch decompositions for simplifying backends.
2022-11-24 04:10:25 -08:00
Ashay Rane a9942f343a
Cache PyTorch source builds to reduce CI time (#1500)
* ci: cache PyTorch source builds

This patch reduces the time spent in regular CI builds by caching
PyTorch source builds.  Specifically, this patch:

1. Makes CI lookup the cache entry for the PyTorch commit hash in
   pytorch-version.txt
2. If lookup was successful, CI fetches the previously-generated WHL
   file into the build_tools/python/wheelhouse directory
3. CI sets the `TM_PYTORCH_INSTALL_WITHOUT_REBUILD` variable to `true`
4. The build_libtorch.sh script then uses the downloaded WHL file
   instead of rebuilding PyTorch

* ci: warm up PyTorch source cache during daily RollPyTorch action

This patch makes the RollPyTorch action write the updated WHL file to
the cache, so that it can be later retrieved by CI that runs for each
PR.  We deliberately add the caching step to the end of the action since
the RollPyTorch action never needs to read from the cache, although
executing this step earlier in the process should not cause problems
either.
2022-10-18 00:42:42 -05:00
Henry Tu ba17a4d6c0
Reenable LTC in out-of-tree build (for real this time) (#1205)
* Fix OOT LTC CI build failure

* Disable LTC during macOS package gen

* Add more details about static TorchMLIRJITIRImporter library
2022-08-19 15:25:00 -04:00
nithinsubbiah fde390c766 Re-enable custom op support 2022-08-16 22:49:08 +05:30
powderluv 2342456356
mac m1 cross compile (#1204)
* mac m1 cross compile

Add support for M1 cross compile

* Remove redundant ExecutionEngine

It is registered as part of RegisterEverything

* nuke non-universal zstd

disable LTC
2022-08-10 08:48:39 -07:00
Sean Silva 5618890ca0 development.md: Avoid name collisions with PYTORCH_ variables 2022-08-05 19:41:08 -07:00
Henry Tu e322f6a878
Update LTC CMake hack documentation (#1155)
* Update CMakeLists.txt

* Update CMakeLists.txt

* Update CMakeLists.txt

* Update CMakeLists.txt

* Update buildAndTest.yml

* Update setup.py

* Address review comments
2022-08-05 14:12:20 -04:00
Henry Tu 2c3b3606d0 Resolve remaining LTC CI failures (#1110)
* Replace CHECK_EQ with TORCH_CHECK_EQ

* Check value of TORCH_MLIR_USE_INSTALLED_PYTORCH during LTC build

* Update LTC XFAIL with NewZerosModule ops

* Explicitly blacklist _like ops

* Automatically blacklist new_/_like ops

* Prune away unused Python dependencies from LTC

* Add flag to disable LTC

* Autogen dummy _REFERENCE_LAZY_BACKEND library when LTC is disabled

* Implement compute_shape_var

* Removed Var tests from XFAIL Set

* XFAIL tests using _local_scalar_dense or index.Tensor

* Add StdDim tests to XFAIL set

* Autogen aten::cat
2022-07-30 09:40:02 -04:00
Henry Tu 47bb38d180 Reference Lazy Backend (#1045)
* Changed Example MLIR backend to Reference MLIR backend

* Moved reference_ltc_backend into csrc

* Merged sys_utils.h

* Renamed reference_ltc_backend to reference_lazy_backend

* Addressed review comments

* Update docs with new library name

* Removed _REFERENCE_LAZY_BACKEND from .gitignore

* Added reference_lazy_backend to the TorchMLIRPythonModules dependency list

Fixed typo in `ltc_examples.md`

Missed instance where `ltc_backend` was used instead of `lazy_backend`.
2022-07-30 09:40:02 -04:00
Jae Hoon (Antonio) Kim 2f22e2ef40 Add initial LTC backend (#610)
* Add initial LTC backend skeleton

* Disable CI build and move TorchMLIRPyTorch.cmake
2022-07-30 09:40:02 -04:00
powderluv f424930a28
Add option to expose custom PyTorch repo/branch (#1103) 2022-07-24 20:08:48 -07:00
powderluv 31fd812acf
Add linux and macOS source builds in CI (#1070)
This enables building Pytorch from source in the CI.
The build should mostly hit the ccache.
Release builds will follow once we have some runtime on the CI.
2022-07-21 14:16:03 -07:00
Ashay Rane 72dd04cdb3
Revert "python: trim registration and loading of dialects and passes" (#1093)
This reverts commit ad283c1043, since it's
causing nightly build failures for all platforms.
2022-07-21 09:35:42 -07:00
Ashay Rane ad283c1043
python: trim registration and loading of dialects and passes (#1084)
In the interest of merging upstream LLVM quickly, a previous patch
(7f08169) updated the torch-mlir build to register all dialects and
passes through Python bindings.  This patch limits the dialects and
passes to only those that are used in torch-mlir.

Key to this change are the removal of
`MLIRPythonExtension.RegisterEverything` and the introduction of a new
Python module (`_mlir_libs/_site_initialize_0.py`), where we register
the dialects and passes used by torch-mlir.
2022-07-20 18:34:17 -07:00
Ashay Rane 7f08169380
bump llvm tag to 3580daa (#1078)
This patch makes some rudimentary changes to torch-mlir's use of MLIR
Python bindings to work with the most recent LLVM code.  We can perhaps
do better by being more selective in what we link against, instead of
using `MLIRPythonExtension.RegisterEverything`.
2022-07-18 16:49:03 -07:00
powderluv 479a8a8963
Remove libtorch downloads (#1058)
Remove all the libtorch downloads. If the user sets
-DTORCH_MLIR_USE_INSTALLED_PYTORCH=OFF then just build from src.

Doesn't change developer workflow since we still default to local
PyTorch versions.

TEST: Build and verify all tests (except one xfail quant) pass on linux
2022-07-14 17:16:51 -07:00
Maksim Levental 1bb990afc7
Speed up libtorch build. (#1031) 2022-07-11 20:46:49 -05:00
Ashay Rane 874fdb7e42
build: improve robustness of cmake and shell scripts (#1018)
On my local machine, `unzip` didn't exist (producing a "command not
found" error), but CMake ignored the error.  Although the build did
succeed (because it found a previously-built version of libtorch), it
seems better to abort builds on such failures, so this patch checks the
return code of all external process invocations.

Along similar lines, this patch also updates the shell scripts in
`build_tools` to extensively use double-quoting to prevent unintentional
word splitting or globbing.  Since some of the scripts execute `rm`
while using shell variables, this patch also adds the preamble `set -u`
to abort execution if an undefined variable is referenced, so that we
reduce the chances of executing `rm -rf /` if the path expression
happens to refer to an undefined variable.
2022-07-06 14:39:30 -07:00
powderluv 33bfeda4c5
Enable libtorch caching and source builds (#1004)
Add an option to cache libtorch/ releases if you don't want to
download the latest. Add an option to enable source builds.

TESTS:
macOS: verify with / without cache downloads
       verify source builds -- shared and static

Linux: Build Tests and Release builds
2022-07-05 10:25:43 -07:00
powderluv 2b52da951b
Link against libtorch (#955)
This moves torch-mlir to link against libtorch on macOS and linux

TESTS: Tests pass. Tested release builds on linux and macOS
2022-06-30 12:40:17 -07:00
powderluv 8fd084377d
Update CMakeLists.txt 2022-06-14 14:46:52 -07:00
powderluv dfc6f7c547
Update CMakeLists.txt
Emergency fix to unblock the nightly Release builder
2022-06-14 14:38:35 -07:00
Bob Adolf 0a7ba62438
Allow torch-mlir to support PyTorch extensions. (#895)
PyTorch allows new operators to be registered dynamically in modules.
Torch-mlir already makes it fairly straightforward to add support for
new operators, and this commit just extends that support to allow new
PyTorch ops to come from a external module.

This does *not* allow ops to be dynamically loaded into torch-mlir.
Torch-mlir must still be compiled with support built-in.

Add a `_torch_mlir_custom_op_example` subpackage to `torch_mlir` which
registers an demonstration op. It will not be imported by default when
importing torch_mlir. It's strictly for testing and documentation.

Adds an end-to-end test for the `torch_mlir_custom_op_example::identity` op.

With all these changes, we should now be actively testing PyTorch extension
support with all future patches.
2022-06-13 14:51:30 -07:00
Sean Silva 075464fa74 Add a new `torch_mlir.compile` method.
This makes it much easier to convert models and hides all the
ClassAnnotator complexity.

This also adds a new example `torchscript_resnet18_all_output_types.py`
which shows the ResNet18 IR for all output types.

Also,

- This moves `run_pipeline_with_repro_report` to
  `torch_mlir.compiler_utils`.
2022-04-20 10:06:01 -07:00
max fe8ac57e6d This PR implements an eager mode backend for PyTorch through the torch-mlir framework. This is accomplished by overriding the `__torch_dispatch__` class method on wrapper subclass `TorchMLIRTensor(torch.Tensor)`.
Effectively, this mode works by compiling op by op as the NN is eagerly executed by PyTorch. Entailed in that compilation is building a representation of the op that can be `torch.jit.script`ed, importing using `ModuleBuilder`, and then executing (e.g., with `RefBackendLinalgOnTensorsBackend`). This mode includes a fallback to conventional PyTorch if anything in the torch-mlir compilation process fails (e.g., unsupported op).

Currently, all e2e tests pass execpt for two that involve an upstream PyTorch bug (https://github.com/pytorch/pytorch/issues/74400).

High priority next steps:

1. A compile cache in order to speed up reruns of the same NN.
2. Integration with IREE (though not in this repo).
3. Integration with `torch.distributed`.
2022-03-22 14:42:57 -07:00
stephenneuendorffer 614b889dc6
Enable python extensions when building out of tree (#363) 2021-10-27 17:04:12 -07:00
Stella Laurenzo a23d77100b Set some wheel building optimization options.
* Also adds a requirements.txt and updates docs to reference it versus stringy pip install.
* Adds doc with instructions on creating a wheel.

Fixes #370
2021-10-25 18:30:53 +00:00
Sean Silva 4fad753073 Move external/torch-mlir to the root of the repo. 2021-09-27 17:11:08 -07:00
Sean Silva d8f603a4e5 Remove old stuff in prep for move-to-root. 2021-09-27 17:11:08 -07:00
Sean Silva 404bd74ddf Port the bulk of the remaining code to torch-mlir
This leaves no real code outside torch-mlir.

This also renames the "npcomp backend contract" to "linalg on tensors
backend contract" as the name of the abstraction layer that RefBackend
(IREE too) accepts.
2021-09-27 12:48:33 -07:00
Sean Silva a25163fbfa Remove old RefBackend
It is superceded by the new one.
2021-09-22 15:33:28 -07:00
Sean Silva 6d8e7f1bb1 Implement Python relayout from #311
Fixes https://github.com/llvm/mlir-npcomp/issues/311

The key change is that TorchPlugin is folded into
`torch_mlir.dialects.torch.importer.jit_ir` (it imports the PyTorch
JIT's IR, so that's a good, scoped name for it).
The CMake option `-DTORCH_MLIR_ENABLE_JIT_IR_IMPORTER=OFF` disables it,
which allows building without a PyTorch native dependency.
2021-09-21 09:29:40 -07:00
Sean Silva 0eb767ea45 Remove frontends/pytorch directory.
It just contained the e2e testing framework. We now fold it into the
main project to reduce complexity.

- `frontends/pytorch/python/` -> `python/torch_support`
- `frontends/pytorch/e2e_testing -> e2e_testing`
- `frontends/pytorch/examples -> examples`
- `frontends/pytorch/test` -> `python/test`
- `torch_mlir_torchscript` python module -> `npcomp_torchscript`
- `torch_mlir_torchscript_e2e_test_configs` python module ->
  `npcomp_torchscript_e2e_test_configs`

This also changes the license of a handful of files from the
"pytorch-style" license to the regular LLVM/npcomp license. The only
people who committed to those files were myself and Yi.
2021-09-17 09:27:49 -07:00
Sean Silva d94d6800fa Bring CI back to life.
This brings back `check-npcomp-all` and the refbackend e2e tests
coverage.
2021-09-16 12:07:32 -07:00
Sean Silva b6be96d722 [torch-mlir earthmoving (2/N)] Python code movement.
This moves the bulk of the Python code (including the Torch interop)
from `frontends/pytorch` into `torch-mlir/TorchPlugin`. This also
required reconciling a bunch of other Python-related stuff, like the
`torch` dialects.

As I did this, it was simpler to just remove all the old numpy/basicpy
stuff because we were going to delete it anyway and it was faster than
debugging an intermediate state that would only last O(days) anyway.

torch-mlir has two top-level python packages (built into the
`python_packages` directory):

- `torch_mlir_dialects`: `torch` dialect Python bindings (does not
  depend on PyTorch). This also involves building the aggregate CAPI for
  `torch-mlir`.
- `torch_mlir`: bindings to the part of the code that links against
  PyTorch (or C++ code that transitively does).

Additionally, there remain two more Python packages in npcomp (but
outside `torch-mlir`):

- `npcomp_torch`: Contains the e2e test framework and testing configs
  that plug into RefBackend and IREE.
- `npcomp_core`: Contains the low-level interfaces to RefBackend and
  IREE that `npcomp_torch` uses, along with its own
  `MLIR_PYTHON_PACKAGE_PREFIX=npcomp.` aggregation of the core MLIR
  python bindings. (all other functionality has been stripped out)

After all the basicpy/numpy deletions, the `npcomp` C++ code is now very
tiny. It basically just contains RefBackend and the `TorchConversion`
dialect/passes (e.g. `TorchToLinalg.cpp`).

Correspondingly, there are now 4 main testing targets paralleling the
Python layering (which is reflective of the deeper underlying dependency
structure)

- `check-torch-mlir`: checks the `torch-mlir` pure MLIR C++ code.
- `check-torch-mlir-plugin`: checks the code in `TorchPlugin` (e.g.
  TorchScript import)
- `check-frontends-pytorch`: Checks the little code we have in
  `frontends/pytorch` -- mainly things related to the e2e framework
  itself.
- `check-npcomp`: Checks the pure MLIR C++ code inside npcomp.

There is a target `check-npcomp-all` that runs all of them.
The `torch-mlir/build_standalone.sh` script does a standalone build of
`torch-mlir`.

The e2e tests (`tools/torchscript_e2e_test.sh`) are working too.

The update_torch_ods script now lives in
`torch-mlir/build_tools/update_torch_ods.sh` and expects a standalone
build.

This change also required a fix upstream related to cross-shlib Python
dependencies, so we also update llvm-project to
8dca953dd39c0cd8c80decbeb38753f58a4de580 to get
https://reviews.llvm.org/D109776 (no other fixes were needed for the
integrate, thankfully).

This completes most of the large source code changes. Next will be
bringing the CI/packaging/examples back to life.
2021-09-15 13:40:30 -07:00
Sean Silva 28a7738189 [torch-mlir earthmoving (1/N)] C/C++ code movement.
This creates the `external/torch-mlir` directory as an
LLVM_EXTERNAL_PROJECTS-compatible project (analogous to
`iree-dialects`) and completes movement/rename of all pure MLIR C/C++
compiler code into there. The next step will be to move all the Python
code / code that links/includes PyTorch C++ code (which currently lives
in `frontends/pytorch`) into a subdirectory here.

I call this "earthmoving" because it is mostly mechanical changes and
renames. As a quick summary (we can change this down the road easily)
- C++ `mlir::NPCOMP::Torch -> mlir::torch::Torch`
- CAPI `npcompTorchListTypeGet -> torchMlirTorchListTypeGet`
- preprocessor `#ifndef NPCOMP_ -> #ifndef TORCHMLIR_`
- CMake `NPCOMPFoo -> TorchMLIRFoo`

The goal of this is to create a standalone project creating a center of
mass for entry into the MLIR ecosystem from PyTorch, suitable in scope
for eventual inclusion/ownership in PyTorch. The idea is that
`external/torch-mlir` will some day be pulled out into its own
repository, and then npcomp will simply pull it in as a submodule.

Layering-wise, what lives in `torch-mlir` lowers code from PyTorch
(currently TorchScript, but TorchFX or pytorch/xla-style tracing are
possible extensions) down to what we have been calling the "Torch
backend contract" which is cleaned up IR (inlining, simplifcation,
conversion to value tensors, ...) entirely in the `torch` dialect. This
is the branching off point for further lowering, of which npcomp takes
one opinion (outside `torch-mlir` of course!), namely the
`TorchConversion` dialect/transforms which lower to IR suitable for IREE
and other linalg-on-tensors based lower-level compilers.

Summary of changes:
- move `{include,lib,test}/Dialect/Torch` into `torch-mlir`
- move relevant parts of CAPI into `torch-mlir`.
- leave a few things related to the `torch-mlir` Python build commented
  out, which should be resolved in a subsequent change.
2021-09-10 21:44:37 -07:00
Stella Laurenzo 4148f88576 Merge npcomp and mlir python namespaces.
* Now the parts of the MLIR API are directly exported under the npcomp module (i.e. `npcomp.ir`, etc).
* Has required fixes for https://reviews.llvm.org/D108489
* Deletes npcomp.tracing vs fixing it because it was a very early experiment that will not be carried forward.
* This makes the npcomp python distribution completely standalone and separate from an mlir installation.
* Makes most of npcomp itself relocatable for future use as a library.
* Most things are a namespace package now. In the future we can s/torch_mlir/npcomp.frontends.torch/ and have it layer properly.
2021-08-22 21:00:42 -07:00
Sean Silva f168cacd6d Remove TCF and TCP.
These were legacy concepts that are now superceded by direct Torch to
linalg-on-tensors lowering. These were based on some very early thinking
related to the layering of frontends vs codegen, which is now obsolete
because:
- We expected a lot more centralization at the frontend (TCF) level. It
  turns out that frontend needs really vary a lot, and there is no grand
  unifying TCF dialect plausible. The additional layer isn't worth it.
- Linalg-on-tensors obsoletes the primary need for TCP. There are still
  a few things not representable with linalg-on-tensors, but the support
  is growing and the whole "not included in linalg-on-tensors" direction
  needs to be rethought. Our TCP dialect didn't cover any of the
  actually important things in this space (such as sort, FFT, top-k,
  etc.).

See historical [slides](https://drive.google.com/file/d/1iljcpTQ5NPaMfGpoPDFml1XkYxjK_6A4/view) / [recording](https://drive.google.com/file/d/1jSPa8TwPKUt0WuLquGc8OgSUVYJHMvWZ/view)
for more details on the origin story here.

Their presence was confusing users too
[bug](https://github.com/llvm/mlir-npcomp/issues/248).

Also,
- Trim down npcomp-run-mlir testing. It was testing TCF to TCP
  lowering for the most part. The essential stuff is retained and
  rephrased with linalg-on-tensors. (we should probably rename it
  "refback-run" or something, as it is just a way to invoke RefBackend)
- test/Python/Backend/RefJIT/simple_invoke_numpy.py is XFAIL'ed. Our
  "anti-framework" direction seems to be the likely future path.
2021-08-02 12:08:39 -07:00