Commit Graph

47 Commits (97d6d04d41216e73d40b89ffd79620973fc14ce3)

Author SHA1 Message Date
Sean Silva d818043986 Bump llvm-project to d50d7c37a159802c89454a6c53c0ec2e7949d84a
Fixes:
- use `op->(method on Operation)`
- update for MlirIdentifier in signature of mlirNamedAttributeGet
2020-12-14 14:30:51 -08:00
Stella Laurenzo f6d7ee06ef Make torch_mlir compatible with binary PyTorch installations.
* This has been anticipated for a long time in that it is quite hard to keep C++ binary compatibility across a system landscape as diverse as PyTorch, LLVM, and this project. This is why we based the PyTorch extension on the MLIR and NPCOMP C APIs only: that is the only sane linkage story for the entire matrix.
* Removes the few LLVM'isms in torch_mlir that had snuck in, using either STL or PyTorch support utilities. The new rule here is that LLVM C++ includes are forbidden at this level and (as stated in the design), torch_mlir should use the PyTorch runtime and support libraries (not introduce an incidental C++ dependency on LLVM).
* Also deletes mnist-playground as it was proving impossible to keep the grid of PyTorch vs system ABI divisions functioning. I am open to a less drastic course here (optional/disabled by default?)
* This gets us pretty close to just using PyTorch's extension builder API, which will be nice for distribution (i.e. it integrates well with the PyTorch ecosystem for deployment). I ended up just simplifying the in-tree CMake support for now.
* Fixes #138
2020-12-14 09:51:00 -08:00
Sean Silva b2077738ca Bump llvm-project to 444822d77a7fea28aa49edf24533c987efa1b2ee
Fixes:
- renames StandardTypes -> BuiltinTypes
- std.extract_element -> tensor.extract
2020-12-11 14:43:38 -08:00
Phoenix Meadowlark 699bf5df45
Add cos_e2e.py, test_utils and support for tensor inputs (#134) 2020-11-24 19:02:50 -08:00
Stella Laurenzo e2405e3ca8 Add design sketch for aten fallback. 2020-11-24 18:13:35 -08:00
Stella Laurenzo 3937dd14cb Add basicpy.numeric_constant op.
* Going through TODOs on the PyTorch side, this is a big cause of them (not being able to have constants for signed/unsigned).
* Added complex while in here since we're at the phase where it is better to just have things complete than partially done.
2020-11-24 16:44:40 -08:00
Stella Laurenzo b0623b7793 Bump LLVM version to 4f5355ee73626f8b8fe6bf0dd6d167fea7628a2c.
* Incorporates changes around LLVM StringRef.
* Ports fix in upstream pybind11 detection.
* Disables CI hack due to broken pybind detection.
2020-11-24 13:12:04 -08:00
meadowlark@google.com 959c0a79cb Expand pytype coverage for torch_signature_ods_gen.py 2020-11-24 12:42:32 -08:00
Stella Laurenzo f13994fdf7 NFC: Remove TODO about creating an mlirOperationStateDestroy (unnecessary). 2020-11-23 15:01:51 -08:00
Stella Laurenzo 9ffd2556ab Add TorchScript import tests missed in previous change. 2020-11-23 14:43:42 -08:00
Stella Laurenzo 78a3c90758 Add TorchScript graph importer.
* Does not handle all features yet but should conservatively fail on unsupported things.
* Location tracking is still somewhat mismatched between what TorchScript and MLIR do. Likely need a better heuristic for tracking locations from defs for nodes that do not carry location.
* Sets the ground-work for a specialized/generic split but only implements the generic side.
* Had some evidence that this requires a recent bump of PT nightly (within the last month) to pick up pybind11 2.6, which includes some cross-module symbol fixes (vs the previously sync'd version). No source changes, but older versions fail to cast function types at runtime.
2020-11-23 14:20:09 -08:00
Sean Silva 1dfcfa9cd1 Add aten.mm op and "test" it e2e.
Note that unlike aten.matmul which has dynamic behavior
depending on the argument ranks (can do matrix-matrix, matrix-vector,
batch matmul, etc.), aten.mm is just a vanilla matrix
multiply, which can be lowered precisely to tcf.matmul.

The "test" is really just an example that I stared at while getting my
feet wet with this. We probably want something that actually tests this
as part of `ninja check-npcomp`.
2020-11-20 17:21:24 -08:00
harsh-nod 67d6694fdc
Update PYTHON cmake variables to Python3 (#121)
After the recent change of cmake variables
from PYTHON_INCLUDE_DIRS to Python3_INCLUDE_DIRS
and PYTHON_LIBRARIES to Python3_LIBRARIES, there were
a few files that still had references to the old
variables. This patch fixes that.
2020-11-17 16:04:14 -08:00
Stella Laurenzo 6850295ec5 Teach cmake how to find the installed PyTorch.
* In most situations, this eliminates the need to explicitly set a path to the Torch cmake files.
* Also upgrades to new Python3 find package. (should eliminate 2.x mismatches)
* Since PyTorch is located by asking Python where it is, this eliminates a lot of causes of mismatch. (one source of truth)
2020-11-13 17:19:25 -08:00
Stella Laurenzo 47ac80491c Delete old PyTorch 1.3 type dispatch oriented code paths.
* We aren't quite at e2e parity, but we aren't going back and the old path is bit-rotted.
2020-11-12 22:27:05 -08:00
Stella Laurenzo e359167562 Fix dispatch of arange.
* Fixes #107
* I wouldn't say I love what had to be done here. Worth a conversation with the PT devs (probably as part of a rollup of a bunch of this stuff).
2020-11-12 22:07:23 -08:00
Stella Laurenzo b4c7ae1e0c Repurpose numpy-compiler compiler/runtime flow for PyTorch.
* A bit gross because I took the chance to upgrade all of the backend bits to the new MLIR Python bindings and we still co-mingle the old and new for now.
* Since the Python created PassManagers are configured for explicit nesting, I had to upgrade some of the pass pipelines to be explicit.
* The demo in mul_maximum_e2e.py now compiles, runs through PyTorch and through the JIT, prints and asserts the same results.
* I am not claiming that this is the prettiest API in this patch: consider that this is just directly using low-level APIs and there should be an intervening high level API.
2020-11-11 10:38:13 -08:00
Stella Laurenzo e60dc2470e Add aten.maximum op and conversions from aten->tcf.
* Conversions are very simple, suporting mul, maximum and add (alpha=1 only).
* Example added with pass pipeline needed to run.
* Much missing off of the golden path but sufficient for such simple cases.
2020-11-04 17:20:54 -08:00
Stella Laurenzo 6c702b149f Add a number of kernels and new patterns.
* convolution, convolution_backward, _log_softmax, _log_softmax_backward_data, nll_loss_forward, nll_loss_backward, nll_loss2d_forward, nll_loss2d_backward, copy_
* Extends the recognition logic and metadata for handling inplace transformations, optional tensors, ints, lists and dropped args.
* The kernel_calls generated by test_conv_nllloss_grads.py now convert to ATen.
* The result *almost* comes out as a pure tensor program with the exception of the copy_ op, which I will do some followup work to deal with.
* More progress on #97
2020-11-04 14:36:59 -08:00
Harsh Menon c2d3820e48 Fix insertion point bug #102
The current code was inserting all build_list ops
after the last constant op since it was assuming that all
elements being passed in were constants.

This patch replaces that patch with a new function that
inserts the build_list ops before the terminator.

Also modifies test_export_conv2d_fwd.py since its output
no longer matches.

TEST: Added test_export_cat.py which is the code in #102
2020-11-02 16:41:26 -08:00
Stella Laurenzo 0c73c535d6 Capture backward conv and copy_ kernels.
* This is sufficient to capture the forward and backward pass and gradients of a convolutional model with an nllloss.
* As with the forward conv, the backward conv is a special case wrapped in an enigma on the PyTorch side. There aren't many like it, so special casing is just what we do.
* When I traced this, I found that the copy_ op is not yet boxing compatible so I had to map it manually. If there are many more like this, I'll probably do something a bit more clever to reduce duplication.
* This exposes new signature patterns that will need to be handled by the ATen lowering. Will take care of that next: It will be nice to have an e2e of a non-trivial case with full gradients.
* Fixes #97.
2020-10-30 22:59:26 -07:00
Stella Laurenzo 8d98dd4551 Support optional args/returns and other odds and ends.
* None's out Device? args.
* Emits bool tensors if needed.
* Adds some stderr tracing to better see what is going on.
* Test case that exercises NLLLoss.
* This test case emits something for backward calculations but there are some issues still to be worked out, so that part is left out of the test case.
* Progress on #97
2020-10-30 14:50:28 -07:00
Stella Laurenzo c08935a418 Rewrite ATen ODS code generator to be based on new op registry and new signature recognition system.
* Deletes prior code generator from previous attempt (moved some of it into this one).
* Renames old generated tablegen source to "Legacy".
* Generates ODS and import rules for most binary and unary arithmetic ops.
* Removes old generated ops and integration tests that were testing details of the prior setup.
2020-10-28 10:37:37 -07:00
Stella Laurenzo 510f226df2 Expose signature metadata to ops and implement ATenRecognizeKernelsPass pass.
* Two op interfaces, one for querying instance metadata and one for getting static data needed to construct an op from a generic form.
* For torch.generic_kernel ops, metadata is splatted in during capture from Torch (it comes from the op registry, which will work for either device capture or graph import).
* Moved the 'add' out of the generated set so I can experiment on it. It implements the TorchBuildableKernelOpInterface interface which provides its metadata.
* The ATenRecognizeKernelsPass pass generically lowers from a torch.generic_kernel to recognized ops that implement the TorchBuildableKernelOpInterface, handling the various types of transformations that we allow at this stage.
2020-10-26 20:31:45 -07:00
Stella Laurenzo 91fc83d2e7 NFC: Transition ATen passes to tablegen registration. 2020-10-22 17:12:44 -07:00
Stella Laurenzo 9618c2dbf7 NFC: Re-organize ATen directory structure and fix warnings.
* Still some more work to do on the Transforms tree to bring it in line with the others (will do that as I add things).
2020-10-22 14:13:26 -07:00
Stella Laurenzo d09300886a NFC: Use new print with large_elements_limit in tests.
* For tests with large constants, decreases issues with lit pipelines.
* Bumps llvm-project to pick up the update.
2020-10-22 13:04:24 -07:00
Stella Laurenzo 58adb6bd8e Work around various PyTorch issues in support of convolution.
* Enables the conv2d fwd test and ResA (which are both small).
* Deletes resnet18 and vgg, which both run but generate output that crashes FileCheck and lit (or at least makes them take an eternity).
2020-10-21 12:44:31 -07:00
Stella Laurenzo 029815152e Add remaining pieces to capture full example models.
* Adds Basicpy List, Tuple, Dict types and plumbs through C API.
* Started debugging the issues around aten::conv2d capture, but a PyTorch bug is suspected.
* Was able to manually verify that the basic conv2d forward test captures correctly with a workaround.
* Need to resolve some printing issues upstream and move these tests to an integration test target (they take ~seconds to run).
2020-10-19 22:16:59 -07:00
Stella Laurenzo 9e52f6235b More progress on PyTorch acap device capture.
* Now gets far enough to capture batch_norm.
* Has some issues still with in-place ops.
* Can materialize constants.
* Includes an upgrade to PyTorch nightly, which has important bug fixes for fallback and boxed kernel dispatch.
* Fixes #78, #79, #80.
* Will do more testing in a follow-up once further bugs are fixed that facilitate getting at the other features.
2020-10-15 21:43:21 -07:00
Stella Laurenzo abb6fe8aa2 Port prior acap export tests to new dispatcher based versions.
* Sadly, non-trivial ones fail.
* Bugs filed and marked XFAIL.
2020-10-13 16:37:46 -07:00
Stella Laurenzo 30cfc6499f Create public API for torch_mlir python code.
* Adds a trampoline/loader 'torch_mlir' module.
* Plumbs through the MLIR python Context and Module creation, interoping with the MLIR Python API (resolves TODO on creating with own context and accessing the module being built).
* Inter-module Python API interop is still a bit rough but workable via the capsule mechanism. Can be evolved later.
* Exports the frontends/pytorch python sources to the project python/ build directory.
* Requires D89294 to land.
2020-10-13 16:36:49 -07:00
Stella Laurenzo 5c5b8db70f Update test configuration to import mlir from LLVM install location.
* Also adds two lit tests to verify that all of our extensions load without fireworks, which is a good indication that the shared library deps are sane.
* Bumps llvm-project version to use D89167.
2020-10-12 15:25:07 -07:00
Stella Laurenzo af4edb63ae Start reworking towards a shared library build.
* Need to have a dag of shared library deps in order to interop across python extensions (as presented in ODM).
* Introduced add_npcomp_library and friends to mirror the MLIR setup.
* Adds a libNPCOMP.so shared library.
* Redirects tools and extensions to link against libNPCOMP.so (instead of static libs).
* Moves all libraries to lib/, all binaries to bin/ and all python extensions to python/. The invariant is that the rpaths are setup to have a one level directory structure.
* Reworks the _torch_mlir extension to build like the others (still need to come up with a consolidated rule to do this instead of open coded).
* Includes an upstream version bump to pick up needed changes.

Sizes with dynamic linking (stripped, release, asserts enabled):
  libNPCOMP.so: 43M (includes much of the underlying LLVM codegen deps)
  libMLIR.so: 31M
  _npcomp.so: 1.6M (python extension)
  _torch_mlir.so: 670K (python extension)
  npcomp-capi-ir-test: 6.3K
  npcomp-opt: 351K
  npcomp-run-mlir: 461K
  mnist-playground: 530K

Still more can be done to normalize and optimize but this gets us structurally to the starting point.
2020-10-09 16:02:58 -07:00
Stella Laurenzo 3ccc2214a7 Set PyTorch captured function return type.
* Resolves various TODOs that required an LLVM change/bump.
* Bumps LLVM to 4aa217160e5f06a96c6effc4950c3b402374de58
2020-10-07 10:14:34 -07:00
Stella Laurenzo ad3ddb9edb Implement torch.kernel_call capture.
* Had to stop short of modifying the function return signature because of a missing C-API upstream.
* Committing here is good enough for a test and will resolve the various TODOs about upstream APIs next.
2020-10-06 21:54:28 -07:00
Stella Laurenzo e5433e314f Add capture function arguments.
* Adds at::Tensor -> MlirValue tracking.
* Adds conversions for tensor and scalar types to MLIR types.
* Adds npcomp C APIs for constructing custom types.
* Reworks pybind include so as to get Torch pybind helpers (needed to pass at::Tensor type from Python->C++).
2020-10-01 18:59:58 -07:00
Stella Laurenzo ba03ecc652 Add public API for constructing a module/function to capture PyTorch ops.
* Uses the MLIR-C API since that will save us a lot of grief down the road (i.e. will give PyTorch and libMLIR/libNPCOMP the ability to skew version-wise).
* Quite a few TODOs and not yet populating the function in any way.
2020-09-29 14:23:22 -07:00
Stella Laurenzo b5f010284f Add boilerplate to do device capture (pytorch 1.6).
* Uses the new dispatcher API.
* Just prints to the console for the moment when an op is captured.
* Executes the op through the existing implementation.
2020-09-28 10:30:54 -07:00
Stella Laurenzo 0cb28f0b06 Move tests around so we can have dedicated tests for the c10 dispatcher.
* Adds a trivial missing test for _torch_mlir.c10.get_registered_ops()
* Disables the regression tests for now on c10 (until implemented).
2020-09-24 18:28:06 -07:00
Stella Laurenzo 6e6efb2854
Add compatibility notes regarding unpacking quantized weights. (#56)
Co-authored-by: Bryce Arden <arden.bryce@gmail.com>
2020-09-24 17:47:28 -07:00
Stella Laurenzo 0d91885965
Add initial python bindings for c10 dispatcher internals. (#55)
* Exposes the op registry via a get_registered_ops method.
* Moves the aten dialect generation scripts in prep for integrating them with this facility.
2020-09-24 16:26:29 -07:00
Stella Laurenzo 8ac29594df
Explicitly load aten and std dialects when constructing a context. (#47)
* This gets the pytorch frontend broadly working and what is left appears to be legitimate failures in 9 tests.
* Errors noted in #46
2020-09-16 23:06:22 -07:00
Stella Laurenzo 678989a321
Update docker, instructions and some fixes for the pytorch 1.3 build. (#45)
* Includes pybind11 directly (for some reason using the pytorch helper header for this depends on a source file not in the image).
* Installs nnpack into the image.
* Installs new-clang and LLD and configures environment to use it (otherwise, link time is terrible).
* Fixes a gcc compile error (in the off chance you build with default gcc compiler).
* Tests are failing based on some dialect registration stuff that must not have been factored correctly. Will followup with a fix.
2020-09-16 21:57:46 -07:00
Stella Laurenzo de38caa547
Make code that depends on the legacy "type dispatch" mechanism optional. (#32)
* Make code that depends on the legacy "type dispatch" mechanism optional.

* This code is fairly tied to a specific ~1.3 version and uses a legacy dispatch mechanism.
* Moving it and making it optional allows the project to build with PyTorch 1.6 and makes it possible for us to start building out a more modern interface mechanism in parallel.
* Some of the moved code will be brought back into the more modern path, but isolating it now lets this be done incrementally.
* Tests are left failing since the entire frontend is optional and the next step involves reworking the interface mechanism to get them to passing in both regimes.
* Fix a few bogons to get things building
* Add Dockerfile with pytorch

Also, I configure with:
-DCMAKE_PREFIX_PATH="/opt/pytorch/pytorch"

(which is where pytorch is installed in this container)

* Make a dep conditional.

Co-authored-by: stephenneuendorffer <stephen.neuendorffer@xilinx.com>
2020-08-26 12:55:16 -07:00
stephenneuendorffer 31b3041e88
Add pytorch interface to ATen Dialect (#30)
This patch adds a pytorch interface to npcomp.  This interface is modeled
after pytorch_xla and exposes the MLIR-based flow as a virtual device (similar
to a gpu device or the xla backend).  Usage is intended to be something like:

  dev = torch_mlir.mlir_device()
  t0 = torch.randn((4,4), device=dev)
  t1 = torch.randn((4,4), device=dev)
  t2 = t0 + t1
  t2_mlir = torch_mlir.get_mlir( t2 )
  t2_cpu = t2.to('cpu')

In this case t2_cpu would contain the result of the computation, and t2_mlir
contains the mlir description of the computation.  Note that this also
properly returns backward paths synthesized by pytorch.  There are several
parts of this:

1) A tensor type (implemented by tensor.* and tensor_impl.*)
2) The device modeling (aten_mlir_bridge.*, aten_mlir_device.*, aten_mlir_type*)
3) a temporary IR (implemented by ir.cpp)

There is also a reference lowering directly from the ATen dialect to C
function calls consisting of two parts:

1) The driver that uses the IR to generate MLIR, run Passes and compile the
result using mlir::ExecutionEngine (implemented by jit.cpp and
mlir_gen.cpp)
2) A runtime library implemented by lib/aten_ops.cpp.  Most of the operations
are implemented by callbacks into the torch C++ libraries.

Some aspects of this are known to be less than optimal, in particular:
1) There's some function definitions that don't live in the file corresponding
to their declaration.
2) More aspects of this (e.g. the IR) seem like they should be automatically
generated.
3) It's unclear to me how much of the 'IR' is actually necessary, or whether
MLIR could be created on the fly.

Note that this code is licensed in a way similar to pytorch, with the
intention that eventually (when npcomp reaches some maturity) it should be
pushed there.  (see frontends/pytorch/LICENSE)  The code is also structured
much closer to the pytorch coding style than the LLVM coding style.
2020-08-21 11:22:47 -07:00
Stella Laurenzo 77b235f621
Create frontends/pytorch directory. (#31)
* Adds/updates readmes with some notes about code organization and direction.
* Meant to prepare a space for upcoming integration of #30.
2020-08-18 09:43:20 -07:00