torch-mlir

Commit Graph

Author	SHA1	Message	Date
mikeurbach	0f6a65a1c5	Enable building using LLVM_EXTERNAL_PROJECTS. (#152 ) This allows building NPCOMP as an external project of LLVM, similar to how CIRCT can be built: https://github.com/llvm/circt/pull/227. The CMake options to use this build style look like this: ``` -DLLVM_EXTERNAL_PROJECTS=npcomp \ -DLLVM_EXTERNAL_NPCOMP_SOURCE_DIR=/path/to/mlir-npcomp \ ```	2021-01-26 11:43:43 -07:00
Stella Laurenzo	9e52f6235b	More progress on PyTorch acap device capture. * Now gets far enough to capture batch_norm. * Has some issues still with in-place ops. * Can materialize constants. * Includes an upgrade to PyTorch nightly, which has important bug fixes for fallback and boxed kernel dispatch. * Fixes #78, #79, #80. * Will do more testing in a follow-up once further bugs are fixed that facilitate getting at the other features.	2020-10-15 21:43:21 -07:00
Stella Laurenzo	af4edb63ae	Start reworking towards a shared library build. * Need to have a dag of shared library deps in order to interop across python extensions (as presented in ODM). * Introduced add_npcomp_library and friends to mirror the MLIR setup. * Adds a libNPCOMP.so shared library. * Redirects tools and extensions to link against libNPCOMP.so (instead of static libs). * Moves all libraries to lib/, all binaries to bin/ and all python extensions to python/. The invariant is that the rpaths are setup to have a one level directory structure. * Reworks the _torch_mlir extension to build like the others (still need to come up with a consolidated rule to do this instead of open coded). * Includes an upstream version bump to pick up needed changes. Sizes with dynamic linking (stripped, release, asserts enabled): libNPCOMP.so: 43M (includes much of the underlying LLVM codegen deps) libMLIR.so: 31M _npcomp.so: 1.6M (python extension) _torch_mlir.so: 670K (python extension) npcomp-capi-ir-test: 6.3K npcomp-opt: 351K npcomp-run-mlir: 461K mnist-playground: 530K Still more can be done to normalize and optimize but this gets us structurally to the starting point.	2020-10-09 16:02:58 -07:00
Stella Laurenzo	0cb28f0b06	Move tests around so we can have dedicated tests for the c10 dispatcher. * Adds a trivial missing test for _torch_mlir.c10.get_registered_ops() * Disables the regression tests for now on c10 (until implemented).	2020-09-24 18:28:06 -07:00
Stella Laurenzo	de38caa547	Make code that depends on the legacy "type dispatch" mechanism optional. (#32 ) * Make code that depends on the legacy "type dispatch" mechanism optional. * This code is fairly tied to a specific ~1.3 version and uses a legacy dispatch mechanism. * Moving it and making it optional allows the project to build with PyTorch 1.6 and makes it possible for us to start building out a more modern interface mechanism in parallel. * Some of the moved code will be brought back into the more modern path, but isolating it now lets this be done incrementally. * Tests are left failing since the entire frontend is optional and the next step involves reworking the interface mechanism to get them to passing in both regimes. * Fix a few bogons to get things building * Add Dockerfile with pytorch Also, I configure with: -DCMAKE_PREFIX_PATH="/opt/pytorch/pytorch" (which is where pytorch is installed in this container) * Make a dep conditional. Co-authored-by: stephenneuendorffer <stephen.neuendorffer@xilinx.com>	2020-08-26 12:55:16 -07:00
stephenneuendorffer	31b3041e88	Add pytorch interface to ATen Dialect (#30 ) This patch adds a pytorch interface to npcomp. This interface is modeled after pytorch_xla and exposes the MLIR-based flow as a virtual device (similar to a gpu device or the xla backend). Usage is intended to be something like: dev = torch_mlir.mlir_device() t0 = torch.randn((4,4), device=dev) t1 = torch.randn((4,4), device=dev) t2 = t0 + t1 t2_mlir = torch_mlir.get_mlir( t2 ) t2_cpu = t2.to('cpu') In this case t2_cpu would contain the result of the computation, and t2_mlir contains the mlir description of the computation. Note that this also properly returns backward paths synthesized by pytorch. There are several parts of this: 1) A tensor type (implemented by tensor.* and tensor_impl.) 2) The device modeling (aten_mlir_bridge., aten_mlir_device., aten_mlir_type) 3) a temporary IR (implemented by ir.cpp) There is also a reference lowering directly from the ATen dialect to C function calls consisting of two parts: 1) The driver that uses the IR to generate MLIR, run Passes and compile the result using mlir::ExecutionEngine (implemented by jit.cpp and mlir_gen.cpp) 2) A runtime library implemented by lib/aten_ops.cpp. Most of the operations are implemented by callbacks into the torch C++ libraries. Some aspects of this are known to be less than optimal, in particular: 1) There's some function definitions that don't live in the file corresponding to their declaration. 2) More aspects of this (e.g. the IR) seem like they should be automatically generated. 3) It's unclear to me how much of the 'IR' is actually necessary, or whether MLIR could be created on the fly. Note that this code is licensed in a way similar to pytorch, with the intention that eventually (when npcomp reaches some maturity) it should be pushed there. (see frontends/pytorch/LICENSE) The code is also structured much closer to the pytorch coding style than the LLVM coding style.	2020-08-21 11:22:47 -07:00

6 Commits (89d4931324589bf75ed088e98888ff21fe7cd41e)