torch-mlir

Commit Graph

Author	SHA1	Message	Date
Sean Silva	2efda323ff	Significantly restructure torch/aten import design. This is a really major and invasive restructuring of the way we get torch operators (`torch::jit::Operator` / `c10::OperatorHandle`) into MLIR. Please forgive the challenging review, but due to the sheer invasiveness, it wasn't really practical do do it in sane smaller pieces. This fully replaces everything that was already working on the TorchScript path (actually, more -- we added tanh support to TorchToLinalg in order to delete the older code paths). Additionally, I've kept the lights on for the acap path too, including what little e2e stuff was working before (for expediency I made a few tiny compromises along the way that will be easy to undo when we give that path proper attention). Overview of the new design: - The torch operator `somens::someunqualname.someoverloadname` is imported as `torch.somens.someunqualname.someoverloadname` (skip the last dotted part if the overload name is empty), OR, if we don't have such an op registered, it is imported as `torch.operator "somens.someunqualname.someoverloadname" (...) : ...`. - The addition of the "overload name" is a critical element here, as the `(ns,unqual,overload)` triple is unique, which solves a lot of problems we were having. - This involves having separate MLIR ops for the `trailing_` and `.out` variants and all the different overloads. This seemed necessary, because the set of overloads is so wild and varied and unstructured. The previous design was leaning into some underlying structure that just isn't there -- the default situation is the "random overload that we want to manage on the MLIR side", rather than that being an exception. E.g. `aten::ne` (not-equal) has 21 overloads, only 4 of which are c10 dispatcher ops see [gist](https://gist.github.com/silvasean/190ba918c550c956260e21254e1b8aa1), and the "out" variant is really called `.Tensor_out` instead of `.out` as it frequently is for other ops. - Rationale for all being in `torch` namespace: the set of operators are so varied and unstructured that "dialect per namespace" doesn't result in anything resembling the typical MLIR dialect boundary expectations. We could maybe draw the boundary at dispatcher ops vs non-dispatcher ops, but that doesn't seem to really result in very much useful structure at this point in time. - Note: within the torch operator registry, we effectively have a mini-basicpy subdialect (already type-resolved), which is reasonably structured. - The existing Torch op interfaces are also removed -- now that we track the overload name, we can losslessly find the original operator. - Instead of `ATenRecognizeKernelsPass`, we now have a `ReduceOpVariantsPass` that keys off certain traits (and perhaps eventually interfaces) to reduce variants of ops to a smaller set, ideally operating on immutable tensors and using surrounding ops to model the mutability/aliasing aspects. - Note: `torch.ns.unqual.overload` ops allow both immutable and mutable tensors (unlike the previous hard distinction in the common case). This is a premonition for a future change that will introduce a bona fide `!torch.tensor` type that will clean up a bunch of stuff. - `TorchToLinalg` / `TorchToStd` supercede the existing "ATen->TCF->TCP->Linalg" path. - The new `torch_ods_gen.py` supercedes `torch_signature_ods_gen.py`. It should look somewhat familiar, but the benefit of hindsight has allowed a lot of simplifications. The overall trend seems to be to make the `torch` dialect a nice layer independent of anything else. It feels like as a natural result of various future changes we will be removing the reliance on basicpy+numpy dialects and have a nice self-contained type system too that properly models the TorchScript type system (including proper subtyping, mutable/immutable tensors, optional dtype, etc.). Recommended review order: - Start at some of the new import IR, e.g. in `frontends/pytorch/test/node_import/prim.py`, `frontends/pytorch/test/acap_export/test_export_add3.py`, and other tests. - `frontends/pytorch/python/torch_mlir_utils/codegen/torch_ods_gen.py` and associated generated files: - `include/npcomp/Dialect/Torch/IR/GeneratedAtenOps.td` - `include/npcomp/Dialect/Torch/IR/GeneratedPrimOps.td` - Inspect `ReduceOpVariants.cpp` / `reduce-op-variants.mlir` and the new traits in `include/npcomp/Dialect/Torch/IR/TorchTraits.h` - Various code changes in the import path in `frontends/pytorch/csrc/builder`. Probably most interesting is the new code in `torch_to_mlir_utils.cpp` that has the logic to create the `torch.operator` ops or `torch.ns.unqual.overload` ops. This is the [new ResNet IR](https://gist.github.com/silvasean/5407aafb710d07612b7b5b92eabecebe), just to be able to look at a substantial sample of IR in the new style.	2021-05-19 13:37:39 -07:00
Sean Silva	55c3cc6624	Add recognition/folder/lowering for aten::__is__, aten::ne.int, and aten::dim Interestingly, TorchScript has its own op (`torch::jit::Operator`) registry separate from the dispatcher (it is a superset of the dispatcher). This is where the "prim" ops and some "aten" ops (that should probably be renamed to "prim") live. In particular, `aten::__is__` is in that latter category of "aten but really prim". This registry is also the source of truth for what the TorchScript interpreter calls into when it executes. The bulk of the "not part of the dispatcher" ops live in `09feb5f579/torch/csrc/jit/runtime/register_prim_ops.cpp (L82)` And the registry itself lives in: `09feb5f579/torch/csrc/jit/runtime/operator.cpp (L196)` This fold further reduces the IR of ResNet by folding away some more not-taken branches. These not-taken branches in ResNet require first-class handling of the list type which we don't yet have on any backend.	2021-04-30 10:57:02 -07:00
Sean Silva	f5dfa02523	Add `aten.mm` to linalg lowering. This is our first op with error semantics, and stresses the system. There are a few design notes of special interest: - RefineTypes.cpp's note about shape inference in the presence of code that dynamically produces and error, and it is provable statically. - ATenToLinalg.cpp's notes about future automation of the ATen->linalg path. - The notes in Passes.td about using low-tech `std.assert` ops instead of `shape.assuming`. Note: Doesn't work on IREE yet due to the `std.assert` op (needs to be lowered to `vm.fail` on the IREE side).	2021-04-16 12:03:31 -07:00
Stella Laurenzo	a7ff87a922	Sever C++ level depend on IREE and rebase on exe and python interface. * IREE doesn't have proper install support, so there is some temporary hoaky hacking in our CMakeLists.txt to shuttle some symlinks around. * Reworked the original numpy e2e with IREE test to pipe through iree-translate. * Removed all of the C++-level dependencies. * Will generalize and apply to the PyTorch backend in a followup.	2020-11-16 21:32:56 -08:00
Sean Silva	1c7c362e29	[TCP] Replace tcp.matmul with linalg.matmul. This involved adding a `tcp.splatted` op to splat a dynamically sized init tensor. See rationale in TCPOps.td docs. One interesting observation is that when lowering tcf.matmul to linalg.matmul, we need to both 1) create the error checks and 2) calculate a shape transfer function to create the init tensors. Previously, 2) was deferred to bufferizing tcp.matmul later. I'm not sure if this is a conflation of concerns or not. For now, it's not a big burden.	2020-11-10 18:58:28 -08:00
Sean Silva	0427aacb0b	[TCP] Replace elementwise ops with std elementwise ops.	2020-11-10 18:58:28 -08:00
Stella Laurenzo	e60dc2470e	Add aten.maximum op and conversions from aten->tcf. * Conversions are very simple, suporting mul, maximum and add (alpha=1 only). * Example added with pass pipeline needed to run. * Much missing off of the golden path but sufficient for such simple cases.	2020-11-04 17:20:54 -08:00
Stella Laurenzo	af4edb63ae	Start reworking towards a shared library build. * Need to have a dag of shared library deps in order to interop across python extensions (as presented in ODM). * Introduced add_npcomp_library and friends to mirror the MLIR setup. * Adds a libNPCOMP.so shared library. * Redirects tools and extensions to link against libNPCOMP.so (instead of static libs). * Moves all libraries to lib/, all binaries to bin/ and all python extensions to python/. The invariant is that the rpaths are setup to have a one level directory structure. * Reworks the _torch_mlir extension to build like the others (still need to come up with a consolidated rule to do this instead of open coded). * Includes an upstream version bump to pick up needed changes. Sizes with dynamic linking (stripped, release, asserts enabled): libNPCOMP.so: 43M (includes much of the underlying LLVM codegen deps) libMLIR.so: 31M _npcomp.so: 1.6M (python extension) _torch_mlir.so: 670K (python extension) npcomp-capi-ir-test: 6.3K npcomp-opt: 351K npcomp-run-mlir: 461K mnist-playground: 530K Still more can be done to normalize and optimize but this gets us structurally to the starting point.	2020-10-09 16:02:58 -07:00
Sean Silva	dc8afc9271	[RefE2E] Refactor how tcf.add is lowered. It was previously going through this awkward route that prematurely created linalg.generic ops, which was an annoying layering problem since we can't compute a shape transfer function for linalg.generic in the general case. Now we pass it through the same path as tcp.matmul, with the shape transfer function being defined for tcp.add. This also removed the need for TCPToLinalg (now deleted). The equivalent of that is happening in lower-shaped-results-to-memref. One interesting outcome of this: we're basically using linalg as a "Buffer TCP". We might want to look into using named structured ops for more of TCP, but that would be a big velocity hit since then any change to the ODS / verification for those ops would be a change to the upstream structured op ODS generator. After we have more experience defining this manually, we should re-evaluate rebasing TCP on generated named linalg ops.	2020-09-18 15:03:53 -07:00
Stella Laurenzo	97d83f786a	Bump submodule versions. * llvm-project: b5924a8e27536d19dd5c4d302db29fb6163d5faa * mhlo: 848ca244d20f045b7921da55a98a04d95ef94f0e * Multiple breakages that need to be fixed. Fixes: * Refactor dialect registration * Remove all kindof methods (Casting functionality has been added upstream and is implicitly available, see https://llvm.discourse.group/t/removing-kinds-from-attributes-and-types/1547.) * Update dialect registration to comply with https://reviews.llvm.org/D85495. * Remove type kinds and update some changed dialect signatures. * Upgrade ATen dialect to match upstream needs. * Move dialect registration to tablegen. * Register the ListType in tablegen. * Change dialect initialization signature. * Use TypeSwitch in MlirIr location printer. * Remove global registry depends from npcomp-opt. * Change LowerToLLVM to pass an MLIRContext vs an LLVMDialect for type creation. * Remove dep on MLIREDSCInterface that is removed upstream. * Thread through the DialectRegistry for opt and python-like tools. * Modernize pass registration (This was forced because the GEN_PASS_REGISTRATION code now generates inline functions vs literal pass registration statements) Co-authored-by: Marius Brehler <marius.brehler@iml.fraunhofer.de>	2020-09-08 13:26:42 -07:00
Stella Laurenzo	5ceb37c19b	Add NumpyToTCF conversion. * Just for numpy.add right now.	2020-07-08 21:03:57 -07:00
Stella Laurenzo	b21b5322f6	Basicpy conversion to IREE+std skeleton and first conversions. * Conversions to std for numeric binary expressions, numeric to_boolean, and numeric comparisons. * Added folders to constant ops to comply with requirements of the pass system. * Extended the frontend with parameter/result annotation processing for primitives (can specify types for function arguments). * Added (empty) directory/sources for IREEVM conversions. These are only enabled if IREE is enabled.	2020-06-13 23:45:43 -07:00
Sean Silva	e29aef855b	Initial TCF/TCP E2E seed. Very much WIP. This is enough to get tcf.add down to approximately the "linalg.generic on buffers" level of abstraction. (but there are nuances)	2020-05-08 20:20:41 -07:00

13 Commits (5ad144c4feadcf88aad0f5f5acfa8a19c1a45522)