torch-mlir

Commit Graph

Author	SHA1	Message	Date
Yi Zhang	e0ff5248fb	Add TorchList type and prim::ListConstruct #218	2021-06-10 14:31:35 -07:00
Sean Silva	370e3270ab	Introduce `!torch.tensor` / `!torch.vtensor` types. This removes our reliance on the numpy dialect and avoids our off-label use of the builtin tnesor type for modeling unknown dtypes. The `!torch.vtensor` (`ValueTensorType`) type is a value-semantic tensor. The `!torch.tensor` (`NonValueTensorType`) type is a non-value-semantic tensor. The new types look as follows syntactically: ``` // Least-static-information, non-value-semantic tensor. !torch.tensor // Explicit form of least-static-information variant. !torch.tensor<,unk> // Least-static-information, value-semantic tensor. !torch.vtensor // Explicit form of least-static-information variant. !torch.vtensor<,unk> // Fixed-set of allowable element types, with first-class support for // Torch's frontend signedness semantics. !torch.tensor<*,si32> // First-class support for unknown dtypes. !torch.tensor<[?,?,?],unk> // Standard MLIR representation of `?` for unknown dimensions. !torch.tensor<[?,2,?,4],unk> // Statically shaped / dtyped example. !torch.vtensor<[1,2,3,4],f32> ``` This required fairly significant changes throughout the compiler, but overall it is a big cleanup. We now have a much clearer layering of "the Torch frontend lowering" vs "lowering to std + linalg + etc.". At the C++ level, there is `ValueTensorType`, `NonValueTensorType`. We also have a helper `BaseTensorType` (kind of like ShapedType) which interoperates with those two. Included changes: - New `torch.tensor(dense<0.0> : tensor<5xf32>) : !torch.tensor` op for creating torch tensor literals in the frontend. - Consistently use signedness for the types (except i1 which I didn't touch -- we need to sort out the situation with !basicpy.BoolType there anyway so will be attending to that soon) - Frontend can annotate whether an argument to the function has value semantics. We currently require this, as our backend contract does not currently allow us to even model the non-value-semantic case. Before, the value-semantic assumption was randomly injected in the middle of the pass pipeline. - Move ArrayToTensor (now called MaximizeValueSemantics) and RefinePublicReturn passes to torch dialect. - The TorchToStd and TorchToLinalg passes are now type conversions from `!torch.vtensor` to `tensor` and use the dialect conversion infra. The overall conversion pipeline is set up following the best practices of the "Type Conversions the Not-So-Hard Way" talk. This required introducing `torch-func-builtin-tensorize` and `torch-finalizing-builtin-tensorize` passes analogous to the upstream bufferization passes with the corresponding names (mostly just copypasta from there). - Misc Torch-level canonicalizations -- we now cleanly layer the lowering to std later in the pipeline, so we are gradually lessening our reliance on random std constant folding before we get to that point. Recommended review order: - New types in TorchTypes.td/TorchTypes.h/TorchDialect.cpp - New ops in TorchOps.td / TorchOps.cpp - Less important / more mechanical stuff - Frontend changes. - Pass changes/additions in `Torch/Transforms` and `Conversion/`	2021-06-10 10:56:48 -07:00
Sean Silva	2efda323ff	Significantly restructure torch/aten import design. This is a really major and invasive restructuring of the way we get torch operators (`torch::jit::Operator` / `c10::OperatorHandle`) into MLIR. Please forgive the challenging review, but due to the sheer invasiveness, it wasn't really practical do do it in sane smaller pieces. This fully replaces everything that was already working on the TorchScript path (actually, more -- we added tanh support to TorchToLinalg in order to delete the older code paths). Additionally, I've kept the lights on for the acap path too, including what little e2e stuff was working before (for expediency I made a few tiny compromises along the way that will be easy to undo when we give that path proper attention). Overview of the new design: - The torch operator `somens::someunqualname.someoverloadname` is imported as `torch.somens.someunqualname.someoverloadname` (skip the last dotted part if the overload name is empty), OR, if we don't have such an op registered, it is imported as `torch.operator "somens.someunqualname.someoverloadname" (...) : ...`. - The addition of the "overload name" is a critical element here, as the `(ns,unqual,overload)` triple is unique, which solves a lot of problems we were having. - This involves having separate MLIR ops for the `trailing_` and `.out` variants and all the different overloads. This seemed necessary, because the set of overloads is so wild and varied and unstructured. The previous design was leaning into some underlying structure that just isn't there -- the default situation is the "random overload that we want to manage on the MLIR side", rather than that being an exception. E.g. `aten::ne` (not-equal) has 21 overloads, only 4 of which are c10 dispatcher ops see [gist](https://gist.github.com/silvasean/190ba918c550c956260e21254e1b8aa1), and the "out" variant is really called `.Tensor_out` instead of `.out` as it frequently is for other ops. - Rationale for all being in `torch` namespace: the set of operators are so varied and unstructured that "dialect per namespace" doesn't result in anything resembling the typical MLIR dialect boundary expectations. We could maybe draw the boundary at dispatcher ops vs non-dispatcher ops, but that doesn't seem to really result in very much useful structure at this point in time. - Note: within the torch operator registry, we effectively have a mini-basicpy subdialect (already type-resolved), which is reasonably structured. - The existing Torch op interfaces are also removed -- now that we track the overload name, we can losslessly find the original operator. - Instead of `ATenRecognizeKernelsPass`, we now have a `ReduceOpVariantsPass` that keys off certain traits (and perhaps eventually interfaces) to reduce variants of ops to a smaller set, ideally operating on immutable tensors and using surrounding ops to model the mutability/aliasing aspects. - Note: `torch.ns.unqual.overload` ops allow both immutable and mutable tensors (unlike the previous hard distinction in the common case). This is a premonition for a future change that will introduce a bona fide `!torch.tensor` type that will clean up a bunch of stuff. - `TorchToLinalg` / `TorchToStd` supercede the existing "ATen->TCF->TCP->Linalg" path. - The new `torch_ods_gen.py` supercedes `torch_signature_ods_gen.py`. It should look somewhat familiar, but the benefit of hindsight has allowed a lot of simplifications. The overall trend seems to be to make the `torch` dialect a nice layer independent of anything else. It feels like as a natural result of various future changes we will be removing the reliance on basicpy+numpy dialects and have a nice self-contained type system too that properly models the TorchScript type system (including proper subtyping, mutable/immutable tensors, optional dtype, etc.). Recommended review order: - Start at some of the new import IR, e.g. in `frontends/pytorch/test/node_import/prim.py`, `frontends/pytorch/test/acap_export/test_export_add3.py`, and other tests. - `frontends/pytorch/python/torch_mlir_utils/codegen/torch_ods_gen.py` and associated generated files: - `include/npcomp/Dialect/Torch/IR/GeneratedAtenOps.td` - `include/npcomp/Dialect/Torch/IR/GeneratedPrimOps.td` - Inspect `ReduceOpVariants.cpp` / `reduce-op-variants.mlir` and the new traits in `include/npcomp/Dialect/Torch/IR/TorchTraits.h` - Various code changes in the import path in `frontends/pytorch/csrc/builder`. Probably most interesting is the new code in `torch_to_mlir_utils.cpp` that has the logic to create the `torch.operator` ops or `torch.ns.unqual.overload` ops. This is the [new ResNet IR](https://gist.github.com/silvasean/5407aafb710d07612b7b5b92eabecebe), just to be able to look at a substantial sample of IR in the new style.	2021-05-19 13:37:39 -07:00
Sean Silva	99178a167d	Bump llvm-project to 0524a09cc7e1a0797982feacf505825231efbee7 - renames of OwningRewritePatternList -> RewritePatternSet - also `insert` to `add` - RewritePatternSet holds a context now - memref dialect split from std	2021-03-23 14:29:05 -07:00
Bairen Yi	fead0312f1	Revert "Also fallback autograd dispatch keys for torchvision::nms" This reverts commit `30a42dea32`.	2021-03-16 19:37:45 -07:00
Bairen Yi	30a42dea32	Also fallback autograd dispatch keys for torchvision::nms Signed-off-by: Bairen Yi <yibairen.byron@bytedance.com>	2021-03-15 17:58:08 -07:00
Sean Silva	a36113e586	Fix recent break due to PyTorch changes. Tracing seems now now capture a 4-operand version of aten::add instead of 3-operand. I fixed the tests that made sense. One test was XFAIL'ed, as I don't have in cache the exact way to fix it yet (requires touching aten-recogniz-kernels stuff). I'll be context switching to work on the kernel recognition stuff soon, and will fix it then.	2021-03-03 18:35:23 -08:00
Sean Silva	c4e4a11e3f	Add support for prim::GetAttr/SetAttr/CallMethod/If This required some invasive surgery to graph_importer.h/cpp, specifically moving most of it into node_importer.h/cpp and relayering it. The abstraction that it had didn't work well in the recursive setting that happens with prim::If. The key observation is that torch::jit::Graph doesn't really correspond directly to anything on the MLIR side. It's a weird combination of a context, builder, and function and just holds a `torch::jit::Block`. It is `torch::jit::Node` and `torch::jit::Block` which form the recursive structure analogous to MLIR's operation/region/block. So node_importer.h/cpp makes sense as a core building block. As part of doing this, I did venture a bit into the AcapController code, and realize now that there is functionality duplicated there with the ivalue importer. Will refactor that soon.	2021-02-04 17:01:47 -08:00
Stella Laurenzo	78a3c90758	Add TorchScript graph importer. * Does not handle all features yet but should conservatively fail on unsupported things. * Location tracking is still somewhat mismatched between what TorchScript and MLIR do. Likely need a better heuristic for tracking locations from defs for nodes that do not carry location. * Sets the ground-work for a specialized/generic split but only implements the generic side. * Had some evidence that this requires a recent bump of PT nightly (within the last month) to pick up pybind11 2.6, which includes some cross-module symbol fixes (vs the previously sync'd version). No source changes, but older versions fail to cast function types at runtime.	2020-11-23 14:20:09 -08:00
Stella Laurenzo	e359167562	Fix dispatch of arange. * Fixes #107 * I wouldn't say I love what had to be done here. Worth a conversation with the PT devs (probably as part of a rollup of a bunch of this stuff).	2020-11-12 22:07:23 -08:00
Harsh Menon	c2d3820e48	Fix insertion point bug #102 The current code was inserting all build_list ops after the last constant op since it was assuming that all elements being passed in were constants. This patch replaces that patch with a new function that inserts the build_list ops before the terminator. Also modifies test_export_conv2d_fwd.py since its output no longer matches. TEST: Added test_export_cat.py which is the code in #102	2020-11-02 16:41:26 -08:00
Stella Laurenzo	0c73c535d6	Capture backward conv and copy_ kernels. * This is sufficient to capture the forward and backward pass and gradients of a convolutional model with an nllloss. * As with the forward conv, the backward conv is a special case wrapped in an enigma on the PyTorch side. There aren't many like it, so special casing is just what we do. * When I traced this, I found that the copy_ op is not yet boxing compatible so I had to map it manually. If there are many more like this, I'll probably do something a bit more clever to reduce duplication. * This exposes new signature patterns that will need to be handled by the ATen lowering. Will take care of that next: It will be nice to have an e2e of a non-trivial case with full gradients. * Fixes #97.	2020-10-30 22:59:26 -07:00
Stella Laurenzo	8d98dd4551	Support optional args/returns and other odds and ends. * None's out Device? args. * Emits bool tensors if needed. * Adds some stderr tracing to better see what is going on. * Test case that exercises NLLLoss. * This test case emits something for backward calculations but there are some issues still to be worked out, so that part is left out of the test case. * Progress on #97	2020-10-30 14:50:28 -07:00
Stella Laurenzo	510f226df2	Expose signature metadata to ops and implement ATenRecognizeKernelsPass pass. * Two op interfaces, one for querying instance metadata and one for getting static data needed to construct an op from a generic form. * For torch.generic_kernel ops, metadata is splatted in during capture from Torch (it comes from the op registry, which will work for either device capture or graph import). * Moved the 'add' out of the generated set so I can experiment on it. It implements the TorchBuildableKernelOpInterface interface which provides its metadata. * The ATenRecognizeKernelsPass pass generically lowers from a torch.generic_kernel to recognized ops that implement the TorchBuildableKernelOpInterface, handling the various types of transformations that we allow at this stage.	2020-10-26 20:31:45 -07:00
Stella Laurenzo	d09300886a	NFC: Use new print with large_elements_limit in tests. * For tests with large constants, decreases issues with lit pipelines. * Bumps llvm-project to pick up the update.	2020-10-22 13:04:24 -07:00
Stella Laurenzo	58adb6bd8e	Work around various PyTorch issues in support of convolution. * Enables the conv2d fwd test and ResA (which are both small). * Deletes resnet18 and vgg, which both run but generate output that crashes FileCheck and lit (or at least makes them take an eternity).	2020-10-21 12:44:31 -07:00
Stella Laurenzo	029815152e	Add remaining pieces to capture full example models. * Adds Basicpy List, Tuple, Dict types and plumbs through C API. * Started debugging the issues around aten::conv2d capture, but a PyTorch bug is suspected. * Was able to manually verify that the basic conv2d forward test captures correctly with a workaround. * Need to resolve some printing issues upstream and move these tests to an integration test target (they take ~seconds to run).	2020-10-19 22:16:59 -07:00
Stella Laurenzo	9e52f6235b	More progress on PyTorch acap device capture. * Now gets far enough to capture batch_norm. * Has some issues still with in-place ops. * Can materialize constants. * Includes an upgrade to PyTorch nightly, which has important bug fixes for fallback and boxed kernel dispatch. * Fixes #78, #79, #80. * Will do more testing in a follow-up once further bugs are fixed that facilitate getting at the other features.	2020-10-15 21:43:21 -07:00
Stella Laurenzo	abb6fe8aa2	Port prior acap export tests to new dispatcher based versions. * Sadly, non-trivial ones fail. * Bugs filed and marked XFAIL.	2020-10-13 16:37:46 -07:00

19 Commits (ea1dd1cd906dd98f5404d690158a93faa0d06c15)