Commit Graph

161 Commits (2e63f4b1e11dfac6b43fbf3fcc87e91a944f7c37)

Author SHA1 Message Date
Sean Silva 370e3270ab Introduce `!torch.tensor` / `!torch.vtensor` types.
This removes our reliance on the numpy dialect and avoids our off-label
use of the builtin tnesor type for modeling unknown dtypes.  The
`!torch.vtensor` (`ValueTensorType`) type is a value-semantic tensor.
The `!torch.tensor` (`NonValueTensorType`) type is a non-value-semantic
tensor. The new types look as follows syntactically:

```
// Least-static-information, non-value-semantic tensor.
!torch.tensor
// Explicit form of least-static-information variant.
!torch.tensor<*,unk>
// Least-static-information, value-semantic tensor.
!torch.vtensor
// Explicit form of least-static-information variant.
!torch.vtensor<*,unk>
// Fixed-set of allowable element types, with first-class support for
// Torch's frontend signedness semantics.
!torch.tensor<*,si32>
// First-class support for unknown dtypes.
!torch.tensor<[?,?,?],unk>
// Standard MLIR representation of `?` for unknown dimensions.
!torch.tensor<[?,2,?,4],unk>
// Statically shaped / dtyped example.
!torch.vtensor<[1,2,3,4],f32>
```

This required fairly significant changes throughout the compiler, but
overall it is a big cleanup. We now have a much clearer layering of "the
Torch frontend lowering" vs "lowering to std + linalg + etc.".

At the C++ level, there is `ValueTensorType`, `NonValueTensorType`.
We also have a helper `BaseTensorType` (kind of like ShapedType) which
interoperates with those two.

Included changes:
- New `torch.tensor(dense<0.0> : tensor<5xf32>) : !torch.tensor` op for
  creating torch tensor literals in the frontend.
- Consistently use signedness for the types (except i1 which I didn't
  touch -- we need to sort out the situation with !basicpy.BoolType
  there anyway so will be attending to that soon)
- Frontend can annotate whether an argument to the function has value
  semantics. We currently require this, as our backend contract does not
  currently allow us to even model the non-value-semantic case. Before,
  the value-semantic assumption was randomly injected in the middle of
  the pass pipeline.
- Move ArrayToTensor (now called MaximizeValueSemantics) and
  RefinePublicReturn passes to torch dialect.
- The TorchToStd and TorchToLinalg passes are now type conversions from
  `!torch.vtensor` to `tensor` and use the dialect conversion infra.
  The overall conversion pipeline is set up following the best practices
  of the "Type Conversions the Not-So-Hard Way" talk. This required
  introducing `torch-func-builtin-tensorize` and
  `torch-finalizing-builtin-tensorize` passes analogous to the upstream
  bufferization passes with the corresponding names (mostly just
  copypasta from there).
- Misc Torch-level canonicalizations -- we now cleanly layer the
  lowering to std later in the pipeline, so we are gradually lessening
  our reliance on random std constant folding before we get to that
  point.

Recommended review order:
- New types in TorchTypes.td/TorchTypes.h/TorchDialect.cpp
- New ops in TorchOps.td / TorchOps.cpp
- Less important / more mechanical stuff
  - Frontend changes.
  - Pass changes/additions in `Torch/Transforms` and `Conversion/`
2021-06-10 10:56:48 -07:00
Sean Silva b7b7fd4959 Rewrite error reporting of e2e tests.
This now gives [much nicer output](https://gist.github.com/silvasean/f048e0f37b04542dae6469b86802bb3e).
Embarrassingly, we previously couldn't even report failures for two
different tests, and weren't able to report on compilation failures
(besides just crashing).
2021-05-20 11:28:20 -07:00
Sean Silva d66e8fe1f8 Get simple quantized model importing.
This is enough to import the program and get it through the compilation
pipeline. It of course fails at the VerifyBackendContract pass since
there is a lot missing, but the final IR for a simple quantized MLP is
looking pretty decent already:
[IR](https://gist.github.com/silvasean/f76bccd76e9b193d396cfb2f9a11f54d)

Main changes:
- Add support for importing torch quantized tensors, including
  `torch.per_tensor_affine.create` op and `!torch.qint8` element type.
- Add support for importing `LinearPackedParamsBase` (basically a weight
  + optional bias, but requires `torch.linear_params.create` op +
  `!torch.LinearParams` type to model it). This was less painful than I
  expected, as it has the necessary methods to opaquely unpack itself. I
  factored things so it should be easy to extend to other custom classes
  like `ConvPackedParamsBase`.
- Add minimal boilerplate for importing `quantized::*` ops, with
  `quantized::linear` being a motivating example.
- Add e2e test with simple quantized MLP (courtesy of @phoenix-meadowlark).

This is somewhat of an abuse of `!numpy.ndarray` / `tensor`, as
really the proper semantics of `!torch.qint8` dtype on a Torch tensor is
"check the quantizer object of the tensor for side data (scale/offset,
possibly per-channel) that defines the full semantics of the tensor". We
don't have any such notion of "side data" for `!numpy.ndarray` /
`tensor`, let alone anything that would have the associated behavior of
keying off the dtype to determine if the side data is present.
This will be fixed by a proper `!torch.tensor` type.
2021-05-20 11:28:20 -07:00
Sean Silva 0c89296075 Shore up error reporting for TorchScript import.
This code was not exception safe -- it would leave an operation
unattached to anything, which breaks MLIR's C++ data structure
invariants (e.g. it cannot safely erase ops).

Also, print out both the exception and any diagnostics, since they can
both contain useful information.
2021-05-20 11:28:20 -07:00
Sean Silva d50ea8d31e Improve diagnostic handler
It wasn't printing notes or putting the "error:" in front.
2021-05-20 11:28:20 -07:00
Sean Silva 2efda323ff Significantly restructure torch/aten import design.
This is a really major and invasive restructuring of the way we get
torch operators (`torch::jit::Operator` / `c10::OperatorHandle`) into
MLIR. Please forgive the challenging review, but due to the sheer
invasiveness, it wasn't really practical do do it in sane smaller
pieces.

This fully replaces everything that was already working on the
TorchScript path (actually, more -- we added tanh support to
TorchToLinalg in order to delete the older code paths). Additionally,
I've kept the lights on for the acap path too, including what little e2e
stuff was working before (for expediency I made a few tiny compromises
along the way that will be easy to undo when we give that path proper
attention).

Overview of the new design:
- The torch operator `somens::someunqualname.someoverloadname` is
  imported as `torch.somens.someunqualname.someoverloadname` (skip the
  last dotted part if the overload name is empty), OR, if we don't have
  such an op registered, it is imported as
  `torch.operator "somens.someunqualname.someoverloadname" (...) : ...`.
  - The addition of the "overload name" is a critical element here, as
    the `(ns,unqual,overload)` triple is unique, which solves a lot of
    problems we were having.
  - This involves having separate MLIR ops for the `trailing_` and
    `.out` variants and all the different overloads. This seemed
    necessary, because the set of overloads is so wild and varied and
    unstructured. The previous design was leaning into some underlying
    structure that just isn't there -- the default situation is
    the "random overload that we want to manage on the MLIR side",
    rather than that being an exception. E.g.  `aten::ne` (not-equal)
    has 21 overloads, only 4 of which are c10 dispatcher ops see
    [gist](https://gist.github.com/silvasean/190ba918c550c956260e21254e1b8aa1),
    and the "out" variant is really called `.Tensor_out` instead of
    `.out` as it frequently is for other ops.
  - Rationale for all being in `torch` namespace: the set of operators
    are so varied and unstructured that "dialect per namespace"
    doesn't result in anything resembling the typical MLIR dialect
    boundary expectations. We could maybe draw the boundary at
    dispatcher ops vs non-dispatcher ops, but that doesn't seem to
    really result in very much useful structure at this point in time.
  - Note: within the torch operator registry, we effectively have a
    mini-basicpy subdialect (already type-resolved), which is reasonably
    structured.
  - The existing Torch op interfaces are also removed -- now that we
    track the overload name, we can losslessly find the original
    operator.
- Instead of `ATenRecognizeKernelsPass`, we now have a
  `ReduceOpVariantsPass` that keys off certain traits (and perhaps
  eventually interfaces) to reduce variants of ops to a smaller set,
  ideally operating on immutable tensors and using surrounding ops to
  model the mutability/aliasing aspects.
  - Note: `torch.ns.unqual.overload` ops allow both immutable and
    mutable tensors (unlike the previous hard distinction in the common
    case). This is a premonition for a future change that will introduce a
    bona fide `!torch.tensor` type that will clean up a bunch of stuff.
- `TorchToLinalg` / `TorchToStd` supercede the existing
  "ATen->TCF->TCP->Linalg" path.
- The new `torch_ods_gen.py` supercedes `torch_signature_ods_gen.py`.
  It should look somewhat familiar, but the benefit of hindsight has
  allowed a lot of simplifications.

The overall trend seems to be to make the `torch` dialect a nice layer
independent of anything else. It feels like as a natural result of
various future changes we will be removing the reliance on basicpy+numpy
dialects and have a nice self-contained type system too that properly
models the TorchScript type system (including proper subtyping,
mutable/immutable tensors, optional dtype, etc.).

Recommended review order:
- Start at some of the new import IR, e.g. in
  `frontends/pytorch/test/node_import/prim.py`,
  `frontends/pytorch/test/acap_export/test_export_add3.py`, and other
  tests.
- `frontends/pytorch/python/torch_mlir_utils/codegen/torch_ods_gen.py`
  and associated generated files:
  - `include/npcomp/Dialect/Torch/IR/GeneratedAtenOps.td`
  - `include/npcomp/Dialect/Torch/IR/GeneratedPrimOps.td`
- Inspect `ReduceOpVariants.cpp` / `reduce-op-variants.mlir` and the new
  traits in `include/npcomp/Dialect/Torch/IR/TorchTraits.h`
- Various code changes in the import path in
  `frontends/pytorch/csrc/builder`. Probably most interesting is the new
  code in `torch_to_mlir_utils.cpp` that has the logic to create the
  `torch.operator` ops or `torch.ns.unqual.overload` ops.

This is the [new ResNet IR](https://gist.github.com/silvasean/5407aafb710d07612b7b5b92eabecebe),
just to be able to look at a substantial sample of IR in the new style.
2021-05-19 13:37:39 -07:00
Sean Silva 45ba5fac6c Bump llvm-project to 6d263b6f1c97fe6c45c75443e7daf6cd0c1c4222
Changes:
- representation of arg attributes on functions changed
2021-05-10 18:06:15 -07:00
Sean Silva 3d08c83580 Add flatten op recognition + shape refinement.
This op has complex aliasing semantics, so it is kept mutable for now.

With this, we reduce ResNet18 to a single BB with all aten operators
having rank + dtype:
https://gist.github.com/silvasean/2fcb1c6e4d4ae27461204a43ae9c5031
2021-05-03 09:54:44 -07:00
Sean Silva 122cae2ee3 Add aten::len.t, aten::size, and aten::gt.int primitive ops
Also add some canonicalizations that finally reduce ResNet down to a
single block.
2021-04-30 10:57:02 -07:00
Sean Silva ec6d06aa86 Add some more ResNet ops.
- aten::relu_, aten::max_pool2d, aten::adaptive_avg_pool2d, aten::batch_norm, aten::conv2d

No aten-to-linalg conversion for the latter ones, as they are fairly
substantial. At this point, I'm trying to get shape inference and stuff
working for them and the IR cleaned up.
2021-04-30 10:57:02 -07:00
Sean Silva 9257457d8a Add AllowsTypeRefinement trait and use it to improve RefineTypes
This trait lets us model the semantics of various aten/torch/numpy ops
that are insensitive to type refinements. This replaces
hardcoded/inconsistent checks for this property.

To show usage of this new trait, we fix up some old uses, and improve
RefineTypes to be smarter about rewriting with this trait.
2021-04-30 10:57:02 -07:00
Sean Silva 55c3cc6624 Add recognition/folder/lowering for aten::__is__, aten::ne.int, and aten::dim
Interestingly, TorchScript has its own op (`torch::jit::Operator`)
registry separate from the dispatcher (it is a superset of the
dispatcher).

This is where the "prim" ops and some "aten" ops (that should probably
be renamed to "prim") live. In particular, `aten::__is__` is in that
latter category of "aten but really prim". This registry is also the
source of truth for what the TorchScript interpreter calls into when it
executes.

The bulk of the "not part of the dispatcher" ops live in
09feb5f579/torch/csrc/jit/runtime/register_prim_ops.cpp (L82)

And the registry itself lives in:
09feb5f579/torch/csrc/jit/runtime/operator.cpp (L196)

This fold further reduces the IR of ResNet by folding away some
more not-taken branches. These not-taken branches in ResNet require
first-class handling of the list type which we don't yet have on any
backend.
2021-04-30 10:57:02 -07:00
Sean Silva 7eb36b4ae7 Constant fold through basicpy.bool_cast.
This is the start of a push to getting ResNet running.

This involves throwing in the towel on an O0 pipelinie for now. See note
in the code. We keep an options struct with `optimize` flag, but it
default to true for now.
2021-04-30 10:57:02 -07:00
Sean Silva 179105ca3e Add basic MLP's to the e2e curriculum.
These tests pass on the reference backend.

- Add aten.linear op + shape xfer function + ATen->Linalg lowering.
 - Note: this needs to be more automated, and needs to cover more cases.
 - Current not implemented caveats:
  - size-1 broadcasting for bias vector (either static-size-1 or ? case)
  - higher-rank aten.linear ops (not produced by torch.nn.Linear though)
  - type promotion (still don't even know the exact rules here)
- Add folder for torch.derefine op. Now the inliner can clean it up as
  it inlines. (call boundaries are a main place we need to insert
  torch.derefine) This is brittle -- the other important case is control
  flow which will need to be handled via an extension to
  RefineTypes.cpp (as will more robust call handling). River has an
  in-flight patch to update it to the new dataflow framework so I didn't
  want to do anything intrusive here.
    - Also adjust torch.derefine syntax to use the keyword `to` instead of
      `->`, as most type-only, cast-like ops do.
2021-04-27 12:18:54 -07:00
Sean Silva 3a890aa26c Miscellaneous changes while trying to work on ResNet18
- Move frontend lowering pipelines to c++ (this helps with reproducing
  failures in npcomp-opt)
- Add debugging printouts when compilation fails on RefBackendTestConfig

The experience now when a test fails during MLIR lowering is now like this:
```
NPCOMP TorchScript Object Graph IR -> NPCOMP Backend IR lowering failed with the following diagnostics:
failed to legalize operation 'torch.global_slot'
Module does not conform to npcomp's backend contract. See dialect conversion legality information above.

Error can be reproduced with:
$ npcomp-opt -torchscript-to-npcomp-backend-pipeline /tmp/ResNet18Module.mlir
```

And when TorchScript->MLIR import fails it looks like this:
```
PyTorch TorchScript module -> NPCOMP Object Graph IR import failed with the following diagnostics:
unhandled prim operation: %18 : int = prim::min(%17) # /usr/local/google/home/silvasean/.local/lib/python3.9/site-packages/torch/nn/functional.py:4532:4
```

Also,
- Add `--filter=<regex>` to e2e test harness to filter tests.
- Add a few prim ops that were needed to import ResNet18
- Fix torch.prim.Loop.condition assemblyFormat (it previously would not
  round-trip in the case of no loop-carried variables)
2021-04-27 11:51:11 -07:00
Sean Silva 8f96901943 Add vision models (resnet18 to start).
Also,
- improve error reporting of e2e framework.
2021-04-27 11:51:11 -07:00
Sean Silva 544cb4ef54 Bump llvm-project to 484b6648fdd4b104eaf7a2504dd07b60af2c9f8d
- add_mlir_doc arg order
- fix some dependent dialects on passes that were now causing errors
- "encoding" attribute on mlirRankedTensorTypeGetChecked
2021-04-22 18:12:55 -07:00
Sean Silva b8ad0189ac Add comment about relative import.
This file needs to adopt best practices for how to structure python
"main"'s, but I don't know how to do that yet.
2021-04-20 12:02:34 -07:00
Sean Silva 39d50ccf0d Add end-to-end testing framework for TorchScript.
The E2E tests can be run with
```
npcpy frontends/pytorch/e2e_testing/torchscript/main.py
```

This commit adds a couple items supporting that end, including new sugar
for annotations (no more raw use of ClassAnnotator!).

Recommended review order:

1. `frontends/pytorch/e2e_testing/torchscript/main.py` for
   the harness + `basic.py` in that directory for examples of tests.
2. Annotation sugar in `frontends/pytorch/python/torch_mlir/torchscript/annotations.py`
   and unittest in `frontends/pytorch/test/ivalue_import/annotations/sugar.py`
3. Global test registry / sugar in
   `frontends/pytorch/python/torch_mlir/torchscript/e2e_test/registry.py`
4. `frontends/pytorch/python/torch_mlir/torchscript/e2e_test/framework.py`
   for the meat of the testing framework (start at `run_tests`), and
   looking at the backend configs in
   `frontends/pytorch/python/torch_mlir/torchscript/e2e_test/configs`
   for examples of backends. This is likely the bulk of review time.
5. Unit tests of the framework logic in `frontends/pytorch/test/torchscript_e2e_test`

There's TODO's scattered throughout, but this seems functional enough to
start pulling stuff into and kicking the tires. A few missing pieces:

1. Marking test expected pass/fail per backend.
2. Figuring out how best to fit this into dev workflows.
3. IREE TestConfig.

Also, forgive this Python newbie... Any advice on Python code structure
/ library design would be much appreciated.
2021-04-20 12:00:35 -07:00
Sean Silva c4123d4d4d Add npcomp-verify-backend-contract pass.
This pass verifies that a given module satisfies the contract that we
have for backends. This is phrased as an "allowlist", because we want to
keep this interface tight. Also, this gives much better diagnostics than
a backend randomly crashing or failing to compile would (though they
could still be improved).

This was especially painful because if we had
`tensor<?x!numpy.any_dtype>` slip through, at some point RefBackend
would convert it to a memref type and trip the "verify type invariants"
assertion which gives no location or anything and crashed the process,
which was very unpleasant.

We implement this with the dialect conversion framework, which works
reasonably well and was quick to put together and familiar, but is still
very "op oriented". We probably want to make this hand-rolled
eventually, especially the error reporting (the most useful kind of
error for a dialect conversion user is not necessarily the best for this
use case). Also, in production, these error will go to users, and need
to be surfaced carefully such as "the compiler needs a type annotation
on this function parameter" which in general requires some special
analysis, wordsmithing, and overall awareness of the e2e use case (such
as how much we can lean into certain source locations) to provide a
meaningful user-level diagnostic.

Also, add `inline` to the current frontend lowering pass pipeline to
allow slightly more complicated programs that otherwise would fail on
shape inference.
2021-04-20 12:00:35 -07:00
Sean Silva f5dfa02523 Add `aten.mm` to linalg lowering.
This is our first op with error semantics, and stresses the system.

There are a few design notes of special interest:
- RefineTypes.cpp's note about shape inference in the presence of code
  that dynamically produces and error, and it is provable statically.
- ATenToLinalg.cpp's notes about future automation of the ATen->linalg
  path.
- The notes in Passes.td about using low-tech `std.assert` ops instead
  of `shape.assuming`.

Note: Doesn't work on IREE yet due to the `std.assert` op (needs to be
lowered to `vm.fail` on the IREE side).
2021-04-16 12:03:31 -07:00
Sean Silva 28a0f02746 Add support for compiling through IREE.
Recommended review order:
- Changes in frontends/pytorch/examples/
- Changes in python/npcomp/compiler/pytorch/backend/
- Boilerplate for the `npcomp-iree-backend-lower-linkage` pass.

This change separates out a
`npcomp.compiler.pytorch.backend.frontend_lowering` module that does the
common lowering for all backends. The individual compiler backends
`npcomp.compiler.pytorch.backend.{refjit,iree}` now accept a loosely
defined "TCP + scalar code" IR mix that will be formalized in the
future as the interface to codegen backends.

This also required adding a small pass
`npcomp-iree-backend-lower-linkage` which adds `iree.module.export` onto
functions, and layering that into the frontend flow. The pass doesn't
require a C++-level dependency on IREE, which is nice for now. TBD how
we are going to handle lists (we hope we can get away with sneakerneting
some td files and relying on loose IR compatibility).

Running through IREE requires the ability to import `iree.compiler` and
`iree.runtime`, which can be obtained as follows:
```
python3 -m pip install iree-compiler-snapshot iree-runtime-snapshot -f https://github.com/google/iree/releases/tag/snapshot-20210406.200
PYTHONPATH="${PYTHONPATH}:${MY_IREE_BUILD}/bindings/python/"
```

This patch makes it painfully clear that we don't have any e2e testing
harness to really plug into, and also don't have a usable Python API to
our compiler stack (something usable in a jupyter notebook).
That will be addressed in subsequent commits. We've been flying by the
seat of our pants with this `examples` directory that isn't subject to
any kind of testing or real usability concerns.
2021-04-09 13:15:07 -07:00
Sean Silva 2ab62aec12 MILESTONE: TorchScript unary tanh runs on RefBackend
This revamps the TORCH_TO_TCF_PASSES to reflect the new layering that we
are doing in the compiler. See comments there for the layering.

Also adds `frontends/pytorch/examples/torchscript_tanh_e2e.py` as an
"example". E2E testing story TBD (want to get IREE working first).
2021-04-07 11:06:34 -07:00
Sean Silva c3f1f8ebf4 [cleanup] Put the root class type for exportPath first.
This is more consistent and intuitive -- usually the object being
"indexed" or used as a "context" for a later parameter goes first.
2021-04-01 18:40:03 -07:00
Sean Silva e749074bae Basic infra for annotate shapes and dtypes on arguments.
These allow users to annotate a known "type bound" on the argument,
which can seed shape/dtype inference. We don't rewrite the function
types as part of the import process (it will happen in a
yet-to-be-written pass) because:

1. We would need to interprocedurally rewrite all calls to keep the IR
   consistent. Currently, we have a place after GlobalizeObjectGraph but
   before we convert to tensors where this is convenient to do. Ideally,
   we would do this on the object graph representation.

1. We don't necessarily know that adjusting the function type is a legal
   calling convention change. The pass will have blessed knowledge (by
   the pass pipeline author) that adjusting the argument type based on
   the type bound is safe (which it frequently is).

2. Note that in principle, a type bound could be a fairly general thing
   (such as maximum sizes of dimensions, unions of multiple concrete
   types, etc.). The pass will in principle have logic to interpret the
   type bounds and to determine a suitable "best" (and legal) argument
   type.
2021-04-01 18:40:03 -07:00
Sean Silva b0ac04001d Update README. 2021-03-30 11:33:33 -07:00
Sean Silva 99178a167d Bump llvm-project to 0524a09cc7e1a0797982feacf505825231efbee7
- renames of OwningRewritePatternList -> RewritePatternSet
  - also `insert` to `add`
- RewritePatternSet holds a context now
- memref dialect split from std
2021-03-23 14:29:05 -07:00
Sean Silva 703428eff4 Add support for "trailing_" and "out" variants of various ops.
We already had the `promoteTrailingOutTensor` flag, but weren't using
it. A inplaceVariantKernelName flag needed to be added.

This change is a little dissatisfying, as the conversions done by the
RecognizeKernelsPass are currently non-orthogonal. In particular,
`kDropResultAndAliasArg0` probably won't work as intended if mixed with
these (we probably need to promote kDropResultAndAliasArg0 to not be an
arg-level thing anyway, as we have done with promoteTrailingOutTensor).

This involved adding a new op `numpy.overwrite_array`.

```
numpy.overwrite_array %arg2 overwrites %arg0 : tensor<2x3xf32>, !numpy.ndarray<[2,3]:f32>
```

This models the destructive update behavior. Note that in the above op,
we cannot simply RAUW %arg0 with a suitably conveted %arg2 (for example,
%arg0 might have uses that are not dominated by %arg2, or might have an
alias relation with some other array in the program). In general, we
need a pass analogous to "SSA-formation" which knows how to see through
these to uncover an underlying tensor program.

Also, add tanh_out_e2e.py/div_inplace_e2e.py and fix some bitrot in
refjit.py which is my running example I'm trying to get working.
2021-03-19 10:34:50 -07:00
Sean Silva a53ed850bd Fix signature of unboxed aten::arange for torch HEAD 2021-03-18 17:53:52 -07:00
Bairen Yi fead0312f1 Revert "Also fallback autograd dispatch keys for torchvision::nms"
This reverts commit 30a42dea32.
2021-03-16 19:37:45 -07:00
Sean Silva ba482cbb72 Generate Conv2d definition.
We should generally be using torch_signature_ods_gen.py for generating
these. Somehow this one slipped through manually.

There is no `aten::conv2d_overridable` in the op registry AFAICT so I
removed that alias.
2021-03-16 12:39:28 -07:00
Sean Silva c607efa205 Make ATenOpRegistrations.txt dump more readable.
Also add `is_write` field.
2021-03-16 12:39:28 -07:00
Bairen Yi 30a42dea32 Also fallback autograd dispatch keys for torchvision::nms
Signed-off-by: Bairen Yi <yibairen.byron@bytedance.com>
2021-03-15 17:58:08 -07:00
Sean Silva 2750d2084c Add prim::device and handle derefining for prim::CallMethod 2021-03-11 14:10:09 -08:00
Sean Silva 572d198b68 Refactor prim node imports. 2021-03-11 14:10:09 -08:00
Sean Silva 01b8a01e1b prim::dtype op 2021-03-11 14:10:09 -08:00
Bairen Yi 53b01cb9ba Bump llvm-project to e31c77b1827fa4dd3511f21af11cfab18ecf6d38
Signed-off-by: Bairen Yi <yibairen.byron@bytedance.com>
2021-03-10 11:01:16 -08:00
Bryce Arden b94a859e03
[torch] Add import support for IValue string Type(s) (#179)
* [torch] Add import support for IValue string Type(s)

* [test] Add test for Strings import
2021-03-04 13:08:50 -08:00
Sean Silva a36113e586 Fix recent break due to PyTorch changes.
Tracing seems now now capture a 4-operand version of aten::add instead
of 3-operand.

I fixed the tests that made sense. One test was XFAIL'ed, as I don't
have in cache the exact way to fix it yet (requires touching
aten-recogniz-kernels stuff).  I'll be context switching to work on the
kernel recognition stuff soon, and will fix it then.
2021-03-03 18:35:23 -08:00
Sean Silva 43dba03afd Properly model "derefinement".
In terms of IR structure, TorchScript allows types to vary in many
circumstances where MLIR requires pointer-identical types. In particular,
it is valid to pass any subtype in place of a type. For example, if an
`Optional[int]` is required somewhere in the IR, it is legal to pass a
value of just `int` (but not the other way around; see
`torch.prim.unchecked_cast`). In effect, every *use* can have a different
type.

We introduce a new op `torch.derefine` that models that impedance
mismatch. This op allows casting a value from one type to a type that it
is a subtype of to model this behavior.

Recommended review order:
- TorchOps.td for new torch.derefine (and updated docs for
  `torch.prim.unchecked_cast`)
- new test code in if.py, loop.py, function-derefine.py
- new code in node_importer.cpp for handling derefinement insertion
- function_importer.cpp and utils changes in torch_to_mlir_utils.cpp

Properly handling derefinement on function boundaries required
relayering the code so that graph_importer.cpp/.h is now
function_importer.cpp/.h because only the `torch::jit::Function`
(actually the `c10::FunctionSchema` it holds) knows the derefined types that are
actually needed at the boundary (see `function-derefine.py` for a test).

Annoyingly, this churns all the functions which are now prefixed with
`__torch__.` but that is more correct anyway (that is their linkage name
in the `torch::jit::CompilationUnit`; the previous `mb.import_function`
was actually buggy in the case of functions calling each other as it
would reference their unqualified name).

With this change, we can import `resnet18` from `torchvision` :)
IR: https://gist.github.com/silvasean/6426a5272d8a6c7caae533fce05ab704
2021-03-03 15:09:44 -08:00
Bryce Arden 1736ff0253 [prim] Add TupleIndex support
I could not find a corresponding ListIndex in prim, which seems to
translate to a __get_attr__ under the hood. I think the reason a tuple
Index op can exist is because Tuple's are supposed to be frozen, where
List operands can be mutable.
2021-03-02 17:28:32 -08:00
Bryce Arden 68338eafb7 [chore] Make variable names in prim.py more clear 2021-03-02 17:28:32 -08:00
Bryce Arden ca3a02da28 [prim] Add support for List|TupleUnpack 2021-03-02 17:28:32 -08:00
Sean Silva df4c5764da Add support for `prim::unchecked_cast`.
This arises when casting optionals, which happens a lot especially
around handling of default arguments (python `if arg is None` idiom).

In this case, the offending code for the model is in max_pool2d:
[code link](b3bf08e67f/torch/nn/functional.py (L657))
2021-03-02 16:01:34 -08:00
Sean Silva 939d36906f Add support for prim::Loop op.
This is a funny one. It combines a `for` and `while` loop in one op. We
will need to write some conversions to `scf`.
2021-03-02 16:01:34 -08:00
Sean Silva 7dfd6f697e Add support for prim::RaiseException.
Used by resnet18.

It seems to originate from a helper `_verify_batch_size`:
[code link](b3bf08e67f/torch/nn/functional.py (L2099)).

I couldn't find a way to test `prim::RaiseException` without also having
`prim::Uninitialized`.
2021-03-02 16:01:34 -08:00
Sean Silva c837dbb077 Properly import the entire torch::jit::CompilationUnit
This primarily unlocks proper handling of free functions (that is,
functions that are not methods of any torch.nn.Module).

Recommended review order:
- `ivalue_importer.cpp` + `ivalue_import/functions*.py`
- `GlobalizeObjectGraph.cpp` + test case
- misc other stuff

The `torch::jit::CompilationUnit` is basically a backing store or
"context" holding all the possible functions in the program. The
previous code was not explicitly accessing this data structure, since it
just imported the `torch::jit::Function`'s that it saw attached to
methods.

Subtly, any time a TorchScript module called into a free function, the
free function gets incorporated into the torch::jit::CompilationUnit,
but doesn't show up anywhere when dumping the module, except in the
curious pattern:

```
%5 : Function = prim::Constant[name="adaptive_avg_pool2d"]()
%6 : Tensor = prim::CallFunction(%5, %input.1, %4)
```

That is, calls are indirect calls, and are accessed via `prim::Constant`
materializing a function object. Even stranger, the `name` attribute here
doesn't really even tell the full story -- it doesn't correspond to
anything. It turns out that the c10::FunctionType itself actually holds
a pointer to the `torch::jit::Function` in the compilation unit
directly (so there is actually no indirection in prim::CallMethod,
because any two values of the same FunctionType call the same
function!). E.g. when converting the IR to bytecode, the "name" is
ignored [code link](1d6bd15790/torch/csrc/jit/runtime/interpreter.cpp (L937)).
We do import `prim::CallFunction` as a `std.call_indirect` though
because it's more braindead to do it that way (it gets canonicalized to
a direct call easily).
2021-03-01 12:08:01 -08:00
Sean Silva 59a3f46795 Add support for prim.NumToTensor
With this, we can import BERT!
```
pt_util ~/tmp/bert.pt  --import --exported-name=forward \
| npcomp-opt -torch-globalize-object-graph -inline -symbol-dce
```
https://gist.github.com/silvasean/fe7735ff5d065cc9216f7b0346d0e977

The test case here is a bit unconventional -- it isn't actually valid
Python. To figure out how to generate it I had to go search the PyTorch
codebase for "NumToTensor" and work backward. In this case I found
this
[code](649760e5f1/torch/csrc/jit/frontend/ir_emitter.cpp (L464))
which via a wild guess I was able to turn into a test case.

In this case it didn't take me too long, but when doing this kind of
"add a bunch of trivial stuff to bring up a real model", I'm starting to
think that we might skimp on test cases when it's fairly trivial and not
obvious how to test with a small test.
2021-02-26 10:16:56 -08:00
Sean Silva 7b6fa27838 Rename tests to match the code they test
- `module_import -> ivalue_import`, as it mainly tests ivalue_importer.cpp
- `graph_import -> node_import`, as it mainly tests node_importer.cpp
 - graph_importer.cpp does call into node_importer.cpp, but doesn't do
 much.

This was getting pretty confusing. Also add README.md's in each
directory for more clarity.
2021-02-25 13:31:33 -08:00
Bryce Arden 27a4515de2
Add Conv2D Torchscript Import Support (#167)
Adds support for lowering a torch.nn.Conv2d module to the Torch Dialect through TorchScript import.
Generated IR can be viewed here:
https://gist.github.com/brycearden/6c0f790115c4577249372ef82768e6fd

Required implementing support for tuple in the ivalue importer and list in the node importer.
2021-02-25 12:14:00 -08:00
Sean Silva a375ccf9da Add ability to annotate TorchScript classes.
The first use case is to annotate certain program constructs as either
exported or private. In this commit we plumb it down to
GlobalizeObjectGraph which makes use of this information.

Recommended review order:
1. class_annotator.h/.cpp + `test/module_import/annotations/*`
    - New abstractions to communicate with Python code and annotate.
2. IR changes in TorchOps.td
    - Adding "private" attribute to various things.
3. ivalue_import.cpp changes
    - Module + ClassAnnotator = annotated IR
4. GlobalizeObjectGraph.cpp + tests
    - use new "private" attributes to create "private" IR.
    - also, tweak some of the op deleting mechanics, which was triggering
      some memory errors / assertions

With this, we can run the classifier through and inline it as follows:
```
frontends/pytorch/utils/pt_util.py --import --exported-name forward ~/tmp/classifier.pt \
| npcomp-opt -torch-globalize-object-graph -inline
```
IR: https://gist.github.com/silvasean/32dcad9f6270557f412094a77cecdd69
2021-02-25 11:28:34 -08:00
Sean Silva 8486968925 Add trivial inliner interfaces.
With this + manually setting private visibility on everything, a simple
classifier can be reduced to this IR, which is looking pretty lean and
mean:
https://gist.github.com/silvasean/19e7e2e21a61ff197aeac0dd864d188f

Also, include a utility script for importing `.pt` models.

```
pt_util.py --import classifier.pt | npcomp-opt -torch-globalize-object-graph
```
2021-02-22 10:40:38 -08:00
Sean Silva 1b769f7841 Extend GlobalizeObjectGraph to handle torch.prim.GetAttr returning NnModuleType
This happens in practice. With this, we can globalize slots for the
non-trivial classifier layer obtained from
https://github.com/NVIDIA/NeMo/blob/main/tutorials/nlp/Joint_Intent_and_Slot_Classification.ipynb

This also adds support for tuple return types, which were needed by that
model.
2021-02-19 10:23:25 -08:00
Sean Silva 158c5c484d Implement GlobalizeObjectGraph transformation.
This required restructuring of how we model TorchScript on import. The
main difference is that now we split out a `torch.class_type` that holds
methods and declarations of the types of each slot. This is more
consistent with TorchScript (our previous representation was
"denormalized").

Recommended reading order:
1. check out the description of `torch.class_type` in `TorchOps.td` and
   look at `test/Dialect/Torch/ops.mlir` and
   `frontends/pytorch/test/module_import/` to familiarize with the new
   representation.
   - Just look at the new IR. The diff between the old names and new
     names is confusing.
2. check out `test/Dialect/Torch/globalize-object-graph*.mlir`
   and read along with the pass description in
   `include/npcomp/Dialect/Torch/Transforms/Passes.td`
3. Read the code in `GlobalizeObjectGraph.cpp` and miscellaneous changes
   in `ivalue_importer.cpp`, `TorchOps.cpp`, etc.
2021-02-18 18:18:47 -08:00
Bairen Yi 99d1db18d2 Add NoneType support for ivalue_importer
PyTorch added a Global variable `_is_full_backward_hook` recently.

See https://github.com/pytorch/pytorch/pull/46163

Signed-off-by: Bairen Yi <yibairen.byron@bytedance.com>
2021-02-18 11:20:29 -08:00
Stanley Winata a38b7b72b2 adapt acap_dispatch to latest pytorch nightly ("1.9.0.dev20210215+cpu")
Modify ACAP_Dispatch to work with latest pytorch
-Remove boxed from convolution's m.impl
-Use redispatch and constrainted keyset to replace deprecated
callwithdispatchkey
2021-02-18 11:13:29 -08:00
Sean Silva 498979ad28 Add MLIR diagnostic handler that prints to `sys.stderr`.
This is needed so that output shows up properly in a Jupyter notebook.
2021-02-17 18:50:05 -08:00
Sean Silva 572163dfde Handle object identity correctly.
This required some careful considerations when defining object identity
for tensors. See the comments for how we do it.

This also tracks some basic information for diagnostics.
2021-02-10 15:15:56 -08:00
Sean Silva 7f7bf39551 Add prim::Print and fix prim::CallMethod
For now, we are treating strings as bytes.
2021-02-10 15:15:56 -08:00
Sean Silva c4e4a11e3f Add support for prim::GetAttr/SetAttr/CallMethod/If
This required some invasive surgery to graph_importer.h/cpp,
specifically moving most of it into node_importer.h/cpp and relayering
it. The abstraction that it had didn't work well in the recursive
setting that happens with prim::If.

The key observation is that torch::jit::Graph doesn't really correspond
directly to anything on the MLIR side. It's a weird combination of a
context, builder, and function and just holds a `torch::jit::Block`. It
is `torch::jit::Node` and `torch::jit::Block` which form the recursive
structure analogous to MLIR's operation/region/block. So
node_importer.h/cpp makes sense as a core building block.

As part of doing this, I did venture a bit into the AcapController code,
and realize now that there is functionality duplicated there with the
ivalue importer. Will refactor that soon.
2021-02-04 17:01:47 -08:00
Sean Silva 99b845411d Rename some tests for consistency 2021-02-01 17:01:18 -08:00
Sean Silva 689b40c7a6 Add initial TorchScript module importer
It turns out that this was easiest to structure as a general IValue
importer, since torch module are just one of the possible IValue's.

We import the IValue object graph in a braindead fashion into basicpy
ops and a new `torch.nn_module` op that is used to model the
attributes/methods of a torch::jit::Module IValue. See `Torch/ops.mlir`
for an example, and also check out the .py import tests in
`frontends/pytorch/test/module_import`.

As part of this change, a few housekeeping tasks:
- extract some helpers from graph_importer.cpp
- more helpers around the C API
- misc touchups
2021-01-28 11:55:17 -08:00
mikeurbach 0f6a65a1c5
Enable building using LLVM_EXTERNAL_PROJECTS. (#152)
This allows building NPCOMP as an external project of LLVM, similar to
how CIRCT can be built: https://github.com/llvm/circt/pull/227.

The CMake options to use this build style look like this:

```
  -DLLVM_EXTERNAL_PROJECTS=npcomp \
  -DLLVM_EXTERNAL_NPCOMP_SOURCE_DIR=/path/to/mlir-npcomp \
```
2021-01-26 11:43:43 -07:00
Sean Silva b92d724179 NFC: Rename "graph_export" to "graph_import"
These mainly exercise the `module_builder.import_function` function, so
it makes sense for the directory to be called "graph import".
2021-01-21 12:17:21 -08:00
Sean Silva d818043986 Bump llvm-project to d50d7c37a159802c89454a6c53c0ec2e7949d84a
Fixes:
- use `op->(method on Operation)`
- update for MlirIdentifier in signature of mlirNamedAttributeGet
2020-12-14 14:30:51 -08:00
Stella Laurenzo f6d7ee06ef Make torch_mlir compatible with binary PyTorch installations.
* This has been anticipated for a long time in that it is quite hard to keep C++ binary compatibility across a system landscape as diverse as PyTorch, LLVM, and this project. This is why we based the PyTorch extension on the MLIR and NPCOMP C APIs only: that is the only sane linkage story for the entire matrix.
* Removes the few LLVM'isms in torch_mlir that had snuck in, using either STL or PyTorch support utilities. The new rule here is that LLVM C++ includes are forbidden at this level and (as stated in the design), torch_mlir should use the PyTorch runtime and support libraries (not introduce an incidental C++ dependency on LLVM).
* Also deletes mnist-playground as it was proving impossible to keep the grid of PyTorch vs system ABI divisions functioning. I am open to a less drastic course here (optional/disabled by default?)
* This gets us pretty close to just using PyTorch's extension builder API, which will be nice for distribution (i.e. it integrates well with the PyTorch ecosystem for deployment). I ended up just simplifying the in-tree CMake support for now.
* Fixes #138
2020-12-14 09:51:00 -08:00
Sean Silva b2077738ca Bump llvm-project to 444822d77a7fea28aa49edf24533c987efa1b2ee
Fixes:
- renames StandardTypes -> BuiltinTypes
- std.extract_element -> tensor.extract
2020-12-11 14:43:38 -08:00
Phoenix Meadowlark 699bf5df45
Add cos_e2e.py, test_utils and support for tensor inputs (#134) 2020-11-24 19:02:50 -08:00
Stella Laurenzo e2405e3ca8 Add design sketch for aten fallback. 2020-11-24 18:13:35 -08:00
Stella Laurenzo 3937dd14cb Add basicpy.numeric_constant op.
* Going through TODOs on the PyTorch side, this is a big cause of them (not being able to have constants for signed/unsigned).
* Added complex while in here since we're at the phase where it is better to just have things complete than partially done.
2020-11-24 16:44:40 -08:00
Stella Laurenzo b0623b7793 Bump LLVM version to 4f5355ee73626f8b8fe6bf0dd6d167fea7628a2c.
* Incorporates changes around LLVM StringRef.
* Ports fix in upstream pybind11 detection.
* Disables CI hack due to broken pybind detection.
2020-11-24 13:12:04 -08:00
meadowlark@google.com 959c0a79cb Expand pytype coverage for torch_signature_ods_gen.py 2020-11-24 12:42:32 -08:00
Stella Laurenzo f13994fdf7 NFC: Remove TODO about creating an mlirOperationStateDestroy (unnecessary). 2020-11-23 15:01:51 -08:00
Stella Laurenzo 9ffd2556ab Add TorchScript import tests missed in previous change. 2020-11-23 14:43:42 -08:00
Stella Laurenzo 78a3c90758 Add TorchScript graph importer.
* Does not handle all features yet but should conservatively fail on unsupported things.
* Location tracking is still somewhat mismatched between what TorchScript and MLIR do. Likely need a better heuristic for tracking locations from defs for nodes that do not carry location.
* Sets the ground-work for a specialized/generic split but only implements the generic side.
* Had some evidence that this requires a recent bump of PT nightly (within the last month) to pick up pybind11 2.6, which includes some cross-module symbol fixes (vs the previously sync'd version). No source changes, but older versions fail to cast function types at runtime.
2020-11-23 14:20:09 -08:00
Sean Silva 1dfcfa9cd1 Add aten.mm op and "test" it e2e.
Note that unlike aten.matmul which has dynamic behavior
depending on the argument ranks (can do matrix-matrix, matrix-vector,
batch matmul, etc.), aten.mm is just a vanilla matrix
multiply, which can be lowered precisely to tcf.matmul.

The "test" is really just an example that I stared at while getting my
feet wet with this. We probably want something that actually tests this
as part of `ninja check-npcomp`.
2020-11-20 17:21:24 -08:00
harsh-nod 67d6694fdc
Update PYTHON cmake variables to Python3 (#121)
After the recent change of cmake variables
from PYTHON_INCLUDE_DIRS to Python3_INCLUDE_DIRS
and PYTHON_LIBRARIES to Python3_LIBRARIES, there were
a few files that still had references to the old
variables. This patch fixes that.
2020-11-17 16:04:14 -08:00
Stella Laurenzo 6850295ec5 Teach cmake how to find the installed PyTorch.
* In most situations, this eliminates the need to explicitly set a path to the Torch cmake files.
* Also upgrades to new Python3 find package. (should eliminate 2.x mismatches)
* Since PyTorch is located by asking Python where it is, this eliminates a lot of causes of mismatch. (one source of truth)
2020-11-13 17:19:25 -08:00
Stella Laurenzo 47ac80491c Delete old PyTorch 1.3 type dispatch oriented code paths.
* We aren't quite at e2e parity, but we aren't going back and the old path is bit-rotted.
2020-11-12 22:27:05 -08:00
Stella Laurenzo e359167562 Fix dispatch of arange.
* Fixes #107
* I wouldn't say I love what had to be done here. Worth a conversation with the PT devs (probably as part of a rollup of a bunch of this stuff).
2020-11-12 22:07:23 -08:00
Stella Laurenzo b4c7ae1e0c Repurpose numpy-compiler compiler/runtime flow for PyTorch.
* A bit gross because I took the chance to upgrade all of the backend bits to the new MLIR Python bindings and we still co-mingle the old and new for now.
* Since the Python created PassManagers are configured for explicit nesting, I had to upgrade some of the pass pipelines to be explicit.
* The demo in mul_maximum_e2e.py now compiles, runs through PyTorch and through the JIT, prints and asserts the same results.
* I am not claiming that this is the prettiest API in this patch: consider that this is just directly using low-level APIs and there should be an intervening high level API.
2020-11-11 10:38:13 -08:00
Stella Laurenzo e60dc2470e Add aten.maximum op and conversions from aten->tcf.
* Conversions are very simple, suporting mul, maximum and add (alpha=1 only).
* Example added with pass pipeline needed to run.
* Much missing off of the golden path but sufficient for such simple cases.
2020-11-04 17:20:54 -08:00
Stella Laurenzo 6c702b149f Add a number of kernels and new patterns.
* convolution, convolution_backward, _log_softmax, _log_softmax_backward_data, nll_loss_forward, nll_loss_backward, nll_loss2d_forward, nll_loss2d_backward, copy_
* Extends the recognition logic and metadata for handling inplace transformations, optional tensors, ints, lists and dropped args.
* The kernel_calls generated by test_conv_nllloss_grads.py now convert to ATen.
* The result *almost* comes out as a pure tensor program with the exception of the copy_ op, which I will do some followup work to deal with.
* More progress on #97
2020-11-04 14:36:59 -08:00
Harsh Menon c2d3820e48 Fix insertion point bug #102
The current code was inserting all build_list ops
after the last constant op since it was assuming that all
elements being passed in were constants.

This patch replaces that patch with a new function that
inserts the build_list ops before the terminator.

Also modifies test_export_conv2d_fwd.py since its output
no longer matches.

TEST: Added test_export_cat.py which is the code in #102
2020-11-02 16:41:26 -08:00
Stella Laurenzo 0c73c535d6 Capture backward conv and copy_ kernels.
* This is sufficient to capture the forward and backward pass and gradients of a convolutional model with an nllloss.
* As with the forward conv, the backward conv is a special case wrapped in an enigma on the PyTorch side. There aren't many like it, so special casing is just what we do.
* When I traced this, I found that the copy_ op is not yet boxing compatible so I had to map it manually. If there are many more like this, I'll probably do something a bit more clever to reduce duplication.
* This exposes new signature patterns that will need to be handled by the ATen lowering. Will take care of that next: It will be nice to have an e2e of a non-trivial case with full gradients.
* Fixes #97.
2020-10-30 22:59:26 -07:00
Stella Laurenzo 8d98dd4551 Support optional args/returns and other odds and ends.
* None's out Device? args.
* Emits bool tensors if needed.
* Adds some stderr tracing to better see what is going on.
* Test case that exercises NLLLoss.
* This test case emits something for backward calculations but there are some issues still to be worked out, so that part is left out of the test case.
* Progress on #97
2020-10-30 14:50:28 -07:00
Stella Laurenzo c08935a418 Rewrite ATen ODS code generator to be based on new op registry and new signature recognition system.
* Deletes prior code generator from previous attempt (moved some of it into this one).
* Renames old generated tablegen source to "Legacy".
* Generates ODS and import rules for most binary and unary arithmetic ops.
* Removes old generated ops and integration tests that were testing details of the prior setup.
2020-10-28 10:37:37 -07:00
Stella Laurenzo 510f226df2 Expose signature metadata to ops and implement ATenRecognizeKernelsPass pass.
* Two op interfaces, one for querying instance metadata and one for getting static data needed to construct an op from a generic form.
* For torch.generic_kernel ops, metadata is splatted in during capture from Torch (it comes from the op registry, which will work for either device capture or graph import).
* Moved the 'add' out of the generated set so I can experiment on it. It implements the TorchBuildableKernelOpInterface interface which provides its metadata.
* The ATenRecognizeKernelsPass pass generically lowers from a torch.generic_kernel to recognized ops that implement the TorchBuildableKernelOpInterface, handling the various types of transformations that we allow at this stage.
2020-10-26 20:31:45 -07:00
Stella Laurenzo 91fc83d2e7 NFC: Transition ATen passes to tablegen registration. 2020-10-22 17:12:44 -07:00
Stella Laurenzo 9618c2dbf7 NFC: Re-organize ATen directory structure and fix warnings.
* Still some more work to do on the Transforms tree to bring it in line with the others (will do that as I add things).
2020-10-22 14:13:26 -07:00
Stella Laurenzo d09300886a NFC: Use new print with large_elements_limit in tests.
* For tests with large constants, decreases issues with lit pipelines.
* Bumps llvm-project to pick up the update.
2020-10-22 13:04:24 -07:00
Stella Laurenzo 58adb6bd8e Work around various PyTorch issues in support of convolution.
* Enables the conv2d fwd test and ResA (which are both small).
* Deletes resnet18 and vgg, which both run but generate output that crashes FileCheck and lit (or at least makes them take an eternity).
2020-10-21 12:44:31 -07:00
Stella Laurenzo 029815152e Add remaining pieces to capture full example models.
* Adds Basicpy List, Tuple, Dict types and plumbs through C API.
* Started debugging the issues around aten::conv2d capture, but a PyTorch bug is suspected.
* Was able to manually verify that the basic conv2d forward test captures correctly with a workaround.
* Need to resolve some printing issues upstream and move these tests to an integration test target (they take ~seconds to run).
2020-10-19 22:16:59 -07:00
Stella Laurenzo 9e52f6235b More progress on PyTorch acap device capture.
* Now gets far enough to capture batch_norm.
* Has some issues still with in-place ops.
* Can materialize constants.
* Includes an upgrade to PyTorch nightly, which has important bug fixes for fallback and boxed kernel dispatch.
* Fixes #78, #79, #80.
* Will do more testing in a follow-up once further bugs are fixed that facilitate getting at the other features.
2020-10-15 21:43:21 -07:00
Stella Laurenzo abb6fe8aa2 Port prior acap export tests to new dispatcher based versions.
* Sadly, non-trivial ones fail.
* Bugs filed and marked XFAIL.
2020-10-13 16:37:46 -07:00
Stella Laurenzo 30cfc6499f Create public API for torch_mlir python code.
* Adds a trampoline/loader 'torch_mlir' module.
* Plumbs through the MLIR python Context and Module creation, interoping with the MLIR Python API (resolves TODO on creating with own context and accessing the module being built).
* Inter-module Python API interop is still a bit rough but workable via the capsule mechanism. Can be evolved later.
* Exports the frontends/pytorch python sources to the project python/ build directory.
* Requires D89294 to land.
2020-10-13 16:36:49 -07:00
Stella Laurenzo 5c5b8db70f Update test configuration to import mlir from LLVM install location.
* Also adds two lit tests to verify that all of our extensions load without fireworks, which is a good indication that the shared library deps are sane.
* Bumps llvm-project version to use D89167.
2020-10-12 15:25:07 -07:00
Stella Laurenzo af4edb63ae Start reworking towards a shared library build.
* Need to have a dag of shared library deps in order to interop across python extensions (as presented in ODM).
* Introduced add_npcomp_library and friends to mirror the MLIR setup.
* Adds a libNPCOMP.so shared library.
* Redirects tools and extensions to link against libNPCOMP.so (instead of static libs).
* Moves all libraries to lib/, all binaries to bin/ and all python extensions to python/. The invariant is that the rpaths are setup to have a one level directory structure.
* Reworks the _torch_mlir extension to build like the others (still need to come up with a consolidated rule to do this instead of open coded).
* Includes an upstream version bump to pick up needed changes.

Sizes with dynamic linking (stripped, release, asserts enabled):
  libNPCOMP.so: 43M (includes much of the underlying LLVM codegen deps)
  libMLIR.so: 31M
  _npcomp.so: 1.6M (python extension)
  _torch_mlir.so: 670K (python extension)
  npcomp-capi-ir-test: 6.3K
  npcomp-opt: 351K
  npcomp-run-mlir: 461K
  mnist-playground: 530K

Still more can be done to normalize and optimize but this gets us structurally to the starting point.
2020-10-09 16:02:58 -07:00
Stella Laurenzo 3ccc2214a7 Set PyTorch captured function return type.
* Resolves various TODOs that required an LLVM change/bump.
* Bumps LLVM to 4aa217160e5f06a96c6effc4950c3b402374de58
2020-10-07 10:14:34 -07:00
Stella Laurenzo ad3ddb9edb Implement torch.kernel_call capture.
* Had to stop short of modifying the function return signature because of a missing C-API upstream.
* Committing here is good enough for a test and will resolve the various TODOs about upstream APIs next.
2020-10-06 21:54:28 -07:00