Commit Graph

24 Commits (b2952b12dd9c2f908a29499ae4eef775b5e56ff0)

Author SHA1 Message Date
stephenneuendorffer 3fd9b7789e
Bump LLVM to 881ff4e4ebe8cc0cc045c7c167cffb01f94f27f8 (#539) 2022-01-25 22:16:30 -08:00
Prashant Kumar 5c7ce45c4e Update external llvm to 966b72098363d44adf2882b9c34
The external llvm is updated to point to
https://reviews.llvm.org/rG966b72098363d44adf2882b9c34fcdbe344ff913.
Some of the changes wrt. NamedAttr has been addressed.

Signed-off-by: Prashant Kumar <prashant@nod-labs.com>
2021-12-06 23:33:58 +05:30
Yi Zhang 0902438882 Update llvm-project to a54f4eae0e1d0ef5adccdcf9f6c2b518dc1101aa
This brings in https://reviews.llvm.org/D110797. PRs that are in
progress will need to use scripts provided by
https://llvm.discourse.group/t/psa-removed-arithmetic-ops-from-standard/4455.
2021-10-18 13:36:42 -04:00
Sean Silva 5b6902e31c Dual license the torch-mlir project.
This commit (with approval from all contributors) dual licenses
the torch-mlir project under both the standard LLVM license and the
standard PyTorch license. This will facilitate moving code between
torch-mlir and the two upstream projects.

The standard file comment is now:

```
// This file is licensed under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
// Also available under a BSD-style license. See LICENSE.
```

See `LICENSE` in the project root for the terms of both licenses.
2021-10-01 10:46:08 -07:00
Sean Silva 4fad753073 Move external/torch-mlir to the root of the repo. 2021-09-27 17:11:08 -07:00
Sean Silva 28a7738189 [torch-mlir earthmoving (1/N)] C/C++ code movement.
This creates the `external/torch-mlir` directory as an
LLVM_EXTERNAL_PROJECTS-compatible project (analogous to
`iree-dialects`) and completes movement/rename of all pure MLIR C/C++
compiler code into there. The next step will be to move all the Python
code / code that links/includes PyTorch C++ code (which currently lives
in `frontends/pytorch`) into a subdirectory here.

I call this "earthmoving" because it is mostly mechanical changes and
renames. As a quick summary (we can change this down the road easily)
- C++ `mlir::NPCOMP::Torch -> mlir::torch::Torch`
- CAPI `npcompTorchListTypeGet -> torchMlirTorchListTypeGet`
- preprocessor `#ifndef NPCOMP_ -> #ifndef TORCHMLIR_`
- CMake `NPCOMPFoo -> TorchMLIRFoo`

The goal of this is to create a standalone project creating a center of
mass for entry into the MLIR ecosystem from PyTorch, suitable in scope
for eventual inclusion/ownership in PyTorch. The idea is that
`external/torch-mlir` will some day be pulled out into its own
repository, and then npcomp will simply pull it in as a submodule.

Layering-wise, what lives in `torch-mlir` lowers code from PyTorch
(currently TorchScript, but TorchFX or pytorch/xla-style tracing are
possible extensions) down to what we have been calling the "Torch
backend contract" which is cleaned up IR (inlining, simplifcation,
conversion to value tensors, ...) entirely in the `torch` dialect. This
is the branching off point for further lowering, of which npcomp takes
one opinion (outside `torch-mlir` of course!), namely the
`TorchConversion` dialect/transforms which lower to IR suitable for IREE
and other linalg-on-tensors based lower-level compilers.

Summary of changes:
- move `{include,lib,test}/Dialect/Torch` into `torch-mlir`
- move relevant parts of CAPI into `torch-mlir`.
- leave a few things related to the `torch-mlir` Python build commented
  out, which should be resolved in a subsequent change.
2021-09-10 21:44:37 -07:00
Sean Silva 83b5b5456d Bump llvm-project to da289a174fc6617c7be37be2947480510fd4f02a
- Build adjustments for `.cpp.inc` dialect files.
- Renaming of `memref.dim` to `tensor.dim` for tensor case.

Minor changes:
- Renaming of `mlir::linalg::ReassociationIndices` to
  `mlir::ReassociationIndices`.
- Adjust command line option parsing in npcomp-run-mlir.
2021-07-07 13:57:29 -07:00
Sean Silva 333e07a74e Add `torch.vtensor.literal` op.
This op is much better behaved than the `torch.tensor.literal` op
(which is the new name of the `torch.tensor` op). In particular
`torch.tensor.literal`:
- always has a maximally refined type.
- always has value semantics.
- can be constant folded / CSE'd.

ReduceOpVariants is changed to perform the transformation from
`torch.tensor.literal` to `torch.vtensor.literal` (which in general
involves static information casts and copies.

This new op also allowed tightening up `torch.tensor.literal` to only
accept NonValueTensorType (instead of any tensor type).

This new ".literal" name is more descriptive. It was getting too
confusing seeing an op called just `torch.tensor` (we originally called
it that because that's the name of the similar function in the Torch
Python API, but it just doesn't fit here).
2021-06-17 14:37:04 -07:00
Sean Silva 4a0eb44d17 Add a !torch.float type.
This removes the dependence of the `torch` dialect on the low-level
builtin types.
Now the `torch` dialect is a standalone layer, suitable for targeting
from higher-level Python abstractions without any premature lowering to
primitive types.
2021-06-17 09:24:18 -07:00
Sean Silva f49ebf1690 Add `!torch.int` type.
This replaces the ad-hoc use of `i64` throughout the Torch layer, and
helps to keep it crystal clear the distinction between `!torch.int`
(which is modeling the Python `int` type) and the various types that
serve as dtypes of tensors, which are a totally different type universe.

Changes:
- `!torch.int` type and C bindings.
- Change `torch.constant.int` parser to not need the `: i64` at the end.
- `m_TorchConstantInt` matcher to aid with matching constants.
- BackendTypeConversion changes for `!torch.int` -> `i64` type
  conversion.
- Refactor finalizing patterns in FinalizingBackendTypeConversionPass
  (they were getting very repetitive).
- Mechanical rewriting of `!torch.int` to `i64` in all the tests, and
  `AnyTorchIntType` to `Torch_IntType` in the `.td` files.
2021-06-17 07:28:23 -07:00
Sean Silva 784156a998 Add `!torch.bool` type.
This finishes removing the dependence on the basicpy dialect!

Changes:
- Add `!torch.bool` type and replace use of `!basicpy.BoolType` in
  Torch-related code.
- Rename BuiltinTensorize to BackendTypeConversion since now it handles
  bool conversions (and, when we add !torch.int and !torch.float, it
  will handle those as well), and generalize the related utilities (I
  also moved them to Torch/Transforms since they aren't really part of
  Torch/IR).
  - Add `torch.to_i1` and `torch.from_i1` ops for materializations
- [cleanup] Reorganize `torch.constant.*` ops in TorchOps.td
- Remove dependency of `torch` dialect on `basicpy` dialect and also
  `std` dialect. For `std`, we use some call related ops, but the
  `torch` dialect itself never produces them (we have passes that do
  though).

This is fairly mechanical. Recommended review order:
- New stuff in Torch/IR
- New BuiltinTypeConversion files.
- Mechnical fixups elsewhere.
2021-06-16 13:22:00 -07:00
Sean Silva 3ccf6002af Add `torch.constant.int` and `torch.constant.float`.
- This removes reliance on basicpy.numeric_constant.
- Also, add OpAsmOpInterface to the `torch.constant.none` and
  `torch.constant.str` ops.
2021-06-15 15:29:42 -07:00
Sean Silva 2e850ecb72 Add !torch.str type.
- Remove dependence on `!basicpy.BytesType`.
- Add `torch.constant.str "s"` analogous to `torch.constant.none`.
2021-06-15 10:10:59 -07:00
Sean Silva db282fd1b4 Introduce native `!torch.none` type.
- Add `torch.constant.none` op to construct it (naming is chosen to be
  analogous to Torch's representation of a prim::Constant with
  NoneType, rather than using the "singleton" terminology of Basicpy).
2021-06-14 13:30:58 -07:00
Sean Silva 81bcd7fb12 Move Torch type implementation code into TorchTypes.cpp 2021-06-10 16:46:47 -07:00
Sean Silva 370e3270ab Introduce `!torch.tensor` / `!torch.vtensor` types.
This removes our reliance on the numpy dialect and avoids our off-label
use of the builtin tnesor type for modeling unknown dtypes.  The
`!torch.vtensor` (`ValueTensorType`) type is a value-semantic tensor.
The `!torch.tensor` (`NonValueTensorType`) type is a non-value-semantic
tensor. The new types look as follows syntactically:

```
// Least-static-information, non-value-semantic tensor.
!torch.tensor
// Explicit form of least-static-information variant.
!torch.tensor<*,unk>
// Least-static-information, value-semantic tensor.
!torch.vtensor
// Explicit form of least-static-information variant.
!torch.vtensor<*,unk>
// Fixed-set of allowable element types, with first-class support for
// Torch's frontend signedness semantics.
!torch.tensor<*,si32>
// First-class support for unknown dtypes.
!torch.tensor<[?,?,?],unk>
// Standard MLIR representation of `?` for unknown dimensions.
!torch.tensor<[?,2,?,4],unk>
// Statically shaped / dtyped example.
!torch.vtensor<[1,2,3,4],f32>
```

This required fairly significant changes throughout the compiler, but
overall it is a big cleanup. We now have a much clearer layering of "the
Torch frontend lowering" vs "lowering to std + linalg + etc.".

At the C++ level, there is `ValueTensorType`, `NonValueTensorType`.
We also have a helper `BaseTensorType` (kind of like ShapedType) which
interoperates with those two.

Included changes:
- New `torch.tensor(dense<0.0> : tensor<5xf32>) : !torch.tensor` op for
  creating torch tensor literals in the frontend.
- Consistently use signedness for the types (except i1 which I didn't
  touch -- we need to sort out the situation with !basicpy.BoolType
  there anyway so will be attending to that soon)
- Frontend can annotate whether an argument to the function has value
  semantics. We currently require this, as our backend contract does not
  currently allow us to even model the non-value-semantic case. Before,
  the value-semantic assumption was randomly injected in the middle of
  the pass pipeline.
- Move ArrayToTensor (now called MaximizeValueSemantics) and
  RefinePublicReturn passes to torch dialect.
- The TorchToStd and TorchToLinalg passes are now type conversions from
  `!torch.vtensor` to `tensor` and use the dialect conversion infra.
  The overall conversion pipeline is set up following the best practices
  of the "Type Conversions the Not-So-Hard Way" talk. This required
  introducing `torch-func-builtin-tensorize` and
  `torch-finalizing-builtin-tensorize` passes analogous to the upstream
  bufferization passes with the corresponding names (mostly just
  copypasta from there).
- Misc Torch-level canonicalizations -- we now cleanly layer the
  lowering to std later in the pipeline, so we are gradually lessening
  our reliance on random std constant folding before we get to that
  point.

Recommended review order:
- New types in TorchTypes.td/TorchTypes.h/TorchDialect.cpp
- New ops in TorchOps.td / TorchOps.cpp
- Less important / more mechanical stuff
  - Frontend changes.
  - Pass changes/additions in `Torch/Transforms` and `Conversion/`
2021-06-10 10:56:48 -07:00
Sean Silva 2efda323ff Significantly restructure torch/aten import design.
This is a really major and invasive restructuring of the way we get
torch operators (`torch::jit::Operator` / `c10::OperatorHandle`) into
MLIR. Please forgive the challenging review, but due to the sheer
invasiveness, it wasn't really practical do do it in sane smaller
pieces.

This fully replaces everything that was already working on the
TorchScript path (actually, more -- we added tanh support to
TorchToLinalg in order to delete the older code paths). Additionally,
I've kept the lights on for the acap path too, including what little e2e
stuff was working before (for expediency I made a few tiny compromises
along the way that will be easy to undo when we give that path proper
attention).

Overview of the new design:
- The torch operator `somens::someunqualname.someoverloadname` is
  imported as `torch.somens.someunqualname.someoverloadname` (skip the
  last dotted part if the overload name is empty), OR, if we don't have
  such an op registered, it is imported as
  `torch.operator "somens.someunqualname.someoverloadname" (...) : ...`.
  - The addition of the "overload name" is a critical element here, as
    the `(ns,unqual,overload)` triple is unique, which solves a lot of
    problems we were having.
  - This involves having separate MLIR ops for the `trailing_` and
    `.out` variants and all the different overloads. This seemed
    necessary, because the set of overloads is so wild and varied and
    unstructured. The previous design was leaning into some underlying
    structure that just isn't there -- the default situation is
    the "random overload that we want to manage on the MLIR side",
    rather than that being an exception. E.g.  `aten::ne` (not-equal)
    has 21 overloads, only 4 of which are c10 dispatcher ops see
    [gist](https://gist.github.com/silvasean/190ba918c550c956260e21254e1b8aa1),
    and the "out" variant is really called `.Tensor_out` instead of
    `.out` as it frequently is for other ops.
  - Rationale for all being in `torch` namespace: the set of operators
    are so varied and unstructured that "dialect per namespace"
    doesn't result in anything resembling the typical MLIR dialect
    boundary expectations. We could maybe draw the boundary at
    dispatcher ops vs non-dispatcher ops, but that doesn't seem to
    really result in very much useful structure at this point in time.
  - Note: within the torch operator registry, we effectively have a
    mini-basicpy subdialect (already type-resolved), which is reasonably
    structured.
  - The existing Torch op interfaces are also removed -- now that we
    track the overload name, we can losslessly find the original
    operator.
- Instead of `ATenRecognizeKernelsPass`, we now have a
  `ReduceOpVariantsPass` that keys off certain traits (and perhaps
  eventually interfaces) to reduce variants of ops to a smaller set,
  ideally operating on immutable tensors and using surrounding ops to
  model the mutability/aliasing aspects.
  - Note: `torch.ns.unqual.overload` ops allow both immutable and
    mutable tensors (unlike the previous hard distinction in the common
    case). This is a premonition for a future change that will introduce a
    bona fide `!torch.tensor` type that will clean up a bunch of stuff.
- `TorchToLinalg` / `TorchToStd` supercede the existing
  "ATen->TCF->TCP->Linalg" path.
- The new `torch_ods_gen.py` supercedes `torch_signature_ods_gen.py`.
  It should look somewhat familiar, but the benefit of hindsight has
  allowed a lot of simplifications.

The overall trend seems to be to make the `torch` dialect a nice layer
independent of anything else. It feels like as a natural result of
various future changes we will be removing the reliance on basicpy+numpy
dialects and have a nice self-contained type system too that properly
models the TorchScript type system (including proper subtyping,
mutable/immutable tensors, optional dtype, etc.).

Recommended review order:
- Start at some of the new import IR, e.g. in
  `frontends/pytorch/test/node_import/prim.py`,
  `frontends/pytorch/test/acap_export/test_export_add3.py`, and other
  tests.
- `frontends/pytorch/python/torch_mlir_utils/codegen/torch_ods_gen.py`
  and associated generated files:
  - `include/npcomp/Dialect/Torch/IR/GeneratedAtenOps.td`
  - `include/npcomp/Dialect/Torch/IR/GeneratedPrimOps.td`
- Inspect `ReduceOpVariants.cpp` / `reduce-op-variants.mlir` and the new
  traits in `include/npcomp/Dialect/Torch/IR/TorchTraits.h`
- Various code changes in the import path in
  `frontends/pytorch/csrc/builder`. Probably most interesting is the new
  code in `torch_to_mlir_utils.cpp` that has the logic to create the
  `torch.operator` ops or `torch.ns.unqual.overload` ops.

This is the [new ResNet IR](https://gist.github.com/silvasean/5407aafb710d07612b7b5b92eabecebe),
just to be able to look at a substantial sample of IR in the new style.
2021-05-19 13:37:39 -07:00
Sean Silva e749074bae Basic infra for annotate shapes and dtypes on arguments.
These allow users to annotate a known "type bound" on the argument,
which can seed shape/dtype inference. We don't rewrite the function
types as part of the import process (it will happen in a
yet-to-be-written pass) because:

1. We would need to interprocedurally rewrite all calls to keep the IR
   consistent. Currently, we have a place after GlobalizeObjectGraph but
   before we convert to tensors where this is convenient to do. Ideally,
   we would do this on the object graph representation.

1. We don't necessarily know that adjusting the function type is a legal
   calling convention change. The pass will have blessed knowledge (by
   the pass pipeline author) that adjusting the argument type based on
   the type bound is safe (which it frequently is).

2. Note that in principle, a type bound could be a fairly general thing
   (such as maximum sizes of dimensions, unions of multiple concrete
   types, etc.). The pass will in principle have logic to interpret the
   type bounds and to determine a suitable "best" (and legal) argument
   type.
2021-04-01 18:40:03 -07:00
Sean Silva 99178a167d Bump llvm-project to 0524a09cc7e1a0797982feacf505825231efbee7
- renames of OwningRewritePatternList -> RewritePatternSet
  - also `insert` to `add`
- RewritePatternSet holds a context now
- memref dialect split from std
2021-03-23 14:29:05 -07:00
Sean Silva 8486968925 Add trivial inliner interfaces.
With this + manually setting private visibility on everything, a simple
classifier can be reduced to this IR, which is looking pretty lean and
mean:
https://gist.github.com/silvasean/19e7e2e21a61ff197aeac0dd864d188f

Also, include a utility script for importing `.pt` models.

```
pt_util.py --import classifier.pt | npcomp-opt -torch-globalize-object-graph
```
2021-02-22 10:40:38 -08:00
Sean Silva 158c5c484d Implement GlobalizeObjectGraph transformation.
This required restructuring of how we model TorchScript on import. The
main difference is that now we split out a `torch.class_type` that holds
methods and declarations of the types of each slot. This is more
consistent with TorchScript (our previous representation was
"denormalized").

Recommended reading order:
1. check out the description of `torch.class_type` in `TorchOps.td` and
   look at `test/Dialect/Torch/ops.mlir` and
   `frontends/pytorch/test/module_import/` to familiarize with the new
   representation.
   - Just look at the new IR. The diff between the old names and new
     names is confusing.
2. check out `test/Dialect/Torch/globalize-object-graph*.mlir`
   and read along with the pass description in
   `include/npcomp/Dialect/Torch/Transforms/Passes.td`
3. Read the code in `GlobalizeObjectGraph.cpp` and miscellaneous changes
   in `ivalue_importer.cpp`, `TorchOps.cpp`, etc.
2021-02-18 18:18:47 -08:00
Sean Silva 689b40c7a6 Add initial TorchScript module importer
It turns out that this was easiest to structure as a general IValue
importer, since torch module are just one of the possible IValue's.

We import the IValue object graph in a braindead fashion into basicpy
ops and a new `torch.nn_module` op that is used to model the
attributes/methods of a torch::jit::Module IValue. See `Torch/ops.mlir`
for an example, and also check out the .py import tests in
`frontends/pytorch/test/module_import`.

As part of this change, a few housekeeping tasks:
- extract some helpers from graph_importer.cpp
- more helpers around the C API
- misc touchups
2021-01-28 11:55:17 -08:00
Stella Laurenzo 510f226df2 Expose signature metadata to ops and implement ATenRecognizeKernelsPass pass.
* Two op interfaces, one for querying instance metadata and one for getting static data needed to construct an op from a generic form.
* For torch.generic_kernel ops, metadata is splatted in during capture from Torch (it comes from the op registry, which will work for either device capture or graph import).
* Moved the 'add' out of the generated set so I can experiment on it. It implements the TorchBuildableKernelOpInterface interface which provides its metadata.
* The ATenRecognizeKernelsPass pass generically lowers from a torch.generic_kernel to recognized ops that implement the TorchBuildableKernelOpInterface, handling the various types of transformations that we allow at this stage.
2020-10-26 20:31:45 -07:00
Stella Laurenzo 2c9ca79c89 Add boilerplate for Torch dialect. 2020-09-28 15:26:17 -07:00