Commit Graph

2314 Commits (d50d3aa5e77117fbb7078c25831ea2913a1c5566)
 

Author SHA1 Message Date
Yi Zhang 2c2286034b
install pybind11 through pip to get version 2.6 (#173) 2021-02-28 16:19:03 -08:00
Sean Silva 79a3f639bf Give torch.global_slot an initializer region.
This is a much simpler representation than the ad-hoc initializer
function we had before. It is also less general, but given the rationale
in Passes.td it seems like the right tradeoff right now.

We can probably carry this representation for quite a while, and when we
can't, it likely means that TorchScript has fixed their object identity
bug and we probably need to just upgrade to a more general object graph
modeling (more general than GlobalizeObjectGraph).

In particular, we don't want to deal with defining and carrying around
this initializer function concept until we need it. For example, if we
want to constant-fold the global slots into uses, this is a much better
representation, and it plays better with symbol-dce (the initializer
function counts as a "use" of the symbol).

(the alternative would have been to write a pass that converts the
initializer function to this form when possible, but I realized that
lots of information had been lost which made that fairly annoying -- it
was all self-inflicted anyway, so best to just go to the source
(GlobalizeObjectGraph) before the information is lost)

Now symbol-dce works nicely (no more "training" bools)
```
pt_util ~/tmp/classifier.pt --import --exported-name forward \
| npcomp-opt -torch-globalize-object-graph -inline -symbol-dce
```
IR: https://gist.github.com/silvasean/8abe63d70d24e29d6db9170ccc8d512b
2021-02-26 16:24:19 -08:00
Sean Silva 59a3f46795 Add support for prim.NumToTensor
With this, we can import BERT!
```
pt_util ~/tmp/bert.pt  --import --exported-name=forward \
| npcomp-opt -torch-globalize-object-graph -inline -symbol-dce
```
https://gist.github.com/silvasean/fe7735ff5d065cc9216f7b0346d0e977

The test case here is a bit unconventional -- it isn't actually valid
Python. To figure out how to generate it I had to go search the PyTorch
codebase for "NumToTensor" and work backward. In this case I found
this
[code](649760e5f1/torch/csrc/jit/frontend/ir_emitter.cpp (L464))
which via a wild guess I was able to turn into a test case.

In this case it didn't take me too long, but when doing this kind of
"add a bunch of trivial stuff to bring up a real model", I'm starting to
think that we might skimp on test cases when it's fairly trivial and not
obvious how to test with a small test.
2021-02-26 10:16:56 -08:00
Sean Silva 7b6fa27838 Rename tests to match the code they test
- `module_import -> ivalue_import`, as it mainly tests ivalue_importer.cpp
- `graph_import -> node_import`, as it mainly tests node_importer.cpp
 - graph_importer.cpp does call into node_importer.cpp, but doesn't do
 much.

This was getting pretty confusing. Also add README.md's in each
directory for more clarity.
2021-02-25 13:31:33 -08:00
Bryce Arden 27a4515de2
Add Conv2D Torchscript Import Support (#167)
Adds support for lowering a torch.nn.Conv2d module to the Torch Dialect through TorchScript import.
Generated IR can be viewed here:
https://gist.github.com/brycearden/6c0f790115c4577249372ef82768e6fd

Required implementing support for tuple in the ivalue importer and list in the node importer.
2021-02-25 12:14:00 -08:00
Sean Silva a375ccf9da Add ability to annotate TorchScript classes.
The first use case is to annotate certain program constructs as either
exported or private. In this commit we plumb it down to
GlobalizeObjectGraph which makes use of this information.

Recommended review order:
1. class_annotator.h/.cpp + `test/module_import/annotations/*`
    - New abstractions to communicate with Python code and annotate.
2. IR changes in TorchOps.td
    - Adding "private" attribute to various things.
3. ivalue_import.cpp changes
    - Module + ClassAnnotator = annotated IR
4. GlobalizeObjectGraph.cpp + tests
    - use new "private" attributes to create "private" IR.
    - also, tweak some of the op deleting mechanics, which was triggering
      some memory errors / assertions

With this, we can run the classifier through and inline it as follows:
```
frontends/pytorch/utils/pt_util.py --import --exported-name forward ~/tmp/classifier.pt \
| npcomp-opt -torch-globalize-object-graph -inline
```
IR: https://gist.github.com/silvasean/32dcad9f6270557f412094a77cecdd69
2021-02-25 11:28:34 -08:00
Sean Silva c424c24ed8 Bump llvm-project to c68d2895a1f4019b387c69d1e5eec31b0eb5e7b0
- dialect registration
- StringAttr::get: order of context arg
- math dialect
- LogicalResult nodiscard
- error message for invalid broadcast
2021-02-22 12:23:24 -08:00
Sean Silva 8486968925 Add trivial inliner interfaces.
With this + manually setting private visibility on everything, a simple
classifier can be reduced to this IR, which is looking pretty lean and
mean:
https://gist.github.com/silvasean/19e7e2e21a61ff197aeac0dd864d188f

Also, include a utility script for importing `.pt` models.

```
pt_util.py --import classifier.pt | npcomp-opt -torch-globalize-object-graph
```
2021-02-22 10:40:38 -08:00
powderluv cecf1fbba5
Add a CI builder with latest pytorch CPU nightly. Also add AArch64 to the build (#166) 2021-02-21 13:36:06 -08:00
Sean Silva 1b769f7841 Extend GlobalizeObjectGraph to handle torch.prim.GetAttr returning NnModuleType
This happens in practice. With this, we can globalize slots for the
non-trivial classifier layer obtained from
https://github.com/NVIDIA/NeMo/blob/main/tutorials/nlp/Joint_Intent_and_Slot_Classification.ipynb

This also adds support for tuple return types, which were needed by that
model.
2021-02-19 10:23:25 -08:00
Bairen Yi b511158141 Add vim swap file in gitignore
Signed-off-by: Bairen Yi <yibairen.byron@bytedance.com>
2021-02-18 19:06:10 -08:00
Sean Silva 158c5c484d Implement GlobalizeObjectGraph transformation.
This required restructuring of how we model TorchScript on import. The
main difference is that now we split out a `torch.class_type` that holds
methods and declarations of the types of each slot. This is more
consistent with TorchScript (our previous representation was
"denormalized").

Recommended reading order:
1. check out the description of `torch.class_type` in `TorchOps.td` and
   look at `test/Dialect/Torch/ops.mlir` and
   `frontends/pytorch/test/module_import/` to familiarize with the new
   representation.
   - Just look at the new IR. The diff between the old names and new
     names is confusing.
2. check out `test/Dialect/Torch/globalize-object-graph*.mlir`
   and read along with the pass description in
   `include/npcomp/Dialect/Torch/Transforms/Passes.td`
3. Read the code in `GlobalizeObjectGraph.cpp` and miscellaneous changes
   in `ivalue_importer.cpp`, `TorchOps.cpp`, etc.
2021-02-18 18:18:47 -08:00
Bairen Yi 99d1db18d2 Add NoneType support for ivalue_importer
PyTorch added a Global variable `_is_full_backward_hook` recently.

See https://github.com/pytorch/pytorch/pull/46163

Signed-off-by: Bairen Yi <yibairen.byron@bytedance.com>
2021-02-18 11:20:29 -08:00
Stanley Winata a38b7b72b2 adapt acap_dispatch to latest pytorch nightly ("1.9.0.dev20210215+cpu")
Modify ACAP_Dispatch to work with latest pytorch
-Remove boxed from convolution's m.impl
-Use redispatch and constrainted keyset to replace deprecated
callwithdispatchkey
2021-02-18 11:13:29 -08:00
Sean Silva 498979ad28 Add MLIR diagnostic handler that prints to `sys.stderr`.
This is needed so that output shows up properly in a Jupyter notebook.
2021-02-17 18:50:05 -08:00
Sean Silva 572163dfde Handle object identity correctly.
This required some careful considerations when defining object identity
for tensors. See the comments for how we do it.

This also tracks some basic information for diagnostics.
2021-02-10 15:15:56 -08:00
Sean Silva 7f7bf39551 Add prim::Print and fix prim::CallMethod
For now, we are treating strings as bytes.
2021-02-10 15:15:56 -08:00
Sean Silva c4e4a11e3f Add support for prim::GetAttr/SetAttr/CallMethod/If
This required some invasive surgery to graph_importer.h/cpp,
specifically moving most of it into node_importer.h/cpp and relayering
it. The abstraction that it had didn't work well in the recursive
setting that happens with prim::If.

The key observation is that torch::jit::Graph doesn't really correspond
directly to anything on the MLIR side. It's a weird combination of a
context, builder, and function and just holds a `torch::jit::Block`. It
is `torch::jit::Node` and `torch::jit::Block` which form the recursive
structure analogous to MLIR's operation/region/block. So
node_importer.h/cpp makes sense as a core building block.

As part of doing this, I did venture a bit into the AcapController code,
and realize now that there is functionality duplicated there with the
ivalue importer. Will refactor that soon.
2021-02-04 17:01:47 -08:00
Sean Silva 99b845411d Rename some tests for consistency 2021-02-01 17:01:18 -08:00
Sean Silva 786b563085 Bump llvm-project to 46e764a628da81795af3f64bd28970b7bd4115d6
No changes needed.
2021-02-01 16:59:32 -08:00
Aaron J Arthurs 63ee4f268a Import basic TCP pad test 2021-01-28 12:01:35 -08:00
Aaron J Arthurs 484fe0d9bd Reformat code 2021-01-28 12:01:35 -08:00
Aaron J Arthurs c0e14da888 Fix TensorFromElementsOp reference 2021-01-28 12:01:35 -08:00
Aaron J Arthurs fc650c9447 Import TCP pad 2021-01-28 12:01:35 -08:00
Sean Silva 689b40c7a6 Add initial TorchScript module importer
It turns out that this was easiest to structure as a general IValue
importer, since torch module are just one of the possible IValue's.

We import the IValue object graph in a braindead fashion into basicpy
ops and a new `torch.nn_module` op that is used to model the
attributes/methods of a torch::jit::Module IValue. See `Torch/ops.mlir`
for an example, and also check out the .py import tests in
`frontends/pytorch/test/module_import`.

As part of this change, a few housekeeping tasks:
- extract some helpers from graph_importer.cpp
- more helpers around the C API
- misc touchups
2021-01-28 11:55:17 -08:00
mikeurbach 0f6a65a1c5
Enable building using LLVM_EXTERNAL_PROJECTS. (#152)
This allows building NPCOMP as an external project of LLVM, similar to
how CIRCT can be built: https://github.com/llvm/circt/pull/227.

The CMake options to use this build style look like this:

```
  -DLLVM_EXTERNAL_PROJECTS=npcomp \
  -DLLVM_EXTERNAL_NPCOMP_SOURCE_DIR=/path/to/mlir-npcomp \
```
2021-01-26 11:43:43 -07:00
stephenneuendorffer f96c05abd4
Update README.md
Fix more links
2021-01-26 10:02:25 -08:00
stephenneuendorffer 142de3bab3
Update README.md
Links were broken
2021-01-26 09:52:46 -08:00
Stella Laurenzo 72f785c4b2 Update install_mlir.sh to take extra configure flags. 2021-01-22 16:30:23 -08:00
Sean Silva b92d724179 NFC: Rename "graph_export" to "graph_import"
These mainly exercise the `module_builder.import_function` function, so
it makes sense for the directory to be called "graph import".
2021-01-21 12:17:21 -08:00
Sean Silva 1965ac4d67 NFC: mark some methods as `override`
This silences some warnings I was seeing locally.
2021-01-21 11:48:41 -08:00
Sean Silva 3f4161635c Bump llvm-project to be7352c00d51f4358db3a23ed6a077f7cb48eafd
- TensorFromElementsOp -> tensor::FromElementsOp
- `cmpi "eq", ...` -> `cmpi eq, ...`. Same for `cmpf`
- syntax change for private func ops
- some changes to the python bindings
2021-01-21 11:16:55 -08:00
Sean Silva 2549d00d8c Specify Python3_EXECUTABLE explicitly.
Otherwise `MLIR_BINDINGS_PYTHON_ENABLED=ON` won't work.
2021-01-20 18:07:04 -08:00
Sean Silva 6351474382 Bump llvm-project to bc556e5685c0f97e79fb7b3c6f15cc5062db8e36
- `let typeDesription` -> `let description`
- LLVMIntegerType -> IntegerType
2021-01-08 14:18:09 -08:00
Stella Laurenzo 52240e0569 Disable RTTI in the LLVM build.
* It was only required with the old python APIs.
2021-01-08 10:56:57 -08:00
Stella Laurenzo 3f706473fd NFC: Delete npcomp python API and switch to upstream.
* Most updates are mechanical except:
  * python/npcomp/__init__.py and python/NpcompModule.cpp: New init/registration bits to replace some automatic things being done in the old bindings. Also an annoying linkage hack that I'll need to triage next.
  * NpcompModule.cpp: New python helpers for custom types and other hard to reach items (for the new bindings).
  * PybindUtils.h: Extended type casting so that the local extension can directly exchange Mlir* C types.
  * python/npcomp/dialects/*: Build support and ODS bindings for local dialects.
  * mlir_utils.py: Defines an ImportContext to replace the old/bad "Helper" class that tracked locations, and insertion points. This has a number of methods on it that would be good candidates to think about better ways to do them upstream.
* Also hoisted a few stand-alone samples to dedicated unit tests as they covered important things.
* More cleanup can be done, but keeping this patch as mechanical as possible to stay in NFC land (this is big enough).
2021-01-08 10:46:24 -08:00
Stella Laurenzo 951d7ff42c NFC: Add .code-workspace to gitignore. 2021-01-08 10:34:49 -08:00
Sean Silva 97d6d04d41 Bump llvm-project to 16c6e9c58e9ae50a775945e6b407f1891f353d2f
Changes:
- linalg init tensor change (outs+init -> just outs)
- IntegerType::get and other builtin types now take the context as the
  first arg
- LLVMType::* is gone. Now LLVM Types are just regular Type's.
2021-01-05 16:12:11 -08:00
Sean Silva d8261a06d5 Fix scripts to handle the case of nonexistent directory.
Also, touch up the docs.
2021-01-05 14:17:08 -08:00
Brennan Saeta 46d3dd9ddd Update links in features.md
Point to correct files instead of 404's.
2021-01-05 13:52:57 -08:00
powderluv d35724ad0d
Use portable realpath. Its unavailable in !GNU (#145)
realpath is a GNUUtils package that is not available on recent OSX

TEST=Build on OSX systems without GNUutils + zsh

Change-Id: I573855b93a08e1746e0bb214be28b4a3ea8264ca
2020-12-29 13:03:15 -08:00
powderluv 4237172bbf
Fix OSX builds. (#143)
--version_script doesn't work on OSX.
Shared libs are .dylibs on OSX.

TEST=Build on iMac Pro. M1 has other issues will be fixed later

Change-Id: I2bda46349a878b8265e273c05d8db6b46c0df633
2020-12-28 01:30:45 -08:00
Aaron Arthurs 85898aaf10
Add TCF convolutional op with bias addition (#137) 2020-12-15 12:53:12 -08:00
Sean Silva d818043986 Bump llvm-project to d50d7c37a159802c89454a6c53c0ec2e7949d84a
Fixes:
- use `op->(method on Operation)`
- update for MlirIdentifier in signature of mlirNamedAttributeGet
2020-12-14 14:30:51 -08:00
Stella Laurenzo f6d7ee06ef Make torch_mlir compatible with binary PyTorch installations.
* This has been anticipated for a long time in that it is quite hard to keep C++ binary compatibility across a system landscape as diverse as PyTorch, LLVM, and this project. This is why we based the PyTorch extension on the MLIR and NPCOMP C APIs only: that is the only sane linkage story for the entire matrix.
* Removes the few LLVM'isms in torch_mlir that had snuck in, using either STL or PyTorch support utilities. The new rule here is that LLVM C++ includes are forbidden at this level and (as stated in the design), torch_mlir should use the PyTorch runtime and support libraries (not introduce an incidental C++ dependency on LLVM).
* Also deletes mnist-playground as it was proving impossible to keep the grid of PyTorch vs system ABI divisions functioning. I am open to a less drastic course here (optional/disabled by default?)
* This gets us pretty close to just using PyTorch's extension builder API, which will be nice for distribution (i.e. it integrates well with the PyTorch ecosystem for deployment). I ended up just simplifying the in-tree CMake support for now.
* Fixes #138
2020-12-14 09:51:00 -08:00
Sean Silva b2077738ca Bump llvm-project to 444822d77a7fea28aa49edf24533c987efa1b2ee
Fixes:
- renames StandardTypes -> BuiltinTypes
- std.extract_element -> tensor.extract
2020-12-11 14:43:38 -08:00
Sean Silva 251aa6e435 Bump llvm-project to 774f1d3ffd458d6cb82d5039758ef1cf6370957f
Date:   Mon Nov 30 15:20:30 2020 -0800

Changes:
- finalizing-bufferize is stricter now, and we need to pull in a DimOp
  bufferization which was previously working by happenstance. The
  offending DimOp's are actually created by the linalg bufferization
  (which creates dim ops on the original tensor values, not the
  converted memrefs), so the fix is moving std-bufferize after
  linalg-bufferize.
2020-11-30 18:40:13 -08:00
Sean Silva f9b32a99fc Bump llvm-project to 164410324d8bf3b5a99e39f7dfe3c6d6972dab30
Date:   Mon Nov 30 12:44:35 2020 -0800

Fixes:
- func-bufferize is no longer finalizing, so we need to add
  finalizing-bufferize.
2020-11-30 13:58:13 -08:00
Sean Silva 955fd3eeda Add some much-needed comments around refbackrt::invoke.
This code is really tricky, and was not commented.
2020-11-25 15:39:41 -08:00
Sean Silva 46aa6d0a24 [RefBackend] Fix leaks related to ABI boundaries.
Best as I can tell (e.g. from LeakSanitizer), this fixes all the leaks
except for those due to buffers created internally to the codegenned
code itself (up next I'll add the buffer deallocation pass to fix
those).

The main change is that instead of attempting to pass `refbackrt::Tensor`
to the codegenned function directly, we make all the ABI types be
UnrankedMemRef which gets passed awkwardly (but workably) as a
`{size_t rank, void *ptrToDescriptor}` on the ABI. The reason why
refbackrt::Tensor wasn't workable is that is that MLIR doesn't really
have a way to deal with the lifetime of unranked memref descriptors that
happen inside the function, which is inevitably what would happen in the
old code that would emit runtime calls to
`refbackrt.to_memref/refbackrt.from_memref` to convert back and forth to
`refbackrt::Tensor` inside the codegenned code.

So, instead of the `refbackrt.to_memref/refbackrt.from_memref` with no
real sound basis for valid lifetime management, we now have a lovely
piece of code in `refbackrt::invoke` in `Runtime.cpp` that just barely
seems to be sound. We rely on the codegenned code having these
properties, which it seems to have:

- it won't free memref descriptors or their backing buffer for arguments
  of UnrankedMemRef type.

- it will allocate a separate memref descriptor for each result
  UnrankedMemRef (which is ensured by having a separate memref_cast for
  each)

- we can sniff the `allocatedPtr`'s (i.e. the backing buffer pointers)
  to avoid double-freeing in the case of aliasing of the backing buffer
  (including backing buffers for arguments feeding into results)

- to catch the case of statically allocated data (which we need to avoid
  passing to `free`) , check if the `allocatedPtr` is (no joke) equal to
  `0xDEADBEEF`, because there is otherwise no way to distinguish
  statically allocated from malloc'ed data...  (std.global_memref lowering
  to LLVM by happenstance sets the allocatedPtr equal to `0xDEADBEEF`,
  presumably mainly as a debugging thing)

Even with all this, we *still* need to (internally to refbackrt::invoke)
make copies of all inputs/outputs! And the details of how the LLVM-level
ABI gets laid out for e.g. function arguments/returns is still super
tricky.

This really highlights how deficient memref is as the general runtime
type for our use case. It's stewing in my mind how best to improve the
situation. My general gut feeling is that IREE's abstractions for this
are "right", but I need to think more how to distill those aspects of
IREE's design in a "reference" way for RefBackend.

Some implementation notes:

- In terms of how this is implemented, this did catch a bug in our ABI
  wrapper functions in LowerToLLVM.cpp, which I had to fix (it happened to
  work before through some combination of npcomprt::Tensor being passed as
  a single pointer + probably me infinite-monkey-ing it until it worked)

- This actually removes 2 out of the 3 compiler runtime functions (the
  only one left is "abort_if". (most of the memref descriptor code moved
  from CopmilerRuntime.cpp to Runtime.cpp)

  - this also means deleting `refbackrt.from_memref` and
  `refbackrt.to_memref`
2020-11-25 13:09:58 -08:00