Commit Graph

593 Commits (2e63f4b1e11dfac6b43fbf3fcc87e91a944f7c37)
 

Author SHA1 Message Date
Sean Silva 2e63f4b1e1 Again try to fix nondeterminism in check-npcomp-all. 2021-09-16 19:52:24 +00:00
Sean Silva 44d615ac1e Try to fix nondeterminism in check-npcomp-all. 2021-09-16 19:30:45 +00:00
Sean Silva d94d6800fa Bring CI back to life.
This brings back `check-npcomp-all` and the refbackend e2e tests
coverage.
2021-09-16 12:07:32 -07:00
Sean Silva b6be96d722 [torch-mlir earthmoving (2/N)] Python code movement.
This moves the bulk of the Python code (including the Torch interop)
from `frontends/pytorch` into `torch-mlir/TorchPlugin`. This also
required reconciling a bunch of other Python-related stuff, like the
`torch` dialects.

As I did this, it was simpler to just remove all the old numpy/basicpy
stuff because we were going to delete it anyway and it was faster than
debugging an intermediate state that would only last O(days) anyway.

torch-mlir has two top-level python packages (built into the
`python_packages` directory):

- `torch_mlir_dialects`: `torch` dialect Python bindings (does not
  depend on PyTorch). This also involves building the aggregate CAPI for
  `torch-mlir`.
- `torch_mlir`: bindings to the part of the code that links against
  PyTorch (or C++ code that transitively does).

Additionally, there remain two more Python packages in npcomp (but
outside `torch-mlir`):

- `npcomp_torch`: Contains the e2e test framework and testing configs
  that plug into RefBackend and IREE.
- `npcomp_core`: Contains the low-level interfaces to RefBackend and
  IREE that `npcomp_torch` uses, along with its own
  `MLIR_PYTHON_PACKAGE_PREFIX=npcomp.` aggregation of the core MLIR
  python bindings. (all other functionality has been stripped out)

After all the basicpy/numpy deletions, the `npcomp` C++ code is now very
tiny. It basically just contains RefBackend and the `TorchConversion`
dialect/passes (e.g. `TorchToLinalg.cpp`).

Correspondingly, there are now 4 main testing targets paralleling the
Python layering (which is reflective of the deeper underlying dependency
structure)

- `check-torch-mlir`: checks the `torch-mlir` pure MLIR C++ code.
- `check-torch-mlir-plugin`: checks the code in `TorchPlugin` (e.g.
  TorchScript import)
- `check-frontends-pytorch`: Checks the little code we have in
  `frontends/pytorch` -- mainly things related to the e2e framework
  itself.
- `check-npcomp`: Checks the pure MLIR C++ code inside npcomp.

There is a target `check-npcomp-all` that runs all of them.
The `torch-mlir/build_standalone.sh` script does a standalone build of
`torch-mlir`.

The e2e tests (`tools/torchscript_e2e_test.sh`) are working too.

The update_torch_ods script now lives in
`torch-mlir/build_tools/update_torch_ods.sh` and expects a standalone
build.

This change also required a fix upstream related to cross-shlib Python
dependencies, so we also update llvm-project to
8dca953dd39c0cd8c80decbeb38753f58a4de580 to get
https://reviews.llvm.org/D109776 (no other fixes were needed for the
integrate, thankfully).

This completes most of the large source code changes. Next will be
bringing the CI/packaging/examples back to life.
2021-09-15 13:40:30 -07:00
Sean Silva 28a7738189 [torch-mlir earthmoving (1/N)] C/C++ code movement.
This creates the `external/torch-mlir` directory as an
LLVM_EXTERNAL_PROJECTS-compatible project (analogous to
`iree-dialects`) and completes movement/rename of all pure MLIR C/C++
compiler code into there. The next step will be to move all the Python
code / code that links/includes PyTorch C++ code (which currently lives
in `frontends/pytorch`) into a subdirectory here.

I call this "earthmoving" because it is mostly mechanical changes and
renames. As a quick summary (we can change this down the road easily)
- C++ `mlir::NPCOMP::Torch -> mlir::torch::Torch`
- CAPI `npcompTorchListTypeGet -> torchMlirTorchListTypeGet`
- preprocessor `#ifndef NPCOMP_ -> #ifndef TORCHMLIR_`
- CMake `NPCOMPFoo -> TorchMLIRFoo`

The goal of this is to create a standalone project creating a center of
mass for entry into the MLIR ecosystem from PyTorch, suitable in scope
for eventual inclusion/ownership in PyTorch. The idea is that
`external/torch-mlir` will some day be pulled out into its own
repository, and then npcomp will simply pull it in as a submodule.

Layering-wise, what lives in `torch-mlir` lowers code from PyTorch
(currently TorchScript, but TorchFX or pytorch/xla-style tracing are
possible extensions) down to what we have been calling the "Torch
backend contract" which is cleaned up IR (inlining, simplifcation,
conversion to value tensors, ...) entirely in the `torch` dialect. This
is the branching off point for further lowering, of which npcomp takes
one opinion (outside `torch-mlir` of course!), namely the
`TorchConversion` dialect/transforms which lower to IR suitable for IREE
and other linalg-on-tensors based lower-level compilers.

Summary of changes:
- move `{include,lib,test}/Dialect/Torch` into `torch-mlir`
- move relevant parts of CAPI into `torch-mlir`.
- leave a few things related to the `torch-mlir` Python build commented
  out, which should be resolved in a subsequent change.
2021-09-10 21:44:37 -07:00
Sean Silva 28762699b3
Comment out the full wheel build
Last commit was only the last step of that.
2021-09-10 21:43:25 -07:00
Sean Silva 0d8af19550
Temporarily disable wheel building
It will be re-enabled after the torch-mlir excision is completed.
2021-09-10 21:40:16 -07:00
Sean Silva a7252f9a06 Add basic support for lists.
This plumbs through a vertical slice of support for lists.

The main chunk of new code here is AnnotateABIPass which captures the
program signature at the Torch backend contract layer, right before we
start `TorchConversion`. The `TorchConversion` lowering process is lossy
w.r.t. types, so it's necessary to do this for all targets in general.
Like using `!iree.list` directly, we use IREE's ABI annotation
representation for this, although there is nothing very IREE-specific
about it (see
https://github.com/google/iree/blob/main/docs/developers/design_docs/function_abi.md)

We change `ListLiteralModule_basic` to use `!torch.int` because IREE
doesn't support f64 yet (and we don't yet have a way for users to say
that they want `!torch.float` to lower as f32).

Recommended review order:
- AnnotateABIPass and tests
- Arg marshaling in npcomp_backend.py and `iree.py`
- Updates to `list_programs.py` / `xfail_sets.py`
- Moving DeleteDeadIREEListsPass to Backend/Common, so that backends
  that don't support lists can use it. RefBackend uses that pass, for
  example.
2021-09-09 20:48:55 -07:00
Yi Zhang 73d553e168 MT model compilation minor changes
This contains the following changes:
 - Fix optional knowledge propagation. The initial knowledge should
 always be NotNone for the operations we implemented.
 - Add Folder for `prim.dtype`
2021-09-09 19:02:48 -04:00
Sean Silva 5f3eb637c4 Fix lowering of reduce ops
We were not filling the `outs` with the neutral element of the
reduction, which resulted in reading uninitialized values (we were
getting lucky that sometimes the uninitialized buffers were all zero's).

Also,
- Slight tweak to error messages in the e2e framework.
2021-09-08 15:30:15 -07:00
Ramiro Leal-Cavazos 6724de7692 Added sum lowering
Added lowering to torch.sum into linalg
2021-09-03 17:37:06 -07:00
Sean Silva ed2afe43e7 Fix TorchToIREE lowering.
We needed to resize the list, not just reserve capacity.
2021-09-03 23:57:54 +00:00
Sean Silva 600cc6b9c7 Fix import in jupyter notebook. 2021-09-03 23:57:54 +00:00
Sean Silva 7a3570e881 Clean up stale examples.
They were confusing users, and most didn't even work anymore.
2021-09-03 22:13:36 +00:00
Sean Silva 1dec561cfd Update llvm-project to 830c0b9023cd0cf91955900e0d96283e7a8c3711
- builder.getSymbolRefAttr is gone.
- OpAsmOpInterface's getAsmResultNames method needs explicit override
- a bunch of churn for builtin.func needing to be made explicit (and
  sometimes implicit?)
- operation printers no longer need to print the operation name
  themselves.
- snuck in beneficial trivial addition to TmpDeleteDeadIREEListsPass to
  test a particular upstream change e2e with my local patchset.
2021-09-03 14:16:38 -07:00
Sean Silva 9cc4fdcaa8 Update iree-dialects to IREE 7d9e4909f5524e275726b2754d3ad050818d56ae 2021-09-03 14:16:38 -07:00
Yi Zhang 3b0e5910a8 Refine types continue.
This should cover all the ops that are left in MT.
2021-09-02 14:39:28 -04:00
dan d9df4bfc95 Add sigmoid lowering
Follows existing conventions for activation functions
2021-08-30 17:32:23 -04:00
Sean Silva 29e1b2fe89 Delete RestrictedCanonicalizer
It doesn't work properly with the new dialect registration framework.
This was latent and only was exposed when running through npcomp-opt.
Not worth investing the brainpower to fix now.
2021-08-27 19:09:29 +00:00
dan d7320f3bda fixed some python imports
Change required to enable
./tools/torchscript_e2e_test.sh --config=iree
2021-08-27 14:58:45 -04:00
Sean Silva 1c53424fe7 Revert "Make verbose testing also report compile/trace/run messages."
This reverts commit d8db41b3b6.

These printouts didn't interoperate well with the reporting structure
since they printed out "immediately" rather than being retained in a
string in the TestResult. Doing so would defeat the purpose though,
because they were being used to determine timing to debug
https://github.com/llvm/mlir-npcomp/issues/287

I think these are best done as local modification when debugging a
particular issue, or we can invest in tracing annotations. Soon these
will all run in parallel, so it makes even less sense to have immediate
printouts.
2021-08-27 18:04:00 +00:00
Yi Zhang d6b9709fa5 Changes to refine types
- Add `!torch.optional` knowledge tracking
- Changes to improve type propagation for branches and terminators. See
examples in `refine-types-branch.mlir`
- Refator to separate handling of different ops from `visitOperation`
- Add refine types for a few new ops
2021-08-27 11:42:00 -04:00
Yi Zhang bc5eae41ca Add more folders to fold away branches
Added folders to a few binary computing ops, `TupleUnpack`,
`__contains__.str` and `__getitem__.Dict_str`.
2021-08-26 17:37:49 -04:00
Stella Laurenzo d8db41b3b6 Make verbose testing also report compile/trace/run messages.
Helped with #287.
2021-08-23 09:57:19 -04:00
Stella Laurenzo 4148f88576 Merge npcomp and mlir python namespaces.
* Now the parts of the MLIR API are directly exported under the npcomp module (i.e. `npcomp.ir`, etc).
* Has required fixes for https://reviews.llvm.org/D108489
* Deletes npcomp.tracing vs fixing it because it was a very early experiment that will not be carried forward.
* This makes the npcomp python distribution completely standalone and separate from an mlir installation.
* Makes most of npcomp itself relocatable for future use as a library.
* Most things are a namespace package now. In the future we can s/torch_mlir/npcomp.frontends.torch/ and have it layer properly.
2021-08-22 21:00:42 -07:00
Stella Laurenzo 177ccdd55b Fix flaky test_export_cat.py lit test (upstream change). 2021-08-22 20:04:47 -07:00
Stella Laurenzo 32f56c67f4 Integrate llvm-project at a8de667af092c9b4b3b4a95827a521602ebf14ed.
* Requires patch https://reviews.llvm.org/D108527
2021-08-22 18:59:59 -07:00
Stella Laurenzo 80ff744c56 Add a few missing deps exposed by stricter linking with BFD. 2021-08-22 11:56:48 -07:00
Sean Silva cab8d922ec Add TorchToIREE and factor out TorchConversion dialect.
This converts a basic list op (torch.prim.ListConstruct) to the IREE
dialect.

```
    def forward(self, x: float):
            return [x, x]
```

turns into:

```
builtin.func @forward(%arg0: !torch.float) -> !torch.list<!torch.float> {
  %0 = torch.prim.ListConstruct %arg0, %arg0 : (!torch.float, !torch.float) -> !torch.list<!torch.float>
  return %0 : !torch.list<!torch.float>
}
```

which turns into:

```
builtin.func @forward(%arg0: f64) -> !iree.list<f64> {
  %c1 = constant 1 : index
  %c0 = constant 0 : index
  %c2 = constant 2 : index
  %0 = iree.list.create %c2 : !iree.list<f64>
  iree.list.set %0[%c0], %arg0 : !iree.list<f64>, f64
  iree.list.set %0[%c1], %arg0 : !iree.list<f64>, f64
  return %0 : !iree.list<f64>
}
```

As part of doing this, I realized that it was time to formalize the IR
form that we reach right before running TorchTo{Linalg,Std,...}. We now
call it the "Torch backend contract". We then lower the "Torch backend
contract" to the "npcomp backend contract", which involves the new
TorchConversion (`torch_c`) dialect, which holds ops that need to
operate on both the npcomp backend types (e.g. builtin tensors, i1, IREE
list, etc.) and the `!torch` types.

This made more sense, as I realized that if I didn't factor out
`torch_c` then the Torch dialect would have a dependency on IREE
dialect (we previously didn't notice this was an issue because we only
depended on `builtin` types), which seemed wrong to me.

Recommended review order:
- TorchToIREE.cpp / `TorchToIREE/basic.mlir`
- Look at the new structure of createTorchScriptToNpcompBackendPipeline.
  It now lives in TorchConversion/Transforms/Passes.cpp and cleanly
  calls into `Torch::createTorchScriptToTorchBackendPipeline` for the
  frontend lowering to the Torch backend contract.
- Mechanical change extracting
  `torch_c.{to,from}_{i1,i64,f64,builtin_tensor,iree_list}` into a new
  TorchConversion dialect, and a few passes specific to the lowering
  from the Torch backend contract to the npcomp backend contract.
- Minor fixes to TorchToLinalg.cpp to use unconverted operands (now that
  we convert lists as part of operand materialization, we need to use
  the original operands). Also added test for AtenMaxPool2dOp and fixed
  m_TorchConstantIntList.
- TmpDeleteDeadIREELists pass. Temporary pass for deleting dead IREE lists that
  are created as part of operand materialization for conv/max pool/avg pool ops
  in TorchToLinalg.
2021-08-16 15:01:58 -07:00
Yi Zhang 85ff8b692b Fix compilation errors from MT model
With the following changes the compilation can continue until
RefineTypes pass:

- Add operators without ODS into `torch_ods_gen.py`
- Add some new optional and list types in `TorchTypes.td`
- Add some folders for aten int type comparator ops
- Modify GlobalizeObjectGraph.cpp. For global slots that's not used,
dont check if an aliased value is stored in more than one of global
slots. This can work around a failure where the same tensor is stored
in multiple "version" slots which are not used.
2021-08-16 16:37:23 -04:00
M4tr1xt4ng 78fd07da5f Deal with CMP0116 2021-08-12 09:40:55 -07:00
Sean Silva 0b7dbf5f81 Initial import of iree-dialects.
We plan on using these dialects "natively" as part of the npcomp backend
contract, and provide feedback to evolve them in IREE. Roughly speaking,
we can consider these dialects as "what's missing from upstream that we
think belongs in the general abstraction layer that npcomp's backend
contract targets".

We integrate them by just copying the relevant directory from the IREE
source tree (with `build_tools/update_iree_dialects.sh`). This avoids
adding IREE as a submodule, which is way too heavyweight (including
IREE itself, another copy of LLVM, TensorFlow, ...) and would give the
false impression of a source dependency rather than the lightweight (and
eventually versioned/stabilized) IR-level compatibility that we strive
for.
2021-08-11 13:00:04 -07:00
Sean Silva 37df45ded4 Update IREE xfail sets.
All tests pass after https://github.com/google/iree/pull/6666 :)
2021-08-11 11:19:09 -07:00
Sean Silva 6105b0f851 E2E framework: Add support for list/dict/scalar values
Most of the change is in the reporting code to give error messages that
are useful, and adjusting TraceItem to be semantically correct w.r.t.
Python's modeling of return values.

This allows writing a test like `ListLiteralModule_basic` for list
functionality, which we will soon be hooking up to IREE.

The IR for that test currently gets this far:
```
builtin.func @forward(%arg0: f64) -> !torch.list<!torch.float> {
  %0 = torch.from_f64 %arg0
  %1 = torch.prim.ListConstruct %0, %0 : (!torch.float, !torch.float) -> !torch.list<!torch.float>
  return %1 : !torch.list<!torch.float>
}
```

It should be sufficient to just add a conversion of
`torch.prim.ListConstruct` (+ relevant type conversion) to necessary
IREE primitives.

For lists of *tensors* (rather than scalar floats), it gets more
complicated, as we need to deal with changing their element type to
ValueTensorType first (by default, they will all be NonValueTensorType).
It seems that IREE might have a type we can lower into for non-value
tensors as well, TBD.
2021-08-11 10:55:43 -07:00
Yi Zhang bfc3ee35c6 Import Machine Translation model to MLIR.
This includes the following changes to import MT model into MLIR. There
are still a lot of work to for actual compilation.
- Add `torch.dict<>`, `torch.any`, `torch.number` types
- Add `torch.prim.DictConstruct` op
- Fix `torch.prim.TupleConstruct` op assembly format to include resulting types
2021-08-10 15:22:06 -04:00
Sean Silva a3bfd115ee Remove npcomp-iree-backend-lower-linkage pass.
This is no longer needed by IREE.
2021-08-09 15:28:02 -07:00
Sean Silva 902c2e579b Add resnet inference jupyter notebook.
This takes the example from torchscript_resnet18_e2e.py and puts it into
a slightly cleaned up notebook form.

It's still a little rough around the edges. Areas for improvement:
- Installation / setup.
- API usability.

Also,
- Add `npcomp-backend-to-iree-frontend-pipeline` since we will be adding
  more stuff there.
- Slight cleanups.
2021-08-09 14:34:43 -07:00
Sean Silva f71845ea75 Triage e2e IREE failures for npcetest.
ResNet works with static shapes. (our test is not static though).

All tests blocked on https://github.com/google/iree/issues/6629
2021-08-05 12:14:58 -07:00
Sean Silva c464cb107f Add npcomp-lsp-server.
To use, do `ninja npcomp-lsp-server`, copy `build/bin/npcomp-lsp-server`
into your PATH somewhere, and then add
```
"mlir.server_path": "npcomp-lsp-server",
```
to your settings.json.

Also bump llvm-project to 2d9759c7902c5cbc9a7e3ab623321d5578d51687 to
bring in latest `mlir-lsp-server` changes.
2021-08-04 10:01:48 -07:00
Yi Zhang 0342b73bf1 Add torch.aten.flatten.using_ints and aten.MaxPool2d linalg lowering
- torch.aten.flatten.using_ints to linalg lowering
- torch.aten.max_pool2d to linalg lowering
- Support torch.aten.conv2d for more flexible dilation and strides values
2021-08-04 12:00:43 -04:00
Sean Silva 496051163f Rename npcomp-run-mlir to refback-run
This better represents its limited scope. This was causing confusion --
people were feeding it higher level ops that require frontend lowering.
2021-08-03 18:24:24 -07:00
Sean Silva 719f0cd709 Minor wordsmithing to README 2021-08-03 14:59:58 -07:00
Sean Silva a153cf4ef2 Refresh Dockerfile and instructions.
Related to https://github.com/llvm/mlir-npcomp/issues/266
2021-08-03 14:54:39 -07:00
Sean Silva 453e29ea05 Add E2E support for tests with heavy dependencies (heavydep tests).
The tests use the same (pure-Python) test framework as the
normal torchscript_e2e_test.sh, but the tests are added in
`build_tools/torchscript_e2e_heavydep_tests` instead of
`frontends/pytorch/e2e_testing/torchscript`. Any needed dependencies can
easily be configured in generate_serialized_tests.sh.

We add an initial machine translation model with a complex set of
dependencies to seed the curriculum there. I verified that this model
gets to the point of MLIR import (it fails there with a segfault due to
not being able to import the "Any" type).

This required moving a few files from the `torch_mlir` Python module
into multiple modules to isolate the code that depends on our C++
extensions (which now live in `torch_mlir` and
`torch_mlir_torchscript_e2e_test_configs`) from the pure Python code
(which now lives in `torch_mlir_torchscript`). This is an entirely
mechanical change, and lots of imports needed to be updated.

The dependency graph is:
```
       torch_mlir_torchscript_e2e_test_configs
                  /              |
                 /               |
                /                |
               V                 V
torch_mlir_torchscript       torch_mlir
```

The `torch_mlir_torchscript_e2e_test_configs` are then dependency-injected
into the `torch_mlir_torchscript` modules to successfully assemble a
working test harness (the code was already structured this way, but this
new file organization allows the isolation from C++ code to actually
happen).  This isolation is critical to allowing the serialized programs
to be transported across PyTorch versions and for the test harness to be
used seamlessly to generate the heavydep tests.

Also:
- Extend `_Tracer` class to support nested property (submodule) accesses.

Recommended review order:
- "user-level" docs in README.md
- code in `build_tools/torchscript_e2e_heavydep_tests`.
- changes in `torch_mlir_torchscript/e2e_test/framework.py`
- misc mechanical changes.
2021-08-03 14:09:56 -07:00
Sean Silva f168cacd6d Remove TCF and TCP.
These were legacy concepts that are now superceded by direct Torch to
linalg-on-tensors lowering. These were based on some very early thinking
related to the layering of frontends vs codegen, which is now obsolete
because:
- We expected a lot more centralization at the frontend (TCF) level. It
  turns out that frontend needs really vary a lot, and there is no grand
  unifying TCF dialect plausible. The additional layer isn't worth it.
- Linalg-on-tensors obsoletes the primary need for TCP. There are still
  a few things not representable with linalg-on-tensors, but the support
  is growing and the whole "not included in linalg-on-tensors" direction
  needs to be rethought. Our TCP dialect didn't cover any of the
  actually important things in this space (such as sort, FFT, top-k,
  etc.).

See historical [slides](https://drive.google.com/file/d/1iljcpTQ5NPaMfGpoPDFml1XkYxjK_6A4/view) / [recording](https://drive.google.com/file/d/1jSPa8TwPKUt0WuLquGc8OgSUVYJHMvWZ/view)
for more details on the origin story here.

Their presence was confusing users too
[bug](https://github.com/llvm/mlir-npcomp/issues/248).

Also,
- Trim down npcomp-run-mlir testing. It was testing TCF to TCP
  lowering for the most part. The essential stuff is retained and
  rephrased with linalg-on-tensors. (we should probably rename it
  "refback-run" or something, as it is just a way to invoke RefBackend)
- test/Python/Backend/RefJIT/simple_invoke_numpy.py is XFAIL'ed. Our
  "anti-framework" direction seems to be the likely future path.
2021-08-02 12:08:39 -07:00
Sean Silva 7c788dbfec Remove CI pinning. 2021-08-02 11:07:08 -07:00
Yi Zhang 93816ee21a Add an e2e test example for Resnet18
Show an example of classifying image from
https://commons.wikimedia.org/wiki/File:YellowLabradorLooking_new.jpg
with Resnet18
2021-07-30 11:44:44 -04:00
Stella Laurenzo 8494455282 Re-enable integration tests in CI. 2021-07-29 22:57:20 -07:00
Yi Zhang 58b2109898 Lower accuracy to make e2e pass
`Conv2dNoPaddingModule_basic` and `Conv2dWithPaddingModule_basic` start
failing because of results accuracy after changing conv_2d linalg ops
from tc ops to yaml ops.
2021-07-29 23:33:38 -04:00
Stella Laurenzo 445472c51e Build packages for npcomp-torch.
* Adds a minimal setup.py for frontends/pytorch
* Makes npcomp-core export its headers and libraries
* Adds a script to build packages.
* Adds CI step to package and smoke test.
* Will need some more tweaks and coordination prior to deploying (version locking etc).
2021-07-29 19:58:59 -07:00