Most of the change is in the reporting code to give error messages that
are useful, and adjusting TraceItem to be semantically correct w.r.t.
Python's modeling of return values.
This allows writing a test like `ListLiteralModule_basic` for list
functionality, which we will soon be hooking up to IREE.
The IR for that test currently gets this far:
```
builtin.func @forward(%arg0: f64) -> !torch.list<!torch.float> {
%0 = torch.from_f64 %arg0
%1 = torch.prim.ListConstruct %0, %0 : (!torch.float, !torch.float) -> !torch.list<!torch.float>
return %1 : !torch.list<!torch.float>
}
```
It should be sufficient to just add a conversion of
`torch.prim.ListConstruct` (+ relevant type conversion) to necessary
IREE primitives.
For lists of *tensors* (rather than scalar floats), it gets more
complicated, as we need to deal with changing their element type to
ValueTensorType first (by default, they will all be NonValueTensorType).
It seems that IREE might have a type we can lower into for non-value
tensors as well, TBD.
This includes the following changes to import MT model into MLIR. There
are still a lot of work to for actual compilation.
- Add `torch.dict<>`, `torch.any`, `torch.number` types
- Add `torch.prim.DictConstruct` op
- Fix `torch.prim.TupleConstruct` op assembly format to include resulting types
This takes the example from torchscript_resnet18_e2e.py and puts it into
a slightly cleaned up notebook form.
It's still a little rough around the edges. Areas for improvement:
- Installation / setup.
- API usability.
Also,
- Add `npcomp-backend-to-iree-frontend-pipeline` since we will be adding
more stuff there.
- Slight cleanups.
To use, do `ninja npcomp-lsp-server`, copy `build/bin/npcomp-lsp-server`
into your PATH somewhere, and then add
```
"mlir.server_path": "npcomp-lsp-server",
```
to your settings.json.
Also bump llvm-project to 2d9759c7902c5cbc9a7e3ab623321d5578d51687 to
bring in latest `mlir-lsp-server` changes.
- torch.aten.flatten.using_ints to linalg lowering
- torch.aten.max_pool2d to linalg lowering
- Support torch.aten.conv2d for more flexible dilation and strides values
The tests use the same (pure-Python) test framework as the
normal torchscript_e2e_test.sh, but the tests are added in
`build_tools/torchscript_e2e_heavydep_tests` instead of
`frontends/pytorch/e2e_testing/torchscript`. Any needed dependencies can
easily be configured in generate_serialized_tests.sh.
We add an initial machine translation model with a complex set of
dependencies to seed the curriculum there. I verified that this model
gets to the point of MLIR import (it fails there with a segfault due to
not being able to import the "Any" type).
This required moving a few files from the `torch_mlir` Python module
into multiple modules to isolate the code that depends on our C++
extensions (which now live in `torch_mlir` and
`torch_mlir_torchscript_e2e_test_configs`) from the pure Python code
(which now lives in `torch_mlir_torchscript`). This is an entirely
mechanical change, and lots of imports needed to be updated.
The dependency graph is:
```
torch_mlir_torchscript_e2e_test_configs
/ |
/ |
/ |
V V
torch_mlir_torchscript torch_mlir
```
The `torch_mlir_torchscript_e2e_test_configs` are then dependency-injected
into the `torch_mlir_torchscript` modules to successfully assemble a
working test harness (the code was already structured this way, but this
new file organization allows the isolation from C++ code to actually
happen). This isolation is critical to allowing the serialized programs
to be transported across PyTorch versions and for the test harness to be
used seamlessly to generate the heavydep tests.
Also:
- Extend `_Tracer` class to support nested property (submodule) accesses.
Recommended review order:
- "user-level" docs in README.md
- code in `build_tools/torchscript_e2e_heavydep_tests`.
- changes in `torch_mlir_torchscript/e2e_test/framework.py`
- misc mechanical changes.
These were legacy concepts that are now superceded by direct Torch to
linalg-on-tensors lowering. These were based on some very early thinking
related to the layering of frontends vs codegen, which is now obsolete
because:
- We expected a lot more centralization at the frontend (TCF) level. It
turns out that frontend needs really vary a lot, and there is no grand
unifying TCF dialect plausible. The additional layer isn't worth it.
- Linalg-on-tensors obsoletes the primary need for TCP. There are still
a few things not representable with linalg-on-tensors, but the support
is growing and the whole "not included in linalg-on-tensors" direction
needs to be rethought. Our TCP dialect didn't cover any of the
actually important things in this space (such as sort, FFT, top-k,
etc.).
See historical [slides](https://drive.google.com/file/d/1iljcpTQ5NPaMfGpoPDFml1XkYxjK_6A4/view) / [recording](https://drive.google.com/file/d/1jSPa8TwPKUt0WuLquGc8OgSUVYJHMvWZ/view)
for more details on the origin story here.
Their presence was confusing users too
[bug](https://github.com/llvm/mlir-npcomp/issues/248).
Also,
- Trim down npcomp-run-mlir testing. It was testing TCF to TCP
lowering for the most part. The essential stuff is retained and
rephrased with linalg-on-tensors. (we should probably rename it
"refback-run" or something, as it is just a way to invoke RefBackend)
- test/Python/Backend/RefJIT/simple_invoke_numpy.py is XFAIL'ed. Our
"anti-framework" direction seems to be the likely future path.
`Conv2dNoPaddingModule_basic` and `Conv2dWithPaddingModule_basic` start
failing because of results accuracy after changing conv_2d linalg ops
from tc ops to yaml ops.
* Adds a minimal setup.py for frontends/pytorch
* Makes npcomp-core export its headers and libraries
* Adds a script to build packages.
* Adds CI step to package and smoke test.
* Will need some more tweaks and coordination prior to deploying (version locking etc).
* Change aligned_alloc -> malloc. It can fail (and does on MacOS) and is a bit over-aggressive optimization for a reference backend.
* Fixed a fragile test that prints -0.0 on MacOS.
* Fail the test (not the framework) on failure to trace (Torch on MacOS is missing features).
* Fix .so -> .dylib for compiler runtime.
* Added additional *ToLLVM conversion patterns (they were disaggregated from standard).
* Misc renames.
* Spelling change on ConvNCHW op, and it now expects strides and dilations attributes.
- Build adjustments for `.cpp.inc` dialect files.
- Renaming of `memref.dim` to `tensor.dim` for tensor case.
Minor changes:
- Renaming of `mlir::linalg::ReassociationIndices` to
`mlir::ReassociationIndices`.
- Adjust command line option parsing in npcomp-run-mlir.
This includes IREE and RefBackend.
This includes a fixup to torchscript_e2e_test.sh for handling the
situation where PYTHONPATH was not already exported.
I'm seeing the following error:
```
CMake Error in frontends/pytorch/csrc/CMakeLists.txt:
Imported target "torch" includes non-existent path
"/usr/local/include/breakpad"
in its INTERFACE_INCLUDE_DIRECTORIES.
```
Reported upstream in: https://github.com/pytorch/pytorch/issues/60485
- Add support for "expected failures" in test reporting. The new error
reports look like
[this](https://gist.github.com/silvasean/6ffd95e1d55302b699673da201da210d).
- We will now be able to put these tests into CI, since the harness
understand which tests are expected to pass and fail.
- Refactor RefBackendTestConfig to NpcompBackendTestConfig which
supports both RefBackend and IREE.
- Add instructions for installing IREE dependencies (both from packages
and for local builds of IREE)
- Add `tools/torchscript_e2e_test.sh` for invoking the e2e test
harness (this makes invoking a bit easier, as it doesn't rely on a
loose Python invocation).
We plumb through e2e a fair number of interesting cases:
- unary, binary, ternary elementwise ops
- ops like `torch.aten.add.Tensor` that also take a scalar parameter
- static size-1 broadcasting
We allow the static size-1 broadcasting case, but emit a runtime error
in the case of dynamic size-1 broadcasting. This seems like a sweet spot
subset of things that can be lowered directly to linalg, while not being
overly constraining to users. This is consistent with what IREE is doing
for CHLO->Linalg lowering as well
([code](50bf7a87e4/iree/compiler/InputConversion/MHLO/BroadcastingToLinalgPatterns.cpp (L1))).
To test the static size-1 case, we added support for the
`torch.aten.unsqueeze` op and lowering for it through
`linalg.tensor_expand_shape`. This involved a generalization of
`MaximizeValueSemantics` able to handle it (the solution there also
works for `torch.aten.flatten.using_ints` which we need for ResNet
anyway)
Also, a few minor additional changes:
- Add `VerifyInvariantsBeforeBackendLowering` pass, which catches a
large class of errors before we get to backend lowering (now that we
are doing dialect conversion, the errors are way nicer if we just emit
them up front rather than in the guts of a random pattern).
- Minor change to RefBackend to allow `linalg.tensor_expand_shape`.
Recommended review order:
- e2e tests in elementwise.py
- `ConvertElementwiseOp` in TorchToLinalg.cpp + elementwise.mlir test
- `ConvertAtenUnsqueezeOp` in TorchToLinalg.cpp + unsqueeze.mlir test
- RefineTypes.cpp + tests
- MaximizeValueSemantics changes + test
- VerifyInvariantsBeforeBackendLowering pass + test
This adds a pattern to MaximizeValueSemantics which does a simple
abstract interpretation within a block, which handles simple cases of
`torch.overwrite_tensor`, enough to remove all the unnecessary uses of
non-value tensors in ResNet right now.
Before/after IR:
[gist](https://gist.github.com/silvasean/a3e1ef625b19dfc63579f73cd3b543b6)
Also,
- Split `torch.copy.tensor` into `torch.copy.to_tensor` and
`torch.copy.to_vtensor` which convert between value and non-value
semantic tensors. This is a much cleaner factorization as they have
very separate use cases and properties (e.g. different side effects)
- Remove the various canonicalization patterns they had, which were
confusing because they resulted in limited forms of maximizing value
semantics throughout the pipeline. We should structure our compilation
pipeline such that only MaximizeValueSemantics should be maximizing
value semantics.
- Adjust pass pipeline to only run MaximizeValueSemantics once.
- Make OverwriteTensorOp `$value` always be a value tensor and
`$overwritten` be a non-value tensor.
1. Added a simplified version of torch.aten.batch_norm which only handles
inference and assumes the weight, bias, running_mean, running_var are not
None.
2. Removed the primitive types check in verifyLinalgCompatibleTypes check
since now we have proper type converter to handle torch types conversion.
The checks for RankedTensorType is kept because the type converter
doesn't guarantee the converted builtin tensor type is ranked. A
separate verification pass to verify the invariant expected by later
passes will need to be added before those can be removed as well.
This temporarily works around the CMake error:
```
CMake Error in frontends/pytorch/csrc/CMakeLists.txt:
Imported target "torch" includes non-existent path
"/pytorch/torch/lib"
in its INTERFACE_INCLUDE_DIRECTORIES.
```
This op is much better behaved than the `torch.tensor.literal` op
(which is the new name of the `torch.tensor` op). In particular
`torch.tensor.literal`:
- always has a maximally refined type.
- always has value semantics.
- can be constant folded / CSE'd.
ReduceOpVariants is changed to perform the transformation from
`torch.tensor.literal` to `torch.vtensor.literal` (which in general
involves static information casts and copies.
This new op also allowed tightening up `torch.tensor.literal` to only
accept NonValueTensorType (instead of any tensor type).
This new ".literal" name is more descriptive. It was getting too
confusing seeing an op called just `torch.tensor` (we originally called
it that because that's the name of the similar function in the Torch
Python API, but it just doesn't fit here).
This removes the dependence of the `torch` dialect on the low-level
builtin types.
Now the `torch` dialect is a standalone layer, suitable for targeting
from higher-level Python abstractions without any premature lowering to
primitive types.
This replaces the ad-hoc use of `i64` throughout the Torch layer, and
helps to keep it crystal clear the distinction between `!torch.int`
(which is modeling the Python `int` type) and the various types that
serve as dtypes of tensors, which are a totally different type universe.
Changes:
- `!torch.int` type and C bindings.
- Change `torch.constant.int` parser to not need the `: i64` at the end.
- `m_TorchConstantInt` matcher to aid with matching constants.
- BackendTypeConversion changes for `!torch.int` -> `i64` type
conversion.
- Refactor finalizing patterns in FinalizingBackendTypeConversionPass
(they were getting very repetitive).
- Mechanical rewriting of `!torch.int` to `i64` in all the tests, and
`AnyTorchIntType` to `Torch_IntType` in the `.td` files.