mirror of https://github.com/llvm/torch-mlir
135 lines
7.2 KiB
Markdown
135 lines
7.2 KiB
Markdown
# Roadmap as of beginning of 2021Q3
|
|
|
|
## Project status overview
|
|
|
|
- TorchScript compilation: Significant work has gone into the TorchScript
|
|
compilation workstream. Basic multi-layer perceptrons execute end-to-end, and
|
|
significant strides have been taken towards ResNet and quantized programs.
|
|
Additionally, a full TorchScript'able machine translation model (IDs to IDs;
|
|
including beam search) has been identified as representative of the kind of
|
|
challenging programs that the TorchScript ahead-of-time compilation flow will
|
|
enable.
|
|
|
|
- `acap_dispatch`: Discussions with stakeholders in the npcomp and PyTorch
|
|
community have shifted the `acap_dispatch` workstream to upstream discussions
|
|
(see [bug](https://github.com/pytorch/xla/issues/2854)). Work within npcomp on
|
|
`acap_dispatch` is temporarily on hold.
|
|
|
|
- RefBackend: The RefBackend workstream is temporarily on hold as well. The
|
|
needs of the TorchScript compilation path are too complex (lists, dicts, error
|
|
handling, runtime ABI) and the engineering resources too limited to
|
|
meaningfully bring up an alternative backend. The decision going forward is to
|
|
single-source on IREE as our needs become more complex. This is somewhat
|
|
unfortunate, as the goal of the RefBackend was to somewhat defray the backend
|
|
story and prevent single-sourcing on what at the time (~2020Q1) was perceived
|
|
as a large external dependency. Somewhat mitigating this situation though is
|
|
that in the intervening year, IREE has become significantly "leaner and
|
|
meaner", and while still nontrivial, it has found a much more tightly scoped
|
|
role that leans much more heavily on upstream infrastructure. In fact,
|
|
inclusion of IREE in the LLVM project in some form now seems possible, which
|
|
will make this dependency very natural.
|
|
|
|
## Non-technical project status overview
|
|
|
|
Community contributions have somewhat petered out due to the shifting focus of
|
|
the project. This was somewhat expected as the early aspirations of the project
|
|
met with the reality of available resourcing, ecosystem constraints, and more
|
|
fine-grained understanding of stakeholder needs. We have brought on 1 new full
|
|
time engineer to work on the project though.
|
|
|
|
## Roadmap overview
|
|
|
|
The project has converged on the TorchScript workstream as the primary effort:
|
|
|
|
- TorchScript compilation: The goal of this project is to build the frontend of
|
|
a truly next-generation ahead-of-time machine learning compiler.
|
|
- Why this project is cool: This system is designed from day 1 to support
|
|
features such as dynamic shapes, control flow, mutable variables,
|
|
program-internal state, and non-Tensor types (scalars, lists, dicts) in a
|
|
principled fashion. These features are essential for empowering an
|
|
industry-level shift in the set of machine learning programs that are
|
|
feasible to deploy with minimal effort across many devices (when combined
|
|
with a backend using the advanced compilation techniques being developed
|
|
elsewhere in the MLIR ecosystem).
|
|
|
|
## TorchScript compilation
|
|
|
|
The TorchScript compiler represents the bulk of core compiler effort in the
|
|
npcomp project.
|
|
[TorchScript](https://pytorch.org/docs/stable/jit_language_reference.html) is a
|
|
restricted (more static) subset of Python, but even TorchScript is quite dynamic
|
|
when compared to the needs of lower-levels of the compilation stack, especially
|
|
systems like Linalg. The overarching theme of this project is building out
|
|
compiler components that bridge that gap. As we do so, the recurring tradeoffs
|
|
are:
|
|
|
|
- user experience: we want a fairly unrestricted programming model -- that's
|
|
what users like about PyTorch, and what enables users to deploy without
|
|
significant modifications of their code.
|
|
- feasibility of the compiler: we want a smart compiler that is feasible to
|
|
implement (for our own sanity :) )
|
|
- excellent generated code quality: this is of course dependent on the backend
|
|
which is paired with the frontend we are building, but there are a number of
|
|
transformations that make sense before we reach the backend which strongly
|
|
affect the quality of code generated from a backend.
|
|
|
|
To give a concrete example, consider the problem of inferring the shapes of
|
|
tensors at various points in the program. The more precision we have on the
|
|
shapes, the better code can be emitted by a backend. But in general, users need
|
|
to provide at least some information about their program to help the compiler
|
|
understand what shapes are at different points in the program. The smarter our
|
|
compiler algorithms are, the less information the user needs to provide. Thus,
|
|
all 3 facets are interlinked and there is no single right answer -- we need to
|
|
balance them for a workable system.
|
|
|
|
To accomplish this goal, we are guided by a *model curriculum*, which consists
|
|
of programs of escalating complexity, from a simple elementwise operation all
|
|
the way to a full-blown end-to-end speech recognition program. Our development
|
|
process consists of setting incremental objectives to build out new layers of
|
|
the compiler to a satisfactory level on easier programs in the curriculum, and
|
|
backfilling complexity as needed to extend to the harder programs. Ideally, this
|
|
backfilling does not require deep conceptual changes to components, but is
|
|
simply an application of extension points anticipated in the original design.
|
|
The trick to making that happen is evaluating designs on enough programs from
|
|
the curriculum to ensure that a solution is likely to generalize and satisfy our
|
|
objectives, without getting bogged down in theoretical details.
|
|
|
|
### 2021Q3
|
|
|
|
- Theme: Scale up the programs we can run end-to-end.
|
|
- End-to-end execution of ResNet.
|
|
- Significant strides towards end-to-end execution of the identified
|
|
end-to-end machine translation model.
|
|
- End-to-end execution of simple programs with lists.
|
|
- End-to-end execution of simple stateful programs.
|
|
- Significant strides towards end-to-end execution of two major "classes of
|
|
models". Tentatively: transformer, LSTM.
|
|
- Theme: Start feeling production-ey
|
|
- For the simplest programs at least, get them running on IREE with
|
|
performance competitive with other frontends
|
|
- Stretch: Extend this result to ResNet.
|
|
- Write initial "user manual" (and any supporting tools, packaging) for how to
|
|
use the new frontend (+ backend integration points) to deploy something on
|
|
IREE.
|
|
- Redesign frontend API's as needed to be palatable to document.
|
|
|
|
|
|
### 2021Q4
|
|
|
|
- Theme: Compiler becomes generally functional for a large class of programs
|
|
- End-to-end execution of end-to-end MT (machine translation) program.
|
|
- End-to-end execution of the two major "classes of models" added to the
|
|
curriculum in Q3.
|
|
- End-to-end execution of quantized model.
|
|
- Identify/build TorchScript'able ASR (automatic speech recognition) program.
|
|
- Significant strides towards end-to-end execution of ASR.
|
|
- Bringing up new programs should be fairly quick and mechanical.
|
|
- Theme: Pathfind next phase after initial compiler bringup.
|
|
- Begin talks with potential users / applications to identify a useful
|
|
"real" capstone project.
|
|
- Goal: Demonstrate viability of the tools.
|
|
- Goal: Start rallying support / interest more broadly.
|
|
- Begin looking at training use cases.
|
|
- Begin looking at building "anti-framework" numerical Python compiler layered
|
|
on our TorchScript compiler.
|