Remove old outdated roadmaps. Add placeholder new one.

2021-10-01 17:22:35 +00:00 · 2021-10-01 17:22:35 +00:00 · 9fc059e948
parent 25a2c8bd85
commit 9fc059e948
3 changed files with 14 additions and 384 deletions
--- a/docs/roadmaps/2021Q2.md
+++ b/docs/roadmaps/2021Q2.md
@ -1,250 +0,0 @@
-# Roadmap as of beginning of 2021Q2
-
-## Non-technical project status overview
-
-The project has evolved past "works on my machine" stage. It's hard to provide
-meaningful numbers for an open-source project, but I'm seeing >5 people
-regularly active on pull requests, bugs, etc. covering aspects ranging from
-acap_dispatch, TorchScript, RefBackend, build systems, and even CI. This is very
-promising and feels healthy to me.
-
-## Roadmap overview
-
-The project has grown a number of aspects, but effort has converged on 3
-workstreams:
-
- acap_dispatch: The goal of this project is to develop a tracing-based frontend
-  for Torch interoperability that takes clues from existing working solutions to
-  enable a gateway from PyTorch to MLIR.
-  - Why this project is cool: For users that can tolerate the limitations of
-    tracing systems, this project enables an MLIR-based frontend for PyTorch on
-    a shorter-time frame than the TorchScript compilation, letting downstream
-    users focus on their value-add. Also, the tracing-based approach has a
-    distinct usability advantage for many pure-Python researcher workflows.
-
- TorchScript compilation: The goal of this project is to build the frontend of
-  a truly next-generation ahead-of-time machine learning compiler.
-  - Why this project is cool: This system is designed from day 1 to support
-    features such as dynamic shapes, control flow, mutable variables,
-    program-internal state, and non-Tensor types (scalars, lists, dicts) in a
-    principled fashion. These features are essential for empowering an
-    industry-level shift in the set of machine learning programs that are
-    feasible to deploy with minimal effort across many devices (when combined
-    with a backend using the advanced compilation techniques being developed
-    elsewhere in the MLIR ecosystem).
-
- Reference backend (RefBackend): The goal of this project is to develop a
-  reference end-to-end flow for the MLIR project, using the needs of our
-  frontends as seeds for new feature development and upstreaming.
-  - Why this project is cool: Due to its status as an LLVM incubator project,
-    npcomp is uniquely positioned to develop an end-to-end flow with a clear
-    path toward upstreaming components as their design converges (example:
-    bufferization), or rapidly rebasing on newly added upstream components to
-    replace homegrown pieces (example: linalg on tensors).
-
-## acap_dispatch
-
-acap_dispatch is the name of our implementation of a tracing-based PyTorch
-program capture system, analogous to the one used by
-[pytorch/xla](https://github.com/pytorch/xla). This system is sufficient to
-capture very many programs of interest, and has the benefit of seamlessly
-capturing gradients, shapes, and dtypes, while still bottoming out on the same
-ATen dialect needed by the TorchScript path. It also trivializes all use of
-Python-data structures like lists by directly observing their values as
-constants.
-
-Looking a bit longer-term, this flow is a good complement to the TorchScript
-flow and has distinct tradeoffs. These are captured nicely in the paper
-[LazyTensor: combining eager execution with domain-specific
-compilers](https://arxiv.org/abs/2102.13267). In their terminology, our
-acap_dispatch path implements "Tracing" while our TorchScript path implements
-"Direct compilation". Direct compilation tends to be required for deploying
-complex models for inference, edge, or federated learning applications, while
-Tracing is the building block for a totally seamless researcher experience when
-iterating in Python.
-
-### 2021Q2
-
- Improve robustness of the flow's program capture, ideally to the level of
-  `pytorch/xla`.
- Get into a steady-state where adding operations is fairly mechanical.
- Support at least a few programs of community interest.
- Identify demand for a more holistic user experience, analogous to
-  `pytorch/xla`. For example, building out support for the more runtime-y
-  aspects like compiling on the fly, moving tensors in and out of the compiler
-  system's runtime, etc. that makes it an actual user experience rather than
-  just a way to get compiler IR.
-
-## TorchScript compilation
-
-The TorchScript compiler represents the bulk of core compiler effort in the
-npcomp project.
-[TorchScript](https://pytorch.org/docs/stable/jit_language_reference.html) is a
-restricted (more static) subset of Python, but even TorchScript is quite dynamic
-when compared to the needs of lower-levels of the compilation stack, especially
-systems like Linalg. The overarching theme of this project is building out
-compiler components that bridge that gap. As we do so, the recurring tradeoffs
-are:
-
- user experience: we want a fairly unrestricted programming model -- that's
-  what users like about PyTorch, and what enables users to deploy without
-  significant modifications of their code.
- feasibility of the compiler: we want a smart compiler that is feasible to
-  implement (for our own sanity :) )
- excellent generated code quality: this is of course dependent on the backend
-  which is paired with the frontend we are building, but there are a number of
-  transformations that make sense before we reach the backend which strongly
-  affect the quality of code generated from a backend.
-
-To give a concrete example, consider the problem of inferring the shapes of
-tensors at various points in the program. The more precision we have on the
-shapes, the better code can be emitted by a backend. But in general, users need
-to provide at least some information about their program to help the compiler
-understand what shapes are at different points in the program. The smarter our
-compiler algorithms are, the less information the user needs to provide. Thus,
-all 3 facets are interlinked and there is no single right answer -- we need to
-balance them for a workable system.
-
-To accomplish this goal, we intend to be guided by a *model curriculum*, which
-consists of programs of escalating complexity, from a simple elementwise
-operation all the way to a full-blown end-to-end speech recognition program. Our
-development process consists of setting incremental objectives to build out new
-layers of the compiler to a satisfactory level on easier programs in the
-curriculum, and backfilling complexity as needed to extend to the harder
-programs. Ideally, this backfilling does not require deep conceptual changes to
-components, but is simply an application of extension points anticipated in the
-original design. The trick to making that happen is evaluating designs on enough
-programs from the curriculum to ensure that a solution is likely to generalize
-and satisfy our objectives, without getting bogged down in theoretical details.
-
-### 2021Q2
-
- Model curriculum
-  - Formalize / publish curriculum to ease collaboration
-  - Incorporate end-to-end ASR (speech recognition) model into curriculum, or
-    program of similar complexity.
-  - Incorporate representative quantized models into curriculum.
- End-to-end execution of at least the simplest models in the curriculum.
-  - User annotation infrastructure for users to provide the compiler
-    information, such as shapes to seed shape inference.
-  - Fill out ATen dialect and `aten-recognize-kernels` pass.
-  - ATen lowering to Linalg-on-tensors
-    - Implement a minimal amount of linalg-inspired abstractions in the "TCF"
-      dialect.
-    - Extend the linalg
-      [OpDSL tooling](https://llvm.discourse.group/t/rfc-linalg-opdsl/2966/6) to
-      enable us to programmatically emit shape validity checks.
-  - Shape/dtype inference
-    - As needed for other incremental objectives.
-    - Build a clear picture of the right place(s) in the longer-term compiler
-      pipeline for shape inference.
-  - Canonicalizations and general compiler optimizations
-    - As needed for other incremental objectives.
-  - Backend choice: RefBackend or IREE candidates.
-
-### 2021Q3
-
- Start to smell a little production-ey
-  - For the simplest models at least, get them running on IREE with performance
-    competitive with other frontends.
-  - Write initial "user manual" (and any supporting tools) for how to use the
-    new frontend (+ backend integration points) to deploy something.
- Extend model support:
-  - Vertically integrated spike building out generalized support for list, dict,
-    etc. for representative complex models. (co-design with RefBackend or IREE).
-  - Implement coherent shape/dtype inference design based on Q2 insights.
- Scale up of Q2 compiler features to the curriculum
-  - Extend user annotation infrastructure as needed.
-  - ATen dialect and `aten-recognize-kernels` pass
-  - ATen --> Linalg lowerings.
-  - Canonicalizations and other compiler optimizations
-
-
-### 2021Q2 Retrospective (added afterwards)
-
- Model curriculum
-  - [✅] Formalize / publish curriculum to ease collaboration
-    - Note: See `frontends/pytorch/e2e_testing/torchscript`.
-  - [~] Incorporate end-to-end ASR (speech recognition) model into curriculum,
-    or program of similar complexity.
-    - Note: TorchScript'able machine translation model identified, but not
-      formally added.
-  - [✅] Incorporate representative quantized models into curriculum.
-    - Note: See `frontends/pytorch/e2e_testing/torchscript/quantized_models.py`
- End-to-end execution of at least the simplest models in the curriculum.
-  - [✅] User annotation infrastructure for users to provide the compiler
-    information, such as shapes to seed shape inference.
-    - Note: See `frontends/pytorch/csrc/builder/class_annotator.cpp` and
-      `frontends/pytorch/python/torch_mlir/torchscript/annotations.py`.
-  - [✅] Fill out ATen dialect and `aten-recognize-kernels` pass.
-    - Note: Accomplished with significant design shift. See
-      [PR](https://github.com/llvm/mlir-npcomp/pull/214).
-  - [❌] ATen lowering to Linalg-on-tensors
-    - Implement a minimal amount of linalg-inspired abstractions in the "TCF"
-      dialect.
-    - Extend the linalg
-      [OpDSL tooling](https://llvm.discourse.group/t/rfc-linalg-opdsl/2966/6) to
-      enable us to programmatically emit shape validity checks.
-    - Note: Not enough programs / ops brought up yet to generalize.
-  - [✅] Shape/dtype inference
-    - As needed for other incremental objectives.
-    - Build a clear picture of the right place(s) in the longer-term compiler
-      pipeline for shape inference.
-    - Note: Need one major pass of this in the frontend at the torch level to
-      get ranks and dtypes, and then one later pass at the linalg level to
-      propagate specific sizes as much as possible.
-  - [✅] Canonicalizations and general compiler optimizations
-    - As needed for other incremental objectives.
-  - [✅] Backend choice: RefBackend or IREE candidates.
-    - Note: Used RefBackend this quarter due to simplicity of current programs +
-      some blocking IREE issues.
-
-## RefBackend
-
-The npcomp reference backend (or "RefBackend") is perhaps the most confusing
-part of the project, since it really has nothing to do per se with compiling
-numerical Python programs. The RefBackend's biggest impact is really a strategic
-play on two time horizons:
-
- short-medium term: Avoid bad design decisions by avoiding single-sourcing on
-  IREE.
-  - Although some key contributors to npcomp are closely affiliated with IREE,
-    there is a distinct desire to honor the spirit of being an LLVM incubator
-    and not have the npcomp project evolve into an extension of IREE. We also
-    believe that this kind of design influence results in a better system in
-    general.
- medium-long term: Give upstream MLIR a more "batteries included" end to end
-  flow by incubating minimally-opinionated components and upstreaming them.
-  - Context: Due to history, all MLIR-based end-to-end flows of nontrivial
-    capability live in downstream repositories, such as TensorFlow, IREE, etc.
-    This leads to an awkward situation where sometimes code is added to
-    upstream, but any load-bearing use case cannot be exercised with upstream
-    tools (such as quantifying performance, building auto-tuning infrastructure,
-    etc.). This leads to significant drag on MLIR's overall trajectory.
-
-The way we intend to advance those two strategic goals there is to incorporate
-end-to-end execution on the RefBackend as part of the end-to-end execution
-milestones of the acap_dispatch and TorchScript frontends.
-
-### 2021Q2
-
- Build out support for PyTorch kernel fallback.
- Help Nicolas build out and ideally land upstream his linalg-on-tensors
-  [e2e execution sandbox](https://github.com/google/iree/tree/main/experimental/runners),
-  with an eye towards rebasing aspects of the RefBackend flow on those
-  components.
- Build out better runtime calling convention interop.
- Start thinking about a plan to support list, dict, etc. in the runtime,
-  ideally using MLIR infra to make it magically generalize and be minimally
-  opinionated.
-
-### 2021Q3
-
- Using the runtime abstractions built out for list, dict, etc., ditch the
-  `memref`-based lowering flow and use new primitives for the "top-level" of the
-  program (use of memref should be isolated from e.g. top-level control flow,
-  lifetime management, calling conventions, etc.).
- Use (or help build) upstream linalg-on-tensors abstractions analogous to
-  IREE's `flow.dispatch.workgroups` (parallel computation grid) that
-  linalg-on-tensors can directly fuse into, avoiding phase ordering issues with
-  fusion, bufferization, kernel generation.
--- a/docs/roadmaps/2021Q3.md
+++ b/docs/roadmaps/2021Q3.md
@ -1,134 +0,0 @@
-# Roadmap as of beginning of 2021Q3
-
-## Project status overview
-
- TorchScript compilation: Significant work has gone into the TorchScript
-  compilation workstream. Basic multi-layer perceptrons execute end-to-end, and
-  significant strides have been taken towards ResNet and quantized programs.
-  Additionally, a full TorchScript'able machine translation model (IDs to IDs;
-  including beam search) has been identified as representative of the kind of
-  challenging programs that the TorchScript ahead-of-time compilation flow will
-  enable.
-
- `acap_dispatch`: Discussions with stakeholders in the npcomp and PyTorch
-  community have shifted the `acap_dispatch` workstream to upstream discussions
-  (see [bug](https://github.com/pytorch/xla/issues/2854)). Work within npcomp on
-  `acap_dispatch` is temporarily on hold.
-
- RefBackend: The RefBackend workstream is temporarily on hold as well. The
-  needs of the TorchScript compilation path are too complex (lists, dicts, error
-  handling, runtime ABI) and the engineering resources too limited to
-  meaningfully bring up an alternative backend. The decision going forward is to
-  single-source on IREE as our needs become more complex. This is somewhat
-  unfortunate, as the goal of the RefBackend was to somewhat defray the backend
-  story and prevent single-sourcing on what at the time (~2020Q1) was perceived
-  as a large external dependency. Somewhat mitigating this situation though is
-  that in the intervening year, IREE has become significantly "leaner and
-  meaner", and while still nontrivial, it has found a much more tightly scoped
-  role that leans much more heavily on upstream infrastructure. In fact,
-  inclusion of IREE in the LLVM project in some form now seems possible, which
-  will make this dependency very natural.
-
-## Non-technical project status overview
-
-Community contributions have somewhat petered out due to the shifting focus of
-the project. This was somewhat expected as the early aspirations of the project
-met with the reality of available resourcing, ecosystem constraints, and more
-fine-grained understanding of stakeholder needs. We have brought on 1 new full
-time engineer to work on the project though.
-
-## Roadmap overview
-
-The project has converged on the TorchScript workstream as the primary effort:
-
- TorchScript compilation: The goal of this project is to build the frontend of
-  a truly next-generation ahead-of-time machine learning compiler.
-  - Why this project is cool: This system is designed from day 1 to support
-    features such as dynamic shapes, control flow, mutable variables,
-    program-internal state, and non-Tensor types (scalars, lists, dicts) in a
-    principled fashion. These features are essential for empowering an
-    industry-level shift in the set of machine learning programs that are
-    feasible to deploy with minimal effort across many devices (when combined
-    with a backend using the advanced compilation techniques being developed
-    elsewhere in the MLIR ecosystem).
-
-## TorchScript compilation
-
-The TorchScript compiler represents the bulk of core compiler effort in the
-npcomp project.
-[TorchScript](https://pytorch.org/docs/stable/jit_language_reference.html) is a
-restricted (more static) subset of Python, but even TorchScript is quite dynamic
-when compared to the needs of lower-levels of the compilation stack, especially
-systems like Linalg. The overarching theme of this project is building out
-compiler components that bridge that gap. As we do so, the recurring tradeoffs
-are:
-
- user experience: we want a fairly unrestricted programming model -- that's
-  what users like about PyTorch, and what enables users to deploy without
-  significant modifications of their code.
- feasibility of the compiler: we want a smart compiler that is feasible to
-  implement (for our own sanity :) )
- excellent generated code quality: this is of course dependent on the backend
-  which is paired with the frontend we are building, but there are a number of
-  transformations that make sense before we reach the backend which strongly
-  affect the quality of code generated from a backend.
-
-To give a concrete example, consider the problem of inferring the shapes of
-tensors at various points in the program. The more precision we have on the
-shapes, the better code can be emitted by a backend. But in general, users need
-to provide at least some information about their program to help the compiler
-understand what shapes are at different points in the program. The smarter our
-compiler algorithms are, the less information the user needs to provide. Thus,
-all 3 facets are interlinked and there is no single right answer -- we need to
-balance them for a workable system.
-
-To accomplish this goal, we are guided by a *model curriculum*, which consists
-of programs of escalating complexity, from a simple elementwise operation all
-the way to a full-blown end-to-end speech recognition program. Our development
-process consists of setting incremental objectives to build out new layers of
-the compiler to a satisfactory level on easier programs in the curriculum, and
-backfilling complexity as needed to extend to the harder programs. Ideally, this
-backfilling does not require deep conceptual changes to components, but is
-simply an application of extension points anticipated in the original design.
-The trick to making that happen is evaluating designs on enough programs from
-the curriculum to ensure that a solution is likely to generalize and satisfy our
-objectives, without getting bogged down in theoretical details.
-
-### 2021Q3
-
- Theme: Scale up the programs we can run end-to-end.
-  - End-to-end execution of ResNet.
-  - Significant strides towards end-to-end execution of the identified
-    end-to-end machine translation model.
-  - End-to-end execution of simple programs with lists.
-  - End-to-end execution of simple stateful programs.
-  - Significant strides towards end-to-end execution of two major "classes of
-    models". Tentatively: transformer, LSTM.
- Theme: Start feeling production-ey
-  - For the simplest programs at least, get them running on IREE with
-    performance competitive with other frontends
-      - Stretch: Extend this result to ResNet.
-  - Write initial "user manual" (and any supporting tools, packaging) for how to
-    use the new frontend (+ backend integration points) to deploy something on
-    IREE.
-    - Redesign frontend API's as needed to be palatable to document.
-
-
-### 2021Q4
-
- Theme: Compiler becomes generally functional for a large class of programs
-  - End-to-end execution of end-to-end MT (machine translation) program.
-  - End-to-end execution of the two major "classes of models" added to the
-    curriculum in Q3.
-  - End-to-end execution of quantized model.
-  - Identify/build TorchScript'able ASR (automatic speech recognition) program.
-  - Significant strides towards end-to-end execution of ASR.
-  - Bringing up new programs should be fairly quick and mechanical.
- Theme: Pathfind next phase after initial compiler bringup.
-  - Begin talks with potential users / applications to identify a useful
-    "real" capstone project.
-      - Goal: Demonstrate viability of the tools.
-      - Goal: Start rallying support / interest more broadly.
-  - Begin looking at training use cases.
-  - Begin looking at building "anti-framework" numerical Python compiler layered
-    on our TorchScript compiler.
--- a/docs/roadmaps/2021Q4.md
+++ b/docs/roadmaps/2021Q4.md
@ -0,0 +1,14 @@
+# 2021Q4 Roadmap
+
+**NOTE**: Under construction. Feedback from the
+[Torch-MLIR RFC](https://discuss.pytorch.org/t/torch-mlir-bridging-pytorch-and-mlir-ecosystems/133151)
+is expected to influence this!
+
+1. Demonstrate viability of our solutions on industry-standard workloads:
+    - {inference, training} x {BERT-L, MaskRCNN, ResNet50}
+1. One industry partner migrates (or begins to migrate) to Torch-MLIR for an e2e
+   flow on their critical path.
+1. Torch-MLIR is seen/accepted as an "off the shelf" solution for Torch/MLIR
+   interop. This specifically covers "softer" / "quasi-technical" aspects of the
+   project, such as community outreach/recognition, build system integration,
+   testing, CI, packaging, documentation.