11 KiB

Raw Blame History

How to Add Ops to Torch-Mlir

Collected links and contacts for how to add ops to torch-mlir.

Turbine Camp: Start Here

This document was previously known as `turbine-camp.md` to Nod.ai. "Turbine Camp" is part of Nod.ai's onboarding process. Welcome to turbine camp. This document originated at Nod.ai as a part of onboardding process, where new nod-ai folks learn about the architecture of our work by adding support for 2 ops to torch-mlir. I decided to put this into torch mlir because a lot of this is about torch-mlir.

Written & maintained by @renxida

Guides by other folks that were used during the creation of this document:

Before you begin...

Nod-ai maintains the pipeline below, which allows us to take a ML model from e.g. huggingface, and compile it to a variety of devices including llvm-cpu, rocm and cuda and more as an optimized vmfb binary.

The pipeline begins with a huggingface model, or some other supported source like llama.cpp.
nod-ai/SHARK-Turbine takes a huggingface model and exports a .mlir file.
llvm/torch-mlir, which you will be working on in turbine-camp, will lower torchscript, torch dialect, and torch aten ops further into a mixture linalg or math MLIR dialects (with occasionally other dialects in the mix)
IREE converts the final .mlir file into a binary (typically .vmfb) for running on a device (llvm-cpu, rocm, vulcan, cuda, etc).

The details of how we do it and helpful commands to help you set up each repo is in Sungsoon's Shark Getting Started Google Doc

PS: IREE is pronounced Eerie, and hence the ghost icon.

How to begin

You will start by adding support for 2 ops in torch-mlir, to get you familiar with the center of our pipeline. Begin by reading torch-mlir's documentation on how to implement a new torch op, and set up llvm/torch_mlir using https://github.com/llvm/torch-mlir/blob/main/docs/development.md
Pick 1 of the yet-unimplemented from the following. You should choose something that looks easy to you. Make sure you create an issue by clicking the little "target" icon to the right of the op, thereby marking the op as yours
- TorchToLinalg ops tracking issue
- TorchOnnnxToTorch ops tracking issue
Implement it. For torch -> linalg, see the how to torchop section below. For Onnx ops, see how to onnx below.
Make a pull request and reference your issue. When the pull request is closed, also close your issue to mark the op as done

How to TorchToLinalg

You will need to do 4 things:

make sure the op exists in torch_ods_gen.py, and then run build_tools/update_torch_ods.sh, and then build. This generates GeneratedTorchOps.td, which is used to generate the cpp and h files where ops function signatures are defined.
- Reference torch op registry
make sure the op exists in abstract_interp_lib_gen.py, and then run build_tools/update_abstract_interp_lib.sh, and then build. This generates AbstractInterpLib.cpp, which is used to generate the cpp and h files where ops function signatures are defined.
- Reference torch shape functions
write test cases. They live in projects/pt1. See the Dec 2023 example.
implement the op in one of the lib/Conversion/TorchToLinalg/*.cpp files

Reference Examples

A Dec 2023 example with the most up to date lowering
Chi's simple example of adding op lowering useful instructions and referring links for you to understand the op lowering pipeline in torch-mlir in the comments

Resources:

how to set up torch-mlir: https://github.com/llvm/torch-mlir/blob/main/docs/development.md
torch-mlir doc on how to debug and test: ttps://github.com/llvm/torch-mlir/blob/main/docs/development.md#testing
torch op registry
torch shape functions

How to TorchOnnxToTorch

Generate the big folder of ONNX IR. Use https://github.com/llvm/torch-mlir/blob/main/test/python/onnx_importer/import_smoke_test.py . Alternatively, if you're trying to support a certain model, convert that model to onnx IR with
```
optimum-cli export onnx --model facebook/opt-125M fb-opt
python -m torch_mlir.tools.import_onnx fb-opt/model.onnx -o fb-opt-125m.onnx.mlir
```
Find an instance of the Op that you're trying to implement inside the smoke tests folder or the generated model IR, and write a test case. Later you will save it to one of the files in torch-mlir/test/Conversion/TorchOnnxToTorch, but for now feel free to put it anywhere.
Implement the op in lib/Conversion/TorchOnnxToTorch/something.cpp.
Test the conversion by running ./build/bin/torch-mlir-opt -split-input-file -verify-diagnostics -convert-torch-onnx-to-torch your_mlir_file.mlir. For more details, see https://github.com/llvm/torch-mlir/blob/main/docs/development.md#testing . Xida usually creates a separate MLIR file to test it to his satisfaction before integrating it into one of the files at torch-mlir/test/Conversion/TorchOnnxToTorch.

Helpful examples:

Contacts

People who've worked on this for a while

Vivek (@vivek97 on discord)
Chi.Liu@amd.com

Recent Turbine Camp Attendees, from recent to less recent

Xida.ren@amd.com (@xida_ren on discord)
Sungsoon.Cho@amd.com

Links

Tutorials
- Sungsoon's Shark Getting Started Google Doc
  - This document contains commands that would help you set up shark and run demos
- How to implement ONNX op lowering
Examples
- A Dec 2023 example with the most up to date lowering
- Chi's Example Lowering
  - Github issue and code detailing how to implement the lowring of an OP.
  - Chi's simple example of adding op lowering useful instructions and referring links for you to understand the op lowering pipeline in torch-mlir in the comments
  - If you have questions, reach out to Chi on Discord
- Vivek's example of ONNX op lowering
Find Ops To Lower
- Torch MLIR + ONNX Unimplemented Ops on Sharepoint
  - If you don't have access yet, request it.
- nod-ai/SHARK-Turbine ssues tracking op support
  - Model and Op Support
  - ONNX op support

Chi's useful commands for debugging torch mlir

https://gist.github.com/AmosLewis/dd31ab37517977b1c499d06495b4adc2

How to write test cases and test your new op

https://github.com/llvm/torch-mlir/blob/main/docs/development.md#testing

How to set up vs code and intellisence for [torch-mlir]

Xida: This is optional. If you're using VS code like me, you might want to set it up so you can use the jump to definition / references, auto fix, and other features.

Feel free to contact me on discord if you have trouble figuring this out.

You may need to write something like this into your

.vscode/settings.json

under torch-mlir

{
    "files.associations": {
        "*.inc": "cpp",
        "ranges": "cpp",
        "regex": "cpp",
        "functional": "cpp",
        "chrono": "cpp",
        "__functional_03": "cpp",
        "target": "cpp"
    },
    "cmake.sourceDirectory": ["/home/xida/torch-mlir/externals/llvm-project/llvm"],
    "cmake.buildDirectory": "${workspaceFolder}/build",
    "cmake.generator": "Ninja",
    "cmake.configureArgs": [
        "-DLLVM_ENABLE_PROJECTS=mlir",
        "-DLLVM_EXTERNAL_PROJECTS=\"torch-mlir\"",
        "-DLLVM_EXTERNAL_TORCH_MLIR_SOURCE_DIR=\"/home/xida/torch-mlir\"",
        "-DCMAKE_BUILD_TYPE=Release",
        "-DCMAKE_C_COMPILER_LAUNCHER=ccache",
        "-DCMAKE_CXX_COMPILER_LAUNCHER=ccache",
        "-DLLVM_ENABLE_PROJECTS=mlir",
        "-DLLVM_EXTERNAL_PROJECTS=torch-mlir",
        "-DLLVM_EXTERNAL_TORCH_MLIR_SOURCE_DIR=${workspaceFolder}",
        "-DMLIR_ENABLE_BINDINGS_PYTHON=ON",
        "-DLLVM_ENABLE_ASSERTIONS=ON",
        "-DLLVM_TARGETS_TO_BUILD=host",
    ],
    "C_Cpp.default.configurationProvider": "ms-vscode.cmake-tools",
    "cmake.configureEnvironment": {
        "PATH": "/home/xida/miniconda/envs/torch-mlir/bin:/home/xida/miniconda/condabin:/home/xida/miniconda/bin:/home/xida/miniconda/bin:/home/xida/miniconda/condabin:/home/xida/miniconda/bin:/home/xida/miniconda/bin:/home/xida/miniconda/condabin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin"
    },
    "cmake.cmakePath": "/home/xida/miniconda/envs/torch-mlir/bin/cmake", // make sure this is a cmake that knows where your python is
}

The important things to note are the cmake.configureArgs, which specify the location of your torch mlir, and the cmake.sourceDirectory, which indicates that CMAKE should not build from the current directory and should instead build from externals/llvm-project/llvm

11 KiB Raw Blame History