11 KiB
How to Add Ops to Torch-Mlir
Collected links and contacts for how to add ops to torch-mlir.
Turbine Camp: Start Here
This document was previously known as `turbine-camp.md` to Nod.ai. "Turbine Camp" is part of Nod.ai's onboarding process. Welcome to turbine camp. This document originated at Nod.ai as a part of onboardding process, where new nod-ai folks learn about the architecture of our work by adding support for 2 ops to torch-mlir. I decided to put this into torch mlir because a lot of this is about torch-mlir.Written & maintained by @renxida
Guides by other folks that were used during the creation of this document:
Before you begin...
Nod-ai maintains the pipeline below, which allows us to take a ML model from e.g. huggingface, and compile it to a variety of devices including llvm-cpu, rocm and cuda and more as an optimized vmfb
binary.
- The pipeline begins with a huggingface model, or some other supported source like llama.cpp.
- nod-ai/SHARK-Turbine takes a huggingface model and exports a
.mlir
file. - llvm/torch-mlir, which you will be working on in turbine-camp, will lower torchscript, torch dialect, and torch aten ops further into a mixture
linalg
ormath
MLIR dialects (with occasionally other dialects in the mix) - IREE converts the final
.mlir
file into a binary (typically.vmfb
) for running on a device (llvm-cpu, rocm, vulcan, cuda, etc).
The details of how we do it and helpful commands to help you set up each repo is in Sungsoon's Shark Getting Started Google Doc
PS: IREE is pronounced Eerie, and hence the ghost icon.
How to begin
- You will start by adding support for 2 ops in torch-mlir, to get you familiar with the center of our pipeline. Begin by reading torch-mlir's documentation on how to implement a new torch op, and set up
llvm/torch_mlir
using https://github.com/llvm/torch-mlir/blob/main/docs/development.md - Pick 1 of the yet-unimplemented from the following. You should choose something that looks easy to you. Make sure you create an issue by clicking the little "target" icon to the right of the op, thereby marking the op as yours
- Implement it. For torch -> linalg, see the how to torchop section below. For Onnx ops, see how to onnx below.
- Make a pull request and reference your issue. When the pull request is closed, also close your issue to mark the op as done
How to TorchToLinalg
You will need to do 4 things:
- make sure the op exists in
torch_ods_gen.py
, and then runbuild_tools/update_torch_ods.sh
, and then build. This generatesGeneratedTorchOps.td
, which is used to generate the cpp and h files where ops function signatures are defined.- Reference torch op registry
- make sure the op exists in
abstract_interp_lib_gen.py
, and then runbuild_tools/update_abstract_interp_lib.sh
, and then build. This generatesAbstractInterpLib.cpp
, which is used to generate the cpp and h files where ops function signatures are defined.- Reference torch shape functions
- write test cases. They live in
projects/pt1
. See the Dec 2023 example. - implement the op in one of the
lib/Conversion/TorchToLinalg/*.cpp
files
Reference Examples
- A Dec 2023 example with the most up to date lowering
- Chi's simple example of adding op lowering useful instructions and referring links for you to understand the op lowering pipeline in torch-mlir in the comments
Resources:
- how to set up torch-mlir: https://github.com/llvm/torch-mlir/blob/main/docs/development.md
- torch-mlir doc on how to debug and test: ttps://github.com/llvm/torch-mlir/blob/main/docs/development.md#testing
- torch op registry
- torch shape functions
How to TorchOnnxToTorch
- Generate the big folder of ONNX IR. Use https://github.com/llvm/torch-mlir/blob/main/test/python/onnx_importer/import_smoke_test.py . Alternatively, if you're trying to support a certain model, convert that model to onnx IR with
optimum-cli export onnx --model facebook/opt-125M fb-opt python -m torch_mlir.tools.import_onnx fb-opt/model.onnx -o fb-opt-125m.onnx.mlir
- Find an instance of the Op that you're trying to implement inside the smoke tests folder or the generated model IR, and write a test case. Later you will save it to one of the files in
torch-mlir/test/Conversion/TorchOnnxToTorch
, but for now feel free to put it anywhere. - Implement the op in
lib/Conversion/TorchOnnxToTorch/something.cpp
. - Test the conversion by running
./build/bin/torch-mlir-opt -split-input-file -verify-diagnostics -convert-torch-onnx-to-torch your_mlir_file.mlir
. For more details, see https://github.com/llvm/torch-mlir/blob/main/docs/development.md#testing . Xida usually creates a separate MLIR file to test it to his satisfaction before integrating it into one of the files attorch-mlir/test/Conversion/TorchOnnxToTorch
.
Helpful examples:
List of Tools you may need to use (this will be incorporated into the above instructions later)
- Generate FILECHECK tests from MLIR test cases:
torch-mlir-opt -convert-<your conversion> /tmp/your_awesome_testcase.mlir | externals/llvm-project/mlir/utils/generate-test-checks.py
. Please don't just paste the generated tests - reference them to write your own
Contacts
People who've worked on this for a while
- Vivek (@vivek97 on discord)
- Chi.Liu@amd.com
Recent Turbine Camp Attendees, from recent to less recent
- Xida.ren@amd.com (@xida_ren on discord)
- Sungsoon.Cho@amd.com
Links
- Tutorials
- Sungsoon's Shark Getting Started Google Doc
- This document contains commands that would help you set up shark and run demos
- How to implement ONNX op lowering
- Sungsoon's Shark Getting Started Google Doc
- Examples
- A Dec 2023 example with the most up to date lowering
- Chi's Example Lowering
- Github issue and code detailing how to implement the lowring of an OP.
- Chi's simple example of adding op lowering useful instructions and referring links for you to understand the op lowering pipeline in torch-mlir in the comments
- If you have questions, reach out to Chi on Discord
- Vivek's example of ONNX op lowering
- Find Ops To Lower
- Torch MLIR + ONNX Unimplemented Ops on Sharepoint
- If you don't have access yet, request it.
- nod-ai/SHARK-Turbine ssues tracking op support
- Torch MLIR + ONNX Unimplemented Ops on Sharepoint
Chi's useful commands for debugging torch mlir
https://gist.github.com/AmosLewis/dd31ab37517977b1c499d06495b4adc2
How to write test cases and test your new op
https://github.com/llvm/torch-mlir/blob/main/docs/development.md#testing
How to set up vs code and intellisence for [torch-mlir]
Xida: This is optional. If you're using VS code like me, you might want to set it up so you can use the jump to definition / references, auto fix, and other features.
Feel free to contact me on discord if you have trouble figuring this out.
You may need to write something like this into your
.vscode/settings.json
under torch-mlir
{
"files.associations": {
"*.inc": "cpp",
"ranges": "cpp",
"regex": "cpp",
"functional": "cpp",
"chrono": "cpp",
"__functional_03": "cpp",
"target": "cpp"
},
"cmake.sourceDirectory": ["/home/xida/torch-mlir/externals/llvm-project/llvm"],
"cmake.buildDirectory": "${workspaceFolder}/build",
"cmake.generator": "Ninja",
"cmake.configureArgs": [
"-DLLVM_ENABLE_PROJECTS=mlir",
"-DLLVM_EXTERNAL_PROJECTS=\"torch-mlir\"",
"-DLLVM_EXTERNAL_TORCH_MLIR_SOURCE_DIR=\"/home/xida/torch-mlir\"",
"-DCMAKE_BUILD_TYPE=Release",
"-DCMAKE_C_COMPILER_LAUNCHER=ccache",
"-DCMAKE_CXX_COMPILER_LAUNCHER=ccache",
"-DLLVM_ENABLE_PROJECTS=mlir",
"-DLLVM_EXTERNAL_PROJECTS=torch-mlir",
"-DLLVM_EXTERNAL_TORCH_MLIR_SOURCE_DIR=${workspaceFolder}",
"-DMLIR_ENABLE_BINDINGS_PYTHON=ON",
"-DLLVM_ENABLE_ASSERTIONS=ON",
"-DLLVM_TARGETS_TO_BUILD=host",
],
"C_Cpp.default.configurationProvider": "ms-vscode.cmake-tools",
"cmake.configureEnvironment": {
"PATH": "/home/xida/miniconda/envs/torch-mlir/bin:/home/xida/miniconda/condabin:/home/xida/miniconda/bin:/home/xida/miniconda/bin:/home/xida/miniconda/condabin:/home/xida/miniconda/bin:/home/xida/miniconda/bin:/home/xida/miniconda/condabin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin"
},
"cmake.cmakePath": "/home/xida/miniconda/envs/torch-mlir/bin/cmake", // make sure this is a cmake that knows where your python is
}
The important things to note are the cmake.configureArgs
, which specify the location of your torch mlir, and the cmake.sourceDirectory
, which indicates that CMAKE should not build from the current directory and should instead build from externals/llvm-project/llvm