mirror of https://github.com/llvm/torch-mlir
9ba77c6e13
This inlines global slots if possible. This allows them to participate in folding, canonicalization, shape inference, etc. Example use cases: - inlining weights and biases that are readonly during inference - inlining the "training" bool to allow stuff to fold away For training use cases (especially internal training loop), we will need something smarter to get good performance. That would look like an "SSA formation" which promotes the global slots to tensors in the program, flushing them back to the slots at the minimal number of necessary places. We might want to let backends do that transformation though. This also interacts with shape inference (type bounds on the slots to even lower them to backends in the first place). |
||
---|---|---|
.. | ||
IR | ||
Transforms | ||
CMakeLists.txt |