* mac m1 cross compile
Add support for M1 cross compile
* Remove redundant ExecutionEngine
It is registered as part of RegisterEverything
* nuke non-universal zstd
disable LTC
The torch-mlir bazel build is [failing](https://github.com/llvm/torch-mlir/runs/7737425906?check_suite_focus=true) since [this commit](504de5e701) due to a linker failure (undefined symbol: `mlir::torch::Torch::createEraseModuleInitializerPass()`).
```
ERROR: /home/runner/.cache/bazel/_bazel_runner/db599744cd37f8c161e5034d9b9cd520/external/torch-mlir/BUILD.bazel:845:10: Linking external/torch-mlir/torch-mlir-opt failed: (Exit 1): clang failed: error executing command /usr/lib/llvm-11/bin/clang @bazel-out/k8-fastbuild/bin/external/torch-mlir/torch-mlir-opt-2.params
Use --sandbox_debug to see verbose messages from the sandbox and retain the sandbox build root for debugging
ld.lld: error: undefined symbol: mlir::torch::Torch::createEraseModuleInitializerPass()
>>> referenced by Passes.cpp
>>> bazel-out/k8-fastbuild/bin/external/torch-mlir/_objs/TorchMLIRTorchPasses/Passes.pic.o:(mlir::torch::Torch::createTorchFunctionToTorchBackendPipeline(mlir::OpPassManager&, mlir::torch::Torch::TorchLoweringPipelineOptions const&))
>>> referenced by Passes.cpp
>>> bazel-out/k8-fastbuild/bin/external/torch-mlir/_objs/TorchMLIRTorchPasses/Passes.pic.o:((anonymous namespace)::registerEraseModuleInitializerPass()::'lambda'()::operator()() const)
clang: error: linker command failed with exit code 1 (use -v to see invocation)
```
This PR adds `lib/Dialect/Torch/Transforms/EraseModuleInitializer.cpp` to `TorchMLIRTorchPasses` library.
At the moment we don't gate torch-mlir PRs with bazel builds. This means bazel builds don't get run on open PRs, and so there's no good way to validate a fix PR which is meant to fix a broken bazel build. This option allows a bazel build to be manually triggered as needed on open PRs.
With llvm/llvm-project@112499f landed, `MLIR_TABLEGEN_EXE` is given as a
cache variable in the MLIR core project. Other external projects, such
as TORCH-MLIR, should not set the variable as this breaks
cross-compilation.
The Torch dialect has an include to `mlir/Dialect/Func/IR/FuncOps.h` and
should therefore have a CMake dependency to the MLIRFuncDialect.
Otherwise, the build can fail since it may occur that
`mlir/Dialect/Func/IR/FuncOps.h.inc` isn't generated yet.
Summary of changes:
- Switch to C++17 (similar to https://reviews.llvm.org/D131348)
- Update MHLO to build with LLVM commit hash 061e0189
- Replace deprecated `hasValue()` and `getValue()` with `has_value()`
and `value()` respectively (https://reviews.llvm.org/D131349)
- Use `TypedAttr` (https://reviews.llvm.org/D130092)
- Use updated assembly format of `mhlo.compare` op (commit
d03ef01e70fbf9afd0fa1976fbb7ed31838929b3 in MHLO repo)
Rather than a per-global-slot initializer region, we now have one for
the whole module. For example, it might look like this:
```
torch.global_slot "private" @tensor : !torch.tensor
torch.global_slot "private" @list : !torch.list<tensor>
torch.global_slot.module_initializer {
%0 = torch.tensor.literal(dense<0.0> : tensor<f32>) : !torch.tensor
%1 = torch.prim.ListConstruct %0 : (!torch.tensor) -> !torch.list<tensor>
torch.initialize.global_slots [
@tensor(%0 : !torch.tensor)
@list(%1 : !torch.list<tensor>)
]
}
```
This new structure allows GlobalizeObjectGraph to create the initializer in a
much simpler way, avoiding the need to reason about whether different slots
alias each other. Reasoning about whether slots alias each other now is the
responsibility of InlineGlobalSlots, which has to do a much more complicated
analysis, implemented using MLIR's dataflow analysis framework.
Recommended review order:
- Check out the new IR constructs in the .mlir files of various passes
- Op definitions (*.td)
- Changes to GlobalizeObjectGraph pass.
- InlineGlobalSlots pass (~total rewrite)
- Misc changes:
- Moving torchMlirAdjustStaticInformation for sharing with C++ code.
- EraseModuleInitializer pass
To make this a bit nicer, it would be good to have a `torch.module` op
with an initializer region attached. That would be more invasive though.
This change has highlighted certain aspects of our project layering
which are worth calling out. None of our backends can handle global
slots, so we enforce that there are no global slots before backend
lowering. At an earlier stage in the project, we had aspirations of
transparently handling mutable global state and such, but for reasons
described below, that is no longer a goal. So really global slots should
be seen as a progressive lowering step as part of inlining all the
IValue's in the original program (GlobalizeObjectGraph is also one such
step).
Over time, with insights from work like IREE-JAX, it has become clear
that there isn't a reliable programming model we can compile for users
where we just transparently handle mutable global state (and some other
things, like lists and dictionaries). There is a need for an "outer
program" that orchestrates more restricted subroutines of the kind we
can handle in our compile flow here. The benefit of that is that it
decouples considerations like shapes, dtypes, etc. from the program
constructs used in the outer program. As long as the outer program can
efficiently invoke (pipelining/async/etc.) high-performance
data-parallel numerical subroutines of the kind we compile in our flow
here, then there is a complete programming model. This is also
consistent with the direction of upstream PyTorch which is becoming more
tracing-based (which inherently loses a lot of program structure, which
then has to be applied back with an "outer program" orchestrating the
traced subroutines).
follow up #761:
This patch updates the `torch_mlir::convertTensorToMlirElementsAttr()`
method to enable the creation of tensors whose base type is Float16.
This patch also adds a test to validate the IR generation, and it
updates the test for importing tensors of various types.
PyTorch recently added support for `dim=None` in the `torch.var`
(5ca9b2b6fa)
and `torch.std`op (eb0e30e0bc).
This commit adds the corresponding support in torch-mlir.
Signed-Off By: Vivek Khandelwal<vivek@nod-labs.com>
In some cases, users know that a traced graph is valid for a wider set
of shapes than they originally traced it with. Provide an option for
users to ignore the shapes in the traced graph when they know it is
legal.
Fixes#997
* [MHLO] Support for dynamic shape in basic op conversion by introducing CHLO dialect
Co-authored-by: Bairen Yi <yibairen.byron@bytedance.com>
Co-authored-by: Jiawei Wu <xremold@gmail.com>
Co-authored-by: Tianyou Guo <tianyou.gty@alibaba-inc.com>
Co-authored-by: Xu Yan <yancey.yx@alibaba-inc.com>
Co-authored-by: Ziheng Jiang <ziheng.jiang@bytedance.com>
* [MHLO] Support I32 as shape tensor dtype
* [NFC] Add a 'TODO' annotation
* Replace CHECK_EQ with TORCH_CHECK_EQ
* Check value of TORCH_MLIR_USE_INSTALLED_PYTORCH during LTC build
* Update LTC XFAIL with NewZerosModule ops
* Explicitly blacklist _like ops
* Automatically blacklist new_/_like ops
* Prune away unused Python dependencies from LTC
* Add flag to disable LTC
* Autogen dummy _REFERENCE_LAZY_BACKEND library when LTC is disabled
* Implement compute_shape_var
* Removed Var tests from XFAIL Set
* XFAIL tests using _local_scalar_dense or index.Tensor
* Add StdDim tests to XFAIL set
* Autogen aten::cat