torch-mlir/build_tools/torchscript_e2e_heavydep_tests/main.py

49 lines
1.6 KiB
Python
Raw Normal View History

# Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
# See https://llvm.org/LICENSE.txt for license information.
# SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
# Also available under a BSD-style license. See LICENSE.
Add E2E support for tests with heavy dependencies (heavydep tests). The tests use the same (pure-Python) test framework as the normal torchscript_e2e_test.sh, but the tests are added in `build_tools/torchscript_e2e_heavydep_tests` instead of `frontends/pytorch/e2e_testing/torchscript`. Any needed dependencies can easily be configured in generate_serialized_tests.sh. We add an initial machine translation model with a complex set of dependencies to seed the curriculum there. I verified that this model gets to the point of MLIR import (it fails there with a segfault due to not being able to import the "Any" type). This required moving a few files from the `torch_mlir` Python module into multiple modules to isolate the code that depends on our C++ extensions (which now live in `torch_mlir` and `torch_mlir_torchscript_e2e_test_configs`) from the pure Python code (which now lives in `torch_mlir_torchscript`). This is an entirely mechanical change, and lots of imports needed to be updated. The dependency graph is: ``` torch_mlir_torchscript_e2e_test_configs / | / | / | V V torch_mlir_torchscript torch_mlir ``` The `torch_mlir_torchscript_e2e_test_configs` are then dependency-injected into the `torch_mlir_torchscript` modules to successfully assemble a working test harness (the code was already structured this way, but this new file organization allows the isolation from C++ code to actually happen). This isolation is critical to allowing the serialized programs to be transported across PyTorch versions and for the test harness to be used seamlessly to generate the heavydep tests. Also: - Extend `_Tracer` class to support nested property (submodule) accesses. Recommended review order: - "user-level" docs in README.md - code in `build_tools/torchscript_e2e_heavydep_tests`. - changes in `torch_mlir_torchscript/e2e_test/framework.py` - misc mechanical changes.
2021-07-10 03:22:45 +08:00
import argparse
import os
import pickle
import torch
from torch_mlir_e2e_test.torchscript.registry import GLOBAL_TEST_REGISTRY
from torch_mlir_e2e_test.torchscript.framework import SerializableTest, generate_golden_trace
from torch_mlir_e2e_test.torchscript.annotations import extract_serializable_annotations
Add E2E support for tests with heavy dependencies (heavydep tests). The tests use the same (pure-Python) test framework as the normal torchscript_e2e_test.sh, but the tests are added in `build_tools/torchscript_e2e_heavydep_tests` instead of `frontends/pytorch/e2e_testing/torchscript`. Any needed dependencies can easily be configured in generate_serialized_tests.sh. We add an initial machine translation model with a complex set of dependencies to seed the curriculum there. I verified that this model gets to the point of MLIR import (it fails there with a segfault due to not being able to import the "Any" type). This required moving a few files from the `torch_mlir` Python module into multiple modules to isolate the code that depends on our C++ extensions (which now live in `torch_mlir` and `torch_mlir_torchscript_e2e_test_configs`) from the pure Python code (which now lives in `torch_mlir_torchscript`). This is an entirely mechanical change, and lots of imports needed to be updated. The dependency graph is: ``` torch_mlir_torchscript_e2e_test_configs / | / | / | V V torch_mlir_torchscript torch_mlir ``` The `torch_mlir_torchscript_e2e_test_configs` are then dependency-injected into the `torch_mlir_torchscript` modules to successfully assemble a working test harness (the code was already structured this way, but this new file organization allows the isolation from C++ code to actually happen). This isolation is critical to allowing the serialized programs to be transported across PyTorch versions and for the test harness to be used seamlessly to generate the heavydep tests. Also: - Extend `_Tracer` class to support nested property (submodule) accesses. Recommended review order: - "user-level" docs in README.md - code in `build_tools/torchscript_e2e_heavydep_tests`. - changes in `torch_mlir_torchscript/e2e_test/framework.py` - misc mechanical changes.
2021-07-10 03:22:45 +08:00
from . import basic_mt
from . import minilm_sequence_classification
Add E2E support for tests with heavy dependencies (heavydep tests). The tests use the same (pure-Python) test framework as the normal torchscript_e2e_test.sh, but the tests are added in `build_tools/torchscript_e2e_heavydep_tests` instead of `frontends/pytorch/e2e_testing/torchscript`. Any needed dependencies can easily be configured in generate_serialized_tests.sh. We add an initial machine translation model with a complex set of dependencies to seed the curriculum there. I verified that this model gets to the point of MLIR import (it fails there with a segfault due to not being able to import the "Any" type). This required moving a few files from the `torch_mlir` Python module into multiple modules to isolate the code that depends on our C++ extensions (which now live in `torch_mlir` and `torch_mlir_torchscript_e2e_test_configs`) from the pure Python code (which now lives in `torch_mlir_torchscript`). This is an entirely mechanical change, and lots of imports needed to be updated. The dependency graph is: ``` torch_mlir_torchscript_e2e_test_configs / | / | / | V V torch_mlir_torchscript torch_mlir ``` The `torch_mlir_torchscript_e2e_test_configs` are then dependency-injected into the `torch_mlir_torchscript` modules to successfully assemble a working test harness (the code was already structured this way, but this new file organization allows the isolation from C++ code to actually happen). This isolation is critical to allowing the serialized programs to be transported across PyTorch versions and for the test harness to be used seamlessly to generate the heavydep tests. Also: - Extend `_Tracer` class to support nested property (submodule) accesses. Recommended review order: - "user-level" docs in README.md - code in `build_tools/torchscript_e2e_heavydep_tests`. - changes in `torch_mlir_torchscript/e2e_test/framework.py` - misc mechanical changes.
2021-07-10 03:22:45 +08:00
def _get_argparse():
parser = argparse.ArgumentParser(
description="Generate assets for TorchScript E2E tests")
parser.add_argument("--output_dir", help="The directory to put assets in.")
return parser
def main():
args = _get_argparse().parse_args()
serializable_tests = []
for test in GLOBAL_TEST_REGISTRY:
trace = generate_golden_trace(test)
module = torch.jit.script(test.program_factory())
torchscript_module_bytes = module.save_to_buffer({
"annotations.pkl":
pickle.dumps(extract_serializable_annotations(module))
})
serializable_tests.append(
SerializableTest(unique_name=test.unique_name,
program=torchscript_module_bytes,
trace=trace))
for test in serializable_tests:
with open(os.path.join(args.output_dir, f"{test.unique_name}.pkl"),
"wb") as f:
pickle.dump(test, f)
if __name__ == "__main__":
main()