torch-mlir

Commit Graph

Author	SHA1	Message	Date
Stella Laurenzo	3d74337be0	Add a torch.kernel_call op and associated predicates.	2020-09-29 15:10:38 -07:00
Stella Laurenzo	ba03ecc652	Add public API for constructing a module/function to capture PyTorch ops. * Uses the MLIR-C API since that will save us a lot of grief down the road (i.e. will give PyTorch and libMLIR/libNPCOMP the ability to skew version-wise). * Quite a few TODOs and not yet populating the function in any way.	2020-09-29 14:23:22 -07:00
Stella Laurenzo	9722a6ce78	Bump LLVM to e72d792c147ee506e337401e20c0f23042cc43fe. * Does not bump mhlo as an upstream integrate on that project has not taken place and there is not an incompatibility.	2020-09-28 15:34:01 -07:00
Stella Laurenzo	2c9ca79c89	Add boilerplate for Torch dialect.	2020-09-28 15:26:17 -07:00
Stella Laurenzo	fb895173f2	Run format_sources.sh.	2020-09-28 12:04:24 -07:00
Stella Laurenzo	b5f010284f	Add boilerplate to do device capture (pytorch 1.6). * Uses the new dispatcher API. * Just prints to the console for the moment when an op is captured. * Executes the op through the existing implementation.	2020-09-28 10:30:54 -07:00
Sean Silva	16c26ef57e	[RefE2E] Use upstream shape constraint conversion pass. Now that we upstreamed our pass, we can remove it. The final pass that landed upstream doesn't do the shape.assuming canonicalization to legalize that op away, so added a restricted-canonicalizer pass that allowed to run just shape dialect canonicalizations, which deletes the shape.assuming. The pass ended up kind of ugly. See the TODO's on it for some potential cleaner directions.	2020-09-28 09:34:44 -07:00
Sean Silva	6ea37cfed6	Bump llvm-project to 9ed1e5873c19eb817fb9e36d0262c7effee5d35e Date: Fri Sep 18 13:55:52 2020 -0700 - Update to linalg syntax - New generated builders are better. Custom builder for tcp.shaped_results is now redundant.	2020-09-28 09:34:44 -07:00
Sean Silva	f9b37c55b7	[RefE2E] Add support for unary ops exp and tanh This is fairly mechanical.	2020-09-24 18:41:30 -07:00
Sean Silva	6b69beae6a	[NFC] Remove stray .dump() that snuck in.	2020-09-24 18:41:30 -07:00
Stella Laurenzo	0cb28f0b06	Move tests around so we can have dedicated tests for the c10 dispatcher. * Adds a trivial missing test for _torch_mlir.c10.get_registered_ops() * Disables the regression tests for now on c10 (until implemented).	2020-09-24 18:28:06 -07:00
Stella Laurenzo	6e6efb2854	Add compatibility notes regarding unpacking quantized weights. (#56 ) Co-authored-by: Bryce Arden <arden.bryce@gmail.com>	2020-09-24 17:47:28 -07:00
Stella Laurenzo	47c3a9f461	Add docker image/instructions for building against pytorch 1.6.	2020-09-24 17:40:25 -07:00
Stella Laurenzo	0d91885965	Add initial python bindings for c10 dispatcher internals. (#55 ) * Exposes the op registry via a get_registered_ops method. * Moves the aten dialect generation scripts in prep for integrating them with this facility.	2020-09-24 16:26:29 -07:00
Sean Silva	c69e9fabc5	[RefE2E] Add support for "max". This cleans up the lowering pipeline to easily allow extending to multiple binary ops. It looks fairly repetitive at multiple levels, but I don't want to prematurely generalize. I think that in principle we could derive a large swatch of TCF + TCP from a single linalg-style specification. Another direction is to use an OpInterface (something like "buildLinalgGenericBody"). I'm keeping my eye on it. In a subsequent commit, I'll mechanically add a set of binary ops modeled off of the std arithmetic ops.	2020-09-22 18:38:32 -07:00
Marius Brehler	681c4e1d4a	Inject missing dialects in E2E passes	2020-09-22 08:52:23 +02:00
Sean Silva	7b7f35744b	[RefE2E] Add interesting control flow example. This also required adding a lowering for ForOp in our tensor->memref conversion.	2020-09-21 12:25:24 -07:00
Stella Laurenzo	bc7c852379	Add more ops from the original integration. * Still need to add a systematic mechanism for discovering gradient ops. * Work needed on the various _ suffixed inplace ops. * Other randoms still not mapped. * Outside of this commit, I do have enough commented/reworked to roughly build but that will take another handful of commits to get going.	2020-09-18 19:11:18 -07:00
Sean Silva	276f5b80ea	[RefE2E] Add assemblyFormat for TCF and TCP ops and tidy up.	2020-09-18 15:03:53 -07:00
Sean Silva	dc8afc9271	[RefE2E] Refactor how tcf.add is lowered. It was previously going through this awkward route that prematurely created linalg.generic ops, which was an annoying layering problem since we can't compute a shape transfer function for linalg.generic in the general case. Now we pass it through the same path as tcp.matmul, with the shape transfer function being defined for tcp.add. This also removed the need for TCPToLinalg (now deleted). The equivalent of that is happening in lower-shaped-results-to-memref. One interesting outcome of this: we're basically using linalg as a "Buffer TCP". We might want to look into using named structured ops for more of TCP, but that would be a big velocity hit since then any change to the ODS / verification for those ops would be a change to the upstream structured op ODS generator. After we have more experience defining this manually, we should re-evaluate rebasing TCP on generated named linalg ops.	2020-09-18 15:03:53 -07:00
Sean Silva	d8675f8ad2	[RefE2E] Add support for matmul. I'm pretty happy with how this turned out. It looks pretty much like it should -- one change at each layer. This particular op bottoms out on linalg which takes care of the rest. - Add tcf.matmul - Add tcp.matmul - Add TCF->TCP lowering - Add tcp.matmul shape transfer function (BypassShapes.cpp) - Add tcp.matmul -> linalg.matmul lowering (LowerShapedResultsToMemref.cpp) - Add support to LowerShapeConstraints for lowering the new shape.cstr_require This matmul op is pretty limited in its capabilities. There is no batching and no multidimensional contraction. Certainly more design work will be needed to find the right abstractions that aren't too general but also help to canonicalize many cases from frontends. This is mainly to show that adding a new op needn't be very "scary" once we have the e2e infra in place. Also, - this clears out some exploratory cruft from the TCF dialect now that this is starting to become real.	2020-09-18 11:31:01 -07:00
Sean Silva	62738d3641	[RefE2E] Fix nul-termination bug. I was seeing some of the error messages come out with some garbage at the end. This fixes it.	2020-09-18 11:31:01 -07:00
Sean Silva	2284f6b4f1	Bump llvm-project to 7c44651360dd94e17011fd1cd7ec3c755e0363b4 Date: Thu Sep 17 18:16:41 2020 -0700	2020-09-18 11:31:01 -07:00
Sean Silva	7486befffd	Fixes for run_lit.sh - new build directory layout - build NPCOMPNativePyExt, now that lit tests use it	2020-09-18 11:31:01 -07:00
Stella Laurenzo	361abebb51	Update README to reference published docker tag.	2020-09-16 23:12:05 -07:00
Stella Laurenzo	8ac29594df	Explicitly load aten and std dialects when constructing a context. (#47 ) * This gets the pytorch frontend broadly working and what is left appears to be legitimate failures in 9 tests. * Errors noted in #46	2020-09-16 23:06:22 -07:00
Stella Laurenzo	678989a321	Update docker, instructions and some fixes for the pytorch 1.3 build. (#45 ) * Includes pybind11 directly (for some reason using the pytorch helper header for this depends on a source file not in the image). * Installs nnpack into the image. * Installs new-clang and LLD and configures environment to use it (otherwise, link time is terrible). * Fixes a gcc compile error (in the off chance you build with default gcc compiler). * Tests are failing based on some dialect registration stuff that must not have been factored correctly. Will followup with a fix.	2020-09-16 21:57:46 -07:00
Sean Silva	75f57b461e	Totally rework RefE2E tensor to memref flow. (#42 ) This now gets the overall "RefE2E" compilation stack to a point that I'm fairly happy with. We simplify it by mostly embracing the "descriptor" view of the world. The overall flow is best understood by reading through the createE2ELoweringPipeline function in lib/E2E/E2E.cpp That function creates a pass pipeline that lowers from "TCF" (which is ~numpy level of abstraction) down to LLVM IR. A brief high-level summary of what happens there: 1. TCF to TCP conversion. This involves reifying error handling in the form of shape constraints. See test/Conversion/TCFToTCP/basic.mlir 2. Lowering shape constraints. This converts shape constraints into eager error-handling code. See test/E2E/lower-shape-constraints.mlir This pass will soon go upstream. Because this lowers to std.assert, some later passes like LowerToNpcomprtABI and LowerToLLVM are updated to properly plumb this through e2e. See test/npcomp-run-mlir/invalid-broadcast.mlir for an execution test that properly aborts in case of an error. 3. Lowering tensors to memrefs. This is done via a series of passes rather than an single mega conversion. Unlike the previous code that mixed in the npcomprt ABI stuff here, it's now a very clean "pure memref" conversion. See test/E2E/lower-*-to-memref.mlir and lib/E2E/TensorToMemref/ Most of the changes are concentrated here. 4. As part of the above, we use the upstream ConvertShapeToStandard for lowering shapes. 5. We lower linalg to loops and lower loops to CFG using upstream passes. 6. Rewrite the "ABI" boundaries of the program to npcomprt data structures (LowerToNpcomprtABI). This mainly affects ABI boundaries and how global tensor constants are represented. One of the major improvements in this commit is that now it's a very clean rewrite that just replaces memrefs on ABI boundaries with !npcomprt.tensor (before there was a get_extent function that is not needed). See test/E2E/lower-to-npcomprt-abi.mlir 7. Lower to LLVM with upstream mlir patterns + some patterns for the npcomprt lowerings. One aspect here that is still a remnant of a non-descriptor-based tensor to memref flow is the BypassShapes + LowerShapedResultsToMemref. BypassShapes wraps the "tensor compute" ops in a tcp.shaped_results (basically a "tie_shape" kind of op), and then LowerShapedResultsToMemref uses those annotations to allocate output buffers while lowering the "tensor compute ops". Note that there are very few "tensor compute" ops currently supported (tcp.add + tcp.broadcast_to), so we just hardcode them in both passes. Realistically, I expect this to go away as we fully embrace the descriptor-based approach for simplicity, so don't look too deep into it.	2020-09-16 17:31:40 -07:00
Stella Laurenzo	a74a98094b	Add a new python script to auto-generate ATen op ODS definitions. (#43 ) * Add a new python script to auto-generate ATen op ODS definitions. * There is still some work on some of the ops to annotate correct types. * The ODS is not actually included into the dialect yet, but I'd like to commit it so that we can track changes. * Will reconcile this with the ops produced by the existing script in a followup. Still need to do some more iteration to reach parity.	2020-09-16 16:21:24 -07:00
Marius Brehler	d62f8227c2	Bump LLVM to @7d1ed69 and fix namespace handling changed upstream. * Bump LLVM to llvm/llvm-project@7d1ed69 * Bump MLIR-HLO to tensorflow/mlir-hlo@1880f87 * Adopt to MLIR's changed namespace handling	2020-09-16 15:52:15 -07:00
Stella Laurenzo	dd9172fd75	Run clang-format on files that do not comply.	2020-09-15 17:54:58 -07:00
Sean Silva	0f9c6b4a35	Bump llvm-project to 84a6da67e6b2a76b15ad1862f4cbb7625fe318df That commit is a Thu Sep 10 22:04:58 2020 -0700 That change is required for a PR that I'm going to make soon.	2020-09-14 15:56:01 -07:00
Marius Brehler	843448cde9	Register dialects in E2E passes	2020-09-11 09:33:44 +02:00
Marius Brehler	a2fb68059f	Remove unused include	2020-09-11 09:33:44 +02:00
Marius Brehler	124bc65a70	Register dialects in ATen lowering pass	2020-09-09 21:55:17 -07:00
Marius Brehler	fb2d1a1559	Register dialects in conversion passes	2020-09-09 21:55:17 -07:00
Stella Laurenzo	81dd571c23	Integrate upstream LLVM at 8d9c13f37d2081c11186718ae8b5aef8b507d152. * mlir-hlo: 062a3ac4a0671d15b5199ed2cd3a9ce02a5bf077 Fixes: * numInputs() just returns an int instead of requiring a call to .getLimitedValue()	2020-09-08 20:34:31 -07:00
Stella Laurenzo	97d83f786a	Bump submodule versions. * llvm-project: b5924a8e27536d19dd5c4d302db29fb6163d5faa * mhlo: 848ca244d20f045b7921da55a98a04d95ef94f0e * Multiple breakages that need to be fixed. Fixes: * Refactor dialect registration * Remove all kindof methods (Casting functionality has been added upstream and is implicitly available, see https://llvm.discourse.group/t/removing-kinds-from-attributes-and-types/1547.) * Update dialect registration to comply with https://reviews.llvm.org/D85495. * Remove type kinds and update some changed dialect signatures. * Upgrade ATen dialect to match upstream needs. * Move dialect registration to tablegen. * Register the ListType in tablegen. * Change dialect initialization signature. * Use TypeSwitch in MlirIr location printer. * Remove global registry depends from npcomp-opt. * Change LowerToLLVM to pass an MLIRContext vs an LLVMDialect for type creation. * Remove dep on MLIREDSCInterface that is removed upstream. * Thread through the DialectRegistry for opt and python-like tools. * Modernize pass registration (This was forced because the GEN_PASS_REGISTRATION code now generates inline functions vs literal pass registration statements) Co-authored-by: Marius Brehler <marius.brehler@iml.fraunhofer.de>	2020-09-08 13:26:42 -07:00
Stella Laurenzo	4c37aed841	Update build instructions to use the submodule for llvm. * Previous instructions were referring to the option to use an external llvm-project checkout with a stale version hash.	2020-08-28 16:20:55 -07:00
Stella Laurenzo	d1ed6d260e	Initial work on a torch op registry. * This extracts metadata from python invocations (nearly) sufficient to generate ODS and a Torch IR translation table for most of the ops. * It also has the side effect of creating a data structure with meaningfully runnable examples suitable for an automated regression test. * There are some ops that are sufficiently complex/weird (like _convolution) that we'll just manually handle those. * See example output: https://gist.github.com/stellaraccident/60a58457b15e9184e224fa98a2658769	2020-08-28 15:20:55 -07:00
Stella Laurenzo	fc4f374345	Format sources.	2020-08-27 14:47:49 -07:00
Stella Laurenzo	de38caa547	Make code that depends on the legacy "type dispatch" mechanism optional. (#32 ) * Make code that depends on the legacy "type dispatch" mechanism optional. * This code is fairly tied to a specific ~1.3 version and uses a legacy dispatch mechanism. * Moving it and making it optional allows the project to build with PyTorch 1.6 and makes it possible for us to start building out a more modern interface mechanism in parallel. * Some of the moved code will be brought back into the more modern path, but isolating it now lets this be done incrementally. * Tests are left failing since the entire frontend is optional and the next step involves reworking the interface mechanism to get them to passing in both regimes. * Fix a few bogons to get things building * Add Dockerfile with pytorch Also, I configure with: -DCMAKE_PREFIX_PATH="/opt/pytorch/pytorch" (which is where pytorch is installed in this container) * Make a dep conditional. Co-authored-by: stephenneuendorffer <stephen.neuendorffer@xilinx.com>	2020-08-26 12:55:16 -07:00
stephenneuendorffer	31b3041e88	Add pytorch interface to ATen Dialect (#30 ) This patch adds a pytorch interface to npcomp. This interface is modeled after pytorch_xla and exposes the MLIR-based flow as a virtual device (similar to a gpu device or the xla backend). Usage is intended to be something like: dev = torch_mlir.mlir_device() t0 = torch.randn((4,4), device=dev) t1 = torch.randn((4,4), device=dev) t2 = t0 + t1 t2_mlir = torch_mlir.get_mlir( t2 ) t2_cpu = t2.to('cpu') In this case t2_cpu would contain the result of the computation, and t2_mlir contains the mlir description of the computation. Note that this also properly returns backward paths synthesized by pytorch. There are several parts of this: 1) A tensor type (implemented by tensor.* and tensor_impl.) 2) The device modeling (aten_mlir_bridge., aten_mlir_device., aten_mlir_type) 3) a temporary IR (implemented by ir.cpp) There is also a reference lowering directly from the ATen dialect to C function calls consisting of two parts: 1) The driver that uses the IR to generate MLIR, run Passes and compile the result using mlir::ExecutionEngine (implemented by jit.cpp and mlir_gen.cpp) 2) A runtime library implemented by lib/aten_ops.cpp. Most of the operations are implemented by callbacks into the torch C++ libraries. Some aspects of this are known to be less than optimal, in particular: 1) There's some function definitions that don't live in the file corresponding to their declaration. 2) More aspects of this (e.g. the IR) seem like they should be automatically generated. 3) It's unclear to me how much of the 'IR' is actually necessary, or whether MLIR could be created on the fly. Note that this code is licensed in a way similar to pytorch, with the intention that eventually (when npcomp reaches some maturity) it should be pushed there. (see frontends/pytorch/LICENSE) The code is also structured much closer to the pytorch coding style than the LLVM coding style.	2020-08-21 11:22:47 -07:00
Stella Laurenzo	69cda404ef	NFC: Fix extra namespace declaration. * Was causing build break on GCC9.	2020-08-20 16:22:41 -07:00
Stella Laurenzo	77b235f621	Create frontends/pytorch directory. (#31 ) * Adds/updates readmes with some notes about code organization and direction. * Meant to prepare a space for upcoming integration of #30.	2020-08-18 09:43:20 -07:00
Stella Laurenzo	a2a36aa8f3	Add mlir-hlo as a submodule and add a script to find versions. (#20 ) * I expect that mlir-hlo will be a non-optional dependency of the project, so adding as a sub-module. * IREE is an optional dependency and I'm keeping this as an out-of-tree checkout for the moment. * The script will compute the join across both iree and mlir-hlo to find a common LLVM version. * The script needs some more work (like a flag that says to update the version, etc). Likely needs more testing through an integrate or two.	2020-08-13 16:42:05 -07:00
stephenneuendorffer	bb668e6e26	Add ATen Dialect (#16 ) This patch adds a dialect intended to be used as a frontend dialect to facilitate lowering from "A Tensor Library" in torch/pytorch. This patch includes several passes that are useful in conjuction with the dialect: --aten-layer-name: Generates layer names for each operation, which are not present in the original pytorch. --aten-to-std: Lower the ATen dialect into standard dialect function calls. --return-elimination-pass: convert functions (primarily the toplevel function) to pass return values by reference. This simplifies pytorch integration. --aten-op-report: generate a textual report about the model --liveness-report Future patches will implement actual integration with the pytorch jit to intercept and generates MLIR in this dialect, then lower the resulting MLIR into function calls through aten-layer-name -> aten-to-std -> return-elimination -> std-to-llvm. The result would then jitted using the LLVM jit, linked against a runtime library which makes calls back into pytorch to implement all the layers. Co-authored-by: Jeff Fifield <jeff.fifield@xilinx.com> Co-authored-by: Jeff Fifield <jeff.fifield@xilinx.com>	2020-08-12 19:28:04 -07:00
stephenneuendorffer	14f614396d	Move precommit to 20.04 (#15 )	2020-08-07 10:32:02 -07:00
stephenneuendorffer	5beaf4cc01	Fix build again (#14 ) The RuntimeShlib.so now lives in /lib.	2020-08-07 08:36:03 -07:00
stephenneuendorffer	a5f3b16f92	Fix precommit workflow (#13 )	2020-08-06 23:51:05 -07:00

... 15 16 17 18 19 ...

1065 Commits (b7082a8d4ec1168f05271b6f16f592932e14c640) All Branches Search

1065 Commits (b7082a8d4ec1168f05271b6f16f592932e14c640)

All Branches