torch-mlir

Commit Graph

Author	SHA1	Message	Date
Sean Silva	276f5b80ea	[RefE2E] Add assemblyFormat for TCF and TCP ops and tidy up.	2020-09-18 15:03:53 -07:00
Sean Silva	d8675f8ad2	[RefE2E] Add support for matmul. I'm pretty happy with how this turned out. It looks pretty much like it should -- one change at each layer. This particular op bottoms out on linalg which takes care of the rest. - Add tcf.matmul - Add tcp.matmul - Add TCF->TCP lowering - Add tcp.matmul shape transfer function (BypassShapes.cpp) - Add tcp.matmul -> linalg.matmul lowering (LowerShapedResultsToMemref.cpp) - Add support to LowerShapeConstraints for lowering the new shape.cstr_require This matmul op is pretty limited in its capabilities. There is no batching and no multidimensional contraction. Certainly more design work will be needed to find the right abstractions that aren't too general but also help to canonicalize many cases from frontends. This is mainly to show that adding a new op needn't be very "scary" once we have the e2e infra in place. Also, - this clears out some exploratory cruft from the TCF dialect now that this is starting to become real.	2020-09-18 11:31:01 -07:00
Sean Silva	75f57b461e	Totally rework RefE2E tensor to memref flow. (#42 ) This now gets the overall "RefE2E" compilation stack to a point that I'm fairly happy with. We simplify it by mostly embracing the "descriptor" view of the world. The overall flow is best understood by reading through the createE2ELoweringPipeline function in lib/E2E/E2E.cpp That function creates a pass pipeline that lowers from "TCF" (which is ~numpy level of abstraction) down to LLVM IR. A brief high-level summary of what happens there: 1. TCF to TCP conversion. This involves reifying error handling in the form of shape constraints. See test/Conversion/TCFToTCP/basic.mlir 2. Lowering shape constraints. This converts shape constraints into eager error-handling code. See test/E2E/lower-shape-constraints.mlir This pass will soon go upstream. Because this lowers to std.assert, some later passes like LowerToNpcomprtABI and LowerToLLVM are updated to properly plumb this through e2e. See test/npcomp-run-mlir/invalid-broadcast.mlir for an execution test that properly aborts in case of an error. 3. Lowering tensors to memrefs. This is done via a series of passes rather than an single mega conversion. Unlike the previous code that mixed in the npcomprt ABI stuff here, it's now a very clean "pure memref" conversion. See test/E2E/lower-*-to-memref.mlir and lib/E2E/TensorToMemref/ Most of the changes are concentrated here. 4. As part of the above, we use the upstream ConvertShapeToStandard for lowering shapes. 5. We lower linalg to loops and lower loops to CFG using upstream passes. 6. Rewrite the "ABI" boundaries of the program to npcomprt data structures (LowerToNpcomprtABI). This mainly affects ABI boundaries and how global tensor constants are represented. One of the major improvements in this commit is that now it's a very clean rewrite that just replaces memrefs on ABI boundaries with !npcomprt.tensor (before there was a get_extent function that is not needed). See test/E2E/lower-to-npcomprt-abi.mlir 7. Lower to LLVM with upstream mlir patterns + some patterns for the npcomprt lowerings. One aspect here that is still a remnant of a non-descriptor-based tensor to memref flow is the BypassShapes + LowerShapedResultsToMemref. BypassShapes wraps the "tensor compute" ops in a tcp.shaped_results (basically a "tie_shape" kind of op), and then LowerShapedResultsToMemref uses those annotations to allocate output buffers while lowering the "tensor compute ops". Note that there are very few "tensor compute" ops currently supported (tcp.add + tcp.broadcast_to), so we just hardcode them in both passes. Realistically, I expect this to go away as we fully embrace the descriptor-based approach for simplicity, so don't look too deep into it.	2020-09-16 17:31:40 -07:00
Stella Laurenzo	a74a98094b	Add a new python script to auto-generate ATen op ODS definitions. (#43 ) * Add a new python script to auto-generate ATen op ODS definitions. * There is still some work on some of the ops to annotate correct types. * The ODS is not actually included into the dialect yet, but I'd like to commit it so that we can track changes. * Will reconcile this with the ops produced by the existing script in a followup. Still need to do some more iteration to reach parity.	2020-09-16 16:21:24 -07:00
Marius Brehler	d62f8227c2	Bump LLVM to @7d1ed69 and fix namespace handling changed upstream. * Bump LLVM to llvm/llvm-project@7d1ed69 * Bump MLIR-HLO to tensorflow/mlir-hlo@1880f87 * Adopt to MLIR's changed namespace handling	2020-09-16 15:52:15 -07:00
Stella Laurenzo	97d83f786a	Bump submodule versions. * llvm-project: b5924a8e27536d19dd5c4d302db29fb6163d5faa * mhlo: 848ca244d20f045b7921da55a98a04d95ef94f0e * Multiple breakages that need to be fixed. Fixes: * Refactor dialect registration * Remove all kindof methods (Casting functionality has been added upstream and is implicitly available, see https://llvm.discourse.group/t/removing-kinds-from-attributes-and-types/1547.) * Update dialect registration to comply with https://reviews.llvm.org/D85495. * Remove type kinds and update some changed dialect signatures. * Upgrade ATen dialect to match upstream needs. * Move dialect registration to tablegen. * Register the ListType in tablegen. * Change dialect initialization signature. * Use TypeSwitch in MlirIr location printer. * Remove global registry depends from npcomp-opt. * Change LowerToLLVM to pass an MLIRContext vs an LLVMDialect for type creation. * Remove dep on MLIREDSCInterface that is removed upstream. * Thread through the DialectRegistry for opt and python-like tools. * Modernize pass registration (This was forced because the GEN_PASS_REGISTRATION code now generates inline functions vs literal pass registration statements) Co-authored-by: Marius Brehler <marius.brehler@iml.fraunhofer.de>	2020-09-08 13:26:42 -07:00
Stella Laurenzo	fc4f374345	Format sources.	2020-08-27 14:47:49 -07:00
stephenneuendorffer	bb668e6e26	Add ATen Dialect (#16 ) This patch adds a dialect intended to be used as a frontend dialect to facilitate lowering from "A Tensor Library" in torch/pytorch. This patch includes several passes that are useful in conjuction with the dialect: --aten-layer-name: Generates layer names for each operation, which are not present in the original pytorch. --aten-to-std: Lower the ATen dialect into standard dialect function calls. --return-elimination-pass: convert functions (primarily the toplevel function) to pass return values by reference. This simplifies pytorch integration. --aten-op-report: generate a textual report about the model --liveness-report Future patches will implement actual integration with the pytorch jit to intercept and generates MLIR in this dialect, then lower the resulting MLIR into function calls through aten-layer-name -> aten-to-std -> return-elimination -> std-to-llvm. The result would then jitted using the LLVM jit, linked against a runtime library which makes calls back into pytorch to implement all the layers. Co-authored-by: Jeff Fifield <jeff.fifield@xilinx.com> Co-authored-by: Jeff Fifield <jeff.fifield@xilinx.com>	2020-08-12 19:28:04 -07:00
Stella Laurenzo	fc484d1bd8	Rework reference shape lowering based on upstream shape dialect changes. * Primarily, the upstream shape dialect now uses tensor<?xindex> for non-erroring, immediate shape calculations (and will return this for shape_of of a tensor or memref). * In addition, upstream passes do not yet exist for fully lowering to standard ops, so the passes here need to be extended to handle this new convention. * This should be seen as an intermediate state, necessary to integrate a new LLVM version and needs more work and cleanup for generality. * There is a good deal of awkwardness in these conversions. The hope is that additional upstream work will yield better defined conversion paths once out of this intermediate state.	2020-08-03 13:43:49 -07:00
Stella Laurenzo	9d5d802cc8	Fix compilation issues due to llvm-project version bump. * Redundant infer type implementations removed. * Update to the linalg GenericOp build calls.	2020-08-01 15:23:57 -07:00
Stella Laurenzo	9e4a62fc71	Allow JITModule passes to be built separately. * Re-introduces frontent/backend split. * Adds a (very) trivial shape refinement pass.	2020-07-10 22:57:26 -07:00
Sean Silva	df0d3fcaff	Consolidate LLVM definitions of runtime data structures. This required making module descriptors hold a FuncDescriptor* instead of a pointer to array of FuncDescriptors as it previously did, which is innocuous (just requires an llvm.bitcast after the llvm.mlir.addressof).	2020-07-10 17:50:55 -07:00
Sean Silva	e228aa4b11	npcomprt: add support for constants - create tcp.global + tcp.get_global_memref - create npcomprt.global + npcomprt.get_global - LLVM lowering for new npcomprt ops - Runtime: - GlobalDescriptor struct emitted by LLVM lowering - implement __npcomp_compiler_rt_get_global Also, - cleanly isolate all runtime data structure definitions shared by the compiler and runtime into lib/runtime/CompilerDataStructures.h	2020-07-10 17:31:24 -07:00
Stella Laurenzo	efbcf0aa44	Add NumpyPublicFunctionsToTensor pass. * Rewrites public function signatures to operate on tensors (vs ndarray). * Most of our backends presume immutable tensors at public function boundaries.	2020-07-08 22:51:54 -07:00
Sean Silva	b4f0cea8fa	Rework e2e flow to use new "npcomprt" This ~totally reworks the existing "runtime" stuff to be more principled and usable, such as from Python. It's still not fully production-quality, mainly in the department of memory management (e.g. it currently leaks memory; we need to figure out "who frees memrefs" + the analysis and transformation needed to do that (maybe use upstream buffer allocation pass?)). The user API is in include/npcomp/runtime/UserAPI.h, though include/npcomp/JITRuntime/JITModule.h is a friendlier wrapper. The stuff under {include,lib}/runtime is totally firewalled from the compiler and tiny (<6kB, though no attention has gone into optimizing that size). For example, we don't link in libSupport into the runtime, instead having our own bare bones replacements for basics like ArrayRef (the JITRuntime helps with bridging that gap, since it can depend on all common LLVM utilities). The overall features of npcomprt is that it exposes a module that with multiple function entry points. Each function has arguments and results that are tensor-valued, and npcomprt::Tensor is the runtime type that is used to interact with that (and a npcomprt::Ref<T> reference-counting wrapper is provided to wrap npcomprt::Tensor in the common case). From an implementation perspective, an npcomprt module at the LLVM/object/binary level exposes a single module descriptor struct that has pointers to other metadata (currently just a list of function metadata descriptors). All interactions with the npcomp runtime are keyed off of that module descriptor, including function lookups and dispatching. This is done to dodge platform ABI issues and also allow enough reflection to e.g. verify provided arguments. Most of the compiler-side work here was in LowerToNpcomprtABI and LowerToLLVM. Also, - Rename npcomp_rt/NpcompRt to npcomprt/Npcomprt; it was getting annoying to type the underscores/caps. - misc improvements to bash_helpers.sh	2020-07-08 19:36:19 -07:00
Stella Laurenzo	5aa2f0f9f6	Add a trivial copy elision canonicalization on ndarray->tensor. * This elides the very common code the compiler adds for chaining otherwise tensor-related numpy ops together. * More aggressive canonicalizations would require more advanced analysis.	2020-07-05 18:09:43 -07:00
Stella Laurenzo	fae15ec5e7	Allow the ndarray type to carry a shape.	2020-07-05 17:34:03 -07:00
Stella Laurenzo	48a0b0ec7f	NFC: Move CPATypeInference to Typing directory.	2020-07-04 16:56:09 -07:00
Stella Laurenzo	051d088161	NFC: Move CPA typing analysis down a directory.	2020-07-04 16:40:02 -07:00
Stella Laurenzo	6a50efd046	Extend the CPA type inference to work on numpy types/ops. * Adds an op interface for adding CPA constraints. * Adds a type conversion hook for handling built-in types (that we can't have adopt our interface). * Converts tensor<> to object(!Tensor, [e:<type>]) just like NdArray. * Implement a few numpy ops far enough to do dtype inference for simple sequences.	2020-07-03 18:16:34 -07:00
Stella Laurenzo	34861b18f4	Add NdArray type inference conversion.	2020-07-03 16:38:10 -07:00
Stella Laurenzo	4a2f7c0b5f	Add constraint propagation and tracking of node members.	2020-07-03 13:29:52 -07:00
Stella Laurenzo	a257da46e2	Introduce a type interface for mapping to CPA types. * Currently just simplifies the logic for UnknownType -> TypeVar.	2020-07-02 13:56:27 -07:00
Stella Laurenzo	e1839a0d6b	Bump llvm and iree versions. * Gets us passed the upstream changes that enable type interfaces. * Adds the ARM backend due to an implicit IREE dependency sneaking in for that (https://github.com/google/iree/issues/2401) * Adds explicit TypeStorage to type base classes per upstream change.	2020-07-02 11:24:05 -07:00
Stella Laurenzo	92190176fb	Add skeleton of pass to do modified PCA type inference.	2020-06-30 20:57:09 -07:00
Stella Laurenzo	046751254f	Refactor old tracing tests and remove deprecated ops. * Old doctests to run under lit. * Old custom filecheck tests -> pytest directory (under lit). * Rename some old ufunc ops in the tracer.	2020-06-29 16:19:03 -07:00
Stella Laurenzo	7ca292ade5	Add partial evaluator for explicit numpy ufuncs. * This enables emission of "numpy.add(a, b)" and several dozen others. * Will deprecate original ufunc infra in a follow-on.	2020-06-29 15:27:39 -07:00
Stella Laurenzo	a4f3ce1ed3	Add value coding for ndarray. * This lets us import arrays from the outer environment, which is the first step to actually handling numpy ops.	2020-06-28 18:42:08 -07:00
Stella Laurenzo	f6721c173d	Add create_array_from_tensor and copy_to_tensor ops.	2020-06-28 17:58:26 -07:00
Stella Laurenzo	efe8915901	Add NdArrayType.	2020-06-28 17:37:20 -07:00
Stella Laurenzo	7bd5733d38	Add "template function" ops and importer code. * This starts to lay down the infra for reasoning about calls * Adds the importer code to generate IR for function calls of compiler recognized static functions.	2020-06-26 18:36:36 -07:00
Stella Laurenzo	529873d13c	Wire up IREE compilation and runtime in a new backend test. * Adds python bindings for invoking flow, HAL, and VM lowering pipelines. * Adds pythong bindings for translating to VM module flatbuffer. * Adds a new backend_test/iree directory and configure lit to find the IREE python rt bindings. * Open code a simple_invoke.py that exercises the whole pipeline (need real APIs for a lot of this). * Fails when invoking the function because I never implemented argument marshaling for scalars :( * Plenty of stuff to do tomorrow.	2020-06-19 00:30:34 -07:00
Stella Laurenzo	b21b5322f6	Basicpy conversion to IREE+std skeleton and first conversions. * Conversions to std for numeric binary expressions, numeric to_boolean, and numeric comparisons. * Added folders to constant ops to comply with requirements of the pass system. * Extended the frontend with parameter/result annotation processing for primitives (can specify types for function arguments). * Added (empty) directory/sources for IREEVM conversions. These are only enabled if IREE is enabled.	2020-06-13 23:45:43 -07:00
Stella Laurenzo	2ba8296151	Add script tools/format_source.sh and run it on all python and c++ sources.	2020-06-13 14:53:54 -07:00
Stella Laurenzo	e3fd22a035	Add a (very) basic type inference pass for basicpy. For simple programs, this gets us enough typing to lower to real backends.	2020-06-10 19:04:05 -07:00
Stella Laurenzo	3e58d8fe37	Add skeleton of type inference pass.	2020-06-10 14:48:22 -07:00
Stella Laurenzo	432e01fe8f	Move Basicpy and Numpy dialect IR to IR/ folder.	2020-06-09 19:22:24 -07:00
Stella Laurenzo	340f109742	Add implicit return and expression statements where the value id discarded.	2020-06-09 18:34:07 -07:00
Stella Laurenzo	e18e8e0a96	Add boolean/logical operations (and, or, not). * Adds a new to_boolean op to evaluate a value as a truthy i1 * Uses cascading scf.if ops to properly evaluate and/or sequences (short-circuit and original value returning) * Adds a helper to construct select ops and uses it to implement 'not'	2020-06-09 00:01:21 -07:00
Stella Laurenzo	b0a80e04f1	Make binary_expr and binary_compare have similar asm syntax.	2020-06-08 18:29:14 -07:00
Stella Laurenzo	1ef3614682	Add support for short-circuit comparisons with scf.if.	2020-06-08 17:52:07 -07:00
Stella Laurenzo	85b724e70c	Adds ODS and import support for binary_expr and binary_compare ops. * Currently only supports non-short-circuit comparisons.	2020-06-08 13:46:06 -07:00
Stella Laurenzo	72499e0319	Add bytes constants.	2020-06-07 16:00:29 -07:00
Stella Laurenzo	f3829b1d4f	Add string constants.	2020-06-07 15:46:28 -07:00
Stella Laurenzo	869228e316	Add bool constants.	2020-06-07 15:15:19 -07:00
Stella Laurenzo	af4466197e	Add lit test suite for python compiler. * Adds a test for simple constants and fixes issues.	2020-06-07 14:29:39 -07:00
Stella Laurenzo	0cc0a7165e	Add basic AST -> basicpy dialect function extraction. * Extends the bindings to support locations. * Various other things necessary to extract a function with simple numeric expressions.	2020-06-06 21:24:28 -07:00
Sean Silva	cd7258dbd4	Enable warnings by default. The secret here is LLVM_ENABLE_WARNINGS=ON. I also fixed a couple warnings, which gets us to be warning-clean. I noticed also that npcomp-run-mlir/basic.mlir seems to be failing. Maybe something since the latest integrate. My next commit (introduce npcomp mini runtime) will largely rewrite it though, so it'll get fixed then.	2020-06-03 20:39:34 -07:00
Sean Silva	e8b1a07ef4	Initial NpcompRt (npcomp_rt) dialect boilerplate.	2020-06-01 19:07:53 -07:00
Sean Silva	3a09455540	Use upstream shape.from_extents Replace our local `tcp.shape_from_extents` op with the upstream `shape.from_extents` op.	2020-05-21 14:51:01 -07:00
Sean Silva	1fed1cb016	Update llvm-project to 753a21928413f8a7e76978cb1354e09150e114e0	2020-05-21 13:09:06 -07:00
Sean Silva	87aa561c69	Remove RtGetTensorExtentOp. It is unused now, and will be superceded by a proper runtime dialect.	2020-05-21 10:17:49 -07:00
Sean Silva	be1971c4fc	Rename tcp.abort_if to tcp.shape_observe_error This more clearly captures its semantics as a structural "observer" of code that we currently mark as NoSideEffect but eventually lowers to eager error handling code. Also, update LowerRankedShapes to erase it, now that the layering here is clear. That pass reifies the eager error handling code, so the need for the dummy op to keep things alive isn't needed. With this change, we are now ready to start lowering to LLVM! This is the current print-ir-after-all from e2e-lowering-pipeline: https://reviews.llvm.org/P8221	2020-05-18 13:38:47 -07:00
Sean Silva	836a8d4bec	Lower tcp.alloc_memref ops to tcp.get_extent + std.alloc. - tcp.get_extent will be liminated while lowering shapes - std.alloc is supported by the upstream LLVM lowering.	2020-05-18 12:53:31 -07:00
Sean Silva	1b48d0d80b	Remove the present tcp.island. The idea was half-baked and after some deep thought felt like a solution looking for a problem. What we had here (and is removed in this patch) just wasn't pulling its weight. I cannot think of anything we would want to do with tcp.island as it is removed here beyond just sinking and merging them within a basic block, such that the witness argument is kind of pointless (only matters for hoisting). TCP compute ops like tcp.add and tcp.broadcast_to have the strong invariant of "pure or undefined behavior", which means they are always safe to sink. The island concept as removed here conferred no benefit. Also, I'll note that "islands" are a trick you can only play once in a system (unless they strictly nest). I have some early-stage thoughs on having an island concept that helps with modeling tensor shapes robustly which seems promising (the island would serve a similar role as tie_shape).	2020-05-14 15:19:37 -07:00
Sean Silva	eaeb4011e6	Lower !shape.shape to SSA values. This uses an approach inspired by what is done in IREE. See comments on LowerRankedShapes.cpp for how it works. The basic gist is that we have an op that creates a !shape.shape from a set of SSA values representing the extents, and then iteratively replace any op producing a !shape.shape with instances of that op.	2020-05-13 17:20:23 -07:00
Sean Silva	ef25428fe3	Add lowering from linalg to loops. This also adds a small pass to clean up the `dim` ops that linalg introduces. For now, it only has a trivial pattern that looks for a `tcp.alloc_memref(%shape)` op to get the shape as we currently have an invariant that all memrefs are the result of such ops. But eventually this will need to look through view ops and any other shape-ish stuff that linalg introduces as it lowers to loops, along with any slicing ops introduced by buffer allocation.	2020-05-11 18:54:52 -07:00
Sean Silva	f525d4dbcf	Add custom assembly format for tcp.alloc_memref/tcp.get_extent This makes the IR a bit easier to scan.	2020-05-11 15:28:34 -07:00
Sean Silva	e29aef855b	Initial TCF/TCP E2E seed. Very much WIP. This is enough to get tcf.add down to approximately the "linalg.generic on buffers" level of abstraction. (but there are nuances)	2020-05-08 20:20:41 -07:00
Stella Laurenzo	a91b0bfbe1	Add numpy.get_slice op and wire it up to the tracer.	2020-05-08 16:04:58 -07:00
Stella Laurenzo	bc5ef81d68	Add basicpy.SlotObject type and ops to create/index into it. * This is intended to provide low-level modeling for built-in objects. * It is now possible to trace slice tuples (which are tuples of NoneType\|EllipsisType\|SlotObjectType<slice, ...>).	2020-05-05 18:16:01 -07:00
Stella Laurenzo	bfd5fedba7	Add central registration for type ranges.	2020-05-05 14:16:39 -07:00
Stella Laurenzo	502ef8f195	Create skeleton for 'Basicpy' dialect. * It is time to start adding more python mechanisms. * Running into this for materializing slice() objects.	2020-05-04 17:48:02 -07:00
Stella Laurenzo	ebb5bcf6af	Handle np.transpose() and ndarray.T shortcut. * Just the form without explicit permutation for now.	2020-05-04 16:20:36 -07:00
Stella Laurenzo	a5f755d406	Implement __array_func__ hook and use it to trace np.dot. * Creates an abstraction/registry around emitters (intended to generalize to AST compilation as well). * Reworks ufuncs to use the same mechanism as array funcs. * Adds the numpy.dot op.	2020-05-04 15:47:01 -07:00
Stella Laurenzo	c89a35f97f	Rework the poc tracer to be structured how intended.	2020-05-02 19:52:21 -07:00
Stella Laurenzo	d3632af675	Add !numpy.any_dtype dialect type.	2020-04-29 18:20:42 -07:00
Stella Laurenzo	b4425fe1d2	Add numpy.ufunc_call op.	2020-04-29 17:49:56 -07:00
Stella Laurenzo	c4a192d5c9	Rename from npcomp::NUMPY to NPCOMP::numpy to follow IREE convention.	2020-04-29 17:10:10 -07:00
Stella Laurenzo	e845db8a20	Add builtin_ufunc and generic_ufunc ops.	2020-04-28 23:51:54 -07:00
Stella Laurenzo	d3b6e1767a	Add stub numpy dialect.	2020-04-26 17:20:58 -07:00

1 2 3 4

171 Commits (2dbab50444e9c30eabbd3355a47545c0650fd100)