Commit Graph

94 Commits (dc8afc927136fcccc4a89e8e0de092b57ebcaf2f)

Author SHA1 Message Date
Sean Silva dc8afc9271 [RefE2E] Refactor how tcf.add is lowered.
It was previously going through this awkward route that prematurely
created linalg.generic ops, which was an annoying layering problem since
we can't compute a shape transfer function for linalg.generic in the
general case. Now we pass it through the same path as tcp.matmul, with
the shape transfer function being defined for tcp.add.

This also removed the need for TCPToLinalg (now deleted). The equivalent
of that is happening in lower-shaped-results-to-memref. One interesting
outcome of this: we're basically using linalg as a "Buffer TCP". We
might want to look into using named structured ops for more of TCP, but
that would be a big velocity hit since then any change to the ODS /
verification for those ops would be a change to the upstream structured
op ODS generator. After we have more experience defining this manually,
we should re-evaluate rebasing TCP on generated named linalg ops.
2020-09-18 15:03:53 -07:00
Sean Silva d8675f8ad2 [RefE2E] Add support for matmul.
I'm pretty happy with how this turned out. It looks pretty much like it
should -- one change at each layer. This particular op bottoms out on
linalg which takes care of the rest.

- Add tcf.matmul
- Add tcp.matmul
- Add TCF->TCP lowering
- Add tcp.matmul shape transfer function (BypassShapes.cpp)
- Add tcp.matmul -> linalg.matmul lowering (LowerShapedResultsToMemref.cpp)
- Add support to LowerShapeConstraints for lowering the new
shape.cstr_require

This matmul op is pretty limited in its capabilities. There is no
batching and no multidimensional contraction. Certainly more design work
will be needed to find the right abstractions that aren't too general
but also help to canonicalize many cases from frontends. This is mainly
to show that adding a new op needn't be very "scary" once we have the
e2e infra in place.

Also,
- this clears out some exploratory cruft from the TCF dialect now that
this is starting to become real.
2020-09-18 11:31:01 -07:00
Sean Silva 75f57b461e
Totally rework RefE2E tensor to memref flow. (#42)
This now gets the overall "RefE2E" compilation stack to a point that I'm
fairly happy with. We simplify it by mostly embracing the "descriptor"
view of the world.

The overall flow is best understood by reading through the
createE2ELoweringPipeline function in lib/E2E/E2E.cpp
That function creates a pass pipeline that lowers from "TCF" (which is
~numpy level of abstraction) down to LLVM IR.

A brief high-level summary of what happens there:

1. TCF to TCP conversion. This involves reifying error handling in the
form of shape constraints. See test/Conversion/TCFToTCP/basic.mlir

2. Lowering shape constraints. This converts shape constraints into
eager error-handling code. See test/E2E/lower-shape-constraints.mlir
This pass will soon go upstream.
Because this lowers to std.assert, some later passes like
LowerToNpcomprtABI and LowerToLLVM are updated to properly plumb this
through e2e.
See test/npcomp-run-mlir/invalid-broadcast.mlir for an execution test
that properly aborts in case of an error.

3. Lowering tensors to memrefs. This is done via a series of passes
rather than an single mega conversion. Unlike the previous code that
mixed in the npcomprt ABI stuff here, it's now a very clean "pure
memref" conversion.
See test/E2E/lower-*-to-memref.mlir and
lib/E2E/TensorToMemref/
Most of the changes are concentrated here.

4. As part of the above, we use the upstream ConvertShapeToStandard for
lowering shapes.

5. We lower linalg to loops and lower loops to CFG using upstream
passes.

6. Rewrite the "ABI" boundaries of the program to npcomprt data
structures (LowerToNpcomprtABI). This mainly affects ABI boundaries and
how global tensor constants are represented. One of the major
improvements in this commit is that now it's a very clean rewrite that
just replaces memrefs on ABI boundaries with !npcomprt.tensor (before
there was a get_extent function that is not needed).
See test/E2E/lower-to-npcomprt-abi.mlir

7. Lower to LLVM with upstream mlir patterns + some patterns for the
npcomprt lowerings.

One aspect here that is still a remnant of a non-descriptor-based tensor
to memref flow is the BypassShapes + LowerShapedResultsToMemref.
BypassShapes wraps the "tensor compute" ops in a tcp.shaped_results
(basically a "tie_shape" kind of op), and then
LowerShapedResultsToMemref uses those annotations to allocate output
buffers while lowering the "tensor compute ops". Note that there are
very few "tensor compute" ops currently supported (tcp.add +
tcp.broadcast_to), so we just hardcode them in both passes.
Realistically, I expect this to go away as we fully embrace the
descriptor-based approach for simplicity, so don't look too deep into
it.
2020-09-16 17:31:40 -07:00
Stella Laurenzo a74a98094b
Add a new python script to auto-generate ATen op ODS definitions. (#43)
* Add a new python script to auto-generate ATen op ODS definitions.

* There is still some work on some of the ops to annotate correct types.
* The ODS is not actually included into the dialect yet, but I'd like to commit it so that we can track changes.
* Will reconcile this with the ops produced by the existing script in a followup. Still need to do some more iteration to reach parity.
2020-09-16 16:21:24 -07:00
Marius Brehler d62f8227c2
Bump LLVM to @7d1ed69 and fix namespace handling changed upstream.
* Bump LLVM to llvm/llvm-project@7d1ed69
* Bump MLIR-HLO to tensorflow/mlir-hlo@1880f87
* Adopt to MLIR's changed namespace handling
2020-09-16 15:52:15 -07:00
Stella Laurenzo 97d83f786a Bump submodule versions.
* llvm-project: b5924a8e27536d19dd5c4d302db29fb6163d5faa
* mhlo: 848ca244d20f045b7921da55a98a04d95ef94f0e
* Multiple breakages that need to be fixed.

Fixes:
* Refactor dialect registration
* Remove all kindof methods (Casting functionality has been added upstream and is implicitly
available, see https://llvm.discourse.group/t/removing-kinds-from-attributes-and-types/1547.)
* Update dialect registration to comply with https://reviews.llvm.org/D85495.
* Remove type kinds and update some changed dialect signatures.
* Upgrade ATen dialect to match upstream needs.
  * Move dialect registration to tablegen.
  * Register the ListType in tablegen.
  * Change dialect initialization signature.
* Use TypeSwitch in MlirIr location printer.
* Remove global registry depends from npcomp-opt.
* Change LowerToLLVM to pass an MLIRContext vs an LLVMDialect for type creation.
* Remove dep on MLIREDSCInterface that is removed upstream.
* Thread through the DialectRegistry for opt and python-like tools.
* Modernize pass registration (This was forced because the GEN_PASS_REGISTRATION code now generates inline functions vs literal pass registration statements)

Co-authored-by: Marius Brehler <marius.brehler@iml.fraunhofer.de>
2020-09-08 13:26:42 -07:00
Stella Laurenzo fc4f374345 Format sources. 2020-08-27 14:47:49 -07:00
stephenneuendorffer bb668e6e26
Add ATen Dialect (#16)
This patch adds a dialect intended to be used as a frontend dialect
to facilitate lowering from "A Tensor Library" in torch/pytorch.

This patch includes several passes that are useful in conjuction with the
dialect:

--aten-layer-name: Generates layer names for each operation, which are not
  present in the original pytorch.
--aten-to-std: Lower the ATen dialect into standard dialect function calls.
--return-elimination-pass: convert functions (primarily the toplevel function)
  to pass return values by reference.  This simplifies pytorch integration.
--aten-op-report: generate a textual report about the model
--liveness-report

Future patches will implement actual integration with the pytorch jit to
intercept and generates MLIR in this dialect, then lower the resulting MLIR
into function calls through aten-layer-name -> aten-to-std ->
return-elimination -> std-to-llvm. The result would then jitted using the LLVM
jit, linked against a runtime library which makes calls back into pytorch to
implement all the layers.

Co-authored-by: Jeff Fifield <jeff.fifield@xilinx.com>

Co-authored-by: Jeff Fifield <jeff.fifield@xilinx.com>
2020-08-12 19:28:04 -07:00
Stella Laurenzo fc484d1bd8 Rework reference shape lowering based on upstream shape dialect changes.
* Primarily, the upstream shape dialect now uses tensor<?xindex> for non-erroring, immediate shape calculations (and will return this for shape_of of a tensor or memref).
* In addition, upstream passes do not yet exist for fully lowering to standard ops, so the passes here need to be extended to handle this new convention.
* This should be seen as an intermediate state, necessary to integrate a new LLVM version and needs more work and cleanup for generality.
* There is a good deal of awkwardness in these conversions. The hope is that additional upstream work will yield better defined conversion paths once out of this intermediate state.
2020-08-03 13:43:49 -07:00
Stella Laurenzo 9d5d802cc8 Fix compilation issues due to llvm-project version bump.
* Redundant infer type implementations removed.
* Update to the linalg GenericOp build calls.
2020-08-01 15:23:57 -07:00
Sean Silva d0f15d2cec Add -optimize flag to npcomp-run-mlir so that it runs optimizations.
My main interest in this is that tweaking the default of this flag is a
quick way to check for miscompiling canonicalizations / op definitions
not annotated properly (e.g. marked NoSideEffect when in fact it is not
safe to do so).
2020-07-13 16:07:44 -07:00
Stella Laurenzo 29da57e631 Update sample for refjit invocation. 2020-07-10 22:57:26 -07:00
Stella Laurenzo 9e4a62fc71 Allow JITModule passes to be built separately.
* Re-introduces frontent/backend split.
* Adds a (very) trivial shape refinement pass.
2020-07-10 22:57:26 -07:00
Stella Laurenzo aea05d68d7 Initial python plumbing to interface with the refjit backend. 2020-07-10 22:57:26 -07:00
Sean Silva df0d3fcaff Consolidate LLVM definitions of runtime data structures.
This required making module descriptors hold a FuncDescriptor* instead
of a pointer to array of FuncDescriptors as it previously did, which is
innocuous (just requires an llvm.bitcast after the llvm.mlir.addressof).
2020-07-10 17:50:55 -07:00
Sean Silva e228aa4b11 npcomprt: add support for constants
- create tcp.global + tcp.get_global_memref
- create npcomprt.global + npcomprt.get_global
- LLVM lowering for new npcomprt ops
- Runtime:
 - GlobalDescriptor struct emitted by LLVM lowering
 - implement __npcomp_compiler_rt_get_global

Also,
- cleanly isolate all runtime data structure definitions shared by the
compiler and runtime into lib/runtime/CompilerDataStructures.h
2020-07-10 17:31:24 -07:00
Stella Laurenzo efbcf0aa44 Add NumpyPublicFunctionsToTensor pass.
* Rewrites public function signatures to operate on tensors (vs ndarray).
* Most of our backends presume immutable tensors at public function boundaries.
2020-07-08 22:51:54 -07:00
Stella Laurenzo 5ceb37c19b Add NumpyToTCF conversion.
* Just for numpy.add right now.
2020-07-08 21:03:57 -07:00
Sean Silva b4f0cea8fa Rework e2e flow to use new "npcomprt"
This ~totally reworks the existing "runtime" stuff to be more
principled and usable, such as from Python. It's still not fully
production-quality, mainly in the department of memory management (e.g.
it currently leaks memory; we need to figure out "who frees memrefs" +
the analysis and transformation needed to do that (maybe use upstream
buffer allocation pass?)).

The user API is in include/npcomp/runtime/UserAPI.h, though
include/npcomp/JITRuntime/JITModule.h is a friendlier wrapper.

The stuff under {include,lib}/runtime is totally firewalled from the
compiler and tiny (<6kB, though no attention has gone into optimizing
that size). For example, we don't link in libSupport into the runtime,
instead having our own bare bones replacements for basics like ArrayRef
(the JITRuntime helps with bridging that gap, since it *can* depend on
all common LLVM utilities).

The overall features of npcomprt is that it exposes a module that
with multiple function entry points. Each function has arguments and
results that are tensor-valued, and npcomprt::Tensor is the runtime type
that is used to interact with that (and a npcomprt::Ref<T>
reference-counting wrapper is provided to wrap npcomprt::Tensor in the
common case).

From an implementation perspective, an npcomprt module at the
LLVM/object/binary level exposes a single module descriptor struct that
has pointers to other metadata (currently just a list of function
metadata descriptors). All interactions with the npcomp runtime are
keyed off of that module descriptor, including function lookups and
dispatching. This is done to dodge platform ABI issues and also allow
enough reflection to e.g. verify provided arguments.

Most of the compiler-side work here was in LowerToNpcomprtABI and
LowerToLLVM.

Also,
- Rename npcomp_rt/NpcompRt to npcomprt/Npcomprt; it was getting
annoying to type the underscores/caps.
- misc improvements to bash_helpers.sh
2020-07-08 19:36:19 -07:00
Stella Laurenzo 5aa2f0f9f6 Add a trivial copy elision canonicalization on ndarray->tensor.
* This elides the very common code the compiler adds for chaining otherwise tensor-related numpy ops together.
* More aggressive canonicalizations would require more advanced analysis.
2020-07-05 18:09:43 -07:00
Stella Laurenzo fae15ec5e7 Allow the ndarray type to carry a shape. 2020-07-05 17:34:03 -07:00
Stella Laurenzo dc271dfb87 Complete the basic spike to perform dtype inference.
* Correctly infers the unknown dtypes that emit as part of compilation for the simple ufunc case.
* Significant more testing needs to be done on the details now that the pass is minimally functional.
* The actual pass itself is still too hacky/not general, but the underlying analysis is further along.
2020-07-05 16:09:16 -07:00
Stella Laurenzo 86ea90ba84 NFC: Rename Support.(h|cpp) to Types.(h|cpp). 2020-07-04 20:42:37 -07:00
Stella Laurenzo 4a5695ae9c Fix createTensorLikeArrayType() declaration. 2020-07-04 20:37:46 -07:00
Stella Laurenzo 00c791f925 Make common utilities for converting TypeNode <-> IR types.
* Generalizes the conversions from ObjectValueType <-> tensor and ndarray.
* Creates a utility to construct the default type map hook.
2020-07-04 20:33:13 -07:00
Stella Laurenzo 97c92aa264 Remove the existing attached values/ops from CPA types.
This was ad-hoc and needs to be replaced by a more principled track back to the IR.
2020-07-04 17:47:19 -07:00
Stella Laurenzo 48a0b0ec7f NFC: Move CPATypeInference to Typing directory. 2020-07-04 16:56:09 -07:00
Stella Laurenzo 051d088161 NFC: Move CPA typing analysis down a directory. 2020-07-04 16:40:02 -07:00
Stella Laurenzo 6a50efd046 Extend the CPA type inference to work on numpy types/ops.
* Adds an op interface for adding CPA constraints.
* Adds a type conversion hook for handling built-in types (that we can't have adopt our interface).
* Converts tensor<> to object(!Tensor, [e:<type>]) just like NdArray.
* Implement a few numpy ops far enough to do dtype inference for simple sequences.
2020-07-03 18:16:34 -07:00
Stella Laurenzo 34861b18f4 Add NdArray type inference conversion. 2020-07-03 16:38:10 -07:00
Stella Laurenzo 4a2f7c0b5f Add constraint propagation and tracking of node members. 2020-07-03 13:29:52 -07:00
Stella Laurenzo 1a13c38033 More progress on CPA.
* Added transitivity propagation rules.
* Fixed up some copy-n-paste inversions from the old algorithm.
2020-07-02 18:56:05 -07:00
Stella Laurenzo 74b8bed7e3 Unique CPA type and constraints to enable comparison by pointer during propagation. 2020-07-02 17:07:02 -07:00
Stella Laurenzo a257da46e2 Introduce a type interface for mapping to CPA types.
* Currently just simplifies the logic for UnknownType -> TypeVar.
2020-07-02 13:56:27 -07:00
Stella Laurenzo b0604684ba NFC: Move CPA support down into it's own directory. 2020-07-02 11:31:23 -07:00
Stella Laurenzo e1839a0d6b Bump llvm and iree versions.
* Gets us passed the upstream changes that enable type interfaces.
* Adds the ARM backend due to an implicit IREE dependency sneaking in for that (https://github.com/google/iree/issues/2401)
* Adds explicit TypeStorage to type base classes per upstream change.
2020-07-02 11:24:05 -07:00
Stella Laurenzo 92190176fb Add skeleton of pass to do modified PCA type inference. 2020-06-30 20:57:09 -07:00
Stella Laurenzo f1b08a0ef0 Add some support classes for implementing a CPA type inference algorithm. 2020-06-30 18:28:39 -07:00
Stella Laurenzo 046751254f Refactor old tracing tests and remove deprecated ops.
* Old doctests to run under lit.
* Old custom filecheck tests -> pytest directory (under lit).
* Rename some old ufunc ops in the tracer.
2020-06-29 16:19:03 -07:00
Stella Laurenzo 7ca292ade5 Add partial evaluator for explicit numpy ufuncs.
* This enables emission of "numpy.add(a, b)" and several dozen others.
* Will deprecate original ufunc infra in a follow-on.
2020-06-29 15:27:39 -07:00
Stella Laurenzo a4f3ce1ed3 Add value coding for ndarray.
* This lets us import arrays from the outer environment, which is the first step to actually handling numpy ops.
2020-06-28 18:42:08 -07:00
Stella Laurenzo f6721c173d Add create_array_from_tensor and copy_to_tensor ops. 2020-06-28 17:58:26 -07:00
Stella Laurenzo efe8915901 Add NdArrayType. 2020-06-28 17:37:20 -07:00
Stella Laurenzo 7bd5733d38 Add "template function" ops and importer code.
* This starts to lay down the infra for reasoning about calls
* Adds the importer code to generate IR for function calls of compiler recognized static functions.
2020-06-26 18:36:36 -07:00
Stella Laurenzo 529873d13c Wire up IREE compilation and runtime in a new backend test.
* Adds python bindings for invoking flow, HAL, and VM lowering pipelines.
* Adds pythong bindings for translating to VM module flatbuffer.
* Adds a new backend_test/iree directory and configure lit to find the IREE python rt bindings.
* Open code a simple_invoke.py that exercises the whole pipeline (need real APIs for a lot of this).
* Fails when invoking the function because I never implemented argument marshaling for scalars :(
* Plenty of stuff to do tomorrow.
2020-06-19 00:30:34 -07:00
Stella Laurenzo 373878f31f Add _npcomp.backend.iree module.
* Populates with builders for the various path pipelines and translator.
2020-06-18 23:28:30 -07:00
Stella Laurenzo 213041449f Move most python sources to the include and lib tree. 2020-06-18 18:02:39 -07:00
Stella Laurenzo b21b5322f6 Basicpy conversion to IREE+std skeleton and first conversions.
* Conversions to std for numeric binary expressions, numeric to_boolean, and numeric comparisons.
* Added folders to constant ops to comply with requirements of the pass system.
* Extended the frontend with parameter/result annotation processing for primitives (can specify types for function arguments).
* Added (empty) directory/sources for IREEVM conversions. These are only enabled if IREE is enabled.
2020-06-13 23:45:43 -07:00
Stella Laurenzo 2ba8296151 Add script tools/format_source.sh and run it on all python and c++ sources. 2020-06-13 14:53:54 -07:00
Stella Laurenzo 19196f23e1 Make a real library for InitAll and extend it to conditionally initialize dependencies.
* Conditioned on the top level CMake option to enable IREE.
* There is still some warning flags and such that need triage, but it does build/work.
2020-06-11 17:47:14 -07:00