A previous fix to the handling of size-1 dims in
`aten.view` (https://github.com/llvm/torch-mlir/pull/962) resulted in
the wrong grouping of dimensions when size-1 dims where between two
dims of size greater than 1. This commit fixes that.
This commit lowers `aten.matmul` to `linalg.BatchMatmul` under the
following conditions:
1. The result of matrix multiplication must have batch dimensions,
i.e., rank greater than 2.
2. The resultant matrix must have at most 1 dynamic batch dimension.
It also handles broadcasting of batch dimensions when batch dimensions
of the matrices are broadcastable.
Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com>
This commit decomposes `aten.baddbmm` op into `aten.bmm`,
`aten.mul.Scalar`, and `aten.add.Tensor` op.
Signed-Off By: Vivek Khandelwal <vivek@nod-labs.com>
A user might want to avoid the extra layer of multiprocessing libary for
debugging purpose. In such cases, the -s flag can be used to force
sequential execution.
Added the dynamic registration of return function to the execution
engine. This makes sure that different/multiple return types are supported.
Also, updated the .style.yapf indentation to 4.
That way, downstreams don't have to duplicate this list.
Also, remove "external config" feature, since it is subsumed by just
importing the test suite.
This commit adds support for the cases of view op where the rank and
the shapes of the input and result are equal.
Signed-Off By: Vivek Khandelwal <vivek@nod-labs.com>
Effectively, this mode works by compiling op by op as the NN is eagerly executed by PyTorch. Entailed in that compilation is building a representation of the op that can be `torch.jit.script`ed, importing using `ModuleBuilder`, and then executing (e.g., with `RefBackendLinalgOnTensorsBackend`). This mode includes a fallback to conventional PyTorch if anything in the torch-mlir compilation process fails (e.g., unsupported op).
Currently, all e2e tests pass execpt for two that involve an upstream PyTorch bug (https://github.com/pytorch/pytorch/issues/74400).
High priority next steps:
1. A compile cache in order to speed up reruns of the same NN.
2. Integration with IREE (though not in this repo).
3. Integration with `torch.distributed`.
- This commit adds decomposition of `aten.dropout` op. It also covers the
training mode of the same op.
- It also adds lowering of `aten.sub.float` op.
Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com>
This commit adds the op `ValsemVariantAtenCopyOp` that represents
`AtenCopy_Op` without the underscore. This is needed to make sure
that the `ReduceOpVariants` pass turns the in-place op into an op
that takes value tensors as inputs, otherwise the
`MaximizeValueSemantics` pass will not be able to add value
semantics correctly.
This commit also adds the lowering of `ValsemVariantAtenCopyOp`.
Signed-Off By: Vivek Khandelwal <vivek@nod-labs.com>
This commit fixes the 2nd and 3rd return types of the `aten.native_layer_norm`.
Previously the mean and rSTD were returned with reduction dims removed.
This commit fixes this and keeps the reduction dims of the results.
Signed-Off-By: Prateek Gupta <prateek@nord-labs.com>
This commit adds the op `ValsemVariantAtenIndexPutImplOp` that represents
`Aten_IndexPutImpl_Op` without the underscore. This is needed to
make sure that the `ReduceOpVariants` pass turns the in-place op
into an op that takes value tensors as inputs, otherwise the
`MaximizeValueSemantics` pass will not be able to add value
semantics correctly.
This commit also adds the lowering of `ValsemVariantAtenIndexPutImplOp` op.
This commit also updates the `torch.bincount` op test cases.
See the documentation in `docs/shape_lib.md` and
`docs/adding_a_shape_function.md` for an overview of the system.
This completely overhauls how we represent shape functions. In
particular, RefineTypes does not infer shapes anymore (only dtypes).
Shape functions are now written in (TorchScript'able) Python.
Recommended review order:
1. Read `docs/shape_lib.md` and `docs/adding_a_shape_function.md`.
1. Code and tests for ReifyShapeCalculations, DropShapeCalculations.
1. Code and tests for SimplifyShapeCalculations.
1. shape_lib_gen.py
1. Code and tests for new RefineTypes pass.
1. Random folders/canonicalizers in TorchOps.cpp and associated test in
`canonicalize.mlir`.
1. New ReadOnly trait inferred from the registry.
1. Any miscellaneous remaining stuff.
Example `-print-ir-after-all` for ElementwiseUnaryModule:
[IR lowering dump](https://gist.github.com/silvasean/e4dc8cbc8d00aac7819602e3cbd8e212).
Example `-print-ir-after-all` for ElementwiseBinaryModule:
[IR lowering dump](https://gist.github.com/silvasean/daf6860ecced732af3568af6b1899113).
This pass is added to lower ops, which can not be lowered
via the TorchToLinalg pass, such as `torch.bincount` op.
This pass also uses torch-mlir's TMTensor Dialect to lower the
complex ops.
Also add torch.bincount op lowering with the help of TMTensor dialect
Signed-Off By: Vivek Khandelwal <vivek@nod-labs.com>
This commit adds support for integer type inputs for
`AtenMaxOp`, `AtenSumOp`, `AtenSumDimIntListOp`.
Signed-Off By: Vivek Khandelwal <vivek@nod-labs.com>
- This commit adds E2E support for `aten.rand_like` and
`aten.bernoulli_.Tensor` ops.
- The `aten.bernoulli(x)` was implemented as:
`aten.bernoulli(x) = rand_like(x) < 0.5`, assuming 0.5 as default
probability, whereas according to the pytorch documentation:
https://pytorch.org/docs/stable/generated/torch.bernoulli.html#torch.bernoulli
the input x in `aten.bernoulli(x)` is itself a tensor containing
probabilities to be used for drawing the binary random number.
- So this commit fixes the `aten.bernoulli(x)` implementation as:
`aten.bernoulli(x) = rand_like(x) < x`.
- It also fixes the case where the input to `aten.bernoulli_.float` is
an integer tensor. In this case the input must be casted to float type
before passing it as operand to `aten.rand_like` op.
`aten.bernoulli_.float(x, p) = rand_like(float(x)) < p`.
Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com>
The view op allows for the new shape argument to have a -1 value for
one of the dimensions, and the op is expected to deduce the size of
that dimension by looking at the sizes of the other dimensions and
comparing it to the total number of elements in the original
tensor. This commit adds this functionality.
This commit does a couple of things. First, it fixes a bug in the
`linalg.generic` body of the `nll_loss_forward` lowering where the
`ignoreIndex` was being compared with the loop index rather than the
current element of the `target` tensor. This was not being caught by
the tests because they were not testing the case where `ingnoreIndex`
actually corresponds to a value in `target`. This has been fixed.
Second, this commit adds support for the `reduction` argument in
`torch.nll_loss_forward` as well as support for 1-D inputs. In order
to simplify the lowering code, I've refactored the code that creates
the `linalg.generic` ops for elementwise and reduction ops into static
functions, to avoid having boilerplate code for indexing maps, etc
that can be very error prone.
Note: The function `convertScalarToDtype` was moved to before all the
conversion patterns, but nothing in it was modified.
- This commit decomposes the `aten.batch_norm` op into the
`aten.native_batch_norm` op, instead of lowering it to the
`linalg.generic` op.
- It also adds run-time asserts in the `aten.native_batch_norm` lowering
to make sure that the shape of the weight, bias, running_mean, and
running_var must match the num of features.
- Since the `aten.native_batch_norm` op is not supported at TOSA backend,
all the modules that are dependent on the `aten.native_batch_norm` op
will fail and therefore they should be removed from the TOSA `passing`
set.
- It also moves `checkNotNone` to utility.
Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com>
This commit adds the op `PseudoAtenFillScalarOp` that represents
`AtenFill_ScalarOp` without the underscore. The approach is the same
as in commit dd998fa4d4.
Adding this op allows for a simpler and more consistent version of the
`empty` and `empty_like` op e2e tests.
- This commit adds lowering of `aten.le.Scalar` and `aten.ge.Scalar` ops
as a part of `convert-torch-to-linalg` pass.
- It also creates a new test script `elementwise_comparison.py` for all
element-wise comparison ops.
Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com>
- This commit adds lowering of `aten.eq.int` op as a part of
`convert-torch-to-std` pass.
- It also refactors the code for binary comparison ops lowering.
Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com>
- This commit adds lowering of `aten.Bool.Tensor` and
`aten.Float.Tensor` op as a part of `convert-torch-to-linalg` pass.
- It also adds support for returning bool types.
- It also fixes lowering of the `aten.Int.Tensor` op for non-zero rank
input tensors.
- If a scalar number is converted to a 0-d tensor and passed on to the
`aten.Float.Tensor` op, it folds to the scalar number.
Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com>
- The `self` name is being used as a keyword argument to the
`torch.ops.aten.nll_loss_backward` function call, which produces
name-conflict error with the python keyword `self` which is pointer to
the current object.
- This commit fixes this issue by replacing the keyword argument by
positional argument.
Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com>
Some of the lowerings use the result type obtained from the op itself
to tell the `linalg::GenericOp` what the type of the result should be
rather than using the type of the result tensor given to the
`linalg::GenericOp`. This becomes a problem when the result type of
the op has static size information and the result tensor used in
`linalg::GenericOp` has dynamic dimensions, for `linalg::GenericOp`
expects the result type to be equal to the type of the output tensor.
This commit replaces the use of the result type from the op itself
with the type of the result tensor passed to `linalg::GenericOp`.
In order to not create too many dynamic/static versions of the same
e2e test, e2e tests have only been added to the ops that currently
fail when used with static sizes.
* [tosa] Support for AtenNe[Tensor|Scalar]Op, AtenLog2Op,
AtenBitwiseAndTensorOp, AtenSquareOp and AtenThresholdOp
* Fix for Issue #532 - Mixed input types for few ops and updated few
tests to use i32 instead of i64
Signed-off-by: Anup Gangwar <anup.gangwar@arm.com>
Co-authored-by: Anup Gangwar <anup.gangwar@arm.com>
This commit fixes an error in the refine types pass of constant
allocation ops. The function used to set the dtype,
`fillInDtypeGivenDtypeAndDataType`, takes two torch types as arguments,
but a torch type and a standard MLIR type were being passed into it.
This commit also fixes the way the dtype was calculated in
`visitAtenToDtypeOp`. This op was also passing a standard MLIR type as
an argument to the `fillInDtypeGivenDtypeAndDataType`
function. Moreover, since the op `aten.to.dtype` has the dtype
argument as not optional, all that is needed is to match
against the int value to extract the dtype.
- This commit adds support for `aten.native_batch_norm` operation.
- The current implementation only supports inference mode of
`aten.native_batch_norm` op.
Signed-Off-By: Gaurav Shukla <gaurav@nod-labs.com>
The lowering of aten::nll_loss_backward op has been added
from torch to linalg dialect. The changes has been made as
a part of -torch-convert-to-linalg pass.
Signed-off-by: Prashant Kumar prashant@nod-labs.com
This PR include the following pieces:
- Add torch `Generator` type. `Generator` type is converted to i64 in
refbackend type converter.
- Add seed managment support for the default global generator.
`torch_c.getNextSeed` op is used to get the seed. On refbackend, the
`torch_c.getNextSeed` is lowered to load/store from [0] of global
variable `default_generator` memref<i64> in `InsertRngGlobals` pass.
- Add `aten.uniform_` and testing as an example op for RNG ops. Add
`torch.pseudo.aten.uniform` op. It has the same operands and return as
the `aten.uniform_` from the op registry except for value semantics.
The added e2e maxpool testcase from #545 was not getting a static shape
due to an unfolded prim.If when RefineTypes was called. This was because
of unfolded torch.iaten.__is__ and torch.prim.unchecked_cast operators
with torch.derefine operands.
* [tosa] Support for AtenCeilOp and AtenReciprocalOp
* [tosa] Support for comparator ops, Aten[Gt|Lt|Eq][Tensor|Scalar]Op with scalar constant
* [tosa] Support for Scalar variants of Aten[Mul|Div|Add|Sub] Ops with scalar constants
Signed-off-by: Anup Gangwar <anup.gangwar@arm.com>
Co-authored-by: Anup Gangwar <anup.gangwar@arm.com>
Note that to enable folding of the code coming from an example
like the ConstantPad2dStaticModule e2e test, support for other
operations had to be added/improved:
- aten::neg.int
- aten::eq.float
- aten::eq.str
- prim::Uninitialized
This commit adds lowering of `aten.threshold` op
This commit adds lowering of `aten.threshold_backward` op
Signed-Off By: Vivek Khandelwal <vivek@nod-labs.com>
This involes the following 2 parts:
- Change refine type to propagate more static shape info.
- Get as much static shape info as possible when creating the result
tensor when converting to linalg.
- This commit adds E2E support for `aten.ones_like` and
`aten.zeros_like` ops.
- Adds support for non-None `dtype` argument of `aten.empty_like` op.
- All the unit test cases related to constant tensor allocation like ops
are moved to a different file named `constant_alloc.py`.
Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com>
This commit adds lowering of `aten.arange.start_step` op.
This commit decomposes `aten.arange` and `aten.arange.start` into
`aten.arange.start_step` op.
Signed-Off By: Vivek Khandelwal <vivek@nod-labs.com>
- It folds `aten.to.dtype` when the input tensor type and result type
are exactly same.
- It folds `aten.view` when the rank of both the input tensor type and
result type is unity.
Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com>
We only handle the expanding OR collapsing cases, we do not handle
expanding And collapsing happening at the same time or cases where
it's neither collapsing nor expanding like view of [2,3] for
3x2 tensor.
It's assumed that if a shape list element is got from
`aten.size(tensor, dim)` the corresponding dim is not splitted or
collapsed. This assumption makes it easier to deal with dynamic shapes.
- Added E2E support for `aten.eq.Tensor` and `aten.lt.Tensor` ops. Both
the operands are expected to be of the same type, i.e., type promotion
is not addressed as a part of this commit.
- Added E2E support for `aten.eq.Scalar` and `aten.lt.Scalar` ops.
Tensor operand type to Scalar operand type promotion has not been
handled in this commit.
Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com>
Add the required lowerings and correct test cases.
These op produce zero-d tensors and it was incorrectly mentioned in
refine types to produce 1d tensor of size 1.
- Templatize `aten.zeros` and `aten.ones` ops lowering.
- Add E2E support for `aten.empty` op.
- Add Integer type support in `aten.mul.Scalar` op lowering.
Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com>
`aten.gt.Tensor` op has been added in torch dialect and the
lowering of the op has been done to the linalg dialect.
Signed-off-by: Prashant Kumar <prashant@nod-labs.com>
This commit adds support for aten.native_layer_norm operation. Here
the previous code for aten.layer_norm is tweaked a little bit to
accomodate both mean and variance values alongwith the layer norm
value. This commit also adds decomposition of aten.layer_norm into
aten.native_layer_norm, which was previously getting lowered directly
to linalg.
Signed-Off-By: Prateek Gupta<prateek@nod-labs.com>
This commit adds lowering of `aten.squeeze.dim` op into
`linalg.TensorCollapseShape` op. Here, the dim(th) dimension of the
input tensor is not supposed to be dynamic.
Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com>
This commit adds lowering of `aten.gt.Scalar` and `aten.where.self` as a
part of element-wise ops lowering.
Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com>
Support for passing memref of bool types as a function argument
and return is added in ref-backend.
Signed-off-by: Prashant Kumar <prashant@nod-labs.com>
The op lowering has been added as a part of `torch-lower-to-linalg`
pass. This takes care of ignore_index but the weight and reduction
operand is still to be accounted for.
Signed-off-by: Prashant Kumar <prashant@nod-labs.com>
Many reduction ops take as an argument an optional output dtype that
can change the type of the input tensor before the reduction is
performed. This commit adds support for the optional dtype flag that
had been previously ignored.
Test:
/tools/torchscript_e2e_test.sh -f 'ReduceSumDtype'
/tools/torchscript_e2e_test.sh -f 'ReduceSumDImIntListDtype'
This commit adds lowering of `aten.Squeeze` op into
`linalg.TensorCollapseShape` op. The size 1 dynamic dimensions are not
handled as a part of this commit.
Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com>
The lowering of aten.fill.Scalar has been added.
The changes have been made as a part of -torch-convert-to-linalg pass.
Signed-off-by: Prashant Kumar <prashant@nod-labs.com>
This commit fixes a type promotion bug when NumToTensor was given a
float as an argument. In particular, the rules for type promotion of a
scalar vary depending on if the scalar is part of a tensor op or
not. NumToTensor falls under the second category, but it was being
treated as part of the first category.
aten.log_softmax_back_data op lowering and required
tests has been added. Some NFC have also been added.
Signed-off-by: Prashant Kumar prashant@nod-labs.com
This commit adds lowering of `aten.mul.Scalar` and also adds
decomposition of `aten.addmm` to `aten.mul.Scalar`, `aten.add.Tensor`
and `aten.mm` ops.
Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com>
Now, aten::linear supports rank 3 inputs. This is a fix
for upcoming bert-inference task. The correct way should be
to support broadcasting in `aten.matmul` op and decompose
`aten.linear` into right ops.
This commit adds new operation `aten.gelu_backward` in the aten
dialect and adds lowering of this operation from aten to linalg.
Signed-Off-By: Prateek Gupta <prateek@nod-labs.com>
This change is to unblock the work of some backprop ops returning more
than one tensors. We will need to think of a more scalable approach
in the future if more flexible return types combinations are needed.
- Remove use of conversion construction macros
- Add mul and div op conversions
- Add corresponding tests
Signed-off-by: Suraj Sudhir <suraj.sudhir@arm.com>
This is to facilitate scalar type conversion in the TorchToLinalg. As
part of adding the helper, this PR also:
- Updated `AtenAddTensorOp`, `AtenSubTensorOp` to use the helpers to
support more type variants.
- Added e2e type promotion testing.
- Added i32 memref return/arg type to support e2e testing.
Support for returning elemental types. Previously, only
memref types as returning types was supported. All the hacky ways
to write tests which return elemental types should be taken care of.
Signed-off-by: Prashant Kumar <prashant@nod-labs.com>
The lowering of `aten.Int.Tensor` op has been added.
The changes has been made as a part of `convert-torch-to-linalg` pass.
Signed-off-by: Prashant Kumar <prashant@nod-labs.com>
- This commit adds lowering of `aten.View` to `linalg.TensorExpandShape`.
- This lowering will be successful only when one or more static
dimensions are expanded.
- It also fixes a typo in `ConvertAtenFlattenUsingIntsOp` conversion
pattern.
Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com>
Part of #380
Also
- BoolType is not considered as Scalar
- e2e framework fixes for nan handling
- `tu.rand(..., low=, high=)` support
- delete unused variable (fix warning)
- Add IouOfModule from #380 to e2e test suite (this is a common
calculation in vision models)
Your branch is ahead of 'origin/main' by 1 commit.
Lowering of `aten.matmul` op is added from torch to linalg dialect.
The different cases correspond to
https://pytorch.org/docs/stable/generated/torch.matmul.html.
TODO: Broadcasting in case of batch-matmul is yet to be taken care of.
Signed-off-by: Prashant Kumar <prashant@nod-labs.com>
* Print more exception info on error during test execution
* Fix formatting
* Add aten::gelu lowering
Co-authored-by: Boian Petkantchin <boian@nod-labs.com>
Includes a fix to use `add_mlir_public_c_api_library` for Torch-MLIR's CAPI library, which is now required (note: upstream sample has it the right way).
Disabled a TOSA test per discussion: https://github.com/llvm/torch-mlir/issues/379
Summary:
This commit fixes an off-by-one error in how negative dimensiosn were
being handled in the lowering of transpose. This commit also adds
tests to transpose and unsqueeze to test negative dimensions.
- Added a DecomposeComplexOps pass to decompose complex torchOps.
- Refactored `visitAtenArgmaxOp` and `visitAtenAnyDimOp` to
`visitReductionAlongDimIntOp`.
- Moved some helper functions into
torch-mlir/Dialect/Torch/Utils/Utils.h to be shared by multiple files.
- Added support for f64 tensor as argument and return types.
We lower through linalg-on-tensors and use RefBackend to run it.
This adds enough support for a "tanh" op. Adding more ops should be
fairly mechanical now that things are wired up. Run with:
```
./tools/torchscript_e2e_test.sh -c tosa
```
The backend structure is very similar to linalg-on-tensors based E2E
backends and is a nice parallel (see `tosa_backend.py`). Actually, this
forced a nice refactoring to the layering here. We removed
`torchscript-module-to-linalg-on-tensors-backend-pipeline` and instead
require separately running
```
torchscript-function-to-torch-backend-pipeline,torch-backend-to-linalg-on-tensors-backend-pipeline
```
This highlights the step that lowers to the "torch backend contract"
of cleaned up `torch` dialect ops is a critical step in the lowering.
Going forward, that is the key load-bearing contract of the torch-mlir
project, not the linalg-on-tensors backend contract.
Recommended review order:
- `TorchToTosa.cpp` / `TorchToTosa/basic.mlir`
- `python/torch_mlir_e2e_test/torchscript/configs/tosa_backend.py` and
the new `utils.py` file there.
- `python/torch_mlir_e2e_test/tosa_backends/linalg_on_tensors.py` and
`abc.py` in that directory for the TOSA backend e2e interface.
- other misc mechanical changes
This is a simple way for externals to plug their backends into the test
suite. They just implement the `TestConfig` class for their backend and
write a small script that exposes it.
I have a pending PR for iree-samples that successfully integrates this.
This commit (with approval from all contributors) dual licenses
the torch-mlir project under both the standard LLVM license and the
standard PyTorch license. This will facilitate moving code between
torch-mlir and the two upstream projects.
The standard file comment is now:
```
// This file is licensed under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
// Also available under a BSD-style license. See LICENSE.
```
See `LICENSE` in the project root for the terms of both licenses.
This leaves no real code outside torch-mlir.
This also renames the "npcomp backend contract" to "linalg on tensors
backend contract" as the name of the abstraction layer that RefBackend
(IREE too) accepts.
Also contains the following changes:
- Remove derefineOp canonicalizer because it's not safe.
- Support for optional tensor and list tensors in reduceOpVariant. This
only works for some special detected and easy to handle cases. For list,
it covers the case list is got from a `ListConstruct`. For optional, it
covers the case optional is constructed from a `DerefineOp`.
- Remove the `inferReturnTypes` for `FromBuiltinTensorOp` because it's
not safe to deduce types from the input. For example, a built-in tensor
of i8 could be converted to si8 or ui8. It's better to let the user
specify the return type explicitly.
A few remain in examples/docs that will be naturally be updated in due
time.
This regresses the list support and the general direction of more widely
supported control flow, lists/dicts/globals that we were going for with
the TorchScript path. The idea is that we are deferring that work to
make torch-mlir a very clean standalone thing. We will reboot it,
probably using some of the tools of iree_pydm to make it simpler, and in
a more natural place (such as an iree-torch repo that depends on IREE and
torch-mlir to build a working PyTorch frontend solution for IREE -- it
was really weird that npcomp depended on IREE).
`tools/torchscript_e2e_test.sh` is all green.
This needs a few passes I put into torch-mlir/lib/RefBackend (not to be
confused with `npcomp/lib/RefBackend`, which will soon be deleted).
For the sake of review, since this brings together a lot of things, I
split this into its own commit. I temporarily commented out some "list"
stuff that we are going to remove as part of the torch-mlir refocus.
It just contained the e2e testing framework. We now fold it into the
main project to reduce complexity.
- `frontends/pytorch/python/` -> `python/torch_support`
- `frontends/pytorch/e2e_testing -> e2e_testing`
- `frontends/pytorch/examples -> examples`
- `frontends/pytorch/test` -> `python/test`
- `torch_mlir_torchscript` python module -> `npcomp_torchscript`
- `torch_mlir_torchscript_e2e_test_configs` python module ->
`npcomp_torchscript_e2e_test_configs`
This also changes the license of a handful of files from the
"pytorch-style" license to the regular LLVM/npcomp license. The only
people who committed to those files were myself and Yi.