* shape: add shape transfer function for aten.neg
Prior to this patch, the list of shape transfer functions did not
include `aten.neg`, which resulted in errors like below.
```
error: unsupported by backend lowering: tensor with unknown rank or dtype
note: see current operation: %0 = "torch.aten.neg"(%arg0) :
(!torch.vtensor<[256,256],f32>) -> !torch.vtensor<*,f32>
note: this is likely due to a missing shape transfer function in shape_lib_gen.py
```
This patch fixes the problem by adding a shape transfer function to
reflect the point-wise nature of this operation.
* linalg: add translation of aten.neg operation
This patch adds a translation rule to lower `aten.neg` operations on
tensors to an `arith.negf` operation wrapped inside a `linalg.generic`
operation. This patch also adds a rudimentary test.
This commit adds the following support to the op `nll_loss_backward`:
- `input` tensor can be rank-1
- `weight` parameter
- `reduction` parameter
- `target`, `grad_output`, `total_weight` can be rank-0
- Checks that input tensors are of the expected type
In order to make sure that the TorchToLinalg conversions leave the
graph in a valid state, the final result of the conversion has to be
casted to the result type of the op. This commit adds this cast to ops
that did not have it.
This commit fixes the 2nd and 3rd return types of the `aten.native_layer_norm`.
Previously the mean and rSTD were returned with reduction dims removed.
This commit fixes this and keeps the reduction dims of the results.
Signed-Off-By: Prateek Gupta <prateek@nord-labs.com>
The term "pseudo" is very vague and was getting confusing (I felt I had
to explain it in every comment referencing it). Instead, rework the
"pseudo" ops to instead be named:
- MLIR Syntax: `torch.valsem.*`
- C++ / ODS: `ValsemVariant*Op`
This makes it clear what the concept is, and avoids confusion with other
things that might be called "pseudo", since these are very specific and
should be 100% consistently named w.r.t. the non-valsem-variant ops that
they correspond to.
This helps keep things organized and also exposes more parallelism to
the build system. It seems though that most of the compile time is
actually spent in the headers though, so the wall time doesn't decrease
as much as I had hoped (and now that the headers are being included
multiple times, the cpu time actually increases a lot, sadly -- will try
to dig into this).