torch-mlir

Commit Graph

Author	SHA1	Message	Date
zjgarvey	de28c8540b	[ONNX] add int16 quantization support (#3446 ) There is currently no int16 quantization support in torch. This patch adds a new mlir type to correspond to the missing "torch.qint16" type, and enables lowering of quantization-related onnx ops using int16 types. In follow-up patches, custom quantization logic for ops like aten.matmul/aten.mm/aten.convolution may need to be revisited to allow support for qint16. The passes in FuseQuantizedOps.cpp may also need slight modifications.	2024-06-12 10:37:22 +05:30
penguin_wwy	6679728c56	Fix deprecated uses of cast/dyn_cast/dyn_cast_or_null/isa (#3243 ) Like #3130, gradually replace the deprecated code https://github.com/llvm/mlir-www/blob/main/website/content/deprecation/_index.md#deprecated	2024-04-27 14:00:56 -07:00
Rob Suderman	f6f890520b	[torch][quant] Quantized `torch.mm` for linalg with end-to-end test (#2750 ) This includes custom op matching for decomposed operations and fusing dequantization into dense operations. As a validation we compare to the dequant+mm torch implementation.	2024-01-24 14:02:50 -08:00

Author

SHA1

Message

Date

zjgarvey

de28c8540b

[ONNX] add int16 quantization support (#3446 )

There is currently no int16 quantization support in torch. This patch
adds a new mlir type to correspond to the missing "torch.qint16" type,
and enables lowering of quantization-related onnx ops using int16 types.

In follow-up patches, custom quantization logic for ops like
aten.matmul/aten.mm/aten.convolution may need to be revisited to allow
support for qint16. The passes in FuseQuantizedOps.cpp may also need
slight modifications.

2024-06-12 10:37:22 +05:30

penguin_wwy

6679728c56

Fix deprecated uses of cast/dyn_cast/dyn_cast_or_null/isa (#3243 )

Like #3130, gradually replace the deprecated code

https://github.com/llvm/mlir-www/blob/main/website/content/deprecation/_index.md#deprecated

2024-04-27 14:00:56 -07:00

Rob Suderman

f6f890520b

[torch][quant] Quantized `torch.mm` for linalg with end-to-end test (#2750 )

This includes custom op matching for decomposed operations and fusing
dequantization into dense operations. As a validation we compare
to the dequant+mm torch implementation.

2024-01-24 14:02:50 -08:00

3 Commits (5a627c46b76f8cdc737aef3bda1b910836e33d88)