W
- data type for out()
outputpublic final class QuantizedMatMulWithBias<W> extends PrimitiveOp
The inputs must be two-dimensional matrices and 1D bias vector. And the inner dimension of `a` (after being transposed if `transpose_a` is non-zero) must match the outer dimension of `b` (after being transposed if `transposed_b` is non-zero). Then do broadcast add operation with bias values on the matrix mulplication result. The bias size must match inner dimension of `b`.
Modifier and Type | Class and Description |
---|---|
static class |
QuantizedMatMulWithBias.Options
Optional attributes for
QuantizedMatMulWithBias |
operation
Modifier and Type | Method and Description |
---|---|
static <W,T,U,V> QuantizedMatMulWithBias<W> |
create(Scope scope,
Operand<T> a,
Operand<U> b,
Operand<V> bias,
Operand<Float> minA,
Operand<Float> maxA,
Operand<Float> minB,
Operand<Float> maxB,
Class<W> Toutput,
QuantizedMatMulWithBias.Options... options)
Factory method to create a class wrapping a new QuantizedMatMulWithBias operation.
|
static QuantizedMatMulWithBias.Options |
inputQuantMode(String inputQuantMode) |
Output<Float> |
maxOut()
The float value that the highest quantized output value represents.
|
Output<Float> |
minOut()
The float value that the lowest quantized output value represents.
|
Output<W> |
out() |
static QuantizedMatMulWithBias.Options |
transposeA(Boolean transposeA) |
static QuantizedMatMulWithBias.Options |
transposeB(Boolean transposeB) |
equals, hashCode, op, toString
public static <W,T,U,V> QuantizedMatMulWithBias<W> create(Scope scope, Operand<T> a, Operand<U> b, Operand<V> bias, Operand<Float> minA, Operand<Float> maxA, Operand<Float> minB, Operand<Float> maxB, Class<W> Toutput, QuantizedMatMulWithBias.Options... options)
scope
- current scopea
- A matrix to be multiplied. Must be a two-dimensional tensor of type `quint8`.b
- A matrix to be multiplied and must be a two-dimensional tensor of type `qint8`.bias
- A 1D bias tensor with size matching inner dimension of `b` (after being
transposed if `transposed_b` is non-zero).minA
- The float value that the lowest quantized `a` value represents.maxA
- The float value that the highest quantized `a` value represents.minB
- The float value that the lowest quantized `b` value represents.maxB
- The float value that the highest quantized `b` value represents.Toutput
- options
- carries optional attributes valuespublic static QuantizedMatMulWithBias.Options transposeA(Boolean transposeA)
transposeA
- If true, `a` is transposed before multiplication.public static QuantizedMatMulWithBias.Options transposeB(Boolean transposeB)
transposeB
- If true, `b` is transposed before multiplication.public static QuantizedMatMulWithBias.Options inputQuantMode(String inputQuantMode)
inputQuantMode
- Input data quantization mode. Either MIN_FIRST(default) or SCALED.public Output<Float> minOut()
Copyright © 2022. All rights reserved.