public final class DecodeProto extends PrimitiveOp
The `decode_proto` op extracts fields from a serialized protocol buffers message into tensors. The fields in `field_names` are decoded and converted to the corresponding `output_types` if possible.
A `message_type` name must be provided to give context for the field names. The actual message descriptor can be looked up either in the linked-in descriptor pool or a filename provided by the caller using the `descriptor_source` attribute.
Each output tensor is a dense tensor. This means that it is padded to hold the largest number of repeated elements seen in the input minibatch. (The shape is also padded by one to prevent zero-sized dimensions). The actual repeat counts for each example in the minibatch can be found in the `sizes` output. In many cases the output of `decode_proto` is fed immediately into tf.squeeze if missing values are not a concern. When using tf.squeeze, always pass the squeeze dimension explicitly to avoid surprises.
For the most part, the mapping between Proto field types and TensorFlow dtypes is straightforward. However, there are a few special cases:
- A proto field that contains a submessage or group can only be converted to `DT_STRING` (the serialized submessage). This is to reduce the complexity of the API. The resulting string can be used as input to another instance of the decode_proto op.
- TensorFlow lacks support for unsigned integers. The ops represent uint64 types as a `DT_INT64` with the same twos-complement bit pattern (the obvious way). Unsigned int32 values can be represented exactly by specifying type `DT_INT64`, or using twos-complement if the caller specifies `DT_INT32` in the `output_types` attribute.
Both binary and text proto serializations are supported, and can be chosen using the `format` attribute.
The `descriptor_source` attribute selects the source of protocol descriptors to consult when looking up `message_type`. This may be:
- An empty string or "local://", in which case protocol descriptors are created for C++ (not Python) proto definitions linked to the binary.
- A file, in which case protocol descriptors are created from the file, which is expected to contain a `FileDescriptorSet` serialized as a string. NOTE: You can build a `descriptor_source` file using the `--descriptor_set_out` and `--include_imports` options to the protocol compiler `protoc`.
- A "bytes://
Modifier and Type | Class and Description |
---|---|
static class |
DecodeProto.Options
Optional attributes for
DecodeProto |
operation
Modifier and Type | Method and Description |
---|---|
static DecodeProto |
create(Scope scope,
Operand<String> bytes,
String messageType,
List<String> fieldNames,
List<Class<?>> outputTypes,
DecodeProto.Options... options)
Factory method to create a class wrapping a new DecodeProto operation.
|
static DecodeProto.Options |
descriptorSource(String descriptorSource) |
static DecodeProto.Options |
messageFormat(String messageFormat) |
static DecodeProto.Options |
sanitize(Boolean sanitize) |
Output<Integer> |
sizes()
Tensor of int32 with shape `[batch_shape, len(field_names)]`.
|
List<Output<?>> |
values()
List of tensors containing values for the corresponding field.
|
equals, hashCode, op, toString
public static DecodeProto create(Scope scope, Operand<String> bytes, String messageType, List<String> fieldNames, List<Class<?>> outputTypes, DecodeProto.Options... options)
scope
- current scopebytes
- Tensor of serialized protos with shape `batch_shape`.messageType
- Name of the proto message type to decode.fieldNames
- List of strings containing proto field names. An extension field can be decoded
by using its full name, e.g. EXT_PACKAGE.EXT_FIELD_NAME.outputTypes
- List of TF types to use for the respective field in field_names.options
- carries optional attributes valuespublic static DecodeProto.Options descriptorSource(String descriptorSource)
descriptorSource
- Either the special value `local://` or a path to a file containing
a serialized `FileDescriptorSet`.public static DecodeProto.Options messageFormat(String messageFormat)
messageFormat
- Either `binary` or `text`.public static DecodeProto.Options sanitize(Boolean sanitize)
sanitize
- Whether to sanitize the result or not.public Output<Integer> sizes()
Copyright © 2022. All rights reserved.