UnicodeDecodeWithOffsets (JavaCPP Presets for TensorFlow 1.15.5-1.5.8 API)

java.lang.Object
- org.tensorflow.op.PrimitiveOp
- - org.tensorflow.op.strings.UnicodeDecodeWithOffsets<T>

Type Parameters:

T - data type for rowSplits() output

All Implemented Interfaces:

Op
```
public final class UnicodeDecodeWithOffsets<T extends Number>
extends PrimitiveOp
```
Decodes each string in `input` into a sequence of Unicode code points.
The character codepoints for all strings are returned using a single vector `char_values`, with strings expanded to characters in row-major order. Similarly, the character start byte offsets are returned using a single vector `char_to_byte_starts`, with strings expanded in row-major order.
The `row_splits` tensor indicates where the codepoints and start offsets for each input string begin and end within the `char_values` and `char_to_byte_starts` tensors. In particular, the values for the `i`th string (in row-major order) are stored in the slice `[row_splits[i]:row_splits[i+1]]`. Thus:
- `char_values[row_splits[i]+j]` is the Unicode codepoint for the `j`th character in the `i`th string (in row-major order).
- `char_to_bytes_starts[row_splits[i]+j]` is the start byte offset for the `j`th character in the `i`th string (in row-major order).
- `row_splits[i+1] - row_splits[i]` is the number of characters in the `i`th string (in row-major order).

Nested Class Summary

Nested Classes
Modifier and Type Class and Description

static class UnicodeDecodeWithOffsets.Options
Optional attributes for UnicodeDecodeWithOffsets

Nested Classes
Modifier and Type	Class and Description
`static class`	`UnicodeDecodeWithOffsets.Options` Optional attributes for `UnicodeDecodeWithOffsets`

Field Summary
- Fields inherited from class org.tensorflow.op.PrimitiveOp
  operation

Method Summary

All Methods Static Methods Instance Methods Concrete Methods
Modifier and Type	Method and Description
`Output<Long>`	`charToByteStarts()` A 1D int32 Tensor containing the byte index in the input string where each character in `char_values` starts.
`Output<Integer>`	`charValues()` A 1D int32 Tensor containing the decoded codepoints.
`static <T extends Number> UnicodeDecodeWithOffsets<T>`	`create(Scope scope, Operand<String> input, String inputEncoding, Class<T> Tsplits, UnicodeDecodeWithOffsets.Options... options)` Factory method to create a class wrapping a new UnicodeDecodeWithOffsets operation.
`static UnicodeDecodeWithOffsets<Long>`	`create(Scope scope, Operand<String> input, String inputEncoding, UnicodeDecodeWithOffsets.Options... options)` Factory method to create a class wrapping a new UnicodeDecodeWithOffsets operation using default output types.
`static UnicodeDecodeWithOffsets.Options`	`errors(String errors)`
`static UnicodeDecodeWithOffsets.Options`	`replaceControlCharacters(Boolean replaceControlCharacters)`
`static UnicodeDecodeWithOffsets.Options`	`replacementChar(Long replacementChar)`
`Output<T>`	`rowSplits()` A 1D int32 tensor containing the row splits.

Methods inherited from class org.tensorflow.op.PrimitiveOp
equals, hashCode, op, toString

Methods inherited from class java.lang.Object
clone, finalize, getClass, notify, notifyAll, wait, wait, wait

- Method Detail
  - create
```
public static <T extends Number> UnicodeDecodeWithOffsets<T> create(Scope scope,
                                                                    Operand<String> input,
                                                                    String inputEncoding,
                                                                    Class<T> Tsplits,
                                                                    UnicodeDecodeWithOffsets.Options... options)
```
    Factory method to create a class wrapping a new UnicodeDecodeWithOffsets operation.
    
    Parameters:
    
    scope - current scope
    
    input - The text to be decoded. Can have any shape. Note that the output is flattened to a vector of char values.
    
    inputEncoding - Text encoding of the input strings. This is any of the encodings supported by ICU ucnv algorithmic converters. Examples: `"UTF-16", "US ASCII", "UTF-8"`.
    
    Tsplits -
    
    options - carries optional attributes values
    
    Returns:
    
    a new instance of UnicodeDecodeWithOffsets
  - create
```
public static UnicodeDecodeWithOffsets<Long> create(Scope scope,
                                                    Operand<String> input,
                                                    String inputEncoding,
                                                    UnicodeDecodeWithOffsets.Options... options)
```
    Factory method to create a class wrapping a new UnicodeDecodeWithOffsets operation using default output types.
    
    Parameters:
    
    scope - current scope
    
    input - The text to be decoded. Can have any shape. Note that the output is flattened to a vector of char values.
    
    inputEncoding - Text encoding of the input strings. This is any of the encodings supported by ICU ucnv algorithmic converters. Examples: `"UTF-16", "US ASCII", "UTF-8"`.
    
    options - carries optional attributes values
    
    Returns:
    
    a new instance of UnicodeDecodeWithOffsets
  - errors
```
public static UnicodeDecodeWithOffsets.Options errors(String errors)
```
    Parameters:
    
    errors - Error handling policy when there is invalid formatting found in the input. The value of 'strict' will cause the operation to produce a InvalidArgument error on any invalid input formatting. A value of 'replace' (the default) will cause the operation to replace any invalid formatting in the input with the `replacement_char` codepoint. A value of 'ignore' will cause the operation to skip any invalid formatting in the input and produce no corresponding output character.
  - replacementChar
```
public static UnicodeDecodeWithOffsets.Options replacementChar(Long replacementChar)
```
    Parameters:
    
    replacementChar - The replacement character codepoint to be used in place of any invalid formatting in the input when `errors='replace'`. Any valid unicode codepoint may be used. The default value is the default unicode replacement character is 0xFFFD or U+65533.)
  - replaceControlCharacters
```
public static UnicodeDecodeWithOffsets.Options replaceControlCharacters(Boolean replaceControlCharacters)
```
    Parameters:
    
    replaceControlCharacters - Whether to replace the C0 control characters (00-1F) with the `replacement_char`. Default is false.
  - rowSplits
```
public Output<T> rowSplits()
```
    A 1D int32 tensor containing the row splits.
  - charValues
```
public Output<Integer> charValues()
```
    A 1D int32 Tensor containing the decoded codepoints.
  - charToByteStarts
```
public Output<Long> charToByteStarts()
```
    A 1D int32 Tensor containing the byte index in the input string where each character in `char_values` starts.

Class UnicodeDecodeWithOffsets<T extends Number>

Nested Class Summary

Field Summary

Fields inherited from class org.tensorflow.op.PrimitiveOp

Method Summary

Methods inherited from class org.tensorflow.op.PrimitiveOp

Methods inherited from class java.lang.Object

Method Detail

create

create

errors

replacementChar

replaceControlCharacters

rowSplits

charValues

charToByteStarts