正弦位置编码层

[source]

SinePositionEncoding class

keras_nlp.layers.SinePositionEncoding(max_wavelength=10000, **kwargs)

Sinusoidal positional encoding layer.

This layer calculates the position encoding as a mix of sine and cosine functions with geometrically increasing wavelengths. Defined and formulized in Attention is All You Need.

Takes as input an embedded token tensor. The input must have shape [batch_size, sequence_length, feature_size]. This layer will return a positional encoding the same size as the embedded token tensor, which can be added directly to the embedded token tensor.

Arguments

  • max_wavelength: The maximum angular wavelength of the sine/cosine curves, as described in Attention is All You Need. Defaults to 10000.
  • **kwargs: other keyword arguments passed to keras.layers.Layer, including name, trainable, dtype etc.

Call arguments

  • inputs: The tensor inputs to compute an embedding for, with shape (batch_size, sequence_length, hidden_dim).
  • start_index: An integer or integer tensor. The starting position to compute the encoding from. This is useful during cached decoding, where each position is predicted separately in a loop.

Example

# create a simple embedding layer with sinusoidal positional encoding
seq_len = 100
vocab_size = 1000
embedding_dim = 32
inputs = keras.Input((seq_len,), dtype="float32")
embedding = keras.layers.Embedding(
    input_dim=vocab_size, output_dim=embedding_dim
)(inputs)
positional_encoding = keras_nlp.layers.SinePositionEncoding()(embedding)
outputs = embedding + positional_encoding

References