This specification relates to generating representations of input sequences.
Many data processing tasks involve converting an ordered sequence of inputs into an ordered sequence of outputs. For example, machine translation systems translate an input sequence of words in one language into a sequence of words in another language. As another example, pronunciation systems convert an input sequence of graphemes into an output sequence of phonemes.