Encoder:
- Tokenize the input French phrase using the trained tokenizer.
- Add tokens to the input on the encoder side.
- Pass the tokens through the embedding layer.
- Feed tokens into the multi-headed attention layers.
- Outputs of the multi-headed attention layers are processed by a feed-forward network.
- The encoder output represents the structure and meaning of the input sequence.
Decoder:
- Insert the encoder output into the middle of the decoder to influence self-attention mechanisms.
- Add a start-of-sequence token to the decoder input.
- The decoder uses contextual understanding from the encoder to predict the next token.
- The output of the decoder's self-attention layers is processed by the decoder feed-forward network.
- Pass the output through a final softmax output layer to get the first token.
- Continue the loop, passing the output token back to the input to predict the next token.
- Repeat until the model predicts an end-of-sequence token.
- The final sequence of tokens can be detokenized into words to obtain the output translation.
Comments