Well, this is just my 2-cent. I think you misunderstand the point I am making. First of all, accept that translation is a lossy process. A translation will always lose meaning one way or another, and without making a full essay about an art piece, you will never get the full picture of the art when translated. Think of it this way, does Haiku in Japanese make sense in English? Maybe. But most likely not. So anyone that wanted to experience the full art must either read an essay about said art or learn the original language. But for story, a translation can at least give you the gist of the event that is happening. Story will inherently have event that have to be conveyed. So a loss of information from subtlety can be tolerated since the highlight is another piece (the string of event).
Secondly, how the model works. GPT is a very bad representation for translation model. Generative Pretrained Transformer, well generate something. I’d argue translation is not a generative task, rather distance calculation task. I think you should read more upon how the current machine learning model works. I suggest 3Blue1Brown channel on youtube as he have a good video on the topic and very recently Welch Labs also made a video comparing it to AlexNet, (arguably) the first breakthrough on computer vision task.
One and only Gintama