In the last couple of years, postedit is being applied more and more in the subtitle industry. In case you don’t know what postedit is, it’s proofreading of MT (machine translation). Subtitling agencies are turning to postedit, obviously in hope of saving time and, ultimately, money by reducing linguists’ fees. The idea is that, since you don’t have to translate “from scratch” but are “just checking” a “ready text”, you need less time, therefore will be paid less.
But is this actually the case?
The way I see it, this assumption stems partly from the idea that
translation takes longer because you actually have to write the text yourself: you
have to move your fingers, you have to click away at the keyboard – and that is
presumed to take longer than simply reading a text.
But in order to proofread a text, it is not enough to simply read it.
You have to think – think thoroughly and think well.
And thinking is precisely what’s more time-consuming in translation.
Because that is precisely what translation is about. It’s about thinking.
It’s not about writing. Simply writing a text using a keyboard, as in dictation, is
called typing, not translation. When we translate, we take time to read,
understand, assimilate, consider context, and then produce a phrase in the
target language, taking into account all factors involved and, in the case of subtitles,
all technical restrictions. More often than not, we rephrase to adapt to space
or reading speed limitations or to comply with guidelines. We read and re-read,
listen and re-listen, to make sure we have understood all content and have
produced a seamless dialogue. It’s a complex process, a far cry from simple
typing, and very time-consuming.
But one might argue, if you have the phrase readily spelt out before
you, then you just need a glance to check it and then move on, right?
Wrong.
Because for each phrase, we need to devote the exact same thinking time
as when translating from scratch. We cannot trust machine translation to
understand and assimilate context and all the complex factors involved. We
therefore have to give our full attention to each and every phrase, just as if
we were translating.
Still, one might insist, once you have read the phrase and thought it
out, if it is correct, then at least you don’t have to type. Does that not save
you a lot of time?
Not really.
You see, we are pros. That means we type really fast. Personally, I type
faster than I think. I’ve been typing every day for years and years. Practice
makes perfect, and my fingers are so quick I hardly lose any time typing: I
type as I think. So, no, the ready typed phrases don’t help me save much time. In
most cases, not at all.
In fact, in many cases, when a phrase needs correction, even a slight
correction such as erasing a word or a couple of letters, it is actually much quicker
to delete the entire phrase and write over it, than take the time to click in
the subtitle and edit the phrase, erasing just a few letters, changing word order,
having to moving back and forth with the mouse or the arrows.
Well, one might say, machine translation is still a great help to you,
is it not? It provides ready solutions for you to work on, does it not?
Rarely.
And when it does, it might actually have a negative effect. It might
provide a mediocre solution which, being readily available, slips in our mind,
occupying the mental space which would otherwise provide our translation, thus
preventing us from coming up with another solution, maybe more apt. The mental
process of translation is obstructed. In a sense, we become mentally lazy.
And most of the time, the solutions provided are simply inadequate and
have to be edited, or downright wrong and have to be replaced. And this,
obviously, is neither helpful nor time-saving.
MT vs human translation
Wait a moment! one might say, what about proofreading human
translations? That’s also paid less than translation, far less!* And still, I
don’t see you complaining about that! Is that not also as time-consuming as
postedit?
No, it’s not.
Because with human translation, you can trust the linguist to have listened,
heard and understood the dialogue in a comprehensive way, just like you, as a
human, would do. And machine translation, however good, cannot do that.
There are many important factors that MT cannot take into account.
In the case of subtitling, the most obvious one is the image. MT cannot
see the action on scene. MT cannot know who is talking or to whom they are
talking. MT cannot perceive the dialogue register, formality or style. MT
cannot grasp continuity and consistency.
I will give some simple yet important examples.
Gender
Many languages have gendered grammar and apply articles where other languages, such as English, do not (i.e., “Cameron paid” can be translated into Greek with a male or a
female article, “πλήρωσε ο Κάμερον” or “πλήρωσε η Κάμερον”). In many cases, MT
has no way of choosing between male or female in case of, say, a surname, as in
my example. If MT applies the wrong gender, then the proof-reader has to
correct it throughout the file. This can be very tricky, as this is not a
grammatical or spelling mistake and cannot be caught by a spellchecker. It is
very easy to overlook a “he” instead of a “she”, as it is perfectly natural and
correct grammatically. What’s more, the proof-reader has to spot all cases of
gendered adjectives referring to this person, and this cannot be done by a find
and replace: you have to read through the entire file very carefully, and still,
you may miss some. Therefore, when correcting MT, one has to be
doubly alert for this sort of error.
Formality
Many languages have use third person singular or second person plural as
a courtesy form. When translating from English into these languages, MT has no
way of knowing if a courtesy form must be used or not. One has to be doubly
alert for this as well, and may have to apply many changes, in verbs and
pronouns, throughout the text. This, also, is very easy to overlook, as the MT
phrase might be perfectly sound grammatically, and therefore can easily escape
our attention, thus creating consistency problems in the file.
The two issues I just mentioned are particularly vexing, because they
need extra attention and concentration, making postedit more time-consuming.
Multiple meanings
Another issue that merits special consideration, is that most words and phrases can have more than one translation. This may seem simple and obvious, but it creates insurmountable problems. Even the simplest phrase can have more than one version. For instance, the simple English word “hello”, depending on context, can be rendered into Greek as “γεια”, “γεια σου”, “γεια σας”, “εμπρός;”, “είναι κανείς εδώ;”, “σύνελθε!”… And there is no guarantee that the MT will choose the appropriate meaning, as it cannot take context into account.
Historical and cultural factors
Yet another issue are cultural references and historical context, also
beyond the grasp of the MT. The name “Henry” might be transcribed as “Χένρι” or
translated as “Ερρίκος”, depending on whether this is a contemporary person or
a historical figure. A phrase might be a citation from the scriptures and
therefore one would have to locate and copy the source. And so forth.
I could go on and on.
The fact of the matter is that there is no way to provide a simple, straightforward
translation of a word or phrase that will always apply. You have to consider
context and other factors that simply cannot be perceived by MT as it is now.
And this is the case more often than not.
Modus operandi
In a human translation, normally there are relatively few errors. This
means you need less time to go through it and correct them. In the case of
subtitles, it is usually enough to watch the video once and simply stop when
you spot an error – which is not very often. Then you run a spell check, some
final technical checks, and you’re done. But with MT, this is not enough. You
need to read through the subtitles, one by one, pausing many times to edit of
rewrite, and then when you’re done, you can watch the video and do the checks.
This means you have to spend much more time editing an MT file than you
would editing a human translation file.
More importantly, it means that when you’re done, the file might still
contain some errors.
Why is that, you might ask? Well, when editing a human file, the errors
are relatively few, so you just catch them, correct them and that’s it. Very
rarely does an error escape you. When editing an MT file, the errors are so
many, that you have to do a complete rehaul. Basically, you have to rewrite the
file to a great measure. From a certain point onwards, this is no longer
“editing” in any sense of the word, it’s recreation. Essentially, you write
from scratch. You do not simply correct the old text; you are producing a new text.
And of course, the new file should be proofread in its turn, because it’s
impossible to create a perfect text.
If this is the case, you might say, why not proofread it yourself?
Well, for one thing, as we all know, it is next to impossible to
proofread your own work You will catch some errors, but others will slip you by
in the second and third reading, just as they slipped you by in the first one. It
has happened to all of us at one time or another. In order to effectively
proofread your own work, you would have to distance yourself as much as
possible, by leaving a large time interval and occupy yourself with other
things in the meantime. But this, for one thing, is not always feasible, because
due dates are pressing and you have to deliver, and for another, it obliges you
to dedicate yet more time to the postedit job – yes, this job that
supposedly would take less time than translation!
Dedicating more time is, of course, out of the question – among other
things, because you can’t afford to dedicate more time to a job that is
paid less. Besides, since this entire MT thing is supposed to save time,
it would be ridiculous to end up actually loosing time.
What do we do, then?
Some of us simply refuse to do postedit and accept the fact that we won’t
be getting any tasks from agencies that are set on working with MT. The ones of
us who accept such tasks, simply have to change their workflow and dedicate
less time to each file, in order for the job to be financially sustainable. If
there is no time to both read through the file and watch the video, you simply omit
one of the two. The net result of this, of course, is loss of quality.
I’m not sure whether translation agencies realise this or not. They
either realise it and do not care, or they have no clue what’s going on.
And frankly, I’m not sure which of the two would be worse.
Tech restrictions
Last but not least, in subtitles we have to consider the influence of
technical restrictions, such as character count and reading speed. More often
than not, the phrases provided by MT violate the word count and reading speed
limit, therefore obliging us to edit the phrase and make it shorter. And most
of the time, this edit does not consist in simply deleting a word or two. Most
of the time, we have to rephrase, that is to rethink and recreate – not to
mention rewrite.
In other words, we have to translate from scratch – after having initially
wasted our time reading and considering the MT suggestion.
What then?
All my colleagues, as far as I know, do not care for MT at all, when it
comes to subtitles. Most of them shun postedit, delete the MT text when
available as an option, and avoid MT suggestions when provided as an aid.
But then, is MT completely useless?
Indeed not! It can be very useful in
other types of translations, such as legal or financial documents that have
long repetitions, or medical and technical documents that have extensive
terminology. It can be useful in the sort of text where precision is paramount
and style is secondary. But not in literature, where creative solutions are
most necessary. And subtitles, let me say, are literature. Scripts are
dialogues, like theatrical plays. They are alive, expressive, unique, and
should be treated accordingly.
[EDIT: I have received many comments from translators who have pointed out that the exact same problems exist in other fields of translation, such as legal, financial, medical and technical. Quoting:
"This is verbatim the thoughts going through my mind when I’m thinking about MT - even though I’m a legal and financial translator. Which goes to show that the issues are the same in those fields."
"MT makes so many mistakes with court legalese. And the fact that some texts are repetitive doesn’t really matter, since MT is inconsistent and will translate the same words differently in different sentences…"
"All those pain points that you explain so well, except for the character restriction, apply exactly the same to say medical or technical. The ambiguity caused by English juxtaposing of nouns with no need for prepositions (that are needed in the target language and can alter profoundly the meaning) is something a machine (or someone who is not an expert in the field) cannot get around without asking for trouble. Plus specialized terminology being so niche is not something MT can really master, things are not so black and white. And I won’t even use the card that medical affects people’s lives while subtitling is just for entertainment so errors are more acceptable."
"I have yet not seen a text (financial, legal, medical or technical included) were MT is precise enough and style doesn't matter." "Αnd also matters more, given the potential risks and hazards."]
Translation is not just clicking away at the keyboard. Translation is
thinking, feeling, weighing, assimilating, recreating. And let’s face it, no MT
and no AI is up to the task – yet. Not unless we create an artificial
intelligence that is not only truly intelligent but also sentient, the equal of
human intelligence and sensitivity.
Till then, I’ll stick to human translation, thank you.
...with a side order of hours and minutes. |