Urs Wolffers wrote:
Matecat has a new feature: Guess tags. You translate a sentence without seeing any tags, then you press on guess tags and Matecat tries to place the tags in the correct position, but there are quite a few problems (tag mismatch, additional spaces...), not recognised by me, but by the Matecat error messages themselves. Now I am no big Matecat fan, as there is cafetran:) Do you think such a feature would be possible for Cafetran too? Thanks a lot for your help. Urs
Hi Urs,
I wasn't aware that you're using CafeTran too?
Anyway, it's a very interesting approach that you suggest. Up to a certain level this should be possible, I guess.
CafeTran already offers "Automatic transfer of remaining tags":
Of course, the use of this automation feature depends on the file format (good results are possible with MS Word documents, for instance).
Now, with this feature enabled, at least the placement of spanning tags could be partially automated, by means of term recognition:
Zwölf Boxkämpfer jagen Viktor zum Spaß quer über den Sylter Deich.
Twelve box fighters chase Victor for fun across the dyke on Sylt.
With a glossary (or TMX termbase) that contains:
boxkämpfer = box fighter
Viktor = Victor
Sylt = Sylt
And with stemming enabled, CafeTran will give you terminology recognition hits for "Boxkämpfer", "Viktor" and "Sylter". Then, a feature like "Wrap recognised terms with (matching) tags", could auto-tag the translations "box fighters" (plural is tagged too), "Victor" and "Sylt" (reduction because of enabled stemming).
Interesting thought. Difficult to implement? I have no idea .
This approach could also bring a solution for a problem that I'm facing quite often: my glossary entries are that long, that they often span tags:
E.g.: Zwölf Boxkämpfer = Twelve box fighters
In a sentence like:
Zwölf Boxkämpfer jagen Viktor zum Spaß quer über den Sylter Deich.
CafeTran will add the "opening" tag before "Zwölf Boxkämpfer". With the new approach, it could place the first tag correctly before the noun. Fascinating idea!
[Edited at 2016-10-08 07:57 GMT]