Extremely long segments with HTML code - bad idea
Thread poster: Dalibor Skalník
Dalibor Skalník
Dalibor Skalník  Identity Verified
Local time: 22:58
English to Czech
Jun 3, 2019

I would like to share my experience with a badly prepared file.
I got an XTM job. There were lots of fuzzies with many extremely long segments containing often over 1000 words and between individual sentences there was some HTML code (plain editable text), there were also many tags, up to 90 and they appeared as numbers starting from 1. Having large segments is never a good idea, add to this HTML code as plain text and many tags and you get a nightmare job. Many of those large segments wer
... See more
I would like to share my experience with a badly prepared file.
I got an XTM job. There were lots of fuzzies with many extremely long segments containing often over 1000 words and between individual sentences there was some HTML code (plain editable text), there were also many tags, up to 90 and they appeared as numbers starting from 1. Having large segments is never a good idea, add to this HTML code as plain text and many tags and you get a nightmare job. Many of those large segments were high fuzzies 99 %, often the only problems were those tags - changed positions. Sometimes those 99% matches required many changes in tags - far more work for 99% fuzzies, the penalty should have been much higher. This was a nightmare number one. The nightmare number two was fuzzies with the HTML code. Sentences were often the same what changed was the code between words - editing large segments with many sentences is more demanding than working with properly segmented files but making changes in the HTML code is much more difficult than editing text. After several segments I knew that I just could not continue - the job contained 86 thousand words, mostly those dreadful fuzzies. I explained the situation to the agency and the translation was stopped. It was a big mistake that the original translation, then nomatches, was done with this horrible segmenting in the first place. So my advice is if you get a big job with nomatches in very large segments (plus the html code as plain text) you should stop working and warn the agency/client that this will cause big problems with fuzzies based on those large nomatches should those segments be reused in the future translations. If the client/agency is reasonable they will do something about it. It will also save the client money - in my case many segments only required changes in the HTML code tags with the penalty 1 regardless the number of changes. BTW, in my case splitting the segments was not an option and even if it was It would require a lot of extra work.
Collapse


Adam Warren
 
Andrzej Mierzejewski
Andrzej Mierzejewski  Identity Verified
Poland
Local time: 22:58
Polish to English
+ ...
I do agree Jun 3, 2019

I do agree: long sentences are difficult to understand as well as translate, in particular when you use any CAT software. That's why I dare to cut long sentences in pieces. Sometimes one long segment (like 65 words/350 signs with spaces included) can be divided in three shorter ones - and is much easier to comprehend.

Once the translation is finished and when I should deliver a text file (TXT, DOCX, etc.), I sometimes (rather more seldom than more often) recombine those short pi
... See more
I do agree: long sentences are difficult to understand as well as translate, in particular when you use any CAT software. That's why I dare to cut long sentences in pieces. Sometimes one long segment (like 65 words/350 signs with spaces included) can be divided in three shorter ones - and is much easier to comprehend.

Once the translation is finished and when I should deliver a text file (TXT, DOCX, etc.), I sometimes (rather more seldom than more often) recombine those short pieces into longer ones, similar to the original. I do it based on my experience as well as on my responsibility. I attach a respective remark to the PM or client. I've never heard a complaint about my method.

Having said that, I can not refrain from one small notice: the longest sentence in your post counts 53 words/299 signs (spaces included) - pls correct me if I'm wrong. Would it be easy to translate?
Collapse


 
Dalibor Skalník
Dalibor Skalník  Identity Verified
Local time: 22:58
English to Czech
TOPIC STARTER
Everything is ok now, the file is well segmented Jun 7, 2019

Having said that, I can not refrain from one small notice: the longest sentence in your post counts 53 words/299 signs (spaces included) - pls correct me if I'm wrong. Would it be easy to translate?

Of course it wouldn't, but my sentence was never meant to be translated
BTW I got the file for translation in Trados Studio. Mostly it is now well segmented plus working with Trados studio (2017) is much more effective than working with XTM, especially when making some changes to previously translated segments. XTM is pain in the... to work with.

[Edited at 2019-06-07 12:23 GMT]

[Edited at 2019-06-07 12:24 GMT]


 


To report site rules violations or get help, contact a site moderator:

Moderator(s) of this forum
Laureana Pavon[Call to this topic]

You can also contact site staff by submitting a support request »

Extremely long segments with HTML code - bad idea






TM-Town
Manage your TMs and Terms ... and boost your translation business

Are you ready for something fresh in the industry? TM-Town is a unique new site for you -- the freelance translator -- to store, manage and share translation memories (TMs) and glossaries...and potentially meet new clients on the basis of your prior work.

More info »
Wordfast Pro
Translation Memory Software for Any Platform

Exclusive discount for ProZ.com users! Save over 13% when purchasing Wordfast Pro through ProZ.com. Wordfast is the world's #1 provider of platform-independent Translation Memory software. Consistently ranked the most user-friendly and highest value

Buy now! »