Working on xliff files exported from Wordpress WPML (Studio 2019)
Téma indítója: Johanne Dupuy

Johanne Dupuy  Identity Verified
Franciaország
Local time: 20:46
Tag (2018 óta)
angol - francia
May 3

Hi,
One of my clients is sending me some xliff files extracted from Wordpress using WPML and they are not exploitable in Studio.
Here is what I get: one big segment, a lot of text not to be translated, etc.

Does someone have some experience regarding xliff files extracted from WPML/Wordpress?

Is there something my client should do during the extraction to avoid this? Would he better send me another format?

Is there something I can do in order to
... See more
Hi,
One of my clients is sending me some xliff files extracted from Wordpress using WPML and they are not exploitable in Studio.
Here is what I get: one big segment, a lot of text not to be translated, etc.

Does someone have some experience regarding xliff files extracted from WPML/Wordpress?

Is there something my client should do during the extraction to avoid this? Would he better send me another format?

Is there something I can do in order to exploit such files on Studio? (not too time-consuming, as I get hundreds of those)

Would that work better with another plugin than WPML? (I was told about Polylang).

Thanks for your advice,

Johanne
Collapse


 

Samuel Murray  Identity Verified
Hollandia
Local time: 20:46
Tag (2006 óta)
angol - afrikaans
+ ...
@Johanne May 4

Johanne Dupuy wrote:
Here is what I get: one big segment, a lot of text not to be translated, etc.


I don't have any experience in this, but based on what I have read about this in previous years and on what I was able to google about this today, I believe what you're describing does not mean that there is something wrong with the WPML XLIFF files, but that Trados is incapable of processing it without some tinkering.

This post from 2013 complains about the exact same thing:
https://wpml.org/forums/topic/xliff-all-content-in-single-trans-unit/
...but the developers' response makes sense to me. The content of an entire page is put into a single XLIFF segment, and it is up to the CAT tool to split it into further segments, based on the translator's own preference. (It just so happens that most CAT tools create separate segments in XLIFF files for separate sentences, but a good CAT tool should be able to segment further by sentence in its own UI even if the source content (and subsequent translated file) is segmented by paragraph or by page.)

In addition, based on some example files that I have seen, WPML puts the translatable content inside CDATA blocks (which is perfectly acceptable), and this means that some CAT tools are unable to recognise the tags as "tags". I believe Trados refers to such tags as "embedded content". Here is a post about using embedded content processing to deal with WPML tags (it doesn't go into great detail but you could search for similar terms):
https://community.sdl.com/product-groups/translationproductivity/f/studio/25698/translating-xliff-files-from-wpml-in-sdl-studio

I imagine posting a question about this on the Trados forum would yield some answers.
https://community.sdl.com/product-groups/translationproductivity/f/general

This page on the MemoQ help file appears to give a warning that not all existing target text from a WPML XLIFF file can be trusted to be translations of the associated source text:
https://docs.memoq.com/current/en/Places/wpml-xliff-filter.html

Is there something my client should do during the extraction to avoid this?


The responses to the post from 2013 above indicate that a feature to split by paragraph may have been developed since then, and if that is so, then your client may have such an option available to them, but my guess would be that the text would still be contained in CDATA blocks anyway (so tags won't be recognised with further tweaking in Trados), and you'd still have the problem with long paragraphs.

Is there something I can do in order to exploit such files on Studio? (not too time-consuming, as I get hundreds of those).


I believe once you figured out the correct embedded content processing settings, you can create a project template in Trados that you can use every time you have such files. Note that AFAIK embedded content processing works only at the time when SDLXLIFF files are created -- the changed settings are not applied to existing SDLXLIFF files. This means that while you try to figure out the correct settings, you'd have to create new temporary projects over and over until you find the correct combination of settings. It may be that someone on the official Trados forum can tell you exactly what settings to use from the get-go, though.

Would that work better with another plugin than WPML? (I was told about Polylang).


I doubt if that is a solution. Installing and configuring a plugin for WordPress isn't a simple affair, and I have read that setting up a translation plugin is doubly difficult, so switching to e.g. Polylang may be wasted effort. This page says that PolyLang Pro uses PO files as its translation format, but I haven't seen sample files, so I can't tell if its PO files suffer from other issues:
https://polylang.pro/doc/import-and-export-strings-translations/

[Edited at 2021-05-04 08:25 GMT]


 

Johanne Dupuy  Identity Verified
Franciaország
Local time: 20:46
Tag (2018 óta)
angol - francia
TÉMAINDÍTÓ
Thanks for this long and detailed answer! May 4

I'm indeed exchanging with SDL through their forum on these questions and trying to work out what I should do in order to improve those files for translation.
I gather there are 2 problems with the files I get :
- not translatable text which appears between
- one main big segment instead of one per sentence
See the screenshot here:

Image Studio xliff

I'm not sure that I'll find a solution for both problems.
I would be curious to know if it works better on other CAT tools.


 

Samuel Murray  Identity Verified
Hollandia
Local time: 20:46
Tag (2006 óta)
angol - afrikaans
+ ...
@Johanne May 4

I got my hands on a test file and here is what I found.

Trados, WFP3 and WFP6 all open the file as you described. I tried the instructions for embedded content in Trados but could not get it to work.

MemoQ, on the other hand, successfully sub-segmented and tagged the embedded content without the need for any further tweaks. I'm not sure for how long an unregistered version of MemoQ keeps going (it used to be that it reverted to a simpler version after 30 days), but if
... See more
I got my hands on a test file and here is what I found.

Trados, WFP3 and WFP6 all open the file as you described. I tried the instructions for embedded content in Trados but could not get it to work.

MemoQ, on the other hand, successfully sub-segmented and tagged the embedded content without the need for any further tweaks. I'm not sure for how long an unregistered version of MemoQ keeps going (it used to be that it reverted to a simpler version after 30 days), but if you could roundtrip via MemoQ. It's a few extra steps but not too many.

The process of creating a project in MemoQ is a little weird, but it's not hard. Also, while Trados creates an SDLXLIFF file automatically, MemoQ does not create a file in its own format automatically, so it requires an extra step. So, in MemoQ, click the little arrow on the "New Project" button and go through the dialog to create a new project (do not choose "New Project from Template). On the second page of the dialog, click "Import" to import the WPML XLIFF file. Then, later, in the project itself, right-click the file and select Export > Export Bilingual, and select the MQXLIFF format (optionally click "Plain XLIFF" ). This creates an MQXLIFF file that you can open and translate in Trados (Trados obviously converts this to SDLXLIFF and then later back to MQXLIFF again). When the translation is finished in Trados, use the usual method of creating the target file (erm... right-click the project). Then, in MemoQ, go to the project (double-click it in the list), select the file, and then click the "Import" button at the top left of the screen, choose Import and select the translated file to import it. Then, to create the final translated XLIFF file, right-click the file and select Export > Export (Choose Path) or Export (Stored Path). This creates the final WFML XLIFF file. You can reuse a project over and over -- it's super easy to add files to an existing project in MemoQ: just drag and drop.
Collapse


 

Johanne Dupuy  Identity Verified
Franciaország
Local time: 20:46
Tag (2018 óta)
angol - francia
TÉMAINDÍTÓ
MemoQ might be the solution, then... May 4

Thanks again for testing this.
I'm more and more tempted to work with MemoQ and not only for that reason...
Could you please show me or send me an extract or a picture of what you are getting with MemoQ?


 

Samuel Murray  Identity Verified
Hollandia
Local time: 20:46
Tag (2006 óta)
angol - afrikaans
+ ...
@Johanne May 4

Johanne Dupuy wrote:
Could you please show me ... a picture of what you are getting with MemoQ?


memoq in


 

Johanne Dupuy  Identity Verified
Franciaország
Local time: 20:46
Tag (2018 óta)
angol - francia
TÉMAINDÍTÓ
Rather convincing! May 4

Thanks again!

 

Stepan Konev  Identity Verified
Oroszországi Föderáció
Local time: 21:46
angol - orosz
Add a new segmentation rule May 4

Try this:
1. Go to Project Settings > Language Pairs > All Language Pairs > Translation Memory and Automation > select your TM (or the first/top TM in the list if you use more than one TM) > click Settings > click Language Resorces> click a pencil icon on Segmentation rules for your source language.
2. In the Segmentation Rules window, click Add...
3. Type the rule name (Description filed)
4. Click 'Advanced View'
5. In the left box (Before brea
... See more
Try this:
1. Go to Project Settings > Language Pairs > All Language Pairs > Translation Memory and Automation > select your TM (or the first/top TM in the list if you use more than one TM) > click Settings > click Language Resorces> click a pencil icon on Segmentation rules for your source language.
2. In the Segmentation Rules window, click Add...
3. Type the rule name (Description filed)
4. Click 'Advanced View'
5. In the left box (Before break) type this:
.[\n]+
6. In the right box (After break) type this:
.
7. Press OK 5 times to complete the process
8. Remove your file from source and re-import it again using the same TM.

All these actions apply to your TM which is used for segmentation. If you use more than one TM in your project, the segmenting TM is the top one (the first in the list).
Collapse


 

Stepan Konev  Identity Verified
Oroszországi Föderáció
Local time: 21:46
angol - orosz
Add a new segmentation rule May 4



 

Stepan Konev  Identity Verified
Oroszországi Föderáció
Local time: 21:46
angol - orosz
Replace untranslatable plain text tags with placeholders May 4

If you want to convert the untranslatable plain text (tags) into real tags/placeholders, go to File > Project Settings > File Types > XLIFF > Embedded content > tick 'Enable embedded content processing' / 'Extract in all paragraphs' > click OK.
*You will need to re-import your file after these changes.
**However I would only do this if there is no translatable text inside the tags.
... See more
If you want to convert the untranslatable plain text (tags) into real tags/placeholders, go to File > Project Settings > File Types > XLIFF > Embedded content > tick 'Enable embedded content processing' / 'Extract in all paragraphs' > click OK.
*You will need to re-import your file after these changes.
**However I would only do this if there is no translatable text inside the tags.


[Edited at 2021-05-04 20:55 GMT]
Collapse


 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Working on xliff files exported from Wordpress WPML (Studio 2019)







Protemos translation business management system
Create your account in minutes, and start working! 3-month trial for agencies, and free for freelancers!

The system lets you keep client/vendor database, with contacts and rates, manage projects and assign jobs to vendors, issue invoices, track payments, store and manage project files, generate business reports on turnover profit per client/manager etc.

More info »
Trados Business Manager Lite
Create customer quotes and invoices from within Trados Studio

Trados Business Manager Lite helps to simplify and speed up some of the daily tasks, such as invoicing and reporting, associated with running your freelance translation business.

More info »