how to convert a plain text glossary into a TermBase recognized by memoQ
Thread poster: Angel Llacuna
Angel Llacuna
Angel Llacuna  Identity Verified
Spain
Local time: 13:53
English to Spanish
May 11, 2018

I created a glossary of terms using a simple plain text editor such as TextPad.

There is the English term followed by an equal sign, and then the Spanish translation.
For some words, several translations are given, separated by asterisks.

An extract of this glossary looks like this :



How I can convert t
... See more
I created a glossary of terms using a simple plain text editor such as TextPad.

There is the English term followed by an equal sign, and then the Spanish translation.
For some words, several translations are given, separated by asterisks.

An extract of this glossary looks like this :



How I can convert this plain text glossary into a TermBase that can be recognized by memoQ while performing a translation with that tool ?
Collapse


 
Anthony Green
Anthony Green  Identity Verified
Italy
Local time: 13:53
Italian to English
+ ...
often quite a lot of steps May 11, 2018

Angel what I would do if I were you would be to upload the first, say, 100 items and then we could see how to deal with all those synonyms and tags you have in there.
Off the top of my head I wouldn't like to say "just do X, Y and Z" but there is no doubt that it can be done


 
Samuel Murray
Samuel Murray  Identity Verified
Netherlands
Local time: 13:53
Member (2006)
English to Afrikaans
+ ...
@Angel May 11, 2018

Angel Llacuna wrote:
How I can convert this plain text glossary into a TermBase that can be recognized by memoQ while performing a translation with that tool?


I'm not a MemoQ user but I checked the MemoQ help file:
http://kilgray.com/memoq/2015-100/help-en/index.html?term_base_csv_import_settings_.html

It would seem that you need to import it as "CSV". I think you should first replace all " = " with a tab, and then replace all " * " with e.g. a semicolon. When importing, select Tab as the delimiter. Tick the option "Split alternatives in field by" and put a semicolon in the box. I'm not sure if renaming your file to something.csv is required.

I'm not 100% sure how "Split alternatives in field by" works but it looks as if it would what you'd be looking for. If it turns out that we misunderstood what that feature does, then I'm afraid you're going to have to turn those multi-translation items into separate items. In other words, change this:

acquire[tab]adquirir;obtener;conseguir;recoger

into this:

acquire[tab]adquirir
acquire[tab]obtener
acquire[tab]conseguir
acquire[tab]recoger

for each term. Can you figure out how to do this? I'm not 100% sure but it from videos it looks to me as if MemoQ would be okay with term bases that contain multiple entries for one source term.

By the way, it looks as if your glossary needs a bit of additional tweaking, e.g. you have "-tech" and "(data)" which I believe would have to go into a third column, if you want MemoQ to realise that they are not part of the target text.


 
Angel Llacuna
Angel Llacuna  Identity Verified
Spain
Local time: 13:53
English to Spanish
TOPIC STARTER
Thank you very much, Samuel and also Anthony for replying ... May 11, 2018

That bit of info, in the form of -tech , on my glossary excerpt, is context information.
Can I define a third column for it on my csv file ?

accoustic noise declaration = declaración de ruidos -tech


 
Samuel Murray
Samuel Murray  Identity Verified
Netherlands
Local time: 13:53
Member (2006)
English to Afrikaans
+ ...
@Angel May 11, 2018

Angel Llacuna wrote:
That bit of info, in the form of -tech, on my glossary excerpt, is context information.
Can I define a third column for it on my csv file?


Yes, it is very common in plaintext CAT tool glossary formats that the third column is for extra information (the "comment" column). This means that you should add a tab (or whatever your separator character is) between the target term and the context information.

In my own CAT tool, WFC, I sometimes have separate entries for each translation, but sometimes I have just one entry, and then put the other translations in the comment column, depending on how important it is for the CAT tool to be able to automatically recognise each translation.

My comments also sometimes contain information about where the term came from, etc. In a proper term base, different types of contextual information would be in separate fields (i.e. parts of speech, origin, antonyms, synonyms, definitions, etc) but in CAT tools with a simpler glossary display, all of that can be written in just one column. I also add information in the comment field if a term was changed from a previous version of the term, but I know that some CAT tools have more advanced version tracking capabilities (I would not be surprised if MemoQ can do this), although maintaining all this information takes time, and sometimes you just need a quick-and-dirty glossary import.


 
Stepan Konev
Stepan Konev  Identity Verified
Russian Federation
Local time: 14:53
English to Russian
Just a minor addition May 11, 2018

It also may be *.txt (not exceptionally csv).
All other steps are the same as described by Samuel.

acquire[tab1]adquirir;obtener;conseguir;recoger[tab2]-tech

Assign 'Import as term' both to F0 (Source language) and F1 (Target language).
Assign 'Import as definition' to F2 (Target language).

F0=text before the first tab
F1=text between the first and the second tab
F2=text after the second tab

'Import as definition' is the
... See more
It also may be *.txt (not exceptionally csv).
All other steps are the same as described by Samuel.

acquire[tab1]adquirir;obtener;conseguir;recoger[tab2]-tech

Assign 'Import as term' both to F0 (Source language) and F1 (Target language).
Assign 'Import as definition' to F2 (Target language).

F0=text before the first tab
F1=text between the first and the second tab
F2=text after the second tab

'Import as definition' is the simplest way, but you can fiddle with 'Import as other field' and select anything else if you like...




[Edited at 2018-05-11 10:32 GMT]
Collapse


 
Anthony Rudd
Anthony Rudd

Local time: 13:53
German to English
+ ...
Import glossary May 11, 2018

Import terminology as CSV TB, UTF-8
specify = as delimiter
specify F0 and F1 as "term"
voilà


 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

how to convert a plain text glossary into a TermBase recognized by memoQ






CafeTran Espresso
You've never met a CAT tool this clever!

Translate faster & easier, using a sophisticated CAT tool built by a translator / developer. Accept jobs from clients who use Trados, MemoQ, Wordfast & major CAT tools. Download and start using CafeTran Espresso -- for free

Buy now! »
Trados Studio 2022 Freelance
The leading translation software used by over 270,000 translators.

Designed with your feedback in mind, Trados Studio 2022 delivers an unrivalled, powerful desktop and cloud solution, empowering you to work in the most efficient and cost-effective way.

More info »