Editing segmentation rules in memoQ – Part 1: Semicolon

Long time no see! This May has been pretty busy (:-)) and I haven’t had much time to write new posts for my blog, but I hope to make up for it in the near future as I am going to have a few days off this week (Thursday-Sunday).

Since I have some free time today I decided to find solution to my segmentation problem connected with one of the projects I currently work on. According to the client’s requirements the text should be divided into segments also after semicolons (;) and he sends translation memories which are segmented accordingly.  Vendors who use CAT programmes which are not set-up appropriately won’t get 100% matches from the TM, which may result in inconsistencies and will certainly lower their efficiency. I have received a manual how to change these settings in Trados, yet as I prefer to use memoQ in this project I needed to come up with my own solution. After doing some research on the Internet and reading memoQ user manual it turned out that the solution is very simple:

      1. You need to open the Resource console window in memoQ.
      2. Go to the tab Segmentation rules.
      3. Click the Create new link to create a new set of segmentation rules for English.
      4. Click the Edit link to modify them.
      5. In the Edit segmentation rule set window enter the new rule ;#!#[\s] in the text field in the left column (Rules). #!# stands for segment break, [\s] for space and ; for semicolon.
      6. Click the Preview button in the upper right corner of the window.
      7. In the Segmentation preview window put the semicolon in any place and click Preview to check whether the rule works.
      8. In order to use this segmentation rules set you need to choose it in the Settings > Segmentation rules tab.

I have tested the rule using the files from this particular project and it works nicely. If you detect any cases in which it doesn’t, please let me know, so that I can modify it or add an exception. If you want, you may download this segmentation rules set by clicking the button below [wpdm_file id=1]. This week I am going to write another manual concerning segmentation rules, so stay tuned!


2 thoughts on “Editing segmentation rules in memoQ – Part 1: Semicolon

  1. The procedure makes sense for semicolon, because most of the time the first letter after the semicolon won’t be a capital letter, and the default memoQ segmentation rules are defined with capital letters after “ending” symbol.
    But if you’d like to add any other symbol as segmentation delimiter, you need to perform steps 1-4, but then just go to “Custom lists” tab, click “#end#” list and add the desired to the list items (right-hand field).

Dodaj komentarz

Twój adres e-mail nie zostanie opublikowany. Wymagane pola są oznaczone *