ValaCAT. Some design aspects

por Marcos Chavarría Teijeiro

Continuing with my series of articles about my participation on Google Summer of Code this time I’ll talk a little about some design aspects of the new application so if you are interested you can criticizer me :D.

Firtly I wanna say that I have found a name for the new tool which I think is pretty cool and it sums up some aspects of the application: ValaCAT, where CAT stands for “Computer Assisted Translation” and Vala (obviously) is the name of the programming language.

I have been working on a design for this new application taking as reference the old Gtranslator design since the last week and here are some aspects that I would like to comment:

Languages

As a result of conversations with Daniel Mustieles (Spanish Language L10N coordinator) I have discover that some languages can have more than two plurals. For instance Russian language has 3 different plurals and “two apples” is translated in a different way than “three apples”.

The information about language plural form is stored on a snippet in the po file header. This snipped contains information about the number of plurals and a C expression that allows the program to know when to use a plural form or another. This is an example from Russian language.

Plural-Forms:
nplurals = 3;
plural = n%10==1 && n%100!=11 ? 0 : n%10>=2 && n%10<=4 && (n%100=20) ? 1 : 2;

As we can see the first parameter (nplurals) indicates that there are 3 plurals and the second parameter (plural) shows an expression that returns three different numbers 1, 2 and 3 for the different type of plurals. This expression is useful for programs but it’s difficult to understand for a newcomer translator. In the new tool we will create some description for these plural forms instead of using numbers. For example instead of using “1” we will use “more than 10”.

To do so we will create a class Language that encapsulates information about the different languages and it allow to get these plural tags.

Language Class Diagram

This class will be some kind of singleton where the new instances are dynamically created when the get_language_by_name or get_language_by_code methods are called. The tags for different plural forms can be hardcoded because the number of different combinations is limited.

Iterators

We need a way for travelling arround the documents in different ways:

  • Visiting only translated messages
  • Visiting untranslated message
  • Visiting fuzzy messages
  • Following the visible order.
  • Visiting messages that match a certain search.

In order to do so we create the FileTabIterator class which stores an ordered list of messages and it allows to invoke next and previous methods. This two methods simple will look what is the previous or next item on the list and ask this item to be selected.

Diagram Iterator

There are two aspects that has not being taking into account yes, one is how to generate/update this iterators and the other is where store them. I think that storing some iterators in the tab class will be real useful and the search related iterator can be related with some kind of Search class. Updating is a more difficult topic do we build them from scratch when some change happens or it is better to implement some kind of smart update deleting or adding items to the list and updating the index?

Next days I will write other articles about other aspects of this application design. You can view the whole design in the GitHub repository.