The a10n project feeds elmo with source comparisons between en-US and localizations. It also gathers data from mozilla’s version control systems (currently hg.mo), and updates the data for those in elmo.
The generated data is stored as a summary in the elmo database, and as a detailed json. The detailed json is stored in elasticsearch.
Currently, there are two schemes of comparisons supported:
compare-locales uses l10n.ini files to find out what to
do. These configuration files are included in the main mozilla code
bases, and are used for Firefox, Firefox for Android and
Thunderbird. They’re mapping one or more repositories with the
upstream code including en-US to one repository per locale. The
fullest example would be
tb_aurora. It’s comparing
toolkit etc from
|en-US repo||dir||l10n repo||dir|
Projects that don’t use the directory structure of Firefox, but need support on the l10n dashboard are supported via a helper repository. The prominent example today is gaia, the UI part of Firefox OS. For those, a pure en-US repository is created, and compared against a repository per localization next to it. There are some features of compare-locales like filtering of entries that are not supported in compare-dirs.
|en-US repo||dir||l10n repo||dir|
The tasks can be grouped into three buckets:
These tasks have different concurrency and load characteristics.
The hg poller is mostly waiting for network responses from hg, and is thus well suited for single-threaded asynchronous callback code. It should be able to throttle the load on the server, but also able to scale to parallel requests.
This code segment updates the local clones that are shared among the elmo hardware, and updates the database to be in-line with that state.
The Changesource observes new pushes to repositories, extracts the files for those changes and submits them to the schedulers.
The scheduler observes changes coming from the changesource, and decides for which of those to run which automation. It’s schematically composed of three decision makers, triggering three automation tasks.
The scheduler is configured by Trees, which associate repositories, directories, and a group of l10n repositories (forest) with a compare-locales (or -dirs) setup.
The first decision is whether to reconfigure the scheduler. This happens if the l10n.ini files change, or if the set of affected locales change.
If the l10n.inis change, 1. the schedulers stop taking new changes 2. the l10n.inis for the affected trees are reloaded 3. if the changes affected the configuration + trigger a rebuild of all locales for that tree with their latest revision
If the all-locales change,
1. load the changed file, and parse it
2. if there are new locales
+ trigger a rebuild of that locale against their latest revision
3. if there are locales dropped
+ update elmo to deactivate that locale’s latest
The compare-locales jobs are triggered either by changes to one of the repositories holding the en-US files, or to one of the repositories in the forest.
Each tree is associated with a list of repositories, and a list of directories. The configuration process gathers and resorts those. Thus, the scheduler algorithm can work the opposite way around.
locales/en-USof any of our directories above. If so, note the tree.
Each tree is associated with exactly one forest, with a repository for each locale.
The comparison jobs are two-fold, compare-locales and compare-dirs. They do share some characteristics, though.
The comparisons are done on disk, on checked out files. For all
repositories included in the comparison, the repositories need to be
checked out with the given revisions, in the same directory structure
as they’re on the upstream server. I.e.,
http://hg.mozilla.org/l10n-central/de/ is checked out to
l10n-central/de. After the check-out, version control isn’t needed,
thus doing this on
hg shares or
git clone -s is fine.
If multiple comparisons run on the same machine, there shouldn’t be conflicting check-outs. In most scenarios where jobs are running in parallel, you’re seeing many jobs on the same revision of en-US, thus sharing the working dir makes the comparisons benefit from OS disk caches. Thus we should be careful about synchronizing jobs on the same hardware.
This job requires a revision for each repository, and the data to get to the entry point l10n.ini, as well as the locale to compare to.
This job requires a revision for both en-US and the locale, the locale, and the forest in which the two repositories reside.
Every now and then, you get more than one change that trigger the same locale for the same tree. Merge days are a perfect example, where we change both en-US and l10n, all around the same time. In those cases, it’s a fair optimization to only compare the latest revisions against each other.
This is fairly trivial as long as all changes are globally ordered, just update the existing request to take a new revision. If you don’t know that the order of the changes is given, things are more tricky, i.e., you need to find out which of the available revisions is newer than the other.
Let’s wrap this up by summarizing the jobs and the affected resources.
|upstream repo||elmo db||local clones||workdir||ES|
This automation is currently implemented based on buildbot 0.7.12+.
The a10n repo will hold an implementation based on queues and celery.