'Try'ing to look at Talos

I've started to hack on the bcp47 patch, and as I add complexity to the chrome registry code path, I'll actually need to look at performance results. Being a good citizen, I'll start with try-server builds. TBH, the prospect of having to do that kept me from starting on this patch for a while, and it didn't come far yet, too. Now, enough rant, I've created a hack. Open the book of e-ville.

When looking at talos results, I've got two problems: An overwhelming amount of numbers, and, in particular for the try server, finding something OK to compare with. All tools we have so far compare runs that are close in time, but in terms of try, that's not necessarily close in code, nor anything else good. So here's what I did:

  1. Open the try pushlog, and select a push to investigate.
  2. Load /try/json-info?node= until the reference repo has a parent of that changeset pushed, checked via /mozilla-central/json-pushes?changeset=.
  3. Load up to five later pushes than the one found.
  4. Load a total of ten pushes including the ones above going back in time.
  5. Ask graphs-stage api for the perf results.
  6. Average per test and platform, and create a scatter plot for each, with 1 being the mean, displaying min and max of the mozilla-central reference numbers, plus the try result.
  7. Show missing perf results in the try run and the base results.

On to the caveats: None of the json apis above support cross-site XHR requests, AFAICT, so I had to create a full blown web app to do this. I picked django just because that's what I'm used to. The app is tested on 1.1.1, but should work on 1.2 as well. So far, there's no db requirements.

Even worse, json-info is really slow, so the app as it is right now doesn't even remotely scale. Mostly the reason why I don't intend to host it anywhere as is.

I don't understand why there is pretty much noise in which test results actually come up. Nor if graph-stage is the right graph server to use to begin with.

It looks like ripped out of the book of e-ville.

Here's the two screens for your sneak-preview pleasure:

The start screen, where you can select your push:

First screen of my ughful try talos compare app

After clicking Go, and waiting a while, you'd end up with

second screen of my ughful try talos compare app

You can get to the alert showing test name and platform by clicking on the dots.

If you're interested, the code is triple-licensed and on hg.mozilla.org. Patches welcome.

I could in theory see solving the perf problems with integrating the pushes app we have for the l10n dashboard, but that's more involved than I feel like spending time on right now. Mostly because I don't exactly know how good refreshing the try clone would perform.