Validate results on upload #534

lukebjerring · 2018-09-11T16:10:55Z

Occasionally, the infrastructure executing the tests will hit frequent errors and produce a results set which is significantly empty, which reduces the value of the homepage on wpt.fyi. We should introduce a verification system, which flags bad runs for human intervention.

lukebjerring · 2018-09-11T16:12:13Z

cc @thejohnjansen - as discussed, all browsers exhibit these problems, so we consider the curation of results to be a server-side task that we would like to solve generally.

lukebjerring · 2018-09-12T14:59:58Z

cc @jugglinmike who has done similar behaviour in the master job in BuildBot

jugglinmike · 2018-09-12T20:51:06Z

The results collection project has a long and storied past with incomplete results (e.g. web-platform-tests/results-collection#478, web-platform-tests/results-collection#541, and web-platform-tests/results-collection#466).

My advice to those just getting involved in this space is to be precise with the distinction between WPT "tests" and "subtests". Missing "tests" can be detected because even though "tests" are distinct from "test files" (due to the presence of "multi-global tests"), "tests" are tracked by the WPT test manifest. "subtests" are generated dynamically at runtime. Since some tests intentionally allow the number of subtests to vary, there has been no attempt to enforce subtest consistency or even to track them.

On a technical level, the results-collection process guards against incomplete results by rejecting WPT reports (as generated via the command wpt run --log-wptreport) that omit any of the expected tests. The WPT CLI emits a list of the expected tests in its "raw" output (as enabled via the --log-raw CLI argument). This is all done in a Python script named run-and-verify.py. The most salient parts are:

I believe that the folks at Microsoft use their own runner, so this code may only be useful as a general guide.

When it comes to wpt.fyi enforcing this expectation prior to publication, it would be ideal if this could be performed at upload time. A failed HTTP request is the most direct way to inform the uploader of a problem. That would require a fair amount of processing power, though--both for generating the list of expected tests and for parsing the uploaded results to make the comparison.

Notifying the uploader asynchronously would be better than not notifying them at all. In gh-378, I suggested how introducing uploader contact information could benefit visitors to wpt.fyi; this is a case where that information would more directly benefit the uploaders themselves.

Hexcles · 2020-01-31T16:01:58Z

We now have a bunch of basic verifications in place in the processor. More heuristics could be added as follow-up work.

lukebjerring assigned Hexcles Sep 11, 2018

jugglinmike mentioned this issue Sep 12, 2018

Feature: "Report issue with data" #378

Open

Hexcles closed this as completed Jan 31, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Validate results on upload #534

Validate results on upload #534

lukebjerring commented Sep 11, 2018

lukebjerring commented Sep 11, 2018

lukebjerring commented Sep 12, 2018

jugglinmike commented Sep 12, 2018

Hexcles commented Jan 31, 2020

Validate results on upload #534

Validate results on upload #534

Comments

lukebjerring commented Sep 11, 2018

lukebjerring commented Sep 11, 2018

lukebjerring commented Sep 12, 2018

jugglinmike commented Sep 12, 2018

Hexcles commented Jan 31, 2020