Hi folks,
I've just added our provisional ramp-up schedule for continuing English Wikipedia deployment to the Article feedback page.
If you weren't able to make it to the IRC office hours last week, I encourage you to review the notes at m:IRC_office_hours/Office_hours_2011-06-16.
The biggest news item is that we have provisional data dumps for all AFT data available here. Please download it, play with it, and report your findings. (We'll add more documentation about the data format soon.)
We're also looking into working with the toolserver folks to ensure that anonymized tables are replicated there. This may still take us a while to fully sort out, though.
We'd love to see more experiments that correlate AFT data with other forms of article assessment, that compute and tabulate data e.g. for all articles within the scope of a WikiProject, and so forth. We hope that the first data dumps will inspire some of you to play with this data.
One of the remaining feature changes that we're working on right now is to add explanatory tooltips for each rating (e.g. "Poor", "Excellent"), which may help with rater baselining.
As a reminder, at this point AFT is still an experiment. We do invite you to analyze the data. It may very well be that AFT in its current form doesn't produce useful data for certain types of articles, that certain rating categories aren't helpful, etc. The analysis we've done to-date indicates that AFT data could be useful to 1) detect problem patterns that slip through change patrol, 2) find articles that are candidates for promotion or cleanup, 3) understand change-over-time patterns. We're continuing to dig into the data, including participation trends on AFT articles vs. non-AFT articles.
The bigger vision here is that this is the first time that we're engaging our readers beyond the edit button. That's a big deal, and we're hoping to build on this first implementation over time. Check out the ideas in Article feedback/Extended review, the early data from our post-rating calls-to-action, etc. There's a lot of potential here to make a transformative, positive difference in engaging our audience and recruiting new editors.
I also want to emphasize that we've taken any requests and concerns expressed here and elsewhere very seriously, and will continue to do so. Modifications that we've made in response to feedback include the per-user hide-ability of the feature, various bugfixes, ability to blacklist the tool via a blacklist category e.g. for disambiguation pages, and more.
One of the biggest remaining pieces of feedback that we've not been able to action yet is that the tool in its current form is too prominent, especially on small pages. I do agree with this in principle, but I don't think it would be wise to haphazardly modify the design. IMO a new design needs to be driven by 1) effectiveness vs. the old design, 2) usability, 3) any functional modifications that we decide we can make. For example, we might end up removing some rating categories entirely if they end up surfacing largely redundant information (e.g. "trustworthy" vs. "objective"). But that'll still take us a while to figure out.