Code Health Group/projects/CodeReview/meeting notes/20190709

2019-07-09 - Kickoff meeting

Attendees: Jean-Rene Branaa, Lars Wirzenius, Andre Klapper, Gergő Tisza, Holger Knust, James Forrester, Kosta Harlan, Evan Prodromou, Niklas Laxström, Mukunda Modell, Will Doran

Introductions

In the event that we don't already know each other

Logistics

Meeting frequency, schedule, and duration
- Proposal: Once a week for a month, then re-evaluate, at 07:00 SF / 14:00 UTC on Tuesdays.

communication channels within the code review workgroup
- Email:[[1]]
- IRC:#wikimedia-codehealth
- Talk:Code Health Group/projects/CodeReview

Scope

Problem Statement(s)

What do we perceive the problem(s) to be?

Andre: [mostly taken from https://www.mediawiki.org/wiki/User:AKlapper_(WMF)/Code_Review ]

Weak review culture
Lack of enough skillful, available, confident reviewers and mergers
Hard to identify good reviewer candidates

Plus:

1.2 No culture to improve changesets by other contributors

2.1 Unstructured review approach

2.2 Unhelpful reviewer comments

2.3 Poor quality of contributors' patches

3.2 Under-resourced or unclear responsibilities

3.3 Workload of existing reviewers

4.2 Hard to realize a repository is unmaintained

4.3 Hard to find related patches

4.4 Lack of synchronization between teams

Mukunda: releng did a survey which collected a lot of comments about this subject already https://www.mediawiki.org/wiki/Developer_Satisfaction

Kosta:

Technical: Post screenshots in Gerrit, but that's less critical
Main issue is time allocation, recognizing and supporting
Every team has its own triage process

Gergő:

Missing organizational buy-in / valueing
Tooling: No CI support for new devs (apart from previously they got basic PHP linting, but that was killed two months ago)
Lack of culture of improving code review / how to [not] give CR feedback

JamesF:

Some friction by design, e.g. "harder" to get patches in because we've got a system that allows chains of patch dependencies (e.g. across repos)

Mukunda:

History: Deployed code (production) / branches far away from master

Lars:

Protecting production shall be better with new CI system

Evan:

Relying more on CI, Integration testing, unit testing should make CR process more pleasant and maybe better

Lars:

Scientific papers: Code review works (but only up to 200 lines of a diff); automated testing might work but less proven by science
Paper blogged by Greg Wilson, but link missing. Blog is http://neverworkintheory.org/
Let humans focus on what humans are better at doing (as it relates to reviews)

Will:

Hygiene important, what the work of the code review should be (purpose), knowledge sharing in our team, pairing within team, onboarding to areas of code, some anxiety

Andre:

There is a page on mediawiki "getting code reviews" https://www.mediawiki.org/wiki/Gerrit/Code_review/Getting_reviews
We should be "rude" and do abandon patches with a lot of changed lines
No guidelines across teams how do code review (cf. https://www.mediawiki.org/wiki/User:AKlapper_(WMF)/Code_Review#Unstructured_review_approach )
Monthly new patches email... heard that some people feel uncomfortable placing -1 with their comments

Lars:

Should go: does this change make sense, does this architecture make sense, does this code look ok. Actually goes: here is 5000 line patch.

JamesF:

Gerrit allows to say +1 without merging, appreciated. Using -1 for experienced patch authors; sometimes fixing a typo and merging for new patch authors. People should not be afraid of giving CR and giving it with appropriate language tailored to the recipient, it's their job

Lars:

Some don't do as much code review. Some don't know how to do it well. How about having monthly training session.

JamesF:

In 2011–2 there was a policy that staff developers did one day of volunteer code review each week (and 4 days regular work): https://www.mediawiki.org/wiki/Wikimedia_engineering_20%25_policy . It failed.
Code review often does not happen because there isn't anybody comfortable to review that area of code. Code reviewers avoid certain areas of code in afraid of breaking things.

Mukunda:

I might see a simple patch to review, but I am not the person who should make the decision whether it is a good idea. I don't know who would be that person.

Holger:

Code review vs. runtime behavior. Any checklist for code review and quality target for certain area of the code?

James:

Not codified in process documents
Currently a conflict(?) over policy that new features should not be merged into deployed code without an owner. Volunteers surprised, where is that written?

<chat>Suggestion to put documentation/help/how to get review links inline in gerrit</chat>

Gergő:

We are a missing "intent to merge" feature. Currently you get notified when patch is created, and then when it is merged already.

Evan:

Can we set up a metric to track and use that to test our interventions on the process? See https://www.cast-safety.org/apex/f?p=102:1 for example of successful approach.

======= Not discussed in 2019-07-09 meeting as we ran out of time: =======

Is it measurable?

Median patch queue time
Oldest unreviewed patch waiting for review
Volunteer developer retention (does the contributor make more patch requests)
Bugs in production (as a braking metric)
Sentiment analysis on comments (friendly, helpful responses)
Developer happiness surveys?
- Good, BUT: these take a long time, they're infrequent, and self-reported numbers are less reliable than behaviour (making contributions)

Patch cycle times, split out by contributor type?

What we want to accomplish as a group

What do we think we can do to solve the above mentioned problem(s)?

Other Info:

Code Review Workgroup Wiki page: https://www.mediawiki.org/wiki/Code_Health_Group/projects/CodeReview
Code Review Workgroup Workboard: phab:project/board/3766/query/all/
Historical: Code Review Session @ Wikimedia Hackathon 2019: phab:T223607
Historical: 2017 Developer Summit session: phab:T149639
Historical: 2016 Developer Summit session: phab:T114419
Historical: Code Review/Office Hours: Proposed/started in phab:T128371 (2016), abandoned in phab:T177974 (2017)
Historical: Gerrit cleanup day (2015): phab:T88531; Lessons learned
More history: https://www.mediawiki.org/wiki/User:AKlapper_(WMF)/Code_Review
Developer Satisfaction Survey 2018 results: https://www.mediawiki.org/wiki/Developer_Satisfaction
A potential Survey re: Code Reviews: phab:T211741
Fatal Airplane Crashes Drop 65% https://www.nytimes.com/2007/10/01/business/01safety.html