Jump to content

Manual talk:Coding conventions/Flow

About this board

For discussion about PHP coding conventions prior to August 2020, see Manual talk:Coding conventions/PHP/Archive.

Novem Linguae (talkcontribs)

So the limit of 100 characters per line seems like it has some pros and cons. Pros are that folks who don't have word wrapped turned on get a better experience. Cons is developers have to spend time manually doing and fixing comment line breaks. It also seems to me that all modern code editors have word wrap. So I guess my question is, are there still folks that don't have word wrap turned on? Is this a Linux thing? Just curious. Thanks!

DKinzler (WMF) (talkcontribs)

In my experience, anything that uses meaningful indendation becames hard to read when using soft-wrap. Modern code editors have gotten much better at this, but it's still not great. It works fine for a long single line comment. But it really falls apart for a complex doc block.

Here's an example of PhpStorm soft-wrapping the documentation of the LocalFileRepo configuration to 80 characters: https://phabricator.wikimedia.org/F57770695

I much prefer the line onds on the righthand side to be cut off when casually browsing.

If we were to rely on soft wrapping entirely, I suppose a big doc block would "physically" be a single line? That would also make diffs much harder to read, since they are line base.

Nikerabbit (talkcontribs)
Reply to "max-len"

[JS] Docblocks: line break between description and tags

2
Novem Linguae (talkcontribs)

I found another area of divergence between phpcs and eslint. Something like...

	/**
	 * Get the IDs of applicable PageTriage namespaces.
	 * @param Config|null $config
	 * @return int[]
	 */

...would be fine in phpcs, but would give a warning for not having a line break before @param and be autofixed in eslint (by the rule jsdoc/tag-lines). Should we make it so these two rules are in alignment? I have no opinion about which style we pick, but I do think it'd be a good idea to keep PHP and JS rules consistent with each other.

Jdforrester (WMF) (talkcontribs)

I believe that we aren't able to enforce this convention in PHPCS which is why it's not yet been done.

Reply to "[JS] Docblocks: line break between description and tags"

[PHP] Spaces in array indexes

13
DKinzler (WMF) (talkcontribs)

The convention currently states: Do not put spaces in brackets when accessing array elements. However, this is awkward if the index is not a literal, especially if it is an expression that contains spaces. For example:

  $a[0]
  $a['foo']
  $a[$i]
  $a[$i + 1]
  $a['x' . $name]
  $a[$this->get( $something, foo)]

We seem to already deviate from the convention in such cases, I found more than 100 places in core that use spaces in array indexes. I propose to change the convention as follows: Do not put spaces in brackets when accessing array elements if the index is a literal or variable. Do put spaces after the opening bracket and before the closing bracket if the index is an expression that by convention should contain spaces anywhere. For example:

  $a[0]
  $a['foo']
  $a[$i]
  $a[ $i + 1 ]
  $a[ 'x' . $name ]
  $a[ $this->get( $something, foo) ]

I think that is much more readable and intuitive. What do you think?

I'd personally also be fine to always add spaces in array indexes, if that is preferable for consistency.

Jdforrester (WMF) (talkcontribs)

(I've noted in the title that you're talking about Manual:Coding conventions/PHP rather than Manual:Coding conventions because the talk pages were all merged together for visibility.)

I think that we should do this, and for a little bit further and just use spaces always in square brackets; this would align our PHP convention with our JS one, which would make the rule very simple to remember.

The old excuse that this would be a lot of work is not a real problem, as we now have both auto-fixers in the linters and an auto-upgrader bot that will fix every repo for us.

Tgr (WMF) (talkcontribs)

I think the concern with changes like that was less that it's a lot of work and more that it touches a lot of lines (and so is a pain in the ass when doing git blame).


Apparently GitHub supports a default ignore-revs file these days; not sure about other tools.

Jdforrester (WMF) (talkcontribs)
Od1n (talkcontribs)

Agree about using spaces in declarations and accesses, i.e. [ 'foo', 'bar' ], $a[ 0 ], and []. Do not use different rules for declaration vs. access (or destructuring, etc.), nor for index being simple (literal or variable) or an expression.

Novem Linguae (talkcontribs)
Tgr (WMF) (talkcontribs)

That would mean mass changes to the codebase (arrays are pretty common). We generally try to avoid those for minor style issues because they cause a lot of merge conflicts, and make git blame harder to use.

Jdforrester (WMF) (talkcontribs)

We can add any particularly-noisy commits to a .git-blame-ignore-revs file if needed?

Tgr (WMF) (talkcontribs)
Jdforrester (WMF) (talkcontribs)

Hmm, yeah, that's irritating. Thanks for digging.

Od1n (talkcontribs)

When doing a "blaming search", all tools I usually use (SmartGit, GitHub website, Gitiles) let go to the preceding change, thus allowing to pursue the search. Ok, one annoying step is still added, but it doesn't prevent at all to complete the search. For perspective, file moves are much, much more annoying!

So, for this case, and also more generally speaking, I don't think we should avoid doing changes because of Git-related considerations.

Tgr (WMF) (talkcontribs)
Jdforrester (WMF) (talkcontribs)

@Tgr (WMF): That's great, thank you!

Back to the wider topic, now we have a mechanism to ignore noisy commits, do we want to proceed with this?

Reply to "[PHP] Spaces in array indexes"

Trailing commas in PHP arrays

7
Tgr (WMF) (talkcontribs)

Per T222042, it's possible now to enforce trailing commas in the last lines of multiline PHP arrays (with phpcbf including an automatic fix). I propose we enable it by default (add it to the MediaWiki style):

  • it makes code style less ambiguous (which is generally a good thing, because there is less room for reviewer subjectivity, and less room for different repos ending up with different style checks);
  • it simplifies diffs (only the line you added shows up as a change and not the previous one) and reduces the chance of merge conflicts;
  • it makes adding and removing lines at the end of arrays more convenient and less error-prone.
Krinkle (talkcontribs)

LGTM. I support enablement (provided it is auto-fixed) of requiring trailing commas in multi-line arrays.

I find it valuable to remove a choice that shouldn't carry any meaning in the first place (namely, the presence or absence of a trailing comma). Specifically, the choice to prefer presence (instead of absence) of trailing commas I believe improves diffs, as well as eases refactoring when re-ordering or copying code from one place to another.

With regard to Thiemo's point, about communicating the significance of a final item through absence of a trailing comma: I do not consider the presence or absence of a comma to be effective communication between developers. I think this is too subtle. When code has a limited-length return value, it should be described in its doc block, and/or covered by strictly asserting unit test, and/or typed with value shape to Phan, or if otherwise important through an inline comment. See also: omitting the (technically optional) semi-colon after the last property of a CSS rule, or after the last statement in a JavaScript function.

We can either: make the trailing symbol redundantly reflect a pre-existing semantic and train ourselves to remember to type that correctly — or, — we can make it a consistent part of mental boilerplate and remove the part of our brain that will draw attention to this detail every time we read a line of code to determine whether and what it could or should mean.

Thiemo Kreuz (WMDE) (talkcontribs)

Seeing my argument being dissected like this is … weird. Just because the absence of a comma isn't communicating much doesn't imply the opposite – that we must enforce it everywhere. I could start listing arguments again – like that I seriously don't believe such a rule would have any impact on the mental effort required to read diffs. But my point is: I also find the extra comma helpful and almost always add it. I just don't think it's worth bothering anybody with a failing CI job. 😢️

This post was hidden by Thiemo Kreuz (WMDE) (history)
MusikAnimal (talkcontribs)

+1! I have long wanted this rule to be present in our styles, and in fact (at least for JavaScript, think?) the opposite is enforced, which I find frustrating. I suggest we enforce comma-dangle for JS as well, for the same reasons we're doing it for PHP. On that note, there seems to be a lot of stylistic crossover between the two languages and I wish we made more effort to keep the code styles consistent.

Tim Starling (WMF) (talkcontribs)

Nitpicking wastes time whether it is automated or manual. I don't support making this change in core or any other repo. I would support removing the sniff from Wikibase.

Tgr (talkcontribs)

FWIW trailing commas are now mandated by PER-CS (the PSR group's attempt at an official PHP code style).

Reply to "Trailing commas in PHP arrays"

Coding convention for Lua

1
Od1n (talkcontribs)
Reply to "Coding convention for Lua"

Enforce quote styles in PHP

3
Tacsipacsi (talkcontribs)
Arlolra (talkcontribs)
Krinkle (talkcontribs)

I don't think this should be enforced in code review, PHPCS, or otherwise in CI. We have a general guideline to follow, but I don't see it helping us to require code to change back and forth between two formats on a regular basis just due to the addition or removal of a character. Local consistently and productivity are I think more important factors.

When all else is equal, e.g. writing new code, or most strings in a method or file are equally simple, then our convention says to default to single quotes. Beyond that I don't think it's worth worrying about. In my opinion, string quoting isn't the sort of thing where cognitive overhead is induced by having variants or where it adds friction during review/auditing/debugging. You are goint to encounter both variants no matter what.

In my view, the guideline mainly exists as a tie-breaker when there's uncertaintity so that nobody has to discuss which one to use if they're not sure. But when modifying existing code, or once code has been put up for review, it's imho not worth changing or discussing further.

Reply to "Enforce quote styles in PHP"
Nemo bis (talkcontribs)

The guideline says you must add to release notes when you fix a bug, but in practice this is almost never respected, especially because with git it makes merge conflicts almost certain. Automatic updates of release notes or other variants (like herds of slaves doing it in the devs' stead) are currently just dreams.

To the few who care and watch this page: remember that release notes become even more important if you remove the bug numbers from the first line of commit messages.

Krinkle (talkcontribs)

See also Thread "Git release notes merge conflicts" on wikitech-l (April 2012). Several ideas for a format and automated tool have been proposed there.

However the discomfort and some people neglecting to include release notes does not justify anything. Until and unless we have a different tool, one is expected to include release notes. You'll just have to deal with it, whether you like it or not. If you don't then the change doesn't get merged (it is trivial to amend a patch and add release notes).

You claim of it being "almost never respected" is simply untrue, there are only a small number of changes from relatively few people that don't provide release notes when they should. Most people do when they should.

By the way, I don't see how Talk:Gerrit/Commit message guidelines#Bug number in commit messages is relevant? If anything, moving them from the subject to the body keep the raw git-log easier to scan. More on that over on that talk page.

Nemo bis (talkcontribs)

I agree that release notes are important, but your "small number of changes" is just wishful thinking.

As of now there are over 350 bugs marked fixed since 1.20 was branched, and only 80 mentioned in RELEASE-NOTES-1.21 (which seems to have about 150 bullets in total)... of course it's just approximations but it seems clear that only a minority of bug fixes follow the convention.

Nemo bis (talkcontribs)
Bawolff (talkcontribs)

Historically we did not (and presumably still don't) include release notes for a bug fix that was introduced in the same release as it was fixed. Some of these bugs may be that sort of situation.

Nemo bis (talkcontribs)

This should be noted in the conventions, anyway it would leave us with at least 190 bugs fixed as of now (those filed before the branching) plus 27 enhancements (requested after the branching), and 188 reports that may need or not be mentioned.

Reply to "Release notes and bug fixes"

Unicode regular expressions

1
Amire80 (talkcontribs)

There's this patch: https://gerrit.wikimedia.org/r/c/mediawiki/extensions/Translate/+/902018/2/src/PageTranslation/TranslatablePageInsertablesSuggester.php

It did what's needed and fixed the bug in question, but I was wondering: Do we usually need the /u flag in regular expressions? If we do, then should it be documented on this page?

And can the automatic tests identify when it's not used?

Reply to "Unicode regular expressions"
Nikerabbit (talkcontribs)
JoKalliauer (talkcontribs)

@Nikerabbit: I sorted the three optimizers when they are invented, and from my personal experience I can recommend scour and svgcleaner more than SVGO.

You are right that I know much about SVG and Commons but to be honest hardly anything about MediaWiki-developing.

Reply to "Audience"

Consistency of nullable types in PHP documentation and type hints?

3
Jdforrester (WMF) (talkcontribs)

In our codebases we have a mix of old-style (Class|null) and new-style (?Class; "new" in PHP 7.1) annotations in our Doxygen blocks and so on; how would people feel about us standardising on one of these rather than having both?

In T250734 there has been some discussion about how to handle the use of nullable types in the more complicated scenario of multiple types (i.e. Class|SecondClass|null etc.), but I don't think we've had an up-front conversation at all about use of the nullable operator yet, and certainly our code is a messy mish-mash of multiple types. (Indeed, I've even seen ?Class $foo = null used.)

Given the lack of a way to handle these for multiple types, I think we should standardise on old-style annotations everywhere to create consistency for now, and write a rule (ideally with an auto-fixer) for the codesniffer tool.

Thoughts?

Duesentrieb (talkcontribs)

Only the new style types work for type hints. Using ?Foo in the type hint and Foo|null in the doc block seems annoying and inconsistent. I'd opt for using the new style everywhere.

By the way: ?Foo $foo = null is different from Foo $foo = null and from ?Foo $foo: The first means "it's optional, but can be null explicitly", the second means "it'S optional, but shouldn't be null if given explicitly", and the third one means "it's required to be provided, but can be null".

Tgr (WMF) (talkcontribs)

+1 for using old-style for consistency. PSR-5 (a draft, and a quite old one at that, but the closest thing to a standard the PHP world has today) has no ?; combining it with unions is illegal in PHP 8, which we'll eventually adopt, at which point it would be very confusing to advertise a mistake in our type docs; and having to switch between ? and |null as types get added / removed would be annoying.

By the way: ?Foo $foo = null is different from Foo $foo = null and from ?Foo $foo: The first means "it's optional, but can be null explicitly", the second means "it'S optional, but shouldn't be null if given explicitly", and the third one means "it's required to be provided, but can be null".

That sounds logical but isn't how PHP actually works. The first two are identical, as a null default implicitly makes the parameter type nullable. The semantics also depend on the context: f(Foo $foo = null, Bar $bar) and f(?Foo $foo, Bar $bar) are identical (though only for B/C with pre-? legacy code).

Reply to "Consistency of nullable types in PHP documentation and type hints?"