API:Client code/Evaluations/mwclient
This page is obsolete. It is being retained for archival purposes. It may document extensions or features that are obsolete and/or no longer supported. Do not rely on the information here being up-to-date. |
The mwclient library facilitates interaction with the MediaWiki API. It can be used to make specific API calls or the many built-in methods can be used for common functions such as retrieving and editing page text, managing user permissions, and more. It has the capability to support MediaWiki installations that lack the write API.
Particularly useful or notable features of mwclient include:
- active development and a community of developers/maintainers
- offers abstraction and a clear and easy way to do tasks (not just a wrapper)
- screenscraping fallback
- full-featured for a basic client library
- decent docs and some tests
In-depth evaluation
[edit]Easy to install
[edit]- Installation instructions are correct and easy to find
- Library is packaged for installation through appropriate package library (PyPI, CPAN, npm, Maven, rubygems, etc.)
Version 0.6.5 (released in 2011) is available through PyPI. Release notes: https://github.com/mwclient/mwclient/blob/master/RELEASE-NOTES.md. Version 0.7 is in progress.
- Platinum standard: library is packaged for and made available through Linux distributions
Easy to understand
[edit]- Well designed--makes all intended API calls available with the intended level of abstraction with no redundancies
Makes many MediaWiki API calls available and also lower-level API requests.
- Platinum standard: makes the Wikidata API available
It may be possible with available methods, but using the requests
library with the Wikidata web API endpoint appears to be an easier option to access the Wikidata API.
- Well documented
- Code is commented and readable
- The development version is mostly PEP8-compliant, and has some comments. However, more consistent comments and more/better docstrings would make it easier to understand how the code works.
- Documentation is comprehensive, accurate, and easy to find
- Documentation is easy to find (
README.rst
,REFERENCE.md
, and the mwclient GitHub wiki) and appears accurate. The documentation is not complete; the section on Generators, in particular, would be very helpful for a less experienced Python developer who is trying to understand the underlying structure of the library.
- Deprecated functions are clearly marked as such
- n/a
- Platinum standard: Documentation is understandable by a novice programmer
- The README and the documentation on the wiki is clearly written and does not assume very much background, and there is a handful of tutorials that a new user can expand on.
- Code uses idioms appropriate to the language the library is written in
Library is distinctly Pythonic; for x in y
is used to iterate, *args
and **kwargs
are used for variable numbers of arguments, functions like .iteritems()
are used to deal with dictionaries, and Python's generator-type functions are used.
Easy to use
[edit]- Has functioning, simple, and well-written code samples for common tasks
- Demonstrates queries
- Demonstrates edits
- Handles API complications or idiosyncrasies so the user doesn't have to
- Login/logout
- Cookies
- Tokens
- Query continuations
- Requests via https, including certificate validation
- As of version 0.6.5, mwclient uses
urllib
to make the API calls, which does not validate SSL certificates and is vulnerable to an interception attack. Using a library likerequests
would make the library more secure and also make the code more readable. However, version 0.7 is being finished and will userequests
for better security and compatibility with the latest version of MediaWiki. - Courteous API usage is promoted through code samples and smart defaults
- gzip compression is used by default
- Examples show how to create and use a meaningful and unique user-agent header (as in https://meta.wikimedia.org/wiki/User-agent_policy)
- Platinum standard: generates a unique user-agent string given name/email address/repository location
- Efficient usage of API calls
- The generator functions iterate over lists, and therefore make multiple API calls for a given list of pages instead of making a single call to retrieve information on multiple titles.
- Can be used with the most recent stable version of the language it is written in (e.g. Python 3 compatible)
Python 2 (2.4+) only.
- Comment on method names
It is confusing that the user calls page.edit()
to retrieve the text of a page, and page.save(...)
to edit the text of a page. Consider deprecating/renaming these functions for clarity. "Generator"-type classes could clarify whether they refer to/use Python generators or the MediaWiki API's generator module.
Easy to debug
[edit]- Contains unit tests for the longest and most frequently modified functions in the library
There is a unit test for Site.site_init
but no others. There are integration tests in basic_edit_test.py
and api_upload_test.py
. These were last updated in June 2013.
- Platinum standard: Unit tests for many code paths exist and are maintained
- Terrible hacks/instances of extreme cleverness are clearly marked as such in comments
Some things are marked as "BAD." or "hack" but there is no commentary on why this is the case.
- Documentation links to the relevant section/subpage of the API documentation
Links to API in documentation, but not to subpages.
Easy to improve
[edit]- Library maintainers are responsive and courteous, and foster a thoughtful and inclusive community of developers and users
Yes. Maintainers and contributors are courteous, responsive, and interested in improving the project.
- Platinum standard: Project sets clear expectations for conduct for spaces where project-related interactions occur (mailing list, IRC, repository, issue tracker). It should:
- State desired attitudes and behaviors
- Provide examples of unwelcome and harassing behavior
- Specify how these expectations will be enforced
- Pull requests are either accepted or rejected with reason within 3 weeks (Platinum standard: 3 business days)
Recent pull requests were merged within 3 days.
- Issues/bugs are responded to in some manner within 3 weeks (Platinum standard: 3 business days) (but not necessarily fixed)
Responsive maintainers and a library in active development.
- The library is updated and a new version is released within 3 weeks (Platinum standard: 3 business days) when breaking changes are made to the API
n/a; breaking changes have not affected mwclient
- Platinum standard: library maintainers contact MediaWiki API maintainers with feedback on the API's design and function
- Library specifies the license it is released under
MIT License.[1]
Suggested TODOs
[edit]- Code-related
- Ensure that version 0.7 is packaged for PyPI and any other desired package repositories
- Add and improve docstrings and expand on terse comments (i.e. explain why a line is
"BAD."
)[2] - Add unit tests for the most commonly used and frequently updated portions of the library[3]
- It is confusing that the user calls
page.edit()
to retrieve the text of a page, andpage.save(...)
to edit the text of a page. Consider deprecating/renaming these functions to improve usability.[4] "Generator"-type classes should clarify whether they refer to/use Python generators or the MediaWiki API's generator module. - Use the
requests
library for http requests (this is in progress for v. 0.7) - Make mwclient compatible with Python 3[5]
- Iterating over a list and calling the API for each item is an inefficient use of API calls. Efficiency in API usage is an important feature of a gold standard library. If you are interested in gold standard status, consider making this more efficient by combining API calls as much as possible (e.g. using generators and combining results:
title=title1|title2|...
). One option may be a constructor method that collects Page requests and enables larger, less frequent API calls. It may be possible to take advantage of the database-like structure of the MediaWiki API and help users save bandwidth.[6]
- Process-related
- Improve documentation:
- Link to relevant API subpages in method documentation[7]
- Include information on Generators[8]
- Include examples of individual user/contact information in the user-agent (also consider writing a function to make this easier for logged-out users)[9]
- Add a basic description of the workflow this library expects[10]
If these issues are addressed, mwclient will meet the gold standard and will be listed as such on API:Client code.