Chemical Markup support for Wikimedia Commons/Internship Report
Appearance
This page is obsolete. It is being retained for archival purposes. It may document extensions or features that are obsolete and/or no longer supported. Do not rely on the information here being up-to-date. This was a Google Summer of Code/2014 project/proposal. |
This page is obsolete. It is being retained for archival purposes. It may document extensions or features that are obsolete and/or no longer supported. Do not rely on the information here being up-to-date. This was a Google Summer of Code/2014 project/proposal. |
minimum viable product and goals
[edit]communication plan
[edit]- daily IRC check-in and weekly hangouts with mentors
- plan to do a lot of community involvement by pinging interested parties as soon as a minimum viable product is available in a test environment
- this is to find issue but also to make a better product by gathering feature requsts
lessons learned since 21 April
[edit]- issues are often more complex then they look like
- while someone is doing code review, it's often good to have a means of real-time communication
- evaluating different tools took a considerable amount of time
- not so surprising as I am a longterm member:
- there's a lot of setup work and helpful build services
- the above doesn't always work
- as well as deployment options like vagrant wmf-labs
setting up a working environment
[edit]Operating system | Server | IDE | performance | notes |
---|---|---|---|---|
Vagrant/Ubuntu 12.x as a VBox guest | LAMP | - (you can presumably use vim) or just your preferred editor on the host system | OK patience required | easy to set up; |
openSUSE 12.x as a VBox guest | LAMP | - (using kate) | OK | easy to set up if not even choosen LAMP during installation |
Ubuntu 14.04 as a VBox guest | LAMP | - (using gedit) | OK | easy to set up (apt-get, ubuntu software center etc.) |
Windows 8.1 as a VBox guest | (W)AMPPS | PHPStorm | Too slow. | easy to set up; PHPStorm as autoated spell checker, spots errors etc. and break-point debugging; however you have to limit HDD bandwidth if you are not on SSD and want to be able to use your host system -- I guess it's related to windows installer post installation optimization services |
Windows 7 | EasyPHP (WAMP) | Eclipse (pdt) | OK | easy to set up; break-point debugging works fine; eclipse has several helpful plugIns [not fully tested all]; however image converts/scalers are not on-board on Windows: code style templates: php, js; Running PHP from CLI; CLI debugging; Important: Adjust the default char encoding. It should be UTF-8. |
week 1
[edit]- 00:48, 20 May 2014 (UTC) cloned paged tiff handler extension to hack it; gerrit:133069
- 22:45, 21 May 2014 (UTC) MIME-type detection does not detect ....
- 23:21, 23 May 2014 (UTC)
- Eclipse + XDebug allowed me to conveniently debug MimeMagic.php
finfo_file
is returningtext/plain
- There is no extension-hook for overwriting the MIME detected by the fileinfo module
- Consequently it requires changes to the core
doGuessMimeType
can be augmented (the proper way; specification doc available, however not required for a minimum viable product)improveTypeFromExtension
currently used
- New challenge: Table
image
, fieldimg_major_mime
is of type enum, not including the non-standardchemical/*
as suggested by the American Chemical Society (ACS)- Bawolff suggested in IRC augmenting the enum by the non-standard chemical. Let's see what "core-DB-people" tell me when I submit a patch to do so.
- SQLite doesn't have something fancy like enum; running the local wiki sets correct values in the DB
- But MSSQL support had some bugs: bugzilla:65757 that I need work around or fix first
- 10:35, 27 May 2014 (UTC)
- Installed MSSQL Server 2014 (important: the full-text index feature must be installed; otherwise installation of MediaWiki fails)
- MSSQL: Learned about T-SQL and that stored procedures are no good means for updating because creating and running them may require other privilegues than the updating user has
- Result: gerrit:135714
week 2
[edit]starting 2 June 2014
- gerrit:135756 augments major MIME types by "chemical"
$wgMediaHandlers
can be used to hook-up the extension. Obviously I have to create a class that inherits from the abstract MediaHandler class. Hence, some functions must be re-implemented.- What I want to find out is under which conditions and when Metadata extraction happens. Because it *should* happen *after* the intermediate SVG is created from the Chemical Table Files so I can properly build on the SVG scaling logic.
week 3
[edit]starting 9 June 2014
- There are a couple of challanges:
- Temporary files as stored by PHP after uploading do not have an extension. In certain scenarios, it was possible that rendering failed due to that because I didn't do MIME type detection by content.
- MIME type detection by content for small files failed gerrit:138737
- Bugs in the updater: It applies patches despite tables.sql has been updated in between.
week 4
[edit]starting 16 June 2014
- Building a living example on tools.wmflabs.org just to have a proof-of-concept
- Requesting new project on wikitech:New Project Request/MediaHandler tests for testing in larger-scale environment for the following goals:
- Giving contributors the option to test MolHandler with reasonable response times
- Measuring server load indigo-depict causes
- Having detailed logs to spot errors
- Having an environment that is similar to the WMF cluster (i.e. with simulation of dedicated "image scalers")
week 5
[edit]starting 23 June 2014
- Submitting extension code produced so far to gerrit
- gerrit:140732 — Foundation for the extension
- Did simple performance measurement with Linux tools (that I am not used with) like
/usr/bin/time -v
and learned about whatulimit
does and how. - Profiling with valgrind might be also interesting.
- Did simple performance measurement with Linux tools (that I am not used with) like
- gerrit:141241 — Hooks intercepting with MimeHandler.php
- gerrit:140732 — Foundation for the extension
- Created instance in new project at labs (puppet fails on trusty so no auto-config, yet)
- Security group for web access - note that after instance creation, security groups can't be changed for an instance
- Proxy creation for web access - mol.wmflabs.org <-> instance:80
- Installed extension ConfirmEdit and created some geeky questions to prevent spam bots and stupid spam users creating accounts
- Several other extensions to support content, layout and markup
- Imported and translated JavaScript for taking screenshots and uploading them
- GuidedTour extension for getting started
- Zillions of micro-commit to the labs-deploy repo
- LocalSettings config: Allow everyone to create accounts.
- Imported some content to play with.
- Worikng on trouble reporter: Gadget that allows picking elements, drawing it on canvas, getting PNG from canvas, dumping HMTL of element and including both into an error report. Expected to be completed on Friday. Possibly a new extension could be derived from this work. It's pretty handy to have sth. that allows users to quickly report issues without having the hoops to re-type everything.
- Not yet completed.
week 6
[edit]starting 30 June 2014
- still working on trouble reporter: UI and information I wanted to gather implemented; it just needs to upload the file and append a pretty-formatted report to a page
- Found something very neat: Port forwarding through SSH. This allows me to access a webserver's administration interface running on a different port without having it to expose to the public behind one of the WMF-Proxies.
- SVGEdit's JavaScript is not state-of-the-art (leaving aside the TODOs and shortcomings like
if(e.data.substr(0,4)==='SVGe'){ //because svg-edit is too longish
). If it should be deployable to WMF, it would need a lot of efforts to prettify it. - Created a WMF-Labs instance for static files (svg-edit as of now) for security when it comes to editing SVG files - as cross-domain-scripting through iframes won't work. And for performance (almost cookie-free domain) running a Cherokee server which I found was easy to config.
week 7
[edit]starting 7 July 2014
- Demo wiki now has all features, I anticipate people need for evaluation and testing and giving feedback including a tool for dragging & dropping possible features into categories like "minimum viable product".
- re-iterating on code and using feedback provided on Demo, Commons, Commons mailing list post, Multimedia mailing list post. Nemo_bis suggested contacting WikiProjects (WP) and Wikibooks shelves (or whatever the name).
- Options for meeting the coding conventions evaluated.
week 8
[edit]starting 14 July 2014
- Goal for today (14 July 2014): Find default settings for indigo-depict that meets de:Wikipedia:Wie erstelle ich Strukturformeln?/Tutorial Strukturformeln#Maße and en:Wikipedia:Manual of Style/Chemistry/Structure drawing#Suggested molecule editor settings
- Goal for this week: Create code to allow these parameters set site-wide and per-image.
- Goal not achieved.
week 9
[edit]starting 21 July 2014
week 10
[edit]starting 28 July 2014
- Hooks to improve MIME type detection of Chemical table files written. This was necessary because these files are not detected by a default MediaWiki installation. System administrators could install a package to allow the identify command to detect these types correctly but this involves extra-work.
- Made memory limit configurable on a per tool basis.
week 11
[edit]starting 04 August 2014
- Documentation clean up in core.
- MolHandler is now available through MediaWiki vagrant. This allows developers to easily play with MolHandler as well as deployment to WMF labs instances just ticking a check box.
week 12
[edit]starting 11 August 2014
week 13
[edit]starting 18 August 2014
week 14
[edit]starting 25 August 2014