Jump to content

Talk:Requests for comment/Abstract table definitions

About this board

Mattflaschen (talkcontribs)

Are there any similar existing tools out there? I realize there may be nothing that fits well. However, since this would be a major change regardless, it's worth looking into.

Doctrine has a database abstraction layer that can do schema changes. We may be able to use that even if we don't use other parts of Doctrine.

This post was posted by Mattflaschen, but signed as Superm401.

Dantman (talkcontribs)

Actually even if we don't use Doctrine (I'm not sure how Doctrine's DB coverage compares to our edge cases) that brings up an alternate idea. If someone didn't want to build a DSL or use arrays (which have so much syntax overhead it becomes hard to read the database information) writing the database layout in PHP would be an alternate idea.

Individual PHP based migrations as (classes?) in different files like Ruby's ActiveRecord's migrations which iirc some of the python SQL abstractions do too.

<?php
namespace MediaWiki\DatabaseMigrations;

class DummyMigration extends DatabaseMigration {
  public function up() {
    $foo = $this->table('foo', 'f');
    $foo->addColumn( 'extra', 'bytes' )
      ->after( 'data' );
    $foo->changeColumn( 'ts', 'time', 'timestamp' )
      ->nullable( true );
  }
}
Mattflaschen (talkcontribs)
Saper (talkcontribs)

I didn't know about this thread and I have started a new discussion about trying Doctrine's database abstraction layer on Discourse.

Reply to "Existing tools"
Legoktm (talkcontribs)
Reply to "Insert a table"
Sharihareswara (WMF) (talkcontribs)

This RfC is due to be discussed briefly on April 9th; join us!

Dantman (talkcontribs)

I'm going to be leaving around 1PM for some family plans, I won't have an internet connection around 3PM.

Sharihareswara (WMF) (talkcontribs)

Discussion from yesterday's meeting:

22:17:17 <sumanah> #topic Abstract table definitions
22:17:20 <sumanah> #link https://www.mediawiki.org/wiki/Requests_for_comment/Abstract_table_definitions
22:17:24 <sumanah> #info This should be quick. I'm assuming the comment from Tim, "Seek comment on DSL details." from late July 2013, still holds. I don't think this one needs any discussion, I just wanted to give people a heads-up that it exists, in case anyone wants to collaborate with Daniel Friesen on it.
22:17:40 <MaxSem> okay, my opinion about it was requested and I'm sceptic
22:17:55 <sumanah> "... this is a proposal for an abstract language for defining tables, indexes, alters, etc... that; Can be sanely written and read. Is not raw sql (we'll probably PEG parse it) and can be used to build the database for any database engine we support. And can be used by extensions too to define their database pieces abstractly so the extension will work in other databases."
22:18:14 <csteipp> parent5446: bug 41201
22:18:25 <MaxSem> what's abstract in proposed "table hitcounter {
22:18:25 <MaxSem> @ENGINE(HEAP)
22:18:25 <MaxSem> @MAX_ROWS(25000)
22:18:25 <MaxSem> hc_id uint
22:18:25 <MaxSem> }" ?
22:18:37 <awight> I think it's much cleaner to use a basic SQL syntax and transform that to make use of extensions when possible. But parsing is a pain.
22:18:42 <MaxSem> it's just MySQL-specific DDL in a different wrap
22:18:51 <parent5446> csteipp: Thanks. That's for the the ClientSession part
22:19:14 <brion> so the last time i tried something like this was for StatusNet
22:19:23 <brion> using structured arrays rather than a DSL so no parsing ;)
22:19:35 <awight> Yeah, Drupal uses that approach. It's pretty heinous...
22:19:37 <brion> but i also tried to be clever about auto-applying schema updates, and that was hell
22:19:48 <MaxSem> I guess StatusNet didn't need to support holy Oracle?
22:19:50 <brion> explicit updates are probably better
22:19:53 <saper> we are close to getting structural arrays in DatabaseUpdaters
22:20:03 <brion> no, oracle could take a flying &$^# for all we cared ;)
22:20:11 <MaxSem> ...together with SQL server and a bunch of FLOSS DBs
22:20:13 <sumanah> ok, so I was wrong and this does inspire discussion :-)
22:20:21 <parent5446> Has anybody looked at Doctrine DBAL's schema generator?
22:20:41 <sumanah> #info skepticism.
22:20:41 <MaxSem> NO FRAMEWORKS IN CORE!!1 :P
22:20:45 <brion> nope, got a link handy?
22:20:49 <parent5446> It's not the same as this RFC, but provides some of the same deliverables
22:20:52 <parent5446> http://docs.doctrine-project.org/projects/doctrine-dbal/en/latest/reference/schema-representation.html
22:21:07 <awight> Hehe. we'll probably need a code generation step in core anyway
22:21:21 * brion kind of hews to the idea that mysql *is* a database abstraction layer…. from SQL to myisam/innodb/etc ;)
22:21:44 <sumanah> Since Daniel isn't here, I feel like I should just point him to this log and ask whether he has any responses
22:21:52 <MaxSem> can we just ditch myisam support?
22:22:14 <brion> people keep wanting postgres and sometimes oracle and mssql, so it seems like there’d be interest in having consistent support, but without the abstraction nobody maintains them
22:22:14 <sumanah> #action sumanah to point Daniel Friesen to log
22:22:39 <saper> MaxSem: do we support myisam at all?
22:22:50 <MaxSem> last time i checked
22:23:04 <brion> parent5446: looks mostly code-based, gets pretty verbose?
22:23:06 <sumanah> anyone in the WMF SF office? can you see whether Steven Walling &/or Ori Livneh are around? because we're about to move on to their RfC
22:23:25 <awight> swalling isn't here
22:24:06 <parent5446> Yeah it's mostly code-based
22:24:27 <parent5446> But it's a lot easier to integrate than inventing our own SQL replacement
22:24:33 <brion> my main worry there is verbosity kills maintenance
22:24:37 <brion> true :D
22:24:42 <awight> re: doctrine, isn't there a pure YAML schema syntax?
22:24:44 <sumanah> I'm about to change topic - if people wanna keep talking about the wonderful world of SQL, ORMs, etc., take it to #wikimedia-dev ? :)
22:24:45 <brion> scary DSLs also kill maintenance ;)
22:24:52 <brion> ok i’m done :)
22:25:06 <parent5446> http://doctrine-orm.readthedocs.org/en/latest/reference/yaml-mapping.html
22:25:12 <saper> I'd love we had record locking under control instead... every second SELECT has now FOR UPDATE or so :(
Reply to "April 9th discussion"
Tim Starling (talkcontribs)

What is the reason for the arbitrary changes from SQL to this proposed language, for example, the addition of braces? I could imagine a rationale for using a completely different language, like XML, but I don't really see the rationale for taking SQL and changing the grammar in arbitrary ways that make it more difficult to port the existing patch files.

Dantman (talkcontribs)

The aim was to have a well defined, simple, and readable DSL for this. The braces are just a flavor you could bike-shed over but the general idea is to be able to simply declare say a table with individual lines nested inside of it defining the columns. It's not SQL, it's just declaring things.

I didn't use SQL because that would basically involve re-implementing parsing of SQL in PHP just to map to Database class methods. It would be unnecessarily complex to implement and wouldn't properly cover the important subtleties to our database structure. And it wouldn't be very readable or cover things like the fact that practically every column is NOT NULL making declaring that excessive and unfriendly to people defining new columns and tables.

While declaration can be done with something like XML or PHP arrays that kind of thing has already been attempted. And I found that these just end up extremely verbose. They are extremely hard to read and understand what's going on, even worse than our SQL files.

As for porting existing things. That will never be easy. These table definitions and modifiers aren't completely equivalent to any single sql patch or file we have. Our tables have differences between database engines that go beyond simple SQL syntax differences and simple type differences. We use references/foreign keys in some engines but not in others. And it's impossible to declare a mapping using types from our declarations for any one database engine that will consistently map to the types we use in other database engines. So porting migrations and tables will always require thought and effort no matter what the syntax.

Tim Starling (talkcontribs)

It's defensible, I suppose. I'd like to hear other opinions on the details of the language from people who have dealt with this area of the code, such as Chad Horohoe, OverlordQ and Max Semenik.

MaxSem (talkcontribs)

I indeed attempted this with a PHP array before, together with Chad. We ended up abandoning this effort to avoid delaying the new installer infinitely.

My main concern is that this change should make things easier, not harder. The more elaborate the DSL is the greater is the chance that we will simply move the complexity to other part of the software in the longer run - DB updates might become easier but now we'll have to maintain a huge DSL abstraction layer. Before supporting this RFC, I would like an assesment of how cumbersome will the implementation be.

Reply to "SQL vs DIY"
There are no older topics