Jump to content

扩展:ORES

From mediawiki.org
This page is a translated version of the page Extension:ORES and the translation is 94% complete.
MediaWiki扩展手册
ORES
发行状态: 稳定版
描述 本扩展将来源于ORES 的数据合并至最近更改界面。
作者 Kunal Mehta, Amir Sarabadani, Adam Roses Wight
MediaWiki >= 1.43
数据库更改
ores_classification
ores_model
许可协议 GNU通用公眾授權條款3.0或更新版本
下載
  • $wgOresBaseUrl
  • $wgOresLiftWingAddHostHeader
  • $wgOresModelVersions
  • $wgOresAggregatedModels
  • $wgOresWikiId
  • $wgOresLiftWingRevertRiskHostHeader
  • $wgOresExcludeBots
  • $wgOresUseLiftwing
  • $wgOresModels
  • $wgOresFrontendBaseUrl
  • $wgOresCacheVersion
  • $wgOresRevisionsPerBatch
  • $wgOresLiftWingBaseUrl
  • $wgOresEnabledNamespaces
  • $wgOresFiltersThresholds
  • $wgOresUiEnabled
  • $wgOresModelClasses
季度下載量 12 (Ranked 124th)
前往translatewiki.net翻譯ORES扩展
Vagrant角色 ores
問題 开启的任务 · 报告错误

ORES扩展将来源于ORES 服务的数据合并至最近更改界面。

当前,ORES后端服务仅被配置为供Wikimedia的wiki使用。为第三方 MediaWiki 安装设置它需要大量的工作。

ORES目前在少数维基媒体基金会网站上部署,但已不再向新的旗下网站部署。 For newer work on machine learning in Wikimedia, see Machine Learning/Modernization .

截图

“需要复核”且可能具有危害的更改在最近更改特殊页面中被高亮显示。
“需要复核”标志被添加到最近更改特殊页面的图例中。

安裝

  • 下载文件,并将解压后的ORES文件夹移动到extensions/目录中。
    开发者和代码贡献人员应从Git安装扩展,输入:cd extensions/
    git clone https://gerrit.wikimedia.org/r/mediawiki/extensions/ORES
  • 将下列代码放置在您的LocalSettings.php 的底部:
    wfLoadExtension( 'ORES' );
    
  • Yes 完成 – 在您的wiki上导航至Special:Version,以验证已成功安装扩展。

当部署完毕后,需要运行维护脚本CheckModelVersions.php(在此之后也可以运行PopulateDatabase.php)

If you want to setup a local development environment for MediaWiki + ORES extension using the following ORES extension local development guide :

配置变量

以下是可供配置的变量及其默认值,附带少许简介。

// URL of the ORES service
$wgOresBaseUrl = 'https://ores.wikimedia.org/';
// Either to exclude edits made by bot to score
$wgOresExcludeBots = true;
// Models to score
$wgOresModels = [
	'damaging' => [ 'enabled' => true ],
	'goodfaith' => [ 'enabled' => true ],
	'reverted' => [ 'enabled' => false ],
	'articlequality' => [
		'enabled' => false,
		'namespaces' => [ 0 ],
		'cleanParent' => true,
		'keepForever'=> true
	],
	'wp10' => [
		'enabled' => false,
		'namespaces' => [ 0 ],
		'cleanParent' => true,
		'keepForever'=> true
	],
	'draftquality' => [
		'enabled' => false,
		'namespaces' => [ 0 ],
		'types' => [ 1 ],
	],
];
// Will replace ORES with Lift Wing for fetching scores
$wgOresUseLiftwing = false;
// URL for Lift Wing - Skippeed if null
$wgOresLiftWingBaseUrl = null;
// Thresholds of different sensitivies in ORES
$wgOresDamagingThresholds = [ 'soft' => 0.7, 'hard' => 0.5 ];
// Namespaces the ORES should score. Empty array means all namespaces.
// If not empty, it will only works on the given namespaces.
// Determine namespaces like [ 0 => true, 120 => true ].
$wgOresEnabledNamespaces = [];
// Database id for ORES service. If not determined, it'll use database name.
// You can choose 'testwiki' that ORES service sends last two digits of rev_id flipped.
// For example: https://ores.wikimedia.org/v1/scores/testwiki/damaging/12345
$wgOresWikiId = null;


Debugging an ORES extension deployment

After we deploy the extension with either ORES or Lift Wing as a backend we can use the steps in the ORES extension debugging guide to make sure it is working fine.

ORES服务响应

ORES extension is merely more than an interface to the ORES service. The service returns a probability score of edits being damaging like this (API v1):

{
  "724030089": {
    "damaging": {
      "prediction": false,
      "probability": {
        "false": 0.8917716518085119,
        "true": 0.10822834819148802
      }
    }
  }
}

It means this edit (diff=724030089) is 10% likely to have caused damage. Note that 90% likely doesn't mean 9 out of ten cases will be vandalism. Choosing thresholds should be done via analysing recall (percentage of vandalism it can catch) or false positive rate. In ORES the "soft" threshold is when recall is 75% (meaning it will include 75% of all damaging edits) and the "hard" threshold is when recall is 90%. You can get the thresholds from model info (an example).

資料庫架構

ORES extension introduces two new tables: ores_model and ores_classification. See the full database schema description.


扩展流程

计分

Once an edit is made the extension triggers a job to hit the service and store the results in the ores_classification table. It means it will not include scores for edits made before the deployment. In order to fill the database you can run the maintenance script PopulateDatabase.php. It will hits the service and keeps the score for the last 5,000 edits. You can run it several times if needed.

Once a model gets updated to a newer version CheckModelVersions.php maintenance script needs to be ran to update the ores_model table which will cause to scores stored in the ores_classification table become deprecated. You can clean these obsolete scores by running PurgeScoreCache.php maintenance script.

介面

The extension won't show anything when deployed but it will add itself as a beta feature (Extension:测试功能 is a dependency of this extension) and once it's enabled by the user it will use hooks in ChangesList (RecentChanges, Watchlist, and RelatedChanges) in both old and enhanced mode and highlights when score exceeds the given threshold.