Reading/Web/Lazy loading of images on Bengali Wikipedia
In progress
On May 9th 2016 (4pm PST) we rolled out lazy loading of images to all users of Bengali Wikipedia. The purpose of this action was to collect data around page views, global NavigationTiming results from a project to guide a future roll out.
Analysis
[edit]This section is currently a draft. Material may not yet be complete, information may presently be omitted, and certain parts of the content may be subject to radical, rapid alteration. |
Rollout of lazy loaded images on bn.m.wikipedia.org suggests a nontrivial speed improvement in page fully loaded time (initial lag excluded) on HTTP1, likely no speed improvement on HTTP2, and again a significant reduction in image bytes shipped per pageview, leading to lighter weight pages.
Note well that the x-axis in the speed graphs is presented in log2n increments.
To reduce noise, navigation timing events are filtered to ensure key fields are present, with an emphasis on anonymous, non-redirected, plain article pageviews.
To simplify analysis for bytes shipped calculations, eligible source HTML pages are constrained as to a relative path /wiki/
(on language variant wikis this path would be different, but that's out of scope here) for a given wiki without a colon ":" character in the remainder of the path, with a restriction that responses must be HTTP 200s, in order to avoid overcounting of 30x, 40x, or other such spurious responses. This aids in narrowing down the analysis to requests likely to be plain article pageviews. Image bytes are constrained to those served from upload.wikimedia.org with an eligible Referer for the same restriction as page paths.
Changes introduced to support lazy loaded images required modified (increased) JavaScript/CSS/HTML.
bnwiki
Navigation timing data for bnwiki were sparse, making analysis difficult.
Based on data from 5-11 May 2016, slightly before the initial lazy loaded images went into force, and 23-29 June 2016, the week corresponding to the latest Thursday to Wednesday week with the most up-to-date loading technique 9 (the same week used for the latter week for ukwiki and fkwiki in above analysis), pages with lazy loaded images loaded faster at the 10th, 50th (median), and 90th percentiles on HTTP1. But pages with non-lazy loaded images loaded faster at the 10th, 50th (median), and 90th percentiles on HTTP2.
A certain level of chaos in the events is evident in the line chart.
Comparison of 14-May-2016 - 11-May-2016 (prior to lazy loaded images) versus 2-29 June 2016 (well after lazy loaded images implemented) paints a slightly more complete picture, at the expense of more general time based trends potentially complicating the data.
Taking the data at face value, it again appears that HTTP1 lazy loaded pages loaded faster at the 10th, 50th (median), and 90th percentiles. On HTTP2 lazy loaded pages loaded faster at the 90th percentile, but slower at the 10th and 50th (median) percentiles.
Percentile | HTTP1 w/o lazy | HTTP1 LAZY | HTTP2 w/o lazy | HTTP2 LAZY | H1 LAZY Improvement | H2 LAZY Improvement |
---|---|---|---|---|---|---|
10% | 1675.4 | 1552.5 | 1217.6 | 1334 | 7.34% | -9.56% |
50% | 6672 | 5825 | 2930 | 3239 | 12.69% | -10.54% |
90% | 23301.4 | 21870 | 8288.6 | 8162 | 6.14% | 1.53% |
Something interesting occurred with bnwiki. The relative amount of HTTP1 traffic was considerably greater (72.94% of share) on lazy loaded images than without them (61.13%) for the 4 week window comparison. This same trend was observed with comparison windows closer to the switchover of varying lengths. This suggests that perhaps lazy loaded images had a larger impact on the relatively slower connections for bnwiki (twice as slow at the median prior to the change), many originating from Bangladesh.
Data transfer comparisons were more straightforward. Based on examination of two 3-day periods (3-5 May 2016 vs 28-30 May 2016), image bytes per pageview were reduced by about 40.24%. This contributed to a decrease of about 22.73% in bytes shipped for the modified JavaScript/CSS/HTML plus images as compared to the baseline JavaScript/CSS/HTML and images.
Data Transfer
month | day | uri_host | Page & scoped rl bytes | image bytes | scoped pageviews | Page & rl per pv | image per pv | total avg | total reduction | image reduction |
---|---|---|---|---|---|---|---|---|---|---|
5 | 3 | bn.m.wikipedia.org | 1501832097 | 2700722786 | 85180 | |||||
5 | 4 | bn.m.wikipedia.org | 1555299806 | 2758045895 | 87959 | |||||
5 | 5 | bn.m.wikipedia.org | 1576132685 | 2824312174 | 89299 | 17654.701636196 | 31562.0483885718 | 49216.7500247678 | ||
6 | 28 | bn.m.wikipedia.org | 2087510493 | 2017188349 | 106948 | |||||
6 | 29 | bn.m.wikipedia.org | 1947149236 | 1930773662 | 101976 | |||||
6 | 30 | bn.m.wikipedia.org | 1889671190 | 1882166445 | 100171 | 19166.6992963328 | 18861.9306556237 | 38028.6299519565 | 0.227323422761173 | 0.402385725305036 |
Queries
The following query was used to derive the lag-excluded load time, roughLoadTimeInitialLagExcluded.
select
left(timestamp,8) as ts,
event_lazyLoadImages,
event_isHttp2,
event_loadEventEnd-event_responseStart as roughLoadTimeInitialLagExcluded,
event_responseEnd-event_responseStart as roughNetworkTimeInitialLagExcluded,
event_requestStart,
event_responseStart,
event_responseEnd,
event_firstPaint,
event_domInteractive,
event_domComplete,
event_loadEventStart,
event_loadEventEnd,
webHost,
event_originCountry,
event_mediaWikiVersion
from NavigationTiming_15485142
where
timestamp > '20160412'
and timestamp < '20160630'
and event_action = 'view'
and event_isAnon = true
and event_mobileMode = 'stable'
and event_namespaceId = 0
and event_redirectCount is null
and event_loadEventEnd is not null
and event_domComplete is not null
and event_domInteractive is not null
and event_responseStart is not null
and wiki in ('bnwiki')
order by
wiki
event_lazyLoadImages,
event_isHttp2,
ts
;
The following query was used to derive image bytes transferred using the constraints described above.
select
month,
day,
substr(referer,1,26),
sum(response_size)
from
webrequest
where
year = 2016
and ((month = 5 and day in (3, 4, 5)) or (month = 6 and day in (28, 29, 30)))
and uri_host = 'upload.wikimedia.org'
and referer rlike '^https://(bn).m.wikipedia.org/wiki/([^:])+$'
and content_type rlike '^image'
and agent_type = 'user'
and http_status = '200'
group by
month,
day,
substr(referer,1,26)
;
The following query was used to derive page and JavaScript/CSS bytes (pre- and post-modification for lazy loading) transferred using the constraints described above.
fa
select
month,
day,
uri_host,
sum(response_size)
from
webrequest
where
year = 2016
and ((month = 5 and day in (3, 4, 5)) or (month = 6 and day in (28, 29, 30)))
and (
(
uri_host = 'bn.m.wikipedia.org'
and ((uri_path rlike '^/wiki/([^:])+$') or (uri_path = '/w/load.php' and uri_query rlike 'skins\.minerva\.icons\.images\.scripts&skin=minerva' and referer rlike '^https://bn.m.wikipedia.org/wiki/([^:])+$'))
)
)
and agent_type = 'user'
and http_status = '200'
group by
month,
day,
uri_host
;
The following query was used to derive pageviews for the using the constraints above. Practically all matching records were qualified as pageviews, largely ruling out the possibility of image byte transfer counts with proper Referer values being derived from anything other than qualified pageviews.
select
month,
day,
uri_host,
content_type,
is_pageview,
count(1)
from
webrequest
where
year = 2016
and ((month = 5 and day in (3, 4, 5)) or (month = 6 and day in (28, 29, 30)))
and (
(
uri_host = 'bn.m.wikipedia.org'
and uri_path rlike '^/wiki/([^:])+$'
)
)
and agent_type = 'user'
and http_status = '200'
group by
month,
day,
uri_host,
content_type,
is_pageview
;
Caveats
As with any data spanning time series and the myriad complexities involved with different devices and environments, data are subject to fluctuation. However, the data transfer savings are unambiguous, and the larger event sampling pool with ukwiki and fawiki lend a degree of confidence that pages are actually loading faster.