The revelation that four errant measurement whiteboxes skewed the leaderboard in the latest Measuring Broadband Australia report (“ACCC revises errors in broadband measurement scheme”, CommsDay 22-Apr-2021) is disconcerting, but begs the bigger question – how large does one need to make the sample size in order to get a robust (i.e. statistically significant) answer that is not so sensitive to errant measurements? The answer, unfortunately, is that it may simply be infeasible.
Having spent considerable time analysing around 17 million speed-test measurement results collected via the Google M-Lab platform over a 2-year period across 12 ISPs in 3 countries, our research rapidly reached the conclusion that comparing averages across ISPs was largely meaningless because the measured differences were lower than the “noise” inherent in the data [1]. To understand why, compare the 98.7% average (in relation to speed plan) for Optus against the 98.9% average for TPG as reported in the original MBA report – even for a 100Mbps plan, this difference corresponds to a mere 0.2Mbps, which is well below the noise introduced by a range of test factors such as latency to the test server, number of concurrent TCP threads, accuracy of the clock in timing the test duration, whether the test traverses 100M vs 1G interfaces of the home router, etc. Correcting for the significant noise introduced by these “biasing factors” turns out to be non-trivial [2], even using sophisticated data analytics techniques such as multivariate matching, and even when millions of data points are available (as was the case in M-Lab). Comparing RSPs using average measured speed is therefore an exercise in futility.
As was outlined in CommsDay on 31-Mar-2021 (“Canopus claims AI tech will revolutionise broadband measurement outputs, economics”), the only meaningful approach moving forward is to report on “experience” rather than speed. For starters, consumers would struggle to comprehend what a 0.2Mbps speed difference means to them, but for sure know how it feels to experience a Netflix spinning-wheel or a gaming lag spike! Equally importantly, experience can be measured for real applications in their natural environment, unlike speed testing that generates unnatural blasts of synthetic traffic. Finally, with recent developments in Software Defined Networking and Artificial Intelligence capabilities, experience can be measured 24×7 for a broad basket of applications at large scale and at low cost.
If ‘speed testing’, as referenced in the article, is the key determinant of “a telco leaderboard and a significant marketing tool”, then the validity of the leaderboard itself is seriously in question. This has material business implications for every service provider. A relatively small number of ‘speed measurement boxes’ at the end of the link for active speed testing is passé and misleading. It is time to step up and embrace passive experience measurement at scale.
References
[1] X. Deng, J. Hamilton, J. Thorne, V. Sivaraman, “Measuring Broadband Performance using M-Lab: Why Averages Tell a Poor Tale”, Proc. International Telecommunication Networks and Applications Conference (ITNAC), Sydney, Australia, Dec 2015.
[2] X. Deng, Y. Feng, T. Sutjarittham, H. Habibi Gharakheili, B. Gallego, V. Sivaraman, “Comparing Broadband ISP Performance using Big Data from M-Lab”, arXiv2101.09795, https://arxiv.org/abs/2101.09795