来源:《卫报》
原文刊登日期:2020年12月6日
Google’s artificial intelligence company DeepMind won an international competition that asked entrants to predict how proteins fold in three dimensions given only the sequences of their chemical links, or amino acids.
谷歌的人工智能公司DeepMind在一项国际竞赛中获胜,该竞赛要求参赛者仅根据蛋白质的化学链(氨基酸)序列预测蛋白质的三维折叠方式。
Set up in 1994, progress in the “Critical Assessment of protein Structure Prediction” (CASP) race had almost come to a halt. Many in the field had given up hope they would live to see a solution. However, in 2018 DeepMind had won the “protein Olympics” by some distance. This year its AlphaFold2 software lapped the opposition. While it was a great leap for CASP, it seemed a small step for humanity. DeepMind trained a neural network on protein-structure databases to learn what proteins look like. It did so by rapidly learning what evolutionary adaptations had occurred over millennia and using those insights in its guesses.
成立于1994年的“蛋白质结构预测的关键评估”(CASP)竞赛几乎停滞不前。该领域的许多人已经放弃了有生之年看到解决方案的希望。然而,在2018年,DeepMind以一定的优势赢得了这个“蛋白质奥运会”。今年,它的AlphaFold2软件完胜对手。虽然这对CASP来说是一大步,但对人类来说似乎是一小步。DeepMind在蛋白质结构数据库上训练了一个神经网络,以了解蛋白质的样子。它通过快速了解数千年来发生了哪些进化适应,并利用这些洞见进行猜测。
DeepMind’s achievement answers one big scientific question but raises more fundamental ones for society. Part of a profit-seeking company, DeepMind pays large salaries for scarce AI talent. Its groundbreaking news was announced in a company press release. It has yet to submit a paper describing its work to a peer-reviewed journal, though it has this year published one on its 2018 CASP entry. A simple idea underpins science: results should always be subject to challenge from experiment. Commercial firms may want to be trusted more than scrutinised.
DeepMind的成就回答了一个重大科学问题,但也为社会提出了更基本的问题。作为一家追求利润的公司的一部分,DeepMind为稀缺的人工智能人才支付了高薪。这一突破性的消息是在公司的新闻稿上宣布的。它还没有向同行评议的期刊提交一篇描述其工作的论文,尽管它今年已经发表了一篇关于2018年的CASP参赛的论文。科学的基础是一个简单的理念:结果应该总是受到来自实验的挑战。商业公司可能更希望被信任而不是被仔细审查。
If there is a paradigm shift in biology that DeepMind represents it is artificial intelligence’s impact on biology. In 2020, it is thought there will be 21,000 scientific papers involving AI methods in this branch of science – and this is growing at 50% a year. It is also dominated by tech giants whose code is their intellectual property, making it particularly opaque. Only 25% of AI papers publish their code. DeepMind, say experts, regularly does not. This impairs accountability and reproducibility and ultimately may hamper progress. There are ongoing attempts to share proprietary data while respecting its highly confidential nature. It would be better if the industry adopted a more open-source attitude.
如果说DeepMind代表的是生物学领域的范式转变,那其实是人工智能对生物学的影响。到2020年,预计将有2.1万篇科学论文涉及这一科学分支的人工智能方法,而且这一数字正以每年50%的速度增长。它还被科技巨头所主导,这些巨头的代码是其知识产权,这使得该领域特别不透明。只有25%的人工智能论文发布了代码。专家表示,DeepMind通常不公开代码。这损害了问责制和再现性,并最终可能阻碍进展。目前正在尝试共享专有数据,同时尊重其高度机密性。如果业界采取更开放的态度,那就更好了。
Science has traditionally progressed by freely distributing knowledge. The underlying concern is that DeepMind, like its rival OpenAI, may opt to commercialise its deep learning model instead of making it freely available. Some argue that price is not a problem and AlphaFold2 is cheap. DeepMind’s advances rest in part on state-backed breakthroughs – from the evolutionary insight of Spanish bioinformatician Alfonso Valencia to the computational work of scientists such as UCL’s David Jones. It would be strange if in years to come university researchers used government cash to pay DeepMind for a system built on government-funded insights. Their discovery should be celebrated and dedicated largely to the public good rather than wholly to the pursuit of profit.
传统上,科学是通过自由传播知识来发展的。潜在的担忧是,与竞争对手OpenAI一样,DeepMind可能会选择将其深度学习模型商业化,而不是免费提供。一些人认为价格不是问题,AlphaFold2很便宜。DeepMind的进步在一定程度上依赖于政府支持的突破——从西班牙生物信息学家阿方索·瓦伦西亚的进化洞察力,到伦敦大学学院的戴维·琼斯等科学家的计算工作。如果在未来几年,大学研究人员用政府资金向DeepMind支付费用,购买一个基于政府资助的研究成果的系统,那将是一件奇怪的事情。这些人工智能公司的发现应该得到赞美,但应该把大部分精力用于公共利益,而不是完全追求利润。