The second lecture I was able to attend at ACL 2022 was about the state of deep nets, complex neural networks, in the modern world. The presenters brought up many interesting points about accessibility, cost of production, and even carbon emissions, and I’ll discuss some of the key points they made here today.
One of the most important ideas the presenters brought up was the difference between state-of-the-art (SOTA) and ease of use. Currently, many computer scientists and computational linguists alike are focused on making new deep nets that are marginally more accurate than their predecessors, which is certainly important, as the more accurate these deep nets are, the better the applications of NLP in technological advances and research will be. However, as these deep nets become more and more accurate, they also become more and more complex, making their pre-training cost extremely high and their accessibility very low. This creates two issues; it limits the ability of academia to compete in the challenge of making optimized deep nets, as education institutes rarely have the money to access the massive computer networks needed to pre-train complex deep nets for extended periods of time. Furthermore, the high costs discourage experimentation, as mistakes are costly. Another issue is that less focus is being placed on ease of use, preventing researchers from easily entering the field and making it harder to find the optimal algorithms for certain tasks.
Thus, the presenters argued that pre-training should be left to industry, and academia should focus on fitting pre-trained models to specific data sets and testing the results. It’s important that results are shared so that researchers can learn from each others’ mistakes. In summary, they proposed that members of the field focus on making models as accessible as possible, as transparent as possible (meaning the execute command is legible to the untrained eye), and as consistent as possible, keeping terminology consistent with older papers.