BLOG@CACM
Architecture and Hardware

Is AI Security Work Best Done In Academia or Industry? Part 2

Reasons to favor academia for AI security research.

Posted
flowers sprout from a book, illustration

We left off in Part 1 with arguments for why industry has become the place for major leaps in AI, though with several notable exceptions, both historical and ongoing. In Part 2, we will consider the counterpoints that make it attractive for doing AI security work in academic lanes and bylanes.

Doubtless, a best-of-both-worlds approach is also a winner here—an academic researcher who has affiliation to some industrial organization and thus has the three advantages discussed in the Part 1: vast compute resources; vast amounts of data; and higher compensation structure. That lane, though, is a narrow one, and has become progressively narrower because of organizational reasons—someone with an industrial affiliation is often not free to speak her mind about weak spots in that organization’s commercial offerings. Take the famous case of Geoffrey Hinton, who stepped back from Google in 2023 in order to speak freely about the potential dangers of artificial intelligence, or more recently, in March 2025, when prolific and respected AI security researcher Nicholas Carlini left Google DeepMind.

So here are the counterpoints favoring AI security work in academia.

Vast Compute Resources

Regarding the training of foundational models, this is indeed well nigh impossible in academic circles. We have to be thankful to several industrial organizations that have released fully trained, open source models: Llama from Meta, Gemma from Google, and Mistral from Mistral AI, to name a few. Bless their souls, because without them, academic work on LLMs would indeed have crawled. These are not toy models either; Llama3’s 70B parameters, while decidedly smaller than those of the leaders, is nothing to smirk at. We can tinker with these open source models to our heart’s content. Further, we fine-tune a large model to suit our needs, and that takes but a fraction of the cost of training a model from scratch. (Pay attention to the licenses of these open source models, though, if you have plans to step outside the realm of academic research, and into the realm of commercial use.)

Vast Amounts of Data

Regarding the need for data, we have, in part, the same trend noted above to thank. The initial training of these open source models took seemingly limitless troves of data. Once that de novo training is done, though, the fine-tuning takes much more moderate amounts of data.

Another positive trend in our technology world has been the release of open source datasets. We are hooked to collecting data to document every nook of our lives (pictures and videos, wearable data with a continuous record of our physiology, and sensor data for a precise, up-to-the-minute record of our physical environments). A good many of us are likewise hooked to releasing such data; a data analog of the dopamine trigger of contributing to a Wikipedia article. Thus, even the environment of data scarcity is being mitigated to some extent.

Arguably, the world-changing dataset in the field of image processing/computer vision, now part of technical folklore, is the ImageNet dataset. There is a favorable evolutionary pattern toward creation of open-source datasets in the field of LLMs as well: OpenSubtitles, DailyDialog (for chatbot), The Pile (a diverse catch-all dataset for various LLM tasks), and FRAMES (for reasoning and knowledge tasks) are hugely popular.

The Compensation Structure

Indeed, a large fraction of AI talent, once trained, gravitates toward the opulence of well-resourced industrial organizations. However, the tale of academic penury vs. industrial opulence is overblown. Academic salaries have risen for those in the “hot areas” (and AI is glowing, furiously red-hot) driven by market trends, and many of us are regularly approached by our friends in industry as a result. Plus, many academics in these hot areas have part-time industrial appointments. So there are enough of my colleagues in academia who are leading thinkers in AI, and doers (lest you have the antiquated notion of academics just preaching and never doing).

As importantly, we have the benefit of the continuous flow of fresh talent, a pristine stream that seemingly magically gets continually replenished. In the U.S., we have had a long period of being the beacon for talented students and researchers from across the world. Such talented youngsters are fast learners and have the necessary naïveté of the untrained to propose leap-through ideas. They have the bug to learn and to grow their educational qualifications, and thus, a far lower salary than industry offers is not an impediment. This is true, though, only during the early phase of their careers, but that is enough to keep our pipeline in academia humming along. 

No Muzzles Allowed

Towering above all else and bridging the three points above is one factor that makes academia the place where security research in AI makes those big leaps. That factor is the lack of a muzzle. We are free to speak our minds, find vulnerabilities in the models or their training protocols or the compute infrastructure on which they run, and communicate them with our industrial colleagues so that our race to more intelligent AI models is also a race toward more secure AI models. 

Often explicitly and sometimes implicitly, industrial practitioners are forbidden from poking holes in their AI products, the cash cows that dare not be slowed down. In the climate of the ‘wild, wild west’ that has persisted in AI since its beginning days, there will be vulnerabilities in the software, especially because of the frenetic pace of development. Yet one is not incentivized to look too carefully into these vulnerabilities. The tech blogosphere and tech social media channels get regular doses of news items where some influential AI person has been eased out of their company because they sounded an early alarm that somehow became public. 

For the most part in academia, we are free from such constraints. We make a name by finding vulnerabilities, and by going farther, suggesting mitigations. Real impact arises from instantiating the mitigations in the actual software or model, which almost invariably involves a harmonious cooperation between academic and industry personnel.

To Sum

Academia in the U.S. has been the fertile soil where new ideas take root and flourish, including in the fast-moving, society-upturning field of AI. Specifically for security in AI, there are fundamental forces that favor academia as the place where many significant advancements will sprout. These forces have shaped this trend for several years now, and I see that this trend will last. So here’s an energetic hurray for us AI security researchers in academia, and even more of a hearty toast to synchronized efforts between academic and industry researchers and practitioners.

This post was originally published on Distant Whispers.

Saurabh would like to thank Rama Govindaraju of Nvidia for providing insightful comments on a draft of this article. The views in the article however are Saurabh’s own.

Saurabh Bagchi

Saurabh Bagchi is a professor of Electrical and Computer Engineering and Computer Science at Purdue University, where he leads a university-wide center on resilience called CRISP. His research interests are in distributed systems and dependable computing, while he and his group have the most fun making and breaking large-scale usable software systems for the greater good.

Join the Discussion (0)

Become a Member or Sign In to Post a Comment

The Latest from CACM

Shape the Future of Computing

ACM encourages its members to take a direct hand in shaping the future of the association. There are more ways than ever to get involved.

Get Involved

Communications of the ACM (CACM) is now a fully Open Access publication.

By opening CACM to the world, we hope to increase engagement among the broader computer science community and encourage non-members to discover the rich resources ACM has to offer.

Learn More