You go to war with the data you have: next-generation AI for national security

Angela M. Sheffield

Artificial intelligence is the most powerful technology in generations with the potential to impact U.S. security, welfare and global leadership. U.S. national security agencies must develop and integrate AI-enabled capabilities to compete and defend in the AI era. However, standard methods and AI technologies fall short for the high-consequence and specialized missions of national security. The U.S. Department of Energy’s (DOE) National Nuclear Security Administration (NNSA) and National Laboratories are developing the Next-Generation of AI — innovative methods and technologies designed for national security challenges and operational concepts. National security agencies should leverage NNSA’s Next-Generation AI research and development to accelerate AI innovation and enable an AI-ready force.

The AI revolution is dominated by deep learning-based systems trained using vast datasets of generic images and text examples. While powerful, these techniques are designed for use cases where it is possible to curate millions of commonplace examples and insights derived by the model are of little consequence. The deep neural networks in these AI systems rely on assumptions of balanced datasets with uniformly distributed examples — bias is assumed away because datasets are so large and data elements represented in even proportion. In short, deep learning requires big, boring datasets. More interesting datasets, like those typical in national security, push these technologies beyond the performance envelope and the bounds of intended use.

NNSA’s Office of Defense Nuclear Nonproliferation Research and Development (DNN R&D) is advancing the state-of-the-art in AI to fulfill requirements for robust, ethical and secure models. Research in DNN R&D’s Next-Generation AI portfolio is laser-focused on the “Achilles’ heels” that limit the usefulness of standard AI models for national security. While DNN R&D’s mission is to develop technologies and science-based capabilities for nuclear proliferation detection missions, AI opportunities and requirements are shared for missions across national security. National security agencies should adopt as standard practice Next-Generation AI techniques, models and technologies in building AI systems for national security missions.

Research in the Next-Generation AI portfolio does not assume that the data is good. Rather, Next-Generation AI researchers know it is not — data is sparse, messy, incomplete and biased. In national security, you go to war with the data you have. The national security enterprise cannot wait for good data to develop methods to build useful AI systems. The data will never be good enough. Next-Generation AI researchers are developing methods to build robust, useful models from bad and biased data including domain-aware AI models and architectures, methods to fuse data from multiple sources, and approaches to leverage synthetic data.

Conventional data science and machine learning techniques require that the data alone provide all information necessary to describe relevant phenomena or features accurately and completely. Rather than relying strictly on data, AI for national security must use domain-aware AI methods: computational techniques to combine data with modeled predictions and simulations, information from human operators or subject matter experts, and information from the operational environment. These methods include custom loss functions and model architectures, like a domain adversarial neural network that penalizes the model when it violates a mission-informed constraint and trains the model to be more sensitive to the right signals.

Researchers in the Next-Generation AI portfolio are developing methods to combine and jointly analyze data from air, land, maritime, space and cyberspace. Data fusion requires sophisticated mathematical and computer science approaches. Successful techniques combine disparate data sources within an underlying framework like an ontology, graph-based model, computational science or predictive model, or decision intelligence framework. Computational models and unsupervised techniques are often the best choice because they can be implemented at scale. The Next-Generation AI portfolio is advancing the science of data fusion to enable the rapid and continuous integration of multidomain data for mission demands in the modern era, where data fusion is the new paradigm.

The Next-Generation AI portfolio generates curated and synthetic data to overcome sparse, biased and incomplete data. Of primary focus are relevant data types beyond images and text including signals, measurements, sensors, and scientific and technical data. Additionally, the Next-Generation AI portfolio curates specialized datasets that capture attributes and behaviors that are relevant for and national security to train models that are useful for these missions.

The United States national security enterprise must integrate AI to win in competition and conflict with China, Russia and emerging threats. While this transformation is daunting, the United States has an invaluable resource: NNSA and the DOE National Laboratories are already developing solutions to build AI systems for national security missions. National security agencies should turn to the Next-Generation AI capabilities of the DOE National Laboratories in the race among global powers to develop and deploy artificial intelligence.

Angela M. Sheffield is the senior program manager for data science and artificial intelligence at the National Nuclear Security Administration’s Office of Defense Nuclear Nonproliferation Research and Development in the Department of Energy. She leads the U.S. government’s premier program to develop artificial intelligence systems to transform national security and fulfill mission requirements across the U.S. government to prevent nuclear weapons proliferation and reduce the threat of nuclear terrorism.

Courtesy: (c4isrnet)