The amount of data available to intelligence and security agencies today is astounding. As data continues to grow, so do the challenges of protecting and using it efficiently so that it can be turned into actionable insights that strengthen national security. The possibilities of using AI to improve public sector agencies are endless—if you have the right infrastructure to help organize unstructured data from multiple sources, quickly process it, and protect it against adversarial attacks, no matter where it lives.
Artificial intelligence and machine learning frameworks rely on a mind-boggling amount of data. Public sector data is usually distributed, comes from multiple sources, is constantly changing, and comes in variety of formats: audio, video, images, logs, and more.
Effective AI models for the public sector need to be trained by using large amounts of data as input. That data needs to be gathered, organized, labeled, structured, and prepared so that AI models can use it. The accuracy and performance of AI and ML models depend on the quantity and quality of training data; this is the most essential element in the entire AI workflow. In terms of national security, ID verification models using computer vision and trained on 100 passports would not perform as well or be as accurate as models trained on 10,000 or 100,000 passports.
The National Security Commission on Artificial Intelligence reports that “a very small percentage of current AI research goes toward defending AI systems against adversarial efforts.” Adversarial ML and AI attacks are very real, and they happen on a large scale. The problem is so big that Microsoft teamed up with the MITRE Corporation to create MITRE ATLAS (Adversarial Threat Landscape for Artificial-Intelligence Systems). ATLAS is “a knowledge base of adversary tactics, techniques, and case studies for machine learning (ML) systems based on real-world observations.”
Adversarial AI and ML attacks often target data or applications used for data preparation and training. As mentioned earlier, this data is one of the most essential elements in the AI workflow because it directly affects the outcome of the AI model. If that data is altered or manipulated in any way by an attack, the adverse results may not be detected before inference and production, when it’s usually too late to remedy the problem.
A multiprong strategy is required to protect against these attacks. Training data goes through at least three stages—data gathering, labeling, and training—before inference and production. To protect datasets against data poisoning, insider and outsider threats, model evasion, and model stealing attacks, organizations need to establish data integrity and traceability throughout the process (data at rest, in motion, and in processing). This protection can be achieved through Zero Trust environments by encrypting data in all stages, creating tamper-proof datasets, and implementing multifactor authentication combined with insider threat detection tools.
The National Institute of Biomedical Imaging and Bioengineering of the NIH has conducted large studies to improve the interpretation of tumors by using AI and deep learning. The National Cancer Institute of the NIH also conducts research into AI-aided imaging for cancer prevention, diagnosis, and monitoring.
The following adversarial example shows how an image that was classified by an AI system as a benign tumor can be manipulated so that the same AI system classifies it as malignant, although the changes are not perceptible to human eye.
Considering the prevalence of these attacks in public sector IT systems, Microsoft and MITRE have classified them into the 7 categories in the Adversarial ML Threat Matrix.
Here is a summary of the AI and ML attacks described in the matrix.
With NetApp® AI solutions for the public sector, you get built-in data protection, compliance, and secure access for your distributed, diverse, and dynamic data on premises and across clouds. NetApp enables you to integrate, organize, protect, and secure your data pipeline from edge to core to cloud. With solutions like NetApp Cloud Insights, NetApp DataOps Toolkit, and NetApp SnapLock® compliance software, we help public sector agencies manage, organize, and use their data while helping to defend against adversarial AI and ML attacks. With these threats averted, governments can use the power of AI to make quick and confident decisions that strengthen national security.
To learn more, visit our AI for public sector webpage.
Dejan is a visionary and a leader whose innovative and out of box thinking have earned him the reputation of creative solutions wizard. Dejan is part of several SNIA (Storage Networking Industry Association) committees, and he is regularly invited to chair conferences and to present on the latest technologies.
Currently, Dejan is Sr. Product Manager at NetApp and is leading initiatives related to Ontap and Hybrid Cloud adoption, amongst other things. Dejan has over 25 years of industry experience in Storage, Cloud, HPC, and AI technologies, and he also holds an MBA and Masters in Information Technology degrees.