Threats to Machine Learning

April 29, 2019

By: Jason M. Pittman, Sc.D.

Our last discussion focused on the integrity of our training data. More precisely, we explored the idea that training data could be poisoned. The poisoning we looked at led to our output model being skewed. While data poisoning is rather nefarious, it is not the only cybersecurity threat. There are a host of potential cybersecurity threats to machine learning and there is no reason to think we are at the end of the threat evolution. Well, how do we provide availability, confidentiality, and integrity for our algorithms then? Let’s find out!

Categorizing Threats

Eye security check

The best way to map our defense is to start with a map of how the offense may manifest. Broadly, cybersecurity threats for machine learning can be categorized by type. Such types are the result of triangulating the type of machine learning algorithm, how the threat acts against the machine learning, and the intended result of the threat. I am not certain triangulation in this form is hard or fixed. However, triangulation is a solid guideline when viewed as informational in some higher order decision-making process.

Remember, we have four general algorithm categories. Algorithms can be supervised or unsupervised. Further, algorithms use classification or regression to produce the output models. In fairness, there are a growing list of subtypes as well which directly factor into the threat conversation. I’ll save subtypes such as generative adversarial networks for a future discussion.

Suffice it to say, threats not only align to the type of algorithm but also according to whether the threat targets something specific about the machine learning or acts in a sweeping fashion. Technically, a traditional threat such as a buffer overflow could be introduced to our machine learning implementation but I don’t see those as a targeting type. Ignoring non machine learning specific threats allows us to concentrate on the way machine learning threats act. As a consequence, we can categorize it also by its intended result.

Relative to intended result, threats come in all shapes and sizes. In fact, I think this is the area with the most growth potential in the coming years. Intended results include possibilities such as making algorithms inoperable, data hiding, and false positive or false negative flooding. We need to start somewhere though. Because we already went into poisoning, I feel as if we can connect all the threat categorization dots by illustrating that specific threat.

An Illustration

As we saw, poisoning targets a specific training dataset. Thus, any given poison will affect only a particular (although not singular) machine learning

Jason Pittman

implementation. I consider poisoning to be targeted threat as opposed to a sweeping threat which would work against a common feature or function. As well, because poisoning targets training data, this threat works against supervised or unsupervised and classifiers or regression algorithms. I’d call that a universal threat. As far as intended result, recall that poisoning skews the resulting model. There are no errors per se and the output will otherwise appear valid.

In the Future

We have a robust categorization framework for machine learning threats now. As such, I think we can envision specific attack scenarios for our machine learning. I don’t want to necessarily go into attack details before we have an opportunity to develop a defensive framework as well. Join me in the future when I layout a defensive plan for these threats to machine learning.