Humans have a tendency to treat emerging technologies as catch-all solutions to endemic problems. Information technology companies actively reinforce this notion in their marketing. Google, Amazon, IBM, and other firms do not just offer software and hardware, they offer “solutions.” Unfortunately, technologies are never solutions in isolation. Technologies are tools and these tools (should) form part of an organization’s strategic approach to solving enduring problems. They are not solutions on their own, and they are certainly not strategies.
Selecting the wrong tool—or even the right tool’s misapplication—does more harm than just squandering resources. The use of one technological tool prevents use of another because of the time and money it takes to incorporate it into workflows, train staff on its use and maintenance, and acquire equipment or services that support it. Opportunity costs are substantial.
As defense leaders look toward artificial intelligence and machine learning (AI/ML) to address organizational problems, they must know more than the hype. They must be familiar with the strengths and weaknesses of AI/ML techniques, consider other tools and steps to solve their issues, and understand the resources required to implement AI/ML tools into approaches. AI/ML algorithms can have great effects on organizational processes, but only if their strengths align against issues within those processes. Even then, AI/ML algorithms may not be the best tool for the task.
Establishing a starting point for military professionals to begin thinking through these requirements is imperative, and there are several rules of thumb that can help leaders gauge whether their organizations should use AI/ML as a tool in their organizational approaches.
Pattern Matching
Current AI/ML techniques create algorithms with the ability to detect patterns. This seems trite, but pattern matching is fundamental to intelligence. From birth, each of us begins the development of our own pattern-detection skills that lead to our ability to identify objects, make predictions, draw conclusions, learn language, and interact with our environment. With time, experience, and exposure to exhaustive datasets, our minds construct the heuristics and algorithms that some of the world’s brightest minds want to replicate through computation.
As babies and toddlers, humans gain the ability to identify everyday things by discerning common characteristics in classes of objects. Our ability to make greater distinctions rises with exposure to more examples (data). To a one-year-old, every truck or van is a car. By three years old, children may recognize individual brands, colors, and models. Parents can supervise this learning by intentionally identifying examples of cars or allow the learning to occur unsupervised as children watch others identify and interact with automobiles.
With appropriate amounts of data, machine-learning algorithms can replicate this pattern recognition. Like humans, algorithms “learn” to recognize more detailed characteristics when trained on more data (but also like humans, not indefinitely). Algorithms can exceed human capacities to identify objects or patterns at scale, but not necessarily in accuracy. For problem sets that require the triage of masses of data (x-rays, magnetic resonance imagery, satellite images, radar tracks, sentiment, etc.), machine-learning algorithms can help if definable signatures exist. AI/ML models trained to detect tumor growth are already showing promising results. Most interestingly, algorithms can run in congress with one another to perform more complex tasks in ways that simulate general intelligence.
Machine-learning algorithms only reliably identify patterns when used in operational environments consistent with their training environment. Today’s algorithms cannot dependably generalize. An algorithm must experience, while being trained, whatever variances users expect it to encounter in operation. An algorithm trained to identify human faces in profile against a black background is just that. Something as simple as changing the background color could harm the algorithm’s effectiveness. Any deviation in operational conditions from training conditions impacts algorithmic accuracy. A 2019 paper highlighted the failure of a Google algorithm to reliably identify objects outside of normal orientations; the algorithm could easily identify vehicles driving on the road, but as soon as the vehicles flipped on their axis, the algorithm made startling errors. Humans can generalize after a few examples. Algorithms cannot. They need explicit instruction and training, at least until things like zero-shot learning mature.
Algorithmic inability to generalize explains why algorithms can successfully identify tumors on MRIs—because the data used to train the algorithm perfectly matches the operational data—but struggle in complex, live environments. Machine-learning algorithms are wonderful pattern matchers, but only when conditions for operational use match well with training conditions. If a problem set exists in a tightly controlled environment and training data contains the same variances the algorithm will experience operationally, it will likely work well. AI/ML algorithms trained to play games outperform human competitors for this very reason—the conditions of a game or simulator are controlled; the operational and training environment are the exact same. If an algorithm’s operational environment contains more variance than its training data, grievous errors will occur.
The accuracy of algorithms in pattern recognition, to include classifiers that identify objects, do not necessarily make them good or bad. For some applications, an algorithm with 70 percent accuracy would still add value and reduce human labor. In other applications, it would not. Leaders must understand the level of accuracy necessary to add value and identify levels that would be hazardous. Individuals who work with the algorithm must know the approximate accuracy of the algorithm in various scenarios to best use it. This allows operators to avoid abdicating judgment in situations where the algorithm is prone to fail.
Acting on Information
When an algorithm can identify objects in an environment, it can form rules to interact with them, much as humans do. Once children can identify animals like bees, fire ants, or wasps, they form rules to avoid them for safety. Once machine-learning algorithms can identify objects, engineers can either prescribe rules for interacting with those objects or allow other algorithms to develop rules for interaction with that object type. A self-driving car might have a rules-based algorithm to tell it to stop when an attached sensor detects a red light (likely through machine learning), or employ an algorithm trained with the goal of avoiding car-on-car collisions.
Algorithms that identify objects should not run unsupervised unless the probable harm caused by a mistake is small relative to the utility gained, the harm is reversible, or where an operator can minimize risk through control measures. Spam filters are a good example. If an email’s spam filter misidentifies an email from a friend as junk mail and places it in a spam folder automatically, the harm is small and reversible, and the user can make rules that prevent it from occurring again (control measures). From a utilitarian perspective, the good supplied by a spam filter generally outweighs the loss of an occasional email.
Alternatively, if the US Navy employed naval mines that used an algorithm to discern combatant from noncombatant vessels, the consequence of an algorithmic mistake would be much greater. Many nations would only use this type of weapon if its error rate was exceedingly small. Unfortunately, predicting the error rate of an identification algorithm in all circumstances taxes human prescience, even with extensive testing. Algorithmic reasoning is often inconceivable to human beings. Neural networks mimic human neurology, but they are not artificial duplicates. Paradoxically, thinking machines do not lessen the need for human judgment, but increase it.
Furthermore, algorithms that make a mistake lack the general intelligence, perception, or context to stop making it. A human driving a car would immediately stop if he or she struck a pedestrian. Audible signals, and the collision itself would tell the driver something has gone wrong. A self-driving car might keep driving without human intervention. A machine-learning algorithm will continue selling or buying equities without market context. AI/ML algorithms lack the general context to detect mistakes and change their behavior without explicit training.
When You Need Machine Learning—and When You Do Not
Machine-learning algorithms add the most benefit when used to conduct pattern analysis at a speed or scale that manpower constraints make prohibitive, in situations where analysis must occur continuously, or where the number of variables exceeds human capacity to analyze. Analyzing satellite imagery using machine learning is an excellent application of the technology. Government and commercial satellites generate torrents of images that require analysis by professionals at all hours. Humans can analyze images, but an appropriately trained algorithm can do this at a speed that reduces organizational manpower costs and frees human analysts for higher-level tasks. Moreover, use of an AI/ML-powered image classifier on satellite imagery can transform an imagery database into something searchable by text by turning pixels into objects; “Find all T-72 tanks identified on imagery located within 20 miles of City X between 1/2/2001 and 1/2/2002.” Employment of machine learning in this context exploits its advantage in performing analysis at speed and scale and adds secondary advantages that can improve organizational performance or processes.
Frequently, leaders want to incorporate machine learning when rules-based algorithms will accomplish the same task. Recently, a senior Army officer proposed that the US Department of Defense create a machine-learning algorithm to identify individual soldiers capable of mobilizing to support civil authorities during natural disasters. A machine-learning algorithm could perform this task, but using one is more laborious than programming a rules-based query. The Department of Defense knows exactly what kind of soldier it needs to mobilize. He or she should be physically ready (which the military quantifies with metrics), be trained in a codified military skill set (which each service tracks with a system of codes reflecting occupational specialties and additional skills), be outside of the immediate disaster area, and have no administrative bars to mobilization (e.g., injured, attending military education courses, etc.). Pulling this information is no more complicated than a database query: “Find all soldiers with [specialty code or skill identifier], AND who are medically ready, AND have no administrative restrictions on mobilization, AND live outside of a 50-mile radius of a disaster area.” There is no need for machine learning to predict or identify what kind of soldiers would best meet the needs of the mission. The Department of Defense has already done all the thinking and it has structured data to query for an answer. Leaders should only reach for AI/ML models if other techniques cannot meet needs as quickly, accurately, or cost-effectively.
If a competent data scientist can solve your problem, you likely do not need to employ an AI/ML model. As you assemble the data, transform it, and analyze it to create a machine-learning model, you will arrive at an answer before you set AI to work for you. Generally, you need an AI/ML model when the quantity of data overwhelms the manpower available to accomplish the task, you require a model to continuously evaluate data for changes (e.g., credit card fraud detection), or the number of variables exceeds human capacity to analyze (e.g., games, simulations, genome decoding).
What Does It Take to Make a Model?
While AI/ML models can take large amounts of computational power to train, create, and run, the most important logistical requirement to creating an AI/ML model is the acquisition of data that matches the conditions where the model will operate. This is more difficult than it sounds.
Imagine a lawn mower engine manufacturer wants to create a predictive maintenance algorithm. The company might find maintenance records at repair shops around the globe and aggregate them to see if an algorithm can pick out patterns of mechanical failure. Repair records will exist in different formats and with different data fields, free-text problem descriptions, different languages, and varying accuracies. The engine manufacturer would need data-literate engineers to comb through the reports and fit free-text descriptions of mechanical failures into a formal ontology of mechanical problems, an arduous task.
Even if the company finds sufficient records and standardizes them, the model might be less insightful than a competent statistician. Good data hygiene and data-science techniques will often provide the same answer as machine learning. For the algorithm to add value, the model needs access to data inside engines as they fail in real time so that users can predict when parts inside individual engines need replacement. The company would need to redesign its product with gauges and sensors to supply the necessary data. Even then, the algorithm would require periodic updates. Algorithms, like people, always require training when environmental changes occur and engineers must design or redesign hardware with AI in mind.
Finding data for an algorithm to train on that closely simulates operating conditions is the hardest task in algorithm creation. An image classifier needs thousands of hand-labeled images to develop an initial capability. These images should be at the same resolution, from the same perspective(s), and within the same spectrum as images in the operational environment. The number of hand-labeled images required will likely grow with the complexity of the classification task.
As you consider use of an AI/ML tool, ask yourself the following: Do I have access to appropriate data to train the algorithm? Is this data the same as data the algorithm will evaluate in the operational environment? How many man-hours will I require to acquire, label, and format the data? Is there a simpler tool than AI/ML that achieves the same effect?
Heuristics for Heuristics
Like many new technologies, the hype behind AI/ML threatens to interfere with its objective, fact-centered implementation. Social media posts and business presentations barrage senior leaders with buzzwords that imply machine learning is a panacea. This marketing is hard to overcome. As leaders consider incorporating AI/ML technologies into their strategies for solving hard problems, let the following rules of thumb act as a guide.
- Don’t be the First: If another organization has not already done something like what you would like to do with AI/ML, then let someone else work through the problem. If you are not a research and development organization, let a federally funded research and development center, university affiliated research center, or Department of Defense lab prove the concept.
- No Data, No Algorithm: If you cannot find ample data to feed algorithms that matches that of the algorithm’s operational environment, your algorithm will struggle.
- Know Thyself: The algorithmic accuracy necessary to gain efficiencies varies with organizational need. Fraud detection algorithms can add value to operations with a low accuracy rate. Self-driving vehicle algorithms cannot because of the high cost of an error. You must know how accurate or precise an algorithm needs to be to help your organization. This allows you to evaluate the costs and benefits of its creation and implementation.
- Don’t Overcomplicate it: Never use a more complicated tool when a simpler one will render the same result for less effort over the life of a program. Many organizational issues do not require AI/ML for resolution. Traditional, rules-based software can provide just as much of a productivity increase as machine learning tools in some situations.
- Narrow and at Scale: At present, AI/ML algorithms best perform narrow tasks in controlled or constrained environments. As the number and complexity of scenarios an algorithm may experience grows, performance declines. Algorithms provide the largest productivity gains when asked to match patterns in a narrow context at a speed, scope, or complexity that exceeds human capacity.
AI/ML technologies offer defense leaders incredible tools to attack longstanding problems, but they are not solutions in and of themselves. Prudent leaders must carefully examine the costs of developing AI/ML algorithms, their likelihood of success in implementation, and their relative advantages or disadvantages to other tools. If defense budgets decline, provident selection of AI/ML projects will become increasingly important.
Jules “Jay” Hurst is an Army strategist. He previously worked on the Algorithmic Warfare Cross-Functional Team (Project Maven).
The views expressed are those of the author and do not reflect the official position of the United States Military Academy, Department of the Army, or Department of Defense.
Image credit: Mike MacKenzie