The real test of AI is not the chatbot. It’s the emergency call center

Most people judge AI by convenience. In emergency communications, the question is different: can it reduce uncertainty, shorten time to action, and support human judgment when panic is part of the signal?

Gal Peretz

09:24, 20.04.26

Most people encounter AI as a tool for convenience. It writes emails, summarizes meetings, and helps developers move faster. Those gains are real. But they have also shaped the wrong expectation: that AI should be judged mainly by how smooth or impressive it feels in a low-stakes setting.
The real test begins in a very different environment, when the input is not a clean prompt, but a frightened person on the other end of an emergency call. In the United States, telecommunicators answer roughly 240 million 911 calls each year, and 911.gov has highlighted 2023 survey findings that more than half of U.S. 911 centers were facing a genuine staffing emergency. In that setting, AI cannot afford to be slow, vague, or overly confident. It either reduces cognitive load, shortens time to action, and supports human judgment, or it gets in the way. 
1 View gallery 
Gal Peretz. 
(Axon 911)
I work on AI in emergency communications, and that reality changes the way I think about this technology. The challenge is not how to make AI look impressive in a demo. It is how to make it useful in the messiest, noisiest, highest-pressure moments.
A 911 call is not a prompt
A 911 call is not a polished interaction with a machine. The caller may be injured, whispering, speaking out of order, or struggling to explain what is happening. They may repeat the least important detail three times and mention the most important one only once. There may be background noise, language gaps, incomplete location information, or a level of panic that makes the situation hard to describe clearly.
The operator still has to figure out what is happening, what matters first, and what needs to happen next. There is no time for prompt engineering in an emergency call center, and no opportunity to ask a frightened caller to refine the request until the system gets it right.
That changes the engineering problem. The goal is not to generate a fluent answer. The goal is to recover structure from chaos and help a trained professional move faster with better context.
This is where much of the public conversation around AI misses the point. In consumer AI, a weak answer is frustrating. In emergency response, a delayed answer, a misleading summary, or false confidence can distort judgment at exactly the wrong moment.
Related articles:
“It started with a robbery”: Carbyne’s CEO on the $625M Axon exit
Public safety giant Axon acquires Carbyne for $625 million in cash
Carbyne raising $100M in AT&T-led round to expand its app-free 911 platform
Three rules when seconds matter
The first is that speed is part of the safety model.
In most software, latency hurts the user experience. In emergency response, latency can hurt the outcome. The purpose of AI in this environment is not to produce long, polished responses. It is to surface the right signal early enough to matter: a key fact, a clearer summary, a faster path to action, or a reduction in the manual work that pulls attention away from the call itself. In this context, speed is not a nice-to-have. It is part of what makes the system operationally useful.
The second is that uncertainty must be visible.
One of the biggest risks in generative AI is not only hallucination. It is false confidence. A fluent answer can sound trustworthy even when the evidence is thin or incomplete. In a high-stakes environment, that is dangerous. AI should help narrow ambiguity, not hide it. It should make it easier to distinguish between what is known, what is inferred, and what still requires human judgment. In critical systems, humility is not a weakness in the product. It is one of the things that makes the product trustworthy.
The third is that systems must be built for stressed humans.
A lot of AI is still built for ideal users: people sitting at a screen, with time to think, time to refine, and time to ask again. Emergency response is the opposite. People under stress do not speak in clean prompts. They speak in fragments. They jump between details. They focus on what feels urgent, which is not always what is operationally important. The best systems do not assume calm, perfect input. They are designed for confusion, pressure, and incomplete information, and they help make progress anyway.
Doing nothing is also a decision
Much of the discussion around AI in high-stakes environments focuses, understandably, on what can go wrong. That scrutiny is necessary. But there is another risk that gets less attention: the cost of leaving overloaded systems exactly as they are.
When call volumes are high, staffing is tight, and administrative burden keeps growing, the status quo is not neutral. Every unnecessary manual step takes attention away from the human interaction that matters most. Every extra layer of documentation steals time from situational understanding. Every fragmented workflow increases the chance that context gets lost between one moment and the next.
The answer is not blind automation. It is assistive systems with structure, guardrails, and human accountability built in from the start. Human oversight cannot be a slogan in a product presentation. It has to be part of the workflow itself.
In public safety, the highest-value AI is not the system that tries to replace the professional. It is the system that helps that professional stay focused on the hardest part of the job: understanding the situation, exercising judgment, and moving faster with less cognitive load.
This is where AI will be judged
Consumer AI introduced the world to what these systems can do. Critical systems will determine what they need to become.
That is a harder benchmark. It is not about entertainment. It is not about novelty. It is not about whether a model sounds smart in a controlled setting. It is about whether the technology can perform responsibly in noisy, time-sensitive, high-accountability environments without pretending to have certainty it has not earned.
For Israeli tech, that should be both a challenge and an opportunity. We know how to build fast and operate under pressure. The next test is whether we can build AI that holds up when the environment is messy, the user is under stress, and the human on the other side needs support, not theater.
The future of AI will not be decided only by bigger models or better demos. It will be decided by whether these systems can help humans act sooner, with clearer context and better judgment, when seconds matter.
Gal Peretz is Head of AI at Axon911.