AI That Can See, Listen, and Coach You Live
Why the next AI shift may not be machines writing for us, but machines helping us think while
we work.
By Mustafa Tameez
CEO & President of Outreach Strategists
6/9/2026
I got a number of emails, texts, and calls on my essay about what glasses did for my eyesight, AI can do for cognition. The questions you asked were really insightful.
In that piece, I argued that AI may not replace thinking as much as extend it, the way glasses do not replace eyesight but help bring the world into focus.
The post hit sometime after the boos at many graduations around the country, when commencement speakers talked about AI in front of graduating classes. That reaction was telling. Young people are not just curious about AI. Many are anxious, skeptical, or angry about what it may mean for their future.
Out of five conversations with CEOs this week, three of them brought up AI with me. That was not the nature or purpose of the meeting. We were not scheduled to talk about technology, productivity, labor, or the future of work. But AI came up anyway.
Leaders are trying to get their arms around AI while the public conversation is more focused on data centers, job loss, energy demand, and opportunities for young people. Those concerns are real. They should not be dismissed.
But in the middle of all of this, I instinctively want to drive the conversation toward what matters most: what is actually going to drive change, how will people use it, and how do we use it to benefit the most people?
The answer may not be the part of AI most people are talking about.
For the last two years, most of the public conversation has been about content generation. AI can write an email. AI can summarize a report. AI can draft a memo. AI can create an image. AI can help with a social media post.
That is useful. It is already changing workflows. It is already being used by students, writers, marketers, lawyers, consultants, researchers, public agencies, nonprofits, and businesses of every size.
But I do not think content generation is the most important long-term shift.
The most important AI shift may not be machines writing for us. It may be machines coaching us while we work.
The next major shift is what I would call real-time multimodal AI.
In plain English, it is AI that can see, listen, and coach you live.
That phrase matters because every company is calling it something different. OpenAI offers this through ChatGPT’s voice mode with video and screen sharing. Google calls its version Gemini Live with camera and screen sharing. Microsoft calls its version Copilot Vision. Meta is moving in a related direction through AI-enabled smart glasses.
Different names. Same direction.
If you have not tried these tools yet, it is easy to miss how different they feel from the AI most people know today. For the first time, AI is moving beyond generating content on demand and becoming something closer to a real-time collaborator. Instead of waiting for a prompt, it can observe what you are working on, follow along with your reasoning, and offer feedback in the moment.
That shift may sound subtle, but I think it is far more significant than another improvement in writing, image generation, or search.
The AI is no longer just waiting for you to type a question into a box. It can look at what you are looking at. It can listen while you talk. It can watch what is on your screen. It can respond while the work is happening.
That is a very different kind of tool.
Content generation is asking AI to produce something.
Real-time multimodal AI is asking AI to participate with you while you are doing something.
That difference is enormous.
Right now, most people using AI are still using it in the old way. They are asking it to write, rewrite, summarize, brainstorm, explain, organize, or research. That is the dominant use case.
The best public data shows generative AI use is growing quickly, but almost none of it separates basic content generation from live camera or screen-sharing AI. So here is my best read: content generation is probably ten to thirty times more common today than real-time visual AI use.
That means we are still very early.
Most people know AI can write something. Far fewer know AI can watch them write something and give feedback while they work.
That is where the change gets interesting.
A student writing a paper can share the screen and talk through the argument. Instead of asking AI to write the paper, the student can use it to test the thesis, find repetition, strengthen the evidence, and improve the conclusion.
That is not cheating. That is coaching.
A young employee preparing for a client meeting can practice with the deck on screen. The AI can point out where the presentation drags, where the client will care most, which slide needs a clearer headline, and where the strongest argument is buried.
That is not replacing the employee. That is accelerating judgment.
A public servant reviewing a long agenda, budget item, or procurement document can use AI to identify what matters, what is unclear, where the risk is, and what questions should be asked before a decision is made.
That is not automation replacing democracy. That is better preparation for decision-making.
A small business owner trying to understand a contract, insurance policy, or vendor proposal can show the AI the screen and ask for a plain-language explanation.
That is not just convenience. That is access.
This is why I keep coming back to the glasses analogy.
Before I got glasses, the world was not missing. It was just harder for me to see. I could function. I could compensate. I could make it work. But once my eyesight was corrected, I realized how much effort had been wasted just trying to bring the world into focus.
AI may do something similar for cognition.
Not because it makes us smarter in some magical way.
But because it can reduce the friction between what we are trying to understand and what we are able to process in the moment.
For people with dyslexia, ADHD, language barriers, low confidence, visual limitations, or limited access to professional coaching, this could be profound.
For years, the advantage in professional life has belonged to people who had help. People with editors. Tutors. Assistants. Analysts. Consultants. Coaches. Parents who could review essays. Bosses who took time to mentor. Friends who knew how systems worked.
Real-time AI can become a kind of accessible second set of eyes.
It can help people see what they missed, ask better questions, organize their thoughts, and practice before walking into the room.
That does not eliminate inequality. It does not solve every structural problem. It does not replace education, judgment, experience, or relationships.
But it may give more people access to the kind of support that used to be available only to the well-connected, well-resourced, or already confident.
That is why this technology may grow faster than people expect.
People do not adopt technology because it is impressive. They adopt it because it solves a pain point.
Content generation solved the blank page.
Real-time multimodal AI solves something bigger: the feeling of being stuck while you are trying to do something.
I may be wrong about the timeline, but I do not think I am wrong about the direction.
This is why I think it will move quickly. A student can use it while struggling through a paper. A public servant can use it while reviewing a complicated agenda item. A young professional can use it before walking into a client meeting. A nurse, teacher, field worker, parent, or older adult can use it while trying to make sense of information in real time.
That is enough to change behavior.
So much of opportunity is not just intelligence. It is preparation. It is feedback. It is pattern recognition. It is knowing what good looks like.
Real-time AI can help more people know what good looks like.
That is the promise.
The risk, of course, is that we become passive. We let AI think for us instead of with us. We outsource judgment. We allow the tool to become a crutch instead of a coach.
That is why the way we talk about this matters.
I do not pretend to have all of this figured out. None of us do. But I know the difference between a tool that impresses people and a tool people will actually use.
I do not want the next generation to think of AI only as a machine that writes papers for them or steals jobs from them. I want them to understand it as a tool that can help them see, think, prepare, practice, and compete.
The difference is not small.
One path weakens people. The other strengthens them.
Leaders have a responsibility to learn this technology before they regulate it, dismiss it, fear it, or blindly adopt it. Public agencies, schools, hospitals, universities, nonprofits, and businesses should not only ask whether AI will replace work. They should ask how AI can improve human capacity.
- Can it help a nurse spend less time fighting paperwork?
- Can it help a student understand a concept before giving up?
- Can it help a small business compete with larger firms?
- Can it help a public official prepare better questions?
- Can it help a person with dyslexia write with more confidence?
- Can it help an older adult navigate a digital world that was not designed for them?
Those are the questions that interest me.
The public conversation around AI is still too narrow. It is either hype or fear. It is either miracle or threat. It is either job creator or job killer.
But the real story may be more human than that.
The real story may be whether AI becomes a tool that expands human ability.
That is why real-time multimodal AI matters.
Not because it can generate more content.
Because it can sit with us while we are trying to understand the world.
It can watch the work as it happens.
It can hear the question before we know how to write it.
It can help us see what we are missing.
And for many people, that may be the difference between being overwhelmed and being capable.
That is the part of AI I want to keep talking about: not the machine replacing the person, but the tool helping the person see more clearly.
Frequently Asked Questions
What is real-time multimodal AI?
Real-time multimodal AI refers to AI tools that can process more than one type of input at once,
including voice, video, images, documents, and screen activity. These tools can respond while
someone is actively working, instead of only answering typed prompts. (See OpenAI, Microsoft Copilot
Vision, or Stanford HAI.)
How is real-time AI different from generative AI?
Generative AI is often used to create content, summarize information, draft text, or brainstorm ideas.
Real-time AI adds live context by allowing the tool to listen, see, or follow along while someone is
trying to complete a task. (See OpenAI or Microsoft Copilot Vision.)
Why does AI live coaching matter for the workplace?
AI live coaching could help workers prepare for meetings, review documents, improve presentations,
and ask better questions before making decisions. Workplace adoption also depends on trust,
training, and whether people understand how to use AI responsibly. (See Pew Research Center or
OECD.)
How could real-time AI support students?
Real-time AI could help students talk through a thesis, organize a paper, identify weak evidence, or
understand a difficult concept before giving up. The stronger use case is learning support, not
replacing the student’s own work. (See Stanford HAI.)
Why is accessibility important in AI adoption?
Accessibility matters because real-time AI may help people facing barriers related to dyslexia, ADHD,
language, vision, confidence, or limited access to coaching. When designed and used responsibly, AI
can help reduce friction in how people understand information and participate in work, school, and
civic life. (See NIST or OECD.)
What are the risks of using AI as a live coach?
The biggest risk is overreliance. AI should support human judgment, not replace it. Organizations
need clear guidance, human oversight, privacy protections, and responsible use standards. (See NIST
AI Risk Management Framework.)
Why should leaders understand real-time AI before adopting or regulating it?
Leaders need to understand how people actually use AI before making decisions about policy,
training, risk, or implementation. The most useful strategies will focus on practical adoption, trust,
accessibility, and human capacity. (See Stanford HAI, NIST, or Pew Research Center.)
Sources and Further Reading
VP’s Take
VP’s Take: Real-time AI raises an important communications challenge: ensuring that people understand these technologies well enough to use them effectively and responsibly. For public agencies, healthcare
organizations, schools, and employers, success depends on clear communication, audience
education, and thoughtful engagement that help people understand how AI can strengthen human
decision-making while preserving accountability, judgment, and trust. – Keri Myrick, PhD, VP of
Healthcare
Related OS Services
Research & Strategy
Strong communication starts with understanding the audience. OS helps clients use research,
sentiment, polling, and strategy to build messaging that reflects what people need, value, and trust.
Public Relations
When new technology changes public expectations, organizations need clear, credible
communications that explain what is changing and why it matters. OS helps clients shape messages,
strengthen reputation, and communicate with confidence.
Advertising & Marketing
AI adoption, digital literacy, and public awareness depend on reaching the right audiences across the
right platforms. OS helps clients build campaigns that increase visibility, connect with communities,
and support measurable growth.
Public Affairs
Technology, workforce, education, and public service decisions often require communication across
stakeholders, institutions, and communities. OS helps clients move ideas forward with research-
informed messaging and public affairs strategy.