Contextualizing AI is vital, but how do we ensure effective information sharing?

Last week I wrote a post called Are prompt engineers actually needed? to which Mohammed Lubbad, PhD.
This was a really thought provoking question and I thought it might be an interesting intellectual experiment to ponder this. I design massively scalable information systems using all sorts of tech wizardry ranging from your basic restful APIs to complex shardable Event Driven Architecture but will that be how the mega AI agents of tomorrow will communicate?
When I started writing this I thought it would be a short post but the more I write the weirder things get. Perhaps I will make this a series. DM/email me if you want to see more like this.
“Have your people call my people” will soon be “Have your AI Agent schedule something with my AI agent”. Now imagine that I am using Gemini and you are using ChatGPT and each control our respective calendars. How will they figure out the optimal time to meet?
It’s possible for them to chat in English I suppose, slinging emails back and forth like a cave man but is that the most effective way? Or would there be some protocol we define for LLMs to do scheduling?
Let's say you are trying to get on my calendar. You would need to have some type of address/url for my agent that your agent can send a calendar request to. From there my agent would have to send you my availability dates. Then your agent would have to select one, confirm it with my agent and then add it to your calendar.
Let's make it more complicated by making it an in-person meeting. We now have to select a venue, perhaps even make reservations. The LLM agents would need to factor in their respective humans dietary restrictions and culinary tastes.
What would that protocol look like? Anyone that has designed a restful API or a communication protocol of any type can quickly see how this becomes quite the complex problem to solve.
First off, security comes to mind. I don’t want random people on the internet to be able to query my schedule. This is easy enough to lock down. The LLM agent being queried can make a todo to ask you if it's okay to expose your calendar to the querying agent.
It could ask “Should it be just this once or any time they want to get on your calendar?”. I feel like this could use its own swim lane diagram at this point.
Personal/Geo information is another interesting one. Should that expose your current location to a complete stranger? Ideally no but what if you are trying to book an Uber ride via the agent? Then perhaps yes.
What if your creepy ex-boy friend figured out a way to jail break the LLM so they could stalk you?
That would lead me to hallucinations and jail breaking. What if the LLM agent hallucinates your availability? That is where I think using the English language wouldn’t be the best way to handle communications.
Perhaps something like AWS Lex could do a primitive version of this. Lex basically is a task oriented LLM. You say something like “Schedule a meeting with Jim next week” then lex realizes you want to do a scheduling task. It then triggers a lambda that knows the address for Jim’s LLM agent and initializes the steps outlined above.
But that is not the AI agent doing the scheduling then, that is just the AI agent triggering human written code to do the back and forth. It’s safer for now but it lacks the potential upside of AI being able to learn and adapt faster than humans. It’s more of a wrapper for human written code. Is that a good thing or a bad thing? Well if it gets the job done and doesn’t cause any harm does it matter?
This is all hurting my head. I could go on and likely will in later posts. For now if you are interested in learning more about the AWS Lex approach you should get on the waiting list for my new book The CTOs guide to AI/ML On AWS.
Question:
What do you think communications protocols between AI agents would look like?