How do you merge the knowledge of 2 LLM agents so you can share data at scale


How do you merge the knowledge of 2 LLM agents so you can share data at scale?

What if company A is using one LLM Agent and company B, which uses an entirely different LLM agent, buys company A and wants to merge the two knowledge bases?

This is fascinating because in the future these technical decisions will likely directly affect the valuation of your business to a possible acquirer. If the data is trapped in LLM agent A then it is much less valuable to the acquiring company if it is difficult or impossible to transfer to their own LLM agents. So make these decisions with care so you don’t cut a few zeros off the valuation of your business in the future.

You might say “Just have LLM Agent A in a chat room with LLM Agent B and let them talk”. You would spend a lot of GPUs to do this and depending on your setup it might not work.

It depends on how they have the LLMs set up. If you just take generic LLM with no fine-tuning and slap it on top of your data source, that is not difficult. I will tell you how to do this with AWS at the end of the post.

But if you have a fine tuned LLM then it's much more complicated. Neural networks are tangled messes so pulling out the exact functionality that you want to keep while leaving the rest and transferring it to an LLM that is not structured with the exact same input/output neurons would be difficult, time-consuming, and likely not feasible.

I suppose it would be possible to fine tune sub modules for highly specialized tasks that the more generic agentic LLM would call only when it needs that task done and that would be easy enough to transfer.

Another option, assuming the LLM itself was fine-tuned, would be to take the same training data you used to finetune agent A and use that to finetune agent B. This could work but would require some expensive GPUs to re-train the 2nd agents model. Additionally you risk over training it and it forgets what it knew before. You would have to merge the training data for agent A with the original training data from agent B.

Have I sufficiently melted your brain? Let’s move on to a more concrete setup that we can achieve today fairly easily.

Simple AWS Setup:

I just started digging on AWS Kendra for my new ebook The CTO’s guide to AI/ML On AWS and it is a pretty amazing piece of technology. You can tie AWS Q, which is AWS’s business intelligence agent with AWS Kendra and get access to basically any data source on AWS or beyond.

The data would still persist in its original data store, so swapping LLMs long term shouldn’t be such an issue. For developers think of the LLM like the Application Layer and one day you wanted to change the application layer from PHP to Java. Both Application Layers can still query the data sources and get all the data. And similar to how the 2 application layers might treat the data differently based on how they were coded for the LLMs there just might be some nuances to how the new agent processes the old data based on how it was trained.

Wrapping it Up:

Just a reminder I do NOT have a PhD in Machine Learning; I am just the guy that has to host this stuff at scale on AWS. If you are an AI/LLM expert I would love to hear your thoughts on this so please feel free to email, Comment, etc.

Question:

How are you designing your LLM agents?

PS:

If you want to chat more about this you should come check out my Free Tech Talk on How To Host AI/ML on AWS this Friday at 1PM CT.