The answer to the token problem, he said, is the same: “Anything in systems can be solved with caching and indirection.”
DevRev, for example, is building a memory layer between AI agents and primary data sources, such as Salesforce or ERP records; that can cut token load and make data movement more efficient. The layer holds a knowledge graph with answers to common agent questions and runs on cheaper CPUs, avoiding more costly GPU cycles.
Sending agents straight at systems like ServiceNow and Salesforce “will burn a lot more tokens. It’s also not precise. And finally, it’s not safe enough where I can roll it back in case an agent has committed a mistake,” Pandey said.
Read the full article here

