Centralization vs. Decentralization in AI: Navigating the Trade-offs
Reply to a question from @meow about the concerns of AI being famously centralized. Our response:
Yes, AI is very centralized right now! As we’ve been building this project the question keeps coming up, where is the line between the inputs to the AI (the data we crawl), the augmentation (vector DBs), the training and the actual operation of the AI (e.g. GPUs, llama.cpp, ONNX, etc)?
We’re addressing this progressively and trying to walk a practical line. We are sure that there are some projects out there that aspire to perform purely distributed AI training. We think this is a laudable goal, but probably not practical now. AI technologies are evolving very, very quickly, and fully distributed software is very hard to get right. There’s an impedance mismatch between the AI training and fully distributed processing right now. We fully expect that this space will eventually get there, but expect that’s likely years in the future; an eternity in the Web3 space. Furthermore, when it comes to foundational models, there are a lot of things being done in the open, if not yet in the Web3 space. Open source models that have “learned to speak” are very, very good. There is not necessarily a lot of value in teaching a model completely from scratch; we don’t just mean this in the Web3 sense, almost no AI company really makes sense to train models completely from scratch these days.
On the other hand, the inputs are one of the most difficult parts to acquire right now, and we can confidently produce this right now. With these inputs we can produce specializations and augmentations which effectively teach the models new things via specialization (e.g. LORAs, etc) as well as augmentation (RAGs, etc). Right now we run these parts in centralized infrastructure but we are super motivated to run them on the end users’ devices. The world has already produced truly spectacular tools which allow running advanced models on modest devices (llama.cpp, ONNX, etc).
As a team we want to focus on delivering. Right now the benefits of running the generation phase of the model locally is less than running the augmentation phase locally (allowing the “customer artifacts” to live locally with the customer). This keeps the customer specific parts away from the general purpose parts.
The next step (not immediately on the roadmap, but clearly in our future), is to do the generation locally. If for no other reason than it will save us a lot of money not having to sit there grinding out text responses on our infrastructure, as well as the obvious user benefits for both speed and privacy. Yet, in the decision between keeping momentum up and delivering vs the complexity of making these generations run reliably on thousands of different device configurations, delivering great products and gaining users wins right now.
Finally, more complex training is probably out of scope for the foreseeable future. We are certainly following this space closely and if it becomes more practical, or becomes more necessary for some reason, then we would certainly elevate it, but for now we see fully distributed training as the sort of trap that can mire a project in incredible technical complexity for an unclear and likely unnecessary outcome.
So, when it comes to the laudable goal of fully decentralized AI, we’re focusing on delivering real and achievable gains in the near term, while always keeping ourselves open to decentralizing additional parts as it becomes practical.
The aggregation and abstraction of depin make it so much easier for the average user (not the average crypto expert, but the average user) to simply tap install and have a gateway to these ecosystems on their device, living in their pocket 24/7. And in the process, we onboard new users to own a wallet and experience the empowering benefits of web3. I assume the advantages here are clear, but happy to address further if you like.