Running models on edge devices is a win-win for users and providers

Generative AI is expensive, very expensive for providers to run. Largely due to the march towards to ever bigger models delivering ever better results, and the relative neglect of the efficiency of those models, but we have also seen indications that there is a lot of improvements to be had in memory efficiency opening up the opportunity to run accurate models on edge devices. Indeed, we are seeing this happening with Microsoft's Copilot+ PCs and Apple's AI initiatives.

This does not only offload the cost from the providers onto the users benefitting the providers, but also benefits the users by not having to send potentially sensitive queries to the provider, and improving the user experience by cutting out the network latency, and being able to work in offline environments or with poor connections.

Moreover, with the seemingly renewed interest in product offerings, I hope it can spark more interest in researching the topic which I think is currently under-researched. Caring about the efficiency of the models also has other benefits such as consuming less power easing the transition to green energy.

🌱 Running models on edge devices is a win-win for users and providers

Continue reading

Loading...