NimbleEdge at PyTorch Conference 2025: Accelerating On-Device AI for Billions
Last week our NimbleEdge team attended PyTorch Conference 2025 - the two-day event brought together researchers, developers, startups, hardware vendors and ecosystem builders - all dedicated to the future of AI deployed at the edge.
We are proud that NimbleEdge was selected for the Startup Showcase, and that our poster session on “Speedy Neurons with Contextual Sparsity” generated meaningful technical interest. These achievements reinforce our mission: scaling AI for billion+ users via on-device solutions.
Here are some of the key announcements, themes and moments that align directly with our mission.
On October 22nd, the PyTorch team announced ExecuTorch 1.0 release in the
blog titled “Introducing ExecuTorch 1.0: Powering the next generation of edge
AI”.
This release describes how ExecuTorch enables production-ready deployment of
PyTorch-based models directly onto mobile, embedded and desktop devices - across
a wide range of hardware backends (CPU, GPU, NPU) and model types.
The release blog also includes “Success Stories” of companies deploying edge AI through ExecuTorch. Among those, NimbleEdge was highlighted as a startup leveraging the runtime to accelerate on-device conversational AI and optimize inference for large-scale users.
Technical tracks such as “Deep Learning Compilers, Kernel Authoring & Accelerators” and “Edge: mobile, embedded, IoT” emphasised hardware/edge AI as a core focus of the PyTorch ecosystem.
The Startup Showcase (Oct 21) offered an opportunity for emerging AI-first companies to engage with the ecosystem and highlight their solutions.
Ray joins PyTorch as a foundation project to provide a unified open source AI compute stack, minimizing complexity for AI production.
The focus on on-device inference and frameworks like ExecuTorch validates our narrative: taking conversational and generative AI off the cloud and onto the device for privacy, latency and scale advantage.
Our poster on contextual sparsity aligns closely with the highlighted themes around compiler/kernels and hardware/edge deployment.
Being featured as a success story in the ExecuTorch 1.0 announcement adds third-party validation of our technical direction and ecosystem relevance.
Startup Showcase: Being selected among a short list of promising AI startups gave NimbleEdge a visible platform in the PyTorch/edge-AI ecosystem.
Poster Session – “Speedy Neurons With Dynamic Sparsity for Fast, Lean LLMs” - drew interest from engineers, hardware vendors and ecosystem partners.
Networking & Ecosystem Engagements: We connected with multiple runtime/accelerator vendors and fellow startups building on PyTorch/ExecuTorch signalling a noticeable shift from cloud to on-device AI.
Attending PyTorch Conference 2025 reaffirmed our conviction: the AI ecosystem is moving decisively toward efficient, on-device, real-time AI at scale and NimbleEdge is leading that shift. Our mission to scale AI for billion+ users under a privacy-first, latency-first on-device model is timely, credible and increasingly relevant.
Thank you to the PyTorch team and all the people we met at the conference. Let’s build on the momentum.

The introduction of Large Language Models (LLMs) and Generative AI (GenAI) has been a major milestone in the field of AI. With AI models encapsulating vast amounts of world knowledge, their ability to reason and understand human language has unlocked unprecedented possibilities

Critical use cases served by on-device AI across industries

It is a valid question (isn’t it?) that why should we put effort into reducing the size of an SDK, with mobile storage capacities increasing all the time. Surely, how much do a few MBs matter when the device has multiple hundred gigabytes of sto