How NimbleEdge enables optimized real-time data ingestion with on-device event stream processing

Neeraj Poddar
Published on
December 18, 2024

Introduction

In our previous blog, we covered how NimbleEdge helps capture event streams with our on-device data warehouse capabilities, with provisions to easily transfer to cloud storage. We also discussed how these capabilities enable AI teams to quickly build training datasets for session-aware personalization models, which drive significant conversion and engagement uplift in apps across verticals. 

In this blog, we continue building on that foundation and cover how event stream capture can be further optimized using on-device data processing and filtering before transferring to cloud storage. 

Potential issues: Coarse-grained events and large event payloads

NimbleEdge’s on-device data warehouse solves many key challenges in data collection for session-aware models and is usually adequate on its own for customers with well-classified user event streams, where the size of each event payload is not very large. 

However, for many apps, user events are coarsely defined and payloads may indeed be very large. For example, when event payloads are responses from backend APIs, they may include extensive metadata, or detailed product catalogues, which are not relevant to session-aware personalization but lead to oversized event payloads. In such cases, transferring event streams to cloud servers becomes both costly and inefficient, replicating some of the same issues involved in data transfer from CDPs, such as high costs (in this case for storage of large event payloads), and additional pre-processing steps before use in training session-aware models (e.g. filtering only relevant data points). Often in these pre-processing steps AI teams want to create new event types with only the payload required for experimentation. The key here is to provide the AI teams flexibility to quickly iterate and deliver value without requiring them to jump through multiple hoops slowing down the experimentation while the end users continue to suffer with degraded experiences.

New NimbleEdge feature: On-device Event Stream Processing & Filtering

To address this challenge, NimbleEdge introduces its latest feature: event stream processing & filtering on-device before relay using Python scripts!

For apps with large event payloads, NimbleEdge now also enables on-device processing of event streams before transfer to cloud storage. This feature ensures that only the essential data points are transmitted, reducing data transfer volumes significantly and resulting in lower cloud storage and processing costs. Additionally, preprocessing data on-device simplifies the downstream workflow for AI teams by providing structured, analysis-ready datasets.

With the introduction of this new feature, AI teams can now:

  • Use customizable Python scripts to process and filter event streams on-device before transmitting to cloud storage, allowing AI teams to define tailored preprocessing rules directly on user devices. A snippet showcasing a sample script in the context of food delivery apps is shared below:
  • Easily access and explore captured events on the NimbleEdge user portal, enabling intuitive navigation and streamlined data discovery as showcased below

Using these capabilities, this feature maximizes data collection flexibility for AI teams, significantly reducing the effort and cost involved in building training datasets for session-aware models.

Impact

Processing user events on-device before sending to cloud storage drives various advantages for AI teams, including:

  • Session-aware modeling: With easy access to processed user event streams, AI teams can easily build session-aware models, enabling 8-10% improvement in ranking model performance, and driving conversion and order value uplift 
  • Faster iteration and lower engineering bandwidth: Data ingested using NimbleEdge is ready to use for analysis, with no need for further processing to convert the data into a usable format. Session-aware use-cases can hence be brought into production much faster, with much lower effort into readying data for analysis
  • Transfer and processing costs minimized: Directly transferring clickstream data from user devices to cloud servers already helps circumvent the massive transfer costs charged by CDPs. Additionally, on-device event stream processing helps prevent the large cost of storing unprocessed event streams, as well as the costs of processing the stream to convert them into formats required for training session-aware models

This new functionality underscores NimbleEdge's dedication to providing highly scalable and cost-efficient solutions for session-aware personalization essential for building experiences with AI and empowering AI teams within these organizations for rapid experimentation and model building.

To learn more about how NimbleEdge drives real-time AI-driven personalized experiences at scale, visit nimbleedge.com or reach out to contact@nimbleedge.com

Get the full access to the Case study
Download Now

Table of Content

SOLUTIONS

Unleash the power of personalized, real-time AI on device

Read your users' mind with personalized, truly real-time GenAI augmented search, copilot and recommendations

Boost conversion and average order value by delivering tailored, GenAI powered user experiences, that adapt in real-time based on user behavior

Contact us
Nimble Edge Use Cases Graphical Representation
LEARN MORE:
Elevate gamer experience with GenAI augmented copilot and real-time personalized recommendations

Improve gamer engagement and cut dropoff with GenAI driven experince, personalized to to incorporate in-session user behavior

Contact us
Nimble Edge Use Cases Graphical Representation
Deliver engaging user experiences with real-time GenAI driven co-pilot, search and recommendations

Optimize content discovery using GenAI, with highly personalized user experiences that adapt to in-session user interactions

Contact us
Nimble Edge Use Cases Graphical Representation
Use Cases

Leverage the Intelligent Edge for Your Industry

Fintech

Betterment in transaction success rate through hyper-personalized fraud detection
Fintech
Fraud detection models that try to flag fraudulent transactions (applies to all the FinTech apps)
Speed & Reliability issues with transactions in non-real time ML systems on the cloud limit personalization levels, as it operates with Huge Costs of running Real-Time ML systems on the Cloud
Read Use Case

E-Commerce

Increase in models’ performances lead to a rise in Conversion carts with higher order size
E-Commerce
Search & Display recommendation models for product discovery for new and repeat orders Personalized offers and pricing
The non Real-time/Batch ML processing doesn't serve highly fluctuating or impulsive customer interests. Organizations need real-time ML systems but it is impossible to implement and scale them on the cloud with even five times the average cloud cost.
Read Use Case

Gaming

See uplifts in game retention metrics like gaming duration, completion, game cross-sells and LTV
Gaming
Contest SelectionMatchmaking and Ranking Cross-contests recommendationPersonalized offers and pricing
As a result of cloud’s limited infrastructure in providing scalability with respect to ML model deployments and processing in real-time, gaming apps adopt non real-time/batch processing that negatively affects click-through rates, game duration, completion, cross-sells, and lifetime value of players.
Read Use Case

Healthcare

Savings in the privacy budget with privacy preserving encryption algorithms
Healthcare
Personalized Search recommendations (Exercises, Nutrition, Services, Products)
User engagement metrics, customer acquisition and retention, NPS, and other business app metrics suffer. On-device/Edge processing can be a great solution but the data processing capacity is inherently limited due to resource constraints of edge devices.
Read Use Case

Travel & Stay

Increase in average booking value with new and repeat customers with higher NPS & savings in cost of acquisition
Travel & Stay
Search/Service recommendation models  + Personalized offers and pricing
NimbleEdge’s HOME runs real-time ML - Inference & Training - on-device, ensuring performance uplifts in Search/Service recommendation and Personalized offers/pricing models at 1/5th of the cost to run them on the cloud.
Read Use Case