How NimbleEdge enables on-device event stream capture to power session-aware AI

Nilotpal Pathak
How NimbleEdge enables on-device event stream capture to power session-aware AI

What is session-aware personalization?

Mobile apps across verticals today offer a staggering variety of choices - thousands of titles on OTT apps, hundreds of restaurants on food delivery apps, and dozens of hotels on travel booking apps. While this abundance should be a source of great convenience for users, the width of assortments itself is driving significant choice paralysis for users, contributing to low conversion and high customer churn for apps.

Many apps have turned to user experience personalization to tackle this challenge, delivering personalized listings in app homepage, search results and recommendations, based on the customer’s historical preferences. This approach has been partially successful, but most enterprises only rely on outdated, historical customer data for personalization, missing out on rich insights from real-time user interactions that more accurately indicate a user’s immediate intent.

That is where session-aware personalization comes in. Session-aware personalization leverages users’ real-time, in-session interactions (e.g. clicks, search queries, cart additions) to understand their current intent, and accordingly tailor their app experience. Leading apps like Netflix, Instacart, and Alibaba’s Taobao have deployed such systems, driving strong conversion and engagement benefits. At NimbleEdge, we take this further by enabling session-aware personalized experiences directly on users’ mobile devices using our on-device AI platform, making real-time personalization efficient and endlessly scalable.

Building session-aware personalization models

Session-aware personalization is performed using AI/ML models that take the user’s real-time clickstream data as input, and return e.g. personalized rankings as output, that form the basis for real-time tailored user experiences (e.g. top search results for a query, top items to display on a user’s homepage feed). Naturally, creating such models is the first step towards enabling session-aware personalization, with these models then deployed and executed on users’ mobile devices using NimbleEdge.

Building such a model involves several challenges, which make session-aware modeling a highly time and resource-intensive effort:

  • Need for massive volume of data:
    • Session-aware modeling exercises require a huge amount of granular user clickstream data - such as clicks, search queries, cart additions, purchase completions, and more - for training.
    • This data captures the sequential nature of user behavior, enabling the model to understand user intent and how it correlates with user interactions in and across sessions.
    • At scale, this can mean capturing billions of interactions for apps with high traffic, such as e-commerce apps processing millions of user sessions daily.
    • However, A large training dataset is essential to ensure the model can generalize across diverse behaviors effectively and deliver personalized, relevant responses that enhance user engagement and user experience
  • Data accuracy and pre-processing:
    • After data collection, it must undergo cleaning to remove incomplete or erroneous data (e.g. accidental clicks)
    • The massive raw user event dataset then needs to be transformed to a format which is suitable for further analysis - for example, filtering user events to limit to focus data points, or enriching data with contextual information such as product categories or price
  • Bandwidth and cost-intensiveness:
    • Performing these accuracy checks and transformations is highly time-consuming for ML engineers, whose time and expertise come at a premium. Combined with infrastructure expenses for storing and transferring large datasets, these processes can quickly overwhelm budgets and timelines

In the next section, we explore how existing tools for clickstream data collection can exacerbate these challenges, often falling short in flexibility and customization.

Data collection for session-aware models and the challenge with CDPs

We’ve already established that creating large, accurate, clickstream datasets is essential for building session-aware personalization models. We’ll now shift our attention to a popular current approach for data collection (i.e. using CDPs), and associated challenges.

Customer Data Platforms (CDPs), such as Segment, Amperity, and Clevertap consolidate and organize data from various sources to create unified user profiles. They integrate with customers’ apps, website, CRM software and marketing automation tools, and keep a record of users’ demographic data, marketing campaign data, as well as behavioral data, such as product-user interactions. This data is then primarily used to inform customer segmentation, personalization, and analytics for marketing campaigns.

Given the nature of focus use-cases, the users and buyers of these platforms are usually marketing teams. While CDPs collect clickstream data that is valuable for ML engineers, they are not purpose-built to support session-aware modeling, leading to several major challenges when leveraging CDPs for this task:

  • Data transfer costs: Transferring data from CDPs to your own cloud storage is often time-taking as well as prohibitively expensive for the large datasets required to train session-aware models
  • Format inflexibility: The data formats for event streams in CDPs are not optimized for analysis by ML teams, requiring significant bandwidth for transformation or ETL tasks to make them usable for training
  • Low customizability: Additionally, CDPs offer limited customizability to ML teams in terms of data formats as their primary users are largely marketing and front-end teams. Since data collection exercises for ML use-cases are often run only for short periods and needs vary from use-case to use-case, CDPs are especially reluctant to offer customizations to cater to ML teams.

To illustrate these challenges, we share a quote below by the VP of AI and Data Science at a leading food delivery app, highlighting why they have struggled to use clickstream data from their CDP for session-aware modeling use-cases.

alt_text

The solution: On-device clickstream data collection with NimbleEdge

Given our focus on session-aware personalization, NimbleEdge offers an on-device data warehouse that can help circumvent the challenges associated with clickstream data collection from CDPs. This on-device data warehouse captures and securely stores real-time user interactions directly on their own devices. These interactions, such as clicks, search queries, and cart additions, can then be seamlessly transferred to cloud storage of your choice, and easily leveraged for training session-aware models.

Purpose-built for ML engineers, this solution enables ML teams to quickly collect requisite clickstream data with high accuracy, while eliminating costs of transferring data to cloud storage associated with CDPs.

In the diagram below, we share a high-level overview of how this system operates:

alt_text

Impact

NimbleEdge’s on-device data warehouse unlocks ML teams’ ability to collect user event stream data at scale, unlocking the following key benefits:

  • Lower data transfer costs: With NimbleEdge’s on-device data warehouse enabling direct transfer of event stream data from end-user devices to cloud storage, ML engineers can eliminate the data transfer costs typically incurred when using CDPs

  • Control on data format: ML teams can select the user event stream data points they want to ingest using the NimbleEdge SaaS platform, providing flexibility in terms of data collection, as well as minimizing time required for pre-processing collected data to bring it into a format suitable for further analysis

  • Faster time to market: With simplified user event stream data collection in the requisite formats, ML teams can train session-aware models faster, as well as iterate quicker, cutting down the time to deployment

This enhancement reinforces NimbleEdge’s commitment to delivering efficient, cost-effective, and scalable solutions for session-aware personalization, even in the most demanding data environments.

To learn more about how NimbleEdge drives real-time AI-driven personalized experiences at scale, visit nimbleedge.com or reach out to support@nimbleedge.com

Related Articles

Mobile App SDK Size Reduction Techniques

Mobile App SDK Size Reduction Techniques

It is a valid question (isn’t it?) that why should we put effort into reducing the size of an SDK, with mobile storage capacities increasing all the time. Surely, how much do a few MBs matter when the device has multiple hundred gigabytes of sto

A
Arpit SaxenaApril 23, 2025
Meet NimbleEdge AI: The First Truly Private, On-Device Assistant

Meet NimbleEdge AI: The First Truly Private, On-Device Assistant

We’re thrilled to introduce NimbleEdge AI, the industry’s first fully on-device conversational assistant powered by the NimbleEdge platform. With **no internet dependency**, **no cloud processing**, and **no data leaving your device**, this is the future of AI: private, secure, and always accessible—even offline.

N
Neeraj PoddarMay 14, 2025
Running AI On-device with a Lean, Performant Python Stack

Running AI On-device with a Lean, Performant Python Stack

The introduction of Large Language Models (LLMs) and Generative AI (GenAI) has been a major milestone in the field of AI. With AI models encapsulating vast amounts of world knowledge, their ability to reason and understand human language has unlocked unprecedented possibilities

K
Kunal MohanApril 13, 2025