Calculate DAU with the raw data via Python

There are many reasons why you may want to validate the raw data - especially with Daily Active Users (DAU). In this guide, we will go over the isSession flag, possible discrepancies, and how to properly calculate the DAU.

isSession flag

In the raw data, there is an isSession flag that is a boolean variable. This flag is used in our backend to determine whether or not to include the user in the DAU calculation. If this variable is set to false, then the user is not included in DAU count.

This flag could be set as false if:

  1. Events are fired out of order
  2. The user has gone offline
  3. You are sending us offline events via our REST API

We don't include these users in the DAU count because they may not actually be interacting with the app — so they should not count as active users.

📘

We disregard the isSession flag when it comes to tracking events because the event could be an API call for when the user was outside of the app (in that case, the user would not have a session but they still triggered an event).

To sum up, we only search for isSession is true for DAU counts. For all events, we disregard the flag, since the users don't have to have a real session to trigger some events.

Data from Offline users

When you request data from us, we can only return data that is available at that time. Keep in mind that users on offline devices are not able to send Leanplum data until they reconnect. So, there could be some instances of trailing data.

When these users come back online, their data will then be batched over to Leanplum in that moment. Users' data can appear on the Leanplum server up to 7 days after the day of the event depending on internet connectivity of the user and when they return to the app.

📘

Note: if you use our automated exporting feature via the s3 buckets, then offline data is accommodated for as the export ports over any "new data" received since the last export.

Python Script to get raw data

To see the actual code to count the DAU, you can use this script. The script is written assuming that you are also using the dataExport.py script though you are free to change the input of the function.