OneSignal Databricks integration

OneSignal Databricks integration overview


Overview

The OneSignal + Databricks integration supports two-way data sync:
  • Export OneSignal message events to Databricks for analytics, dashboards, and reporting.
  • Import custom events from Databricks to OneSignal to trigger Journeys and personalized campaigns.
Use this integration to unify your engagement and behavioral data across platforms and drive data-informed messaging strategies.

Export OneSignal events to Databricks

Sync push, email, in-app, and SMS events from OneSignal into your Databricks lakehouse for near real-time analytics and visibility. Requirements
  • Professional Plan
  • Custom Events enabled (for event imports)
  • Databricks Platform: AWS, Azure, or GCP
  • Databricks Plan: Premium or higher
  • Databricks Unity Catalog (recommended for governance)
  • Databricks SQL Warehouse for querying
  • Delta Lake event tables (for custom event import)
Setup Steps

1. Collect SQL warehouse details

1

Log in to your Databricks workspace

Go to SQL Warehouses in your Databricks workspace.
2

Select your warehouse

Select your warehouse and open the Connection details tab.
3

Save the following details

  • Server Hostname
  • Port
  • HTTP Path
Databricks SQL connection details for Fivetran setup

Databricks SQL connection details for Fivetran setup

2. Create a service principal

1

Go to the Service Principals page

Go to Workspace Settings > Identity and Access > Service Principals.
2

Add a new service principal

Click Add Service Principal, then Add New.
3

Name the service principal

Name it (e.g., onesignal-sync).
Modal for adding a service principal, with the 'Add new' option highlighted

Modal for adding a service principal, with the 'Add new' option highlighted

3. Generate a secret

1

Click into the created principal

2

Go to the Secrets tab

3

Generate a secret

Click Generate Secret and save it securely.
The secret is only visible once—store it safely.
Databricks 'Generate secret' modal showing OAuth secret and client ID for API authentication

Databricks 'Generate secret' modal showing OAuth secret and client ID for API authentication

4. Assign permissions

1

Navigate to your Catalog and open the Permissions tab

2

Click Grant

3

Assign the following permissions to the service principal

  • USE CATALOG
  • USE SCHEMA
  • SELECT
  • MODIFY
  • CREATE SCHEMA
  • CREATE TABLE
Privilege assignment screen for a Databricks principal with custom catalog permissions selected

Privilege assignment screen for a Databricks principal with custom catalog permissions selected.

5. Connect OneSignal

1

Activate the integration

In OneSignal, navigate to Data > Integrations > Databricks.
2

Enter the details

  • Server Hostname
  • Port
  • HTTP Path
  • Catalog Name
  • Schema Name
  • Service Principal credentials (ID + Secret)
OneSignal Databricks Configuration form with fields for catalog, hostname, HTTP path, and OAuth credentials

OneSignal Databricks Configuration form with fields for catalog, hostname, HTTP path, and OAuth credentials.

3

Configure the integration

  • Sync Frequency: as often as every 15 minutes
  • Dataset/Table Names: pre-set as onesignal_events_<app-id> and message_events (editable)
  • Event Types: choose which to sync—select all or just what you need
4

Select events

Select the events you care to receive in your Databricks catalog.
OneSignal event export settings screen showing sync status, dataset configuration, and selected message event types

OneSignal event export settings screen showing sync status, dataset configuration, and selected message event types.

5

Complete the setup

Click Save and wait for the success confirmation
Initial data sync can take 15–30 minutes to appear in BigQuery.While you wait, send messages via push, email, in-app, or SMS to trigger the events selected.

6. View data in Databricks

  1. Open your Catalog in Databricks.
  2. Once syncing completes, your configured schema will appear.
  3. Access and query the message_events table.
    Databricks Catalog view showing OneSignal message events table under a production schema

    Databricks Catalog view showing OneSignal message events table under a production schema.

  4. Click into tables for sample data preview.
    Sample data from the message_events_1 table with synced OneSignal event fields

    Sample data from the message_events_1 table with synced OneSignal event fields.

If you run into issues like missing schemas, permission errors, or malformed events, contact support@onesignal.com.

Message events and properties

Message event kinds

Property: event_kind Type: String The kind of message and event (e.g. message.push.received, message.push.sent).
Message Event (OneSignal)event_kindDescription
Push Sentmessage.push.sentPush notification successfully sent.
Push Receivedmessage.push.receivedDelivered push (see Confirmed Delivery).
Push Clickedmessage.push.clickedUser clicked the push.
Push Failedmessage.push.failedDelivery failure. See message reports.
Push Unsubscribedmessage.push.unsubscribedUser unsubscribed from push.
In-App Impressionmessage.iam.displayedIn-App message shown.
In-App Clickedmessage.iam.clickedIn-App message clicked.
In-App Page Viewedmessage.iam.pagedisplayedIn-App page shown.
Email Sentmessage.email.sentEmail delivered.
Email Receivedmessage.email.receivedEmail accepted by recipient’s mail server.
Email Openedmessage.email.openedEmail opened. See Email Reports.
Email Link Clickedmessage.email.clickedLink in email clicked.
Email Unsubscribedmessage.email.unsubscribedRecipient unsubscribed.
Email Marked Spammessage.email.resporedasspamMarked as spam. See Email Deliverability.
Email Bouncedmessage.email.hardbouncedBounce due to permanent delivery failure.
Email Failedmessage.email.failedDelivery failed.
Email Suppressedmessage.email.supressedSuppressed due to suppression list.
SMS Sentmessage.sms.sentSMS sent.
SMS Deliveredmessage.sms.deliveredSMS successfully delivered.
SMS Failedmessage.sms.failedSMS failed to deliver.
SMS Undeliveredmessage.sms.undeliveredSMS rejected or unreachable.

Event data schema

For each message event generated by a user, the following metadata will be attached to the record.
Column NameTypeDescription
event_idUUIDUnique identifier for the event
event_timestampTimestampTime of event occurrence
event_kindStringThe Event Kind
subscription_device_typeStringDevice type (e.g., iOS, Android, Web, Email, SMS)
languageStringSubscription language code
versionStringIntegration version
device_osStringDevice operating system version
device_typeNumberNumeric device type
tokenStringPush token, phone number, or email
subscription_idUUIDSubscription ID
subscribedBooleanSubscription status
onesignal_idUUIDOneSignal user ID
last_activeStringLast active timestamp
sdkStringOneSignal SDK version
external_idStringExternal user ID that should match the integration user ID
app_idUUIDApp ID from OneSignal
template_idUUIDTemplate ID (if applicable)
message_idUUIDMessage batch/request ID
message_nameStringName of the message
message_titleStringMessage title (English only)
message_contentsStringTruncated message body (English only)
_created, _id, _index, _fivetran_syncedInternal useFivetran sync metadata

Notes

  • Syncs after saving/activating may take an additional 15-30 minutes to complete.
  • Deactivating may still result in one final sync after deactivation.
  • To ensure efficient data synchronization, our system automatically creates and manages staging datasets. These datasets, named with a pattern like fivetran_{two random words}_staging, temporarily store data during processing before it’s integrated into your main schema. These staging datasets are essential for maintaining a streamlined workflow and should not be deleted, as they will be automatically recreated.

Import events from Databricks

Send behavioral event data from Databricks to OneSignal to:
  • Trigger Journeys based on user activity
  • Personalize messaging based on behavioral data
Requirements
  • Databricks workspace with SQL Warehouse or compute cluster
  • Personal Access Token with appropriate permissions
  • Event data tables containing behavioral data in Delta Lake format
  • Unity Catalog (recommended for data governance)
Setup Steps
1

Create Databricks Personal Access Token

Generate a Personal Access Token for OneSignal to access your Databricks workspace:
  1. Navigate to User Settings in your Databricks workspace
  2. Click Developer tab and then Access tokens
  3. Click Generate new token
  4. Enter a comment like “OneSignal Integration” and set expiration (recommend 90 days)
  5. Save the generated token (you’ll need this for OneSignal)
2

Configure SQL Warehouse access

Ensure OneSignal can query your event data via SQL Warehouse:
  1. Navigate to SQL Warehouses in your Databricks workspace
  2. Select or create a SQL Warehouse for OneSignal access
  3. Note the Server Hostname and HTTP Path from the connection details
  4. Ensure the warehouse has access to your event data tables
3

Grant table permissions

Grant OneSignal read access to tables containing event data:
-- For Unity Catalog enabled workspaces
GRANT SELECT ON TABLE catalog.schema.event_table TO `onesignal@yourdomain.com`;

-- For Hive metastore tables
GRANT SELECT ON TABLE database.event_table TO `onesignal@yourdomain.com`;
4

Add integration in OneSignal

In OneSignal, go to Data > Integrations and click Add Integration.Select Databricks and provide:
  • Server Hostname: Your Databricks SQL Warehouse hostname
  • HTTP Path: SQL Warehouse HTTP path
  • Personal Access Token: Token created in step 1
  • Catalog (optional): Unity Catalog name if using Unity Catalog
5

Configure event data source

Specify the Databricks table containing your event data:
  • Database/Schema: Database or schema name containing event tables
  • Table: Table name with event records (e.g., user_events)
  • Event Query: Optional SQL query to filter or transform event data
Your event table should contain columns for:
  • Event name/type (String)
  • User identifier (String)
  • Event timestamp (Timestamp)
  • Additional event properties
6

Test the connection

Click Test Connection to verify OneSignal can access your Databricks workspace and read event data.

Event data mapping

Map your to OneSignal’s custom events format:
OneSignal FieldDescriptionRequired
nameevent_nameEvent identifierYes
external_iduser_idUser identifierYes
timestampevent_timestampWhen event occurredNo
propertiesevent_dataNo

Advanced configuration

Unity Catalog Integration

Leverage Unity Catalog for governed data access:
SELECT
  event_name,
  user_id,
  event_timestamp,
  to_json(
    named_struct(
      'product_id', product_id,
      'purchase_amount', purchase_amount,
      'category', category
    )
  ) as payload
FROM catalog.schema.user_events
WHERE event_timestamp >= current_timestamp() - INTERVAL 7 DAYS

Delta Lake Optimization

Optimize event tables for better query performance:
  • Partitioning: Partition by date (event_date) for faster time-based queries
  • Z-Ordering: Z-order by user_id and event_name for better filtering
  • Delta Lake Features: Use liquid clustering for automatic optimization

Streaming Event Processing

For real-time event processing, consider:
  • Structured Streaming: Process events as they arrive
  • Delta Live Tables: Build robust event processing pipelines
  • Auto Loader: Continuously ingest new event files
Ensure your SQL Warehouse has sufficient compute resources to handle OneSignal’s queries without affecting other workloads.

FAQ

Why do I see different message IDs with the same content?

This happens when the same message is sent more than once, likely via a transactional flow or message template reused across multiple sends.

How often does OneSignal sync events from Databricks?

OneSignal syncs event data based on your configured schedule, with a minimum interval of 15 minutes.

Can I use Databricks notebooks for event processing?

Yes, you can use notebooks to process and prepare event data, then expose it via tables that OneSignal can query.

What about cost optimization for event queries?

Consider using serverless SQL Warehouses for cost-effective, on-demand compute that automatically scales based on query load.