Blogs Home

Streaming Events to External Data Lakes with AWS Firehose and EdgeTag

March 28, 2025

The introduction of the Event Sink provider channel in Blotout has unlocked a powerful new capability: the ability to stream events directly into your own external data lake.

In this guide, we’ll walk through how to configure Blotout to forward events into AWS Firehose, using a Cloudflare Worker as a secure relay.

Why Stream Events to a Data Lake?

Streaming event data into a centralized repository gives your organization:

  • Ownership of first-party data
  • Flexibility to analyze and transform events however you like
  • Compatibility with BI tools, ML pipelines, and custom analytics

With Blotout+ AWS Firehose, you get a scalable and cost-effective event pipeline.

Introducing the Event Sink Provider Channel

The Event Sink provider channel acts as a customizable endpoint where Blotout can deliver raw events in real time. By pointing it to a Cloudflare Worker, you can then authenticate and relay those events into AWS Firehose.

Setting Up a Cloudflare Worker Endpoint

To securely forward events into AWS Firehose, we need a Cloudflare Worker that holds AWS credentials.

Configure Secrets

In your Worker settings, add the following secrets:

  • AWS_REGION – e.g. us-east-1
  • AWS_FIREHOSE_STREAM – stream name in AWS
  • AWS_ACCESS_ID – credential ID
  • AWS_ACCESS_KEY – credential secret
  • BEARER_TOKEN – random string for Blotout authentication

Use type: Secret so values are hidden after configuration.

Building and Deploying the Worker with Wrangler

Step 1: Initialize the Worker

wrangler init aws-firehose-example

Choose:

  • Hello World Starter
  • Worker only
  • TypeScript
  • No Git (optional)
  • No immediate deployment

Step 2: Install Packages

npm i aws4 axios

Add Node.js compatibility in wrangler.jsonc:

"compatibility_flags": ["nodejs_compat"]‍

Step 3: Define Environment Variables

// worker-configuration.d.ts

declare namespace Cloudflare {
  interface Env {
    AWS_ACCESS_ID: string;
    AWS_ACCESS_KEY: string;
    AWS_FIREHOSE_STREAM: string;
    AWS_REGION: string;
    BEARER_TOKEN: string;
  }
}

interface Env extends Cloudflare.Env {}

Step 4: Implement the Worker

import aws4 from 'aws4'

const getEndpoint = (region: string) =>
  `https://firehose.${region}.amazonaws.com`

export default {
  async fetch(request: Request, env: Env, ctx: ExecutionContext): Promise<Response> {
    if (request.method === 'POST') {
      const authHeader = request.headers.get('Authorization') || ''
      if (authHeader !== `Bearer ${env.BEARER_TOKEN}`) {
        return new Response('Not allowed', { status: 403 })
      }

      const endpoint = getEndpoint(env.AWS_REGION)
      const payload = await request.text()

      // Build the Firehose API request body
      const body = JSON.stringify({
        DeliveryStreamName: env.AWS_FIREHOSE_STREAM,
        Data: payload,
      })

      const url = new URL(endpoint)
      const firehoseRequest: any = {
        host: url.host,
        path: url.pathname || '/',
        method: 'POST',
        headers: {
          'Content-Type': 'application/json',
          'Content-Length': String(Buffer.byteLength ? Buffer.byteLength(body) : body.length),
        },
        body,
        service: 'firehose',
        region: env.AWS_REGION,
      }

      // Sign the Firehose request (aws4 mutates the object and returns it)
      aws4.sign(firehoseRequest, {
        accessKeyId: env.AWS_ACCESS_ID,
        secretAccessKey: env.AWS_ACCESS_KEY,
      })

      try {
        // Use fetch (works in Cloudflare Workers)
        const response = await fetch(endpoint + (firehoseRequest.path || ''), {
          method: firehoseRequest.method,
          headers: firehoseRequest.headers,
          body: firehoseRequest.body,
        })

        const text = await response.text()
        console.log('Event sent to Firehose', response.status, text)

        // Mirror upstream status + body to client (or return simple OK if you prefer)
        return new Response(text, { status: response.status })
      } catch (err) {
        console.error('Error while sending data to Firehose', err)
        return new Response(`Error: ${(err as Error).message}`, { status: 500 })
      }
    }

    return new Response('Hello World!')
  },
} satisfies ExportedHandler<Env>

Step 5: Build & Deploy

wrangler build

Deploy to Cloudflare, then copy the generated code into the editor for final deployment.

Connecting Blotout to AWS Firehose

In EdgeTag’s Event Sink channel form:

  • Set the Worker’s public URL
  • Add the Bearer token from your Worker secrets

Once saved, events will automatically begin streaming through your Worker into AWS Firehose.

Verifying and Debugging Your Setup

  • Check Cloudflare Worker logs in real time to confirm event delivery.
  • Debug errors by updating Worker code and redeploying.
  • Monitor AWS Firehose metrics to ensure events are flowing into your data lake.

FAQs: Streaming Events to AWS Firehose

Q1: Why use a Cloudflare Worker instead of calling Firehose directly?
Workers provide a secure relay, preventing AWS credentials from being exposed.

Q2: Can I route events to multiple streams?
Yes—your Worker logic can branch by event type, channel, or metadata.

Q3: Is this setup HIPAA/GDPR compliant?
Yes, as long as you configure filtering logic within your Worker to strip sensitive fields before sending to Firehose.

Conclusion: Your Data, Your Way

With EdgeTag’s Event Sink channel and Cloudflare Workers, you can build custom pipelines to AWS Firehose-backed data lakes—with security, flexibility, and full ownership of your event data.

👉 This is just the beginning. Future posts will explore advanced transformations and multi-cloud event streaming.