Picket: Automated Threat Intelligence Integration for AWS

Most AWS environments consume threat intelligence the same way: someone downloads a CSV, reformats it, uploads it to a GuardDuty threat list or a WAF IP set, and hopes they remember to do it again next week. The lists go stale. The coverage is partial. The process doesn’t scale.

Picket automates the entire pipeline — from feed ingestion to AWS service integration — so that threat intelligence actually reaches the controls that can act on it.

The project is live now: dashboard and GitHub repository.

The problem

AWS provides the enforcement mechanisms. GuardDuty accepts custom threat intelligence sets for IP and domain matching. WAF accepts IP sets for request blocking. Security Hub accepts custom findings for centralised triage. The services work — the gap is getting data into them reliably and continuously.

Threat intelligence feeds publish thousands of indicators of compromise daily: malicious IPs, command-and-control domains, malware hashes. The data is freely available from sources like Abuse.ch ThreatFox, Abuse.ch Feodo Tracker, and AlienVault OTX. Commercial feeds from CrowdStrike, Recorded Future, and others add higher-confidence indicators.

The challenge is operational. Each feed has a different format — JSON APIs, CSV blocklists, STIX/TAXII. Each AWS service has a different ingestion mechanism — S3 files for GuardDuty, API calls for WAF, BatchImportFindings for Security Hub. Each service has different limits — GuardDuty allows six threat intel sets of 250,000 IPs each, WAF allows 10,000 IPs per IP set. Indicators have different lifespans and different levels of reliability.

Manually managing this across multiple feeds and multiple AWS services is a maintenance burden that most teams either avoid entirely or do inconsistently. The result is the same: threat intelligence that exists but never reaches the services that could use it.

How Picket works

Picket is a serverless pipeline built on Lambda, DynamoDB, SQS, and EventBridge. It runs in three stages.

Ingest. EventBridge schedules invoke a feed poller Lambda on a configurable interval — every 15 minutes by default. The poller dispatches to feed-specific adapters that handle the format differences. Each adapter fetches indicators, normalises them into a common IOC schema (type, value, source, tags, MITRE tactics), and publishes them to an SQS queue.

Three feeds are integrated in the initial release: Abuse.ch ThreatFox (malware IOCs across IP, domain, URL, and hash types), Abuse.ch Feodo Tracker (banking trojan C2 infrastructure), and AlienVault OTX (community-sourced pulses covering all indicator types). Adding a feed means writing one adapter function — the pipeline handles everything downstream.

Normalise. A normaliser Lambda consumes from the SQS queue, validates each indicator (IP format, domain structure, hash length), deduplicates against DynamoDB, and computes a confidence score. The scoring model weights indicators by source reliability — an indicator seen by multiple independent feeds scores higher than one from a single source. IOCs receive a time-to-live (default 30 days) that refreshes each time the indicator reappears in a feed, so actively reported threats persist while stale indicators age out automatically.

Distribute. Distributor Lambdas run on a separate schedule and push qualified indicators — those above a configurable confidence threshold — to the appropriate AWS services:

IPv4 and IPv6 addresses route to GuardDuty threat intel sets (via S3) and WAF IP sets (via API). GuardDuty uses them for network flow and DNS log correlation. WAF uses them to block requests at the edge.
Domains route to GuardDuty threat intel sets for DNS-based detection.
File hashes route to Security Hub as custom findings for centralised visibility.

Each distributor handles the service-specific mechanics — S3 file generation for GuardDuty, optimistic concurrency with lock tokens for WAF, batch import for Security Hub — so the rest of the pipeline doesn’t need to care.

An expiry worker runs hourly, querying DynamoDB for indicators past their TTL and marking them for removal from all downstream services. Indicators that are no longer reported by any feed don’t persist indefinitely.

Design decisions

DynamoDB single-table design. All IOC data lives in one DynamoDB table with a composite key structure. The partition key encodes IOC type and value (IOC#ipv4#203.0.113.50), the sort key encodes source (SOURCE#abusech-threatfox). Two global secondary indexes support the distribution and expiry query patterns — one for querying by type and confidence score, one for querying by TTL expiry time. This means each distributor can efficiently retrieve exactly the indicators it needs without scanning the full table.

Confidence-based prioritisation. AWS service limits are real constraints. WAF IP sets hold 10,000 addresses. Rather than failing when a feed produces more indicators than a service accepts, Picket queries the top N indicators by confidence score. The highest-confidence threats — those corroborated by multiple sources — always make it into the enforcement layer.

Source-weighted scoring. Not all feeds are equal. The confidence model assigns base weights by source type (commercial feeds score higher than community OSINT) and boosts indicators seen by multiple independent sources. A single Abuse.ch indicator starts at 70. The same indicator independently reported by OTX boosts to 80. If CrowdStrike also reports it, it climbs higher. The formula is deterministic and transparent — no opaque ML scoring.

Serverless and event-driven. No servers to patch, no containers to manage, no idle compute. Lambda functions execute on schedule or in response to SQS messages, scale to zero between runs, and cost fractions of a cent for typical workloads. The entire platform runs within AWS free tier for moderate feed volumes.

The dashboard

Picket includes a static dashboard at picket.awsmatt.com built with Astro and served via CloudFront. It provides a read-only view of the platform’s operational state: active IOC counts by type, feed health status, and a breakdown of how indicators route to AWS services.

The dashboard fetches data from a lightweight API Gateway + Lambda backend that queries DynamoDB directly. It’s a monitoring surface, not a control plane — all configuration lives in Terraform, all automation runs on schedule.

Security posture

The platform is designed for a security-conscious deployment:

No long-lived credentials. GitHub Actions authenticates via OIDC, locked to the main branch. Lambda roles follow least-privilege with resource-scoped IAM policies — no Resource: "*".
Infrastructure as code. Every resource is defined in Terraform with modules for each subsystem. State is remote with locking. IAM policies are scoped to specific ARNs.
Input validation at every boundary. IOC values are validated against type-specific rules before entering DynamoDB. The API is read-only with CORS locked to the dashboard origin.
Defence in depth on the dashboard. CloudFront enforces HSTS with preload, X-Frame-Options DENY, Content-Type-Options nosniff, and a strict referrer policy. TLS 1.2 minimum. HTTP/2 and HTTP/3 enabled.

What comes next

The initial release covers the core pipeline: ingest, normalise, distribute, expire. The architecture is designed to extend in several directions.

More feeds. The adapter pattern makes adding feeds straightforward. STIX/TAXII sources, MISP feeds from national CERTs, and commercial feeds like CrowdStrike and Recorded Future are natural additions. Each new feed strengthens the confidence scoring through multi-source corroboration.

Network Firewall integration. AWS Network Firewall accepts domain and IP rule groups that can block traffic at the VPC level. Adding a distributor for Network Firewall extends coverage from edge (WAF) and detection (GuardDuty) to network-layer enforcement.

Richer dashboard. The current dashboard shows aggregate counts. Future iterations could surface IOC timelines, feed reliability metrics, confidence score distributions, and a searchable indicator table. Historical trend data would help teams understand how their threat landscape evolves.

Multi-account distribution. Organisations running AWS at scale operate across many accounts. Distributing threat intelligence from a central security account to member accounts — via GuardDuty administrator delegation, WAF Firewall Manager policies, or cross-account Security Hub — is a natural scaling step.

STIX/TAXII output. Picket consumes threat intelligence. It could also produce it — publishing a curated, confidence-scored TAXII feed that other tools and organisations consume. The DynamoDB data model already captures the metadata needed for STIX bundle generation.

The connection to D3FEND-AWS

Picket and D3FEND-AWS address different layers of the same problem. D3FEND-AWS maps what defensive controls exist for each AWS attack technique. Picket operationalises one category of those controls — threat intelligence integration — by automating the data flow from external feeds into AWS enforcement services.

In D3FEND-AWS terms, Picket implements defensive techniques across the Detect tactic (GuardDuty threat intelligence correlation) and the Harden tactic (WAF IP blocking, Network Firewall rules). The structured mapping tells you which controls matter. Picket makes one set of those controls work continuously without manual intervention.

Try it

The dashboard shows the live system. The source code is open source. Deploy it in your own AWS account with terraform apply.