Project Overview
A production-grade serverless application built on AWS to capture, process, and analyze product visit data. This project demonstrates end-to-end cloud architecture including event-driven processing, data streaming, API integration, and automated compliance monitoring.
Architecture Highlights
- Frontend: Static web form hosted on S3 with CloudFront distribution
- API Layer: API Gateway with Lambda proxy integration for request processing
- Message Queue: SQS for reliable, decoupled message handling
- Data Storage: DynamoDB for fast writes with on-demand capacity
- Data Lake: S3-based data lake fed by DynamoDB Streams
- Compliance & Monitoring: CloudTrail, EventBridge, and SNS for change tracking
- Observability: CloudWatch Logs for troubleshooting and verification
Technologies Used
AWS Services: Lambda (Node.js 22.x), DynamoDB, DynamoDB Streams, SQS, S3, API Gateway, CloudTrail, EventBridge, SNS, IAM, CloudWatch Logs
Development Tools: AWS CLI, CloudShell, API Gateway JavaScript SDK
Part 1: Event-Driven Data Processing Pipeline
Objective
Build the core data pipeline: SQS queue receives product order messages, triggers Lambda function, which writes to DynamoDB table. Establish CloudWatch Logs as the primary troubleshooting tool.
Implementation Details
DynamoDB Table Configuration
- Table Name:
ProductVisits - Partition Key:
ProductVisitKey(String) - Capacity Mode: On-demand (auto-scaling for unpredictable workloads)
- Indexes: 0 (simple key-value store pattern)
SQS Queue Setup
- Queue Name:
ProductVisitsDataQueue - Type: Standard (high throughput, at-least-once delivery)
- Encryption: Amazon SQS key (SSE-SQS)
- Purpose: Decouples message ingestion from processing, smooths traffic spikes
Lambda Function
- Function Name:
productVisitsDataHandler - Runtime: Node.js 22.x
- Trigger: SQS queue with batch size of 1
- IAM Role:
lambdaRoleForSQSPermissions - Permissions: DynamoDB (PutItem, GetItem, Scan, UpdateItem, DeleteItem), SQS (ReceiveMessage, DeleteMessage, GetQueueAttributes), CloudWatch Logs (CreateLogGroup, CreateLogStream, PutLogEvents)
Testing & Validation
Used AWS CLI from CloudShell to send JSON-formatted product order messages to the SQS queue. Each message contained product details (ProductId, ProductName, Category, Price, CustomerId, CustomerName, TimeOfVisit).
Lambda function successfully processed messages from the queue and wrote to DynamoDB. Verified data integrity by scanning the table, confirming 5 product orders were stored correctly.
Key Learnings
- SQS provides reliable decoupling between message producers and consumers
- Lambda scales automatically to handle queue depth
- CloudWatch Logs is essential for debugging serverless applications
- DynamoDB on-demand capacity eliminates the need for capacity planning
Part 2: Data Lake Integration with DynamoDB Streams
Objective
Enable real-time data streaming from DynamoDB to S3 data lake. Capture item-level changes using DynamoDB Streams and trigger Lambda to write data to S3 in a partitioned folder structure.
Implementation Details
DynamoDB Streams Configuration
- View Type:
New image(captures the entire item after modification) - Purpose: Stream changes from ProductVisits table to downstream consumers
S3 Data Lake
- Bucket Name:
product-visits-datalake-98329713 - Folder Structure:
data/YYYY/MM/DD/(date-based partitioning) - Purpose: Store raw product visit records for analytics and reporting
Lambda Function for Stream Processing
- Function Name:
productVisitsDatalakeLoadingHandler - Runtime: Node.js 22.x
- Trigger: DynamoDB Stream with batch size 1
- IAM Role:
productVisitsLoadingLambdaRole - Permissions: DynamoDB Streams (GetRecords, GetShardIterator, DescribeStream, ListStreams), S3 (PutObject on specific bucket), CloudWatch Logs
IAM Policy Configuration
Custom-managed policy productVisitsLoadingLambdaPolicy initially had incorrect S3 permissions.
CloudWatch Logs revealed AccessDenied error for S3 PutObject:
Fixed by updating policy with exact bucket ARN:
Testing & Validation
Re-sent messages to SQS queue via CloudShell to trigger DynamoDB writes, which in turn triggered the stream.
Verified S3 bucket contains date-partitioned folders with product visit data:
Key Learnings
- DynamoDB Streams enable real-time data replication and ETL workflows
- CloudWatch Logs is critical for diagnosing permission issues
- IAM policies must specify exact resource ARNs for S3 bucket access
- Date-based partitioning in S3 simplifies data organization and query performance
Part 3: Web Frontend with API Gateway Integration
Objective
Build a static web frontend that submits product visit data through API Gateway, which triggers Lambda to send messages to SQS, completing the end-to-end data flow.
Implementation Details
Lambda Function for API Integration
- Function Name:
productVisitsSendDataToQueue - Runtime: Node.js 22.x
- Purpose: Receives HTTP requests from API Gateway and sends messages to SQS
- IAM Role:
productVisitsSendMessageLambdaRole - Permissions: SQS SendMessage, CloudWatch Logs
IAM Policy for SQS Access
Custom-managed policy productVisitsSendMessageLambdaPolicy:
API Gateway REST API
- API Name:
productVisit - Type: REST API (Regional)
- Resource:
/productVisit - Method:
PUTwith Lambda proxy integration - CORS: Enabled (required for browser requests)
Static Website Hosting
- S3 Bucket:
product-visits-webform-48466948 - Configuration: Static website hosting enabled
- Files: HTML form, API Gateway JavaScript SDK (apigClient.js), required libraries
- Deployment:
aws s3 syncfrom CloudShell
Web Form
Simple HTML form with fields for product visit data:
Testing & Validation
Submitted product visits through web form. Data flowed through API Gateway → Lambda → SQS → Lambda → DynamoDB. Verified 18 total items in DynamoDB (5 from CLI testing + 13 from web form):
Key Learnings
- API Gateway with Lambda proxy integration simplifies request/response handling
- CORS must be enabled for browser-based API calls
- API Gateway JavaScript SDK handles request signing and authentication
- S3 static website hosting provides cost-effective frontend hosting
- End-to-end serverless architecture eliminates server management overhead
Part 4: Compliance Monitoring & Alerting
Objective
Implement automated change tracking and real-time alerting for S3 bucket configuration changes. Use CloudTrail for audit logging, EventBridge for event detection, and SNS for email notifications.
Implementation Details
CloudTrail Configuration
- Trail Name:
management-events - Scope: Multi-region (captures API calls across all regions)
- Event Type: Management events only (no data events)
- S3 Bucket:
aws-cloudtrail-logs-760660567681-c11ca38b - Purpose: Centralized audit log for all AWS API calls
SNS Topic for Email Alerts
- Topic Name:
MyEmailNotification - Type: Standard
- Subscription: Email (confirmed)
- Purpose: Deliver real-time alerts to administrator inbox
EventBridge Rule
- Rule Name:
BucketChangesAlert - Event Pattern: S3 bucket configuration changes
- PutBucketPolicy
- DeleteBucketPolicy
- PutBucketVersioning
- PutBucketPublicAccessBlock
- Target: SNS topic (MyEmailNotification)
- Purpose: Trigger email alerts on bucket configuration changes
Testing & Validation
Deleted bucket policy on product-visits-webform bucket (then immediately restored it).
CloudTrail captured the API call, EventBridge matched the event pattern, and SNS delivered email alert within seconds:
Key Learnings
- CloudTrail provides comprehensive audit logging for compliance and security
- EventBridge enables event-driven automation based on AWS service events
- SNS delivers real-time notifications with minimal latency
- Automated alerting helps maintain security posture and detect unauthorized changes
- Event pattern filtering reduces alert noise by focusing on meaningful changes
Architecture Diagram
Data Flow:
- User submits product visit form → S3 static website
- JavaScript calls API Gateway →
/productVisitPUT method - API Gateway triggers
productVisitsSendDataToQueueLambda - Lambda sends message to
ProductVisitsDataQueueSQS - SQS triggers
productVisitsDataHandlerLambda - Lambda writes to
ProductVisitsDynamoDB table - DynamoDB Stream captures change
- Stream triggers
productVisitsDatalakeLoadingHandlerLambda - Lambda writes to
product-visits-datalakeS3 bucket - CloudTrail logs all API calls
- EventBridge monitors S3 configuration changes
- EventBridge triggers SNS for email alerts
Components:
- 3 Lambda functions (Node.js 22.x)
- 1 DynamoDB table with Streams enabled
- 1 SQS queue (Standard)
- 2 S3 buckets (static website + data lake)
- 1 API Gateway REST API
- 1 CloudTrail trail (multi-region)
- 1 EventBridge rule
- 1 SNS topic with email subscription
- 3 IAM roles with custom policies
- CloudWatch Logs for observability
Skills Demonstrated
- Serverless architecture design and implementation
- Event-driven processing with SQS and Lambda
- API design and integration with API Gateway
- Data streaming with DynamoDB Streams
- Data lake design with S3 partitioning strategy
- IAM policy creation and least-privilege access control
- CloudWatch Logs for troubleshooting and debugging
- CloudTrail for audit logging and compliance
- EventBridge for event-driven automation
- SNS for real-time notifications
- AWS CLI and CloudShell for testing and deployment
- Static website hosting on S3
- Cross-service integration and orchestration
- Error handling and iterative debugging
Lessons Learned
- CloudWatch Logs is essential: Every Lambda invocation, error, and permission issue was diagnosed through CloudWatch Logs
- IAM policies require precise ARNs: Generic wildcards often fail; specify exact resource ARNs for S3 and other services
- SQS decoupling is powerful: Separating message ingestion from processing allows independent scaling and error handling
- DynamoDB Streams enable real-time ETL: Stream processing patterns are ideal for building data lakes and analytics pipelines
- API Gateway + Lambda simplifies API development: No servers to manage, automatic scaling, and built-in authentication options
- Event-driven monitoring reduces manual overhead: Automated alerts catch configuration drift immediately
- Incremental testing is critical: Building and testing each component separately simplifies debugging
Future Enhancements
- Add Amazon Cognito for user authentication and authorization
- Implement API Gateway request validation and throttling
- Add DLQ (Dead Letter Queue) for failed message processing
- Build analytics dashboards with QuickSight or custom visualizations
- Implement data retention policies and lifecycle management
- Add CloudWatch alarms for Lambda errors and DynamoDB throttling
- Convert infrastructure to Infrastructure as Code (Terraform or CloudFormation)
- Add CI/CD pipeline for automated deployments
- Implement data encryption at rest and in transit
- Add X-Ray tracing for distributed request tracking
Conclusion
This serverless application demonstrates a complete, production-ready architecture for capturing, processing, and analyzing product visit data. By leveraging AWS managed services, the solution achieves:
- Scalability: Auto-scales to handle any traffic volume
- Reliability: Built-in redundancy and automatic retries
- Cost Efficiency: Pay-per-use pricing with no idle infrastructure costs
- Observability: Comprehensive logging and monitoring
- Security: Least-privilege IAM policies and automated change detection
- Maintainability: Serverless architecture eliminates server patching and management
The project showcases expertise in AWS cloud services, event-driven architecture, IAM security, and modern application development patterns. It represents the type of scalable, cost-effective solutions that drive business value in cloud-native environments.