The journey of creating a serverless service to scrape contents from a web page
5 min readApr 30, 2022
In this article, I will share my experience of using Serverless Framework to build a service to extract the HTML content of a web page and store it in an S3 bucket. This article covers the tech stack, unit tests and CI/CD.
It is a very simple serverless service that I use as an opportunity to experiment with some ideas which I would like to share with everyone.
Serverless Framework
The reasons why I choose Serverless Frameworks are:
Infrastructure as code
It allows us to specify all the AWS services to be used in a file called serverless.yml. The syntax is straightforward and is also very flexible.
- Follow separation of concerns principle, we can put related configurations in separate files and reference them in the main configuration file. For example, I put the configuration for S3 bucket IAM and S3 bucket in two separate files.
- We can use variables in the serverless.yml file to dynamically replace configuration values. For example, I use variables to reference other resources either defined in the main configuration file or separate configuration files.