This post demonstrates the iterative approach that is an integral component of agile development on my personal project. It started small with a minimum viable product (MVP) and evolved by adding more features and refining the existing ones.
The centre point of this post is my Photo of the Day bot that started as a simple script one would manually invoke and run on local machine. The bot evolved into a dockerized application running in the cloud while continuously building and deployed.
The implementation of the idea goes a long way back. At the beginning, there wasn’t a clear vision of what features and technologies are going to be utilized. Over time, I picked some technologies from my consulting projects, others out of curiosity. Either way, starting on a greenfield is a different experience than using the technology at the client. One needs to dig deeper, start from scratch, tweak the configuration, customize and troubleshoot.
The simple script evolved into a continuously built and deployed dockerized application. It would not only post a photo to Twitter but also extract EXIF information, post on Telegram, integrate with Amazon’s S3 and ECR and auto-generate hashtags using one of Tensorflow’s basic model.
The best learning experience is applying what one has learned on something practical. I only got this far because I wanted to make something for people around me and myself that would cheer our days up. ❤️
From the beginning …
During my travels, I shoot lots of beautiful photos, but only a handful of them see the light of the world. If some do, it might be as short as showing them to my friends at the cafe. And that’s a shame.
MVP – Minimum Viable Product
The first version of the script would scan a photo folder, pick a random photo and post it to Twitter via API. Within a few hours, I got a working MVP.
Twitter API can only post an image file smaller than 3MB. Keeping an eye on the file size would not be practical. Within the next iteration, I implemented the Pillow library to keep resizing before the file size gets under 3MB. One can also use Pillow to extract file’s metadata, transform and lots more. My bot extracts date, GPS coordinates and camera information.
Another iteration’s focus was on rewriting the spaghetti script with a few methods into an object-oriented design. I also refactored lots of the code and added Telegram posting support.
Some time later, I saw someone using TensorFlow for image recognition. What a better way of trying new technology out than implementing it into the bot and present the outcome as hashtags! It’s 2020, and I’m definitely not going to do that manually. 😀🤦🏼♀️
As of 2021, instead of replatforming to TensorFlow 2.0, I switched to Rekognition and got rid of a good amount of libraries.
Because of bundling the image recognition model with TensorFlow, I thought of containerizing the application to make building and running more automated and coherent process. Later, I benefited from containerizing as it made cloud hosting a relatively simple task.
S3 – Simple Storage Service
At this point, the bot relied on image data present in its folder structure. This tight coupling would be impractical for a cloud-hosted scheduled task. To decouple data from the code, they would have to be stored somewhere else in storage that would be independent on the code. An ideal use-case of AWS S3! The bot utilizes the boto3 library to retrieve the image from S3 and post it. Easy!
ECS – Elastic Container Service
Up to this point, the bot execution was a manual process not bringing any added value. On my way back from a conference, I sat next to an AWS architect on the aeroplane. During our conversation, I asked whether he’s got an idea of hosting a dockerized application in AWS and got recommended to take a look at ECS, namely Fargate launch type.
The Fargate launch type allows one running dockerized images without the need for maintaining infrastructure and EC2 cluster node instances. It’s a lightweight launch type where one specifies their Task Definition (what to run with resource quotas) and creates a task on the cluster.
Pricing is based on requested CPU and memory resources per hour. My bot runs as a scheduled task every day for approximately two minutes. My bill is around 0.7USD per month.
Azure DevOps Pipelines
Last bit of the puzzle was implementing continuous deployment process. At that time, I was part of a project using Azure DevOps pipelines for infrastructure and data platform deployment.
We had a DevOps team that did all the heavy lifting, and my team was only making minor adjustments to their foundation. I wanted to get some hands-on experience and went to set a continuous build pipeline for my bot.
My build pipeline is hooked on the bot’s master branch. When there’s a new commit, the pipeline pulls the repository content, inserts secrets (Twitter, Telegram, S3 API keys), invokes Docker Build task and pushes the image to Elastic Container Registry (ECR).
The application build was a great experience. Throughout the process, I got to learn new technologies, find a way of applying them on the problem and making them work together. Whereat the beginning was only a Python script with third-party libraries, now stands Docker, AWS with S3, ECR, ECS and Azure with DevOps.
The bot’s ECS Scheduled Task gets invoked, and posts to Twitter every day at 6 am CET.