I Built a Visa Requirement Change Tracker for Fun

Have you ever wondered how visa requirements between countries change over time? I certainly have. It all started when I was planning an international trip and needed to check if I needed a visa for the country I was visiting. A quick Google search gave me the answer, but it sparked a bigger question: How have visa requirements evolved over the years?

I was wondering if countries are getting more relaxed about international travel or if things are getting stricter. Also, I wanted to keep track of when and how visa rules shift between countries in the future, for example:

Recent Visa requirement changes for United States (more examples later) — Recent Visa requirement changes for United States

Surprisingly, I couldn't find a good source for historical visa requirement data online. So, I figured I’d have fun creating something simple to track it. Hopefully, it will stick around for years to come!

Finding the Right Data Source

My first step was to find a credible source. In this case, I turned to the Henley Passport Index, the same site where I initially checked visa requirements.

A quick inspection of the network requests revealed two useful API endpoints:

Inspecting the network tab from the website

api.henleypassportindex.com/api/v3/countries
api.henleypassportindex.com/api/v3/visa-single/:country_iso_code

These APIs provided current visa requirements and historical passport strength data for each country — perfect!

I always prefer using APIs instead of scraping HTML. API data is way more organized and doesn’t tend to break when the website design changes.

Designing the Database Schema

I picked SQLite because it's super easy to use and I am familiar with it.

Based on the available data and the questions I wanted to answer for in mind, I settled on this schema:

erDiagram Country ||--o{ CountryRanking : has Country ||--o{ VisaRequirement : "issues/receives" Country { text code PK text name text region } CountryRanking { text country_code PK, FK int year PK int rank int visa_free_count } VisaRequirement { text from_country PK, FK text to_country PK, FK date effective_date PK text requirement_type }

I hope I don't end up regretting using natural keys here, but honestly, I think it makes sense in this case.

💬

If you tried hitting the /v3/countries endpoint above, you'd notice a field called openness. I'm not exactly sure why does but it just seemed to have the same value for every single country so I'm omitting that here.

Designing the System

My goals were simple:

Keep my cost as low as possible (ideally free to host)
Make it easy to maintain (e.g. simple code, minimizing the number of interacting components)
Make it easy to share with others i.e. host it on the Internet

The Cron Job

I decided to use a technique I call "GHActions Scraping", which I've detailed in my previous post.

Basically, the idea is to use:

GitHub Actions as a cron job for scraping tasks
Workflow Artifacts for storing the SQLite database, eliminating the need for a separate database server

SQLite is perfect for this use case because it's just a flat file that can be easily uploaded to and downloaded from GitHub Artifacts. Anyway, here's what the project would look like:

💡

You can see the full GitHub Actions workflow I've set up for this project here.

The Scraper

Having written several scrapers from finding cheap craft beers to Esports schedules before, I opted for just a simple Python script. No frameworks whatsoever.

The logic was straightforward:

Fetch data for all 227 ~~countries~~ travel destinations
For each country code, fetch its visa requirements
Parse and store this data in SQLite

The script is designed to update existing records if they've changed and only add new ones when necessary to avoid duplication.

You can find the entire Python script on GitHub.

Hosting and Display

For the cron job, I use GitHub Actions. Since I plan to run the job only twice a month, it's essentially free.

To display the data, I chose Datasette hosted on Railway. While Datasette may not be the fanciest looking choice, it gets a lot done without requiring extensive frontend work which I am not really good at.

Continuous Deployment (CD) With Railway Docker Image Source

While setting up CD with Railway, I ran into a little hiccup. Whenever I scrape new data and update my SQLite DB, I have to build a new Docker image. The problem is that we're using a Docker image as our deployment source.

Right now, Railway has no way of knowing when a new Docker image is published on Docker Hub, so it doesn’t automatically deploy the latest one (the railway up command doesn’t do the trick for redeploying Docker images as it tries to build with Nixpacks instead). To get around this, I had to check the last successful deploy ID and then use that ID to trigger a redeployment to make CD work properly for my project (example). If you're hosting Datasette on Vercel, it’s way easier with the datasette-publish-vercel plugin!

The Results (Some Screenshots)

If you're curious about how everything looks like on Datasette, here are some of the interesting findings that I've gathered:

Top 10 countries with the most improved passport rankings (c.a.a 2024)

Compares average visa-free counts by region for the last 5 years (c.a.a 2024)

My personal favorite is the non-reciprocal visa requirements by country:

Visa Requirement Reciprocity Summary — Non-reciprocal visa relationships refer to situations where two countries have different visa requirements for each other's citizens (c.a.a 2024)

Concerns and Caveats

I've thought about some things that could go wrong with this project. Here they are:

API Problems

I'm using an API that isn't officially documented. This means it could stop working as is at any time. Or maybe, the data structure might just change without notice. If that happens, I'll need to update the script or find a new data source, which would be a pain.

Losing Interest

I might get bored or tired of fixing this if it breaks. I've kept my old projects running so far, but it gets harder as I make more things. If it becomes too much work, I might have to shut it down — which is something I sometimes think about.

Increasing Costs

I hope the costs stay low. I could use a static site instead of Datasette to potentially save money, but Datasette is just so good! Observable Framework is such a strong contender for this.

That’s It

The website is now up and running!

I plan to come back to this over time to see how resilient this is and to see how visa policies change over time. Who knows what interesting patterns we might see over time?