Show HN: Automated, SEO optimized anime figure list with AI generated content
myfigurelist.comHey HN, I'm the author of this post.
I wanted to learn about SEO optimization, performances and page ranking. As I'm an anime fan, I decided to create a platform that discovers anime figure products, fills accurate descriptions about the product and the character, and lists the price of every shop with a fully automated process.
I just finished the project and I would like to share with you what I've learned about SEO, performances and I will detail interesting tech aspects to achieve this result.
Here are all the steps to discover new figures and create a relevant page for each:
1) My crawler looks for figures that don't exist in my DB, all Japanese figures have a unique identifier called "jancode", the crawler will visit official figure websites and try to look for unknown jancodes.
2) Once a new figure is discovered, this one will be added to a queue for further processing.
3) Using image AI, I'm doing image analysis and try to extract labels that describe the figure (for example accessories such as hat, staff, the pose, etc). Using these image labels and basic info about the figure such as the original series name or character, I use ChatGPT to generate a visual description of the figure as well as explaining the background of the original series.
4) Once I have all the needed information about the figure, I will use the jancode to discover the figure on multiple shops in order to compare the price. 5) Then a page is published to the platform with all the data.
6) Finally, I have a sitemap.xml generator that is exposing all figure pages in real-time in order to let search engine crawlers discover newly added pages.
Besides this automated process to add new figures to the platform, users can explore and manage their figure list and even share their reviews and images about a figure, these community content are more important than the initial AI-generated content.
In terms of tech stack I use Java for the backend, and Scrapy for the web crawler. The SEO and performances are also very important so I use NextJS for the front-end. Everything is deployed automatically using Docker Swarm and Github actions.