Settings

Theme

Show HN: CoolQLCool – Turn Websites into GraphQL Accessible APIs

coolql.cool

124 points by gavino 7 years ago · 22 comments

Reader

lachenmayer 7 years ago

I've previously written a very similar project called "graphql-scraper" (which is arguably a far less cool name...), you can check it out at http://github.com/lachenmayer/graphql-scraper

It works very similarly, with only superficial differences under the hood (eg. I used jsdom, and this uses cheerio). The `waitForSelector` feature is very cool!

You can see a live demo of the HN example using graphql-scraper at https://graphqlbin.com/v2/lxNohP

This example is deployed on Glitch - you can easily spin up your own using https://github.com/lachenmayer/graphql-scraper-server (with 1-click deploys to Heroku, Now & Glitch)

Of course (as mentioned already) there is also https://github.com/syrusakbary/gdom which uses Python+Graphene.

  • gavinoOP 7 years ago

    I remember seeing GDOM a while back when I first started this project, but forgot to write it down as a source of inspiration. I'm gonna add all of these as alternatives, because they're all great :D

    • syrusakbary 7 years ago

      So happy to read that :) (and so glad it's served as source of inspiration for your project, keep up the good work!)

maio 7 years ago

Nice! There is also similar project GDOM - https://github.com/syrusakbary/gdom written in Python.

bryanrasmussen 7 years ago

Are you planning to build anything on top of this - service,company? I was thinking it would be a good way to build an api for some projects I've been thinking of working on, although I would probably want to switch out cheerio for https://github.com/intoli/remote-browser/

  • gavinoOP 7 years ago

    Nah, I don't really plan on turning it into a company. I'd gladly accept any PR to swap out cheerio, I haven't touched that part in close to a year :D

canadev 7 years ago

This is a tangent but they link to a serverless deployment service where you upload your code as a function and they execute it. Pretty interesting.

pdxandi 7 years ago

I've been looking for something like this! I'm trying to play around with it but can't seem to get the selector right. How do I grab a table `td` by its nth selector (tried `td:nth-of-type(n)` to no avail)?

VMG 7 years ago

Awesome name, awesome project!

conceptpad 7 years ago

Great project! I can imagine this may greatly improve web certain classes of scraping. @gavino I'm curious what tooling and architecture you used to put this together?

  • gavinoOP 7 years ago

    Sure! The backend is actually pretty straight forward, it's a NextJS app deployed on Now with a few added endpoints to handle the incoming GraphQL queries.

    Then for actually turning the query into a digestable output I used the GraphQL schema builder that handles accepts HTML nodes from the requested page and grabs the right variables.

simonhamp 7 years ago

Didn’t Yahoo do something like this many years ago, effectively a SQL for web pages?

nurettin 7 years ago

Not sure what to make of this. How does it handle throttling or captchas?

halfjew22 7 years ago

If this is a community reference I’m going to be very happy.

jarjar12 7 years ago

Sorry a dumb question. What are the use cases ? Thx

  • ralusek 7 years ago

    1.) You have a website with data you'd like to consume.

    2.) That website doesn't expose an api, but returns statically rendered html.

    3.) You don't like parsing statically rendered html for the data you're looking for, and you'd prefer getting the data using a GraphQL interface.

    • acct1771 7 years ago

      Page isn't loading in Materialistic (HN reader on Android/F-Droid repo), but: do you have this exact verbiage on it? It's very concise!

hokumguru 7 years ago

This is very cool.

powerslacker 7 years ago

Give this man an internet.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection