Settings

Theme

I scraped 1.94M Airbnb photos for opium dens, pet cameos, and messy kitchens

burla-cloud.github.io

69 points by jmp1062 11 days ago · 43 comments

Reader

nickjantz 11 days ago

Am I missing something other commenters are seeing about this not being an ad? The domain is on Burla, which hosted the compute needed for this. There's a giant airbnb x burla logo at the top. People are saying there's a lawsuit pending, it's against guidelines, what's the point, etc..

It's content marketing plain and simple for Burla towards people that view this site. It was highly likely done by employees at both Burla and AirBNB together as a joint project.

  • jperryjperry 11 days ago

    One of the Burla founders here. Not a joint project with Airbnb. I’ve been experimenting with giving agents access to Burla clusters and letting them run with analysis ideas I find interesting. This was one of the results.

    The branding is a bit much, fair call, but the intent here was just to explore what these agents can actually build when you give them access to large amounts of compute.

GrinningFool 11 days ago

I'm struggling a bit with how the 'funniest' ranked reviews are genuine descriptions of people's miserable (and sometimes unsafe) experiences. Where's the funny?

As an experitisement, I guess it gets the name out there but not in any way I'd want for my business.

  • jperryjperry 11 days ago

    personally I find those experiences really funny especially in my life. looking back I think most people find humor in it, i could be wrong? I don't think so though

    • GrinningFool 11 days ago

      Sure but it's not your life, right? This is other people's misfortunes, and these reviews weren't written to convey their entertainment at an old story.

devmor 11 days ago

The author makes some pretty insane leaps in logic for classification, and it’s apparent in the photos.

“Drug-Den vibes” apparently means the owner is poor or a photo is obscured or badly lit.

danhon 11 days ago

"Looking at every public Airbnb listing in Inside Airbnb's open data dump, all at once, on Burla"

This Inside Airbnb?

Community Guidelines

Please:

Only take the data you need. Do not scrape data from the site, if you would like to subscribe to the data directly, please email data@insideairbnb.com

  • yodon 11 days ago

    >Everything was parallelized on Burla, on a single dynamic cluster that scaled to ~1.7K CPU workers for photo download and CLIP, with 20 A100 GPUs running embedding clusters in parallel on the same cluster.

    That's a lot of budget - would have been nice if they'd made an actual donation to the project, instead of pounding the project's servers and bandwidth when there are much better ways to interact with the data.

    • jperryjperry 11 days ago

      Totally fair callout. I should’ve been more careful here and leaned on the provided datasets / bulk access instead of pulling things at scale. That’s on me.

      I’ll make a donation to support the project regardless. Appreciate you raising it.

      • danhon 11 days ago

        ... so you'd only end up making a donation if you ended up "stressing the project's infra more than expected"?!

wheelerwj 11 days ago

This thing is ripe for a lawsuit and has terrible methodology as far as I can tell.

  • smrtinsert 11 days ago

    On what grounds is there a lawsuit? Hasn't scraping been classified as legal?

    • happyopossum 11 days ago

      Calling someone’s apartment an opium den is potentially libel, and if it results in a material financial impact, you’ve got a lawsuit.

    • wheelerwj 11 days ago

      classifying people's businesses as an "opium den" using a shitty LLM prompt seems like a pretty good way to piss some people off.

      • bot403 11 days ago

        I don't necessarily agree with labeling them drug dens. But certainly the hosts showed zero or negative effort in keeping the room clean and suitable to rent. They do deserve some shaming.

dwroberts 11 days ago

“Drug den vibes” and they’re mostly just small rooms?

  • tart-lemonade 11 days ago

    I found one in Istanbul [0] (which now 404s) that somewhat fits the label and looks like it could have been a set on The Wire, but most of the "drug den" ones are just cramped, taken by someone who doesn't know how to take pictures and doesn't care to learn (blurry, bad lighting, noisy, poor staging), or both.

    Most of the bad TV placement ones are also boring because they're just over a fireplace. Technically correct, but not noteworthy. However, I did find one that was truly spectacular [1] (still live for now) and left me with more questions than answers.

    [0]: https://www.airbnb.com/rooms/988178752120341661 / https://archive.is/xnvC5

    [1]: https://www.airbnb.com/rooms/41725492 / https://archive.is/IyMvT

  • jperryjperry 11 days ago

    some are more psychedelic drug vibes and others are just insanely messy.

    I've had shitty and small apartments many times and that doesn't prevent me from cleaning it. especially if I'm going to rent it out

  • guywithahat 11 days ago

    I feel like floor mattresses, trash, and peeling paint were also at play. They're all sort of unsafe rooms people wouldn't want to go to unless they felt like they had to (i.e. doing drugs)

  • nickthegreek 11 days ago

    Apparently if your resting place lacks a headboard, you abuse chemicals.

htrp 11 days ago

This seems like an advertisement for an open source package

>Scale Python across 1,000 CPUs or GPUs in 1 second. Burla is a high-performance parallel processing library with an extremely fast developer experience. Scale batch processing, vector embeddings, inference, or build pipelines with dynamic hardware.

Edit: Author comment was flagged dead. They work at burla which is a managed cloud service for parallelizing python

  • andai 11 days ago

    Looks like it was hit by some sort of automated ChatGPT detector.

xrd 11 days ago

Airbnb was actually started by two guys who created an opium den for Obama's convention so this doesn't surprise me.

gavmor 11 days ago

These are amazing! Some are probably offensive, because I saw a cozy, if kitschy, British den labeled as "did-someone-just-leave" vibes which... unfair.

xikrib 11 days ago

Ah yes, let's price the world out of the real estate market and then use insanely powerful AI models to systematically mock the living conditions of the poors.

guywithahat 11 days ago

This is pretty great, the reviews at the bottom are the best part. I'm impressed they were able to scrape so much data

NoLinkToMe 11 days ago

What a waste of energy (money/resources)... Scraping and AI-scanning 2 million photos to identify animals in the advertisement pictures? What's the point.

As an exercise a sample of 1000 photos would've been enough. As a database, knowing a listing has a cat in the picture or a funny review doesn't offer any real value.

I wonder what the footprint is of such an exercise.

  • jperryjperry 11 days ago

    The pet detection part isn’t the point, that’s just a visible output. The actual goal was to stress test agents + distributed compute on something non-trivial.

  • ericmcer 11 days ago

    I dunno there are literally 100s of millions (billions?) of people who spend more than an hour per day just scrolling through social media feeds.

    How much does it cost to send a billion people an hour of video every day? Almost all of the resources tech uses is for pointless or even negative things.

    What % of compute/bandwidth do you think is used for "real value"? I would guess it is well below 1%.

add-sub-mul-div 11 days ago

This vanity scraping is fucking up the internet for everyone else.

It's hardly the only thing, but it's part of the problem.

  • jperryjperry 11 days ago

    Fair feedback. Definitely more backlash than I expected. The intent was to experiment with large-scale analysis, not add noise or put strain on shared resources. I’ll be more thoughtful about this kind of thing going forward.

    • wiiwj 11 days ago

      Man you are so disconnected from reality. It’s actually insane. I’m convinced many like you don’t go outside.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection