Settings

Theme

Why does searching Google for random hex lead to car dealers? [video]

tmp.tonybox.net

133 points by bonyt 2 years ago · 79 comments

Reader

CommieBobDole 2 years ago

Looking at this very briefly, the results seem to always be inventory pages for the dealerships, which use long strings of hex or just random numbers as identifiers for the vehicles they have for sale.

For example, a search for "ca7112b7167c15e621412c0fbc0a6c97" brings up the URL "https://www.premierclearancecenterofstbernard.com/inventory/...", which has a gallery of vehicles at the bottom whose image names are of the format "9b362510c100095f02cf3cad9e365ea6.jpg".

I assume something inside the Google black box is saying "well, there's no exact match but this site has a bunch of strings with most of the same characters, so here you go".

Edit: And to add to this, I'd surmise that the reason you see a lot of car dealerships in these results is that they sell a lot of one-offs - instead of having a list of SKUs in inventory, they sell a unique vehicle just once, so the inventory systems need to account for that by using long strings as item IDs and the like. Also there's probably a limited number of inventory systems out there, so a bunch of random dealerships are probably all using the same one.

  • cedws 2 years ago

    Back when Google search was good this query would have returned no results. As it should do. Now it desperately tries to dig up anything it can find just so the number of results is not zero. Somebody at Google wanted to the increase search 'hit rate' KPI and this is the result.

    • SquareWheel 2 years ago

      If you put quotes around the string (the "exact match" operator), the only results are this very thread. So it seems to be working as intended.

      Basically, you did a fuzzy search and got a fuzzy result. Usually that's what people want. Quotes will let you fine-tune results. Or if you want all results to be strict by default, use verbatim mode. I tested that with the above string and again, only this thread showed up.

      • underwater 2 years ago

        But it’s clearly not what people want. Ask any person if a search for a hex encoded ID should be a fuzzy match for a different ID and the answer will be no.

        As technical people, it’s easy to infer what’s happening under the hood and make excuses for the weirdness. But food product design is about having strong opinions about what should happen, and ignoring our bias is around the limitations of the tech or the status quo.

        In an age where I can have an entire conversation with a computer or generate a video from text the world’s greatest search engine still doesn’t understand that you can’t fuzzy match an ID? It increasingly feels like Google search is stuck in the past.

        • fragmede 2 years ago

          Who is this "any person" who's searching for random hex, and how much do you think they care of Google shows them a car instead of whatever thing they're not even actually looking for?

          the idea that this mythical "any person" even cares about the difference between a useless car result and a page that says no results and then they just move on with their lives is projecting a lot of your own biases onto a hypothetical.

          • ipsum2 2 years ago

            Obviously no one would search a completely random hex, but it may represent an ID somewhere, and they want to find out more information about it. e.g. a SHA or MD5 hash.

            • fragmede 2 years ago

              Agreed, you can look up d41d8cd98f00b204e9800998ecf8427e for example. The question is how critical is it that Google returns a non-useful result vs a page that says there are no results. I think most people don't care, it's not useful either way.

          • behringer 2 years ago

            I search id's all the time. Google is becoming more and more worthless.

        • semi 2 years ago

          > But it’s clearly not what people want. Ask any person if a search for a hex encoded ID should be a fuzzy match for a different ID and the answer will be no. >

          in a search field explicitly for hex encoded IDs it shouldn't be

          In a generic web search that has to guess if my term was a hex encoded id ('cafe' is but almost certainly isn't intended as one..).. it's less obvious.

          in the case of a clear hex encoded id of sufficient length, i would like to know there are zero exact results, but as long as it's still fast I would love some fuzzy matches after in case there was a typo in my term or in the indexed document.

        • Dalewyn 2 years ago

          >the world’s greatest search engine still doesn’t understand that you can’t fuzzy match an ID?

          No, not without telling it to run an Exact Match search by enclosing the string in quotes.

      • plorg 2 years ago

        Meanwhile if I search for a specific Bosch solenoid part number there's a 50% chance that any one result will point to some different part number that contains 90% of the digits - even though the specific part number actually exists!

        • gravescale 2 years ago

          Same for electronic part numbers. Search engines will just go "eh, pretty close" and mix in results for, say, TPS562201 with those for TPS56221.

      • xp84 2 years ago

        I get that that's the default now, but can't help but hate it. When you search for like `dog house` to have a bunch of results for just house (marked "Missing: ~dog~" ) it's so dumb. Why would I have typed dog unless that was important to me??

      • OwenFM 2 years ago

        This sets things up for all sorts of problems when people don't notice that the IDs aren't exactly the same.

        At the moment, Google happens to be choosing car dealers as a fallback, but what if it instead fell back to a page "transaction a67cedf has been confirmed"?

    • noqc 2 years ago

      Garbage in garbage out is fine here, no? I hate google quite as much as the next person here, but this seems like a non-issue. If I type in a random string, it should be assumed that I'm searching for something.

      • ryanianian 2 years ago

        Sometimes you really do want exactly that "random" string. This is common with error messages, model numbers, build hashes, etc. If I'm searching for B9GDSIGH as the model number for my refrigerator, I really don't want to see B9GDSIGY.

        • kimixa 2 years ago

          But if it links to the B9GDSIG series refrigerator, which has the 240v H and 120v Y subtypes, then it would be correct in suggesting that?

          Same with error messages - they often have timestamps, or local object IDs/memory addresses, which you also want to be fuzzy-matched.

          I think the issue is the de-emphasis of "power" modifiers for google - it's less obvious how to say "This part of the string needs exact match, this can be fuzzy"

        • dylan604 2 years ago

          In that case, click the "must contain" link and it resubmits with the query wrapped in quotes. Or, just quote the query yourself on the first go if you know it must match

          • ryanianian 2 years ago

            Google no longer (hasn't in a while) respected quotes. It's very hard to get Google to actually say there aren't any results even when in fact there are no matching results.

            • dylan604 2 years ago

              They respect it when they submit it then, as every time I've used that function to see them update the query with quotes it comes back with different results. I've never cared to look at the search query in the URL, so maybe they also add and additional parameter that tells the back end specifically to obey the quotes on this resubmitting???? So at some point, the quotes aren't ignored

            • fragmede 2 years ago

              that's not my experience.

              https://www.google.com/search?q=%22kgirbudidndijrjjr%22 gives me "Your search - "kgirbudidndijrjjr" - did not match any documents.", at least it will until they index this comment and find kgirbudidndijrjjr

          • thfuran 2 years ago

            Quotes are more like guidelines these days.

          • aldousd666 2 years ago

            on the advanced search, there's still the option to specify that it 'must contain' something, but I'm not sure if it's just a suggestion like quotes or not.

            • dylan604 2 years ago

              I "love" how we've reached a point where we so distrust this company specifically but dark pattern UIs in general where we almost anticipate placebo like buttons.

      • bravetraveler 2 years ago

        One man's trash is another man's treasure. Search is ambiguous enough by nature IMO. No liberty zone!

        Agree with the peer - specificity matters. Model numbers are a good example. I feel like I've developed a weak form of dyslexia because I can't trust Google like I once did.

        Things I want fuzzy searches for... will be presented fuzzy. Not as an opaque string of usually-quoted characters, but wrapped in keywords

        A reply makes a good point - double quotes don't seem as effective any more.

    • Galatians4_16 2 years ago

      I miss when Google had thousands of results, and you could browse past page 5. Now it just lies to you.

    • duxup 2 years ago

      Is there anyway we can somehow find out that is true?

      I could have sworn google always was happy to return some odd url matches, typically when the given results weren't great.

      • rurp 2 years ago

        I remember when Googlewhacks[0] used to be a thing. Zero result search queries weren't interesting enough because they were too easy to find.

        [0]https://en.m.wikipedia.org/wiki/Googlewhack

      • dylan604 2 years ago

        I've seen it come back with something along the lines of "it looks like there's not a lot matches" with some useless cartoon graphic.

        I see this a lot when searching for phone numbers. I've also seen the opposite like the forced "find something no matter how terrible of a match to avoid no results" as being described. You search for a number and no exact matches, but it returns things with different area codes same prefix different numbers. Or same area code, different prefix, same numbers. Or some such randomness that I can't even venture a guess as to why it thought the not one number matches would be interesting to me. Unless you're brave, I'd suggest not searching for random phone numbers with Safe Search off as you'll find some very interesting pages displayed that have absolutely nothing to do with the number being searched.

      • beardyw 2 years ago

        There was at one time a kind of game where you tried to find a search term that would return only say 3 results. It was hard, but some did get found.

        Having said that I have recently had some kind of "nothing found" result on several occasions. So it still happens.

        --edit--

        In fact I just tried "ca7112b7167c15e621412c0fbc0a6c9" (omitting the last digit to avoid HN) and got:

        Your search - "ca7112b7167c15e621412c0fbc0a6c9" - did not match any documents.

        Suggestions:

        Make sure that all words are spelled correctly. Try different keywords. Try more general keywords.

      • cedws 2 years ago

        Unless you have a time machine there's only anecdotal evidence, but there's plenty of it on HN. Seen many comments here reporting the same thing.

      • Gormo 2 years ago

        Just do an image search for "google search returned no results screenshot". Plenty of examples.

    • shadowgovt 2 years ago

      I can't tell you the number of times I've searched for random serial numbers and gotten the exact product I seek. I'm glad Google indexes this random crap.

    • 1vuio0pswjnm7 2 years ago

      An experiment would be to create high quality, non-commercial websites with pages containing these hex strings and see if the pages appear in Google SERPs.

      The fact that Google returns car dealerships when the user is searching for hex strings is telling.

    • refulgentis 2 years ago

      That doesn't sound right to me: Google used to suppress results with string matches?

      Why?

      If so, would that be a good thing?

      Why shouldn't I be able to find the vehicle via its ID?

  • libria 2 years ago

    > no exact match but this site has a bunch of strings with most of the same characters

    I suspect it's something similar, but more like partial string match which may score as "close enough to display". I get consistent results with the same hex string - dealerships - but if I quote it (exact match), I get no matches.

  • alanh 2 years ago

    I DO NOT BUY IT. Plenty of sites use unique identifiers and other random hex strings all over, e.g., fingerprinted assets. If your explanation were accurate I would expect more kinds of sites to show up

  • shadowgovt 2 years ago

    Additionally, the user is doing the search in a non-Incognito session, so the system will bias based on assumption of user preferences. "Hm, I see this random hex identifier in three pages... Oh, but this user likes cars. Let's give 'em the car result first."

  • wlesieutre 2 years ago

    > Edit: And to add to this, I'd surmise that the reason you see a lot of car dealerships in these results is that they sell a lot of one-offs - instead of having a list of SKUs in inventory, they sell a unique vehicle just once, so the inventory systems need to account for that by using long strings as item IDs and the like.

    If only there were some sort of standardized identification number for vehicles

  • confused_boner 2 years ago

    Bing search results for that are interesting

libria 2 years ago

Repro'd in an incognito window so it's not a history thing. 1st 3 of OPs strings if anyone else is experimenting (remove spaces):

    3344cfb4 78ead204a49b88 1da6079adf8a
    e2c75c64 eef8087f6f36df 57
    eb944335 73626fe9b73550 b02a651620d8
--

Shoot, depending on crawling, this may end up causing this page to match. I'm injecting spaces above to deter this, but maybe it'll also prove out the partial string match theory...

  • 1970-01-01 2 years ago

    I'm only getting back 2 results: Citi.com and FDIC.gov

    Clicking on the 3 dots gives me this info:

         Your search & this result
         This result seems relevant even though this search term may not appear: 
         3344cfb478ead204a49b881da6079adf8a
dtagames 2 years ago

Most likely some part of the string matches the VIN number. Dealers are legally required to post the VIN of an actual vehicle in any advertisements that have a price, as a way of preventing bait-and-switch.

  • OptionOfT 2 years ago

    Funny, in Europe that's absolutely not the case.

    I watched some government sale and they posted a PDF vehicles for sale that were forfeited.

    The VINs where there but parts of it where blacked out.

    It was a PDF. I copy-pasted the text behind the black box and got the full VIN.

    • londons_explore 2 years ago

      In Europe VIN's of cars are treated a little like SSN's are treated in the US. Some governments assume that just because you know the VIN of a vehicle, you must be it's owner, despite many vehicles having the VIN written on every bit of glass and visible without even unlocking the car...

      • JohnFen 2 years ago

        In the US, the VIN must be visible from the outside of the vehicle, through the windshield on the driver's side. Covering the VIN from view is illegal.

    • dylan604 2 years ago

      > It was a PDF. I copy-pasted the text behind the black box and got the full VIN.

      You're such a hacker. As the world turns now, I'd expect some legislation that says if you copy the text from a badly created PDF, then you are the one to blame and not the one that made the bad document. You're clearly circumventing the intent. You you...criminal.

  • dawnerd 2 years ago

    And yet they still bait and switch. Most recently-ish with added markups not in their online price.

    • cratermoon 2 years ago

      Or just claiming the vehicle is currently unavailable or not yet for sale because it's in the shop/in use as a loaner/the manager has a hold on it or some BS, but here's a very similar vehicle that we'd love to unload on you!

      It's very technically legal because they do have the vehicle in their inventory, and you can test drive and buy it, but just not right then.

qnleigh 2 years ago

Good guesses in the comments so far: VIN number partial matches and targeted search. Anyone going to test what's correct?

Ideas: 1. Vin numbers are 17 characters and don't contain I, O or Q, to prevent confusion with other letters. If you throw in lots of these always spaced by less than 17 characters, do you get fewer hits?

2. Does a VPN and/or private browsing affect the results?

A third possibility is that Google has cheaper ad category for search queries that they can't categorize. This doesn't explain the diversity of dealerships though.

  • joe_the_user 2 years ago

    Sure, it's matching VINs. But in the vast expanse of the net, surely there are many strings of random hex out there. Why this source of random digits.

  • chuckadams 2 years ago

    Mercedes uses 18-digit vins, tho I believe it’s the same format and checksum algorithm for the first 17 digits so it’s really more “17 + 1 digits”. Still drives validators nuts tho.

ww520 2 years ago

The word embeddings computed from the hex values and the car dealership's inventory ID's probably have close similarity in Google's vector db.

  • rerdavies 2 years ago

    I like that theory, but with one slight modification.

    There's a single word embedding for DARNED_IF_I_KNOW, and, statistically, automobile listings outnumber other pages with the DARNED_IF_I_KNOW token.

lambdaxyzw 2 years ago

Weird premise. I search for random hex literally all the time (checking hashes and guessing algorithms as a part of my reverse engineering work) and I don't remember car dealers coming up especially often. I suspect it's just the author who - because of their location or the previous search history - gets more targetted car dealership ads.

joe_the_user 2 years ago

I think this is notable just because it's a result of Google now having every single search result set be trying to sell you something. That's different from simply having targeted ads and rather disturbing.

  • cratermoon 2 years ago

    Google is now a glorified Yellow Pages, assuming that every search is a search for a business.

omoikane 2 years ago

I see that digits is between 10 and 19:

   DIGITS=$((10 + $RANDOM % 10))
If it was always an even number, I would have expected some checksum files to be matched (16 for md5sum, 20 for sha1sum, etc).
cratermoon 2 years ago

I'm going to guess that google makes more money from car dealer ads than it does for programmers searching for hex codes. Also probably just because Google's search is more and more giving irrelevant results.

jimbobthrowawy 2 years ago

I tend to get variations on cryptocurrency block explorer websites mostly.

It's annoying when I want to search for a btih or something exact.

Terr_ 2 years ago

Weird... Maybe Google thinks that the closest inexact match is a VIN number?

worik 2 years ago

Search bubble?

hi-v-rocknroll 2 years ago

Perhaps Google trolls anyone in security or torrenting, and would instead prefer to show CPM/CPC ads to charge instead of nothing because money. /s

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection