Techniques which allow the sharing of data whilst keeping it secure

85 points by jinnko 3 years ago · 33 comments

Reader

nl 3 years ago

There are other techniques that aren't generally included in the "Zero Knowledge Proofs" set of techniques that are perhaps more practical for general development.

For example, I find private set intersection[1] as implemented by OpenMined a really useful primitive a bunch of privacy enhancing applications can be built on top of.

My colleagues and I recently published a pre-print[2] showing how to use this for sharing locations you and another person have had in common, without being able to see other locations. The paper talks about a social network built around this but I also think there are useful applications in things like real-world games (scavenger hunts etc)

[1] https://github.com/OpenMined/PSI/blob/master/private_set_int...

[2] https://arxiv.org/abs/2210.01927

sjducb 3 years ago

If you're trying to use brute force to break this system it's important to realise that the search space is not all possible hashes. The search space is all likely names. This makes it trivial to break, especially for the CIA case where there are really only a few hundred likely adversaries.

The encryption/hashing doesn't really add anything beyond empty marketing. The trusted party who ppl report to could easily work out all of the names of they wanted to.

franknstein 3 years ago

This is common misconception. The truth is that in HE every plaintext can be encrypted to (exponentially iirc) many different ciphertexts. During encryption one of those is chosen randomly. This makes dictionary attacks practically impossible.
Edit: HE scheme (lwe) works on individual bits. Meaning there are only two plaintexts (0,1). Each has exponentially many ciphertexts, only one chosen at random. They also share ciphertext space, meaning each ciphertext could be either encrypted zero or one.
- sjducb 3 years ago
  
  Maybe I'm missing something, but surely a dictionary based attack will work because you have to be able to know that your key has already been submitted by another user. That's the point of the application.
  1) Initial report is filed.
  2) Second report is filed by a user who only knows the attackers details.
  3) Match is found
  Therefore you can just keep iterating through names till you get a match.
  Another way of saying it is that the application won't work if a second user can't tell that the first user has entered an attackers name.
  The vulnerability is in the application specification, not HE.
mattdesl 3 years ago

Depends on how the database and application was set up. In a properly formed ZK application you cannot obtain any reasonable information by brute force hashing names.
Edit: I realize you’re probably talking about Callisto which does seems like simple hashing of names, but I wanted to note that this is not always the case in apps using ZK proofs and FHE which the article touches on a bit later.

ericalexander0 3 years ago

What's the limiting factor of homomorphic encryption? Is it that it's provable? Is it the compute overhead? Is it too magical for governance?

lucgommans 3 years ago

Right now? For broad applicability it is the computational overhead. There are some applications that are doable now, such as reading data anonymously from a small-ish dataset (or a low-volume service where you can afford to wait a minute for an answer, or using really expensive hosting).
An example of that was a Wikipedia server that someone made which would serve you pages without the server knowing which page your client was actually after (https://news.ycombinator.com/item?id=31668814 4 months ago, 119 comments). It's still not really efficient; you can't simply swap out the real Wikipedia for this system and expect it to simply work.
> the server [needs] to scan through the entire encrypted dataset [for every request] (this is unavoidable, otherwise its I/O patterns would leak information)
Imagine Wikipedia servers needs to read every byte written on Wikipedia and operate on it before being able to formulate an answer to a random pageload. Additionally, if I remember correctly, things like autocomplete worked by just downloading the entire list of articles and doing that locally. It's all not impossible, but not a drop-in solution.
And then when you have a situation where you can practically apply it, there aren't popular/trusted/already-audited software packages out there for you to just use with confidence.

xhkkffbf 3 years ago

Privacy enhancing technologies are neat, but they're not exactly new. I've been following them since the 90s. There was a book called _Translucent Databases_ and dozens of good papers back then. Since then, there are entire conferences.

It's good that the Guardian is covering this, but it's not exactly new.

nl 3 years ago

Translucent Databases[1] is not the same thing at all. The approach they are using requires the person doing the querying to have the same encryption key that the data is encrypted with (eg, they recommend hashing data then querying with the hashed data[2]).
This is a great technique that improves security.
But is isn't the same as zero-knowledge techniques which are comparatively new. The maths for them first developed in the mid-to-late 1980s, but zkSNARK (which made it useful in computer science) wasn't developed until 2012[3]
[1] https://www.wayner.org/node/15
[2] https://www.researchgate.net/publication/301174908_Transluce...
[3] https://dl.acm.org/doi/10.1145/2090236.2090263
- lupire 3 years ago
  
  Translucent Databases covers ZNP also.
  https://www.wayner.org/node/39
  Of course implementation takes work and especially so on modern large databases.
  - nl 3 years ago
    
    Nice!
mattdesl 3 years ago

Fast, general purpose, succinct, and non-interactive proofs are pretty new, see SNARK (2013), STARK (2018) and Halo2 (2020). Even more interesting is zkVM like RISC Zero. Practical and open source applications of FHE feels even more nascent than ZK.
woojoo666 3 years ago

It's not new but it is undergoing a bit of a revolution because of growing interest in privacy and decentralization. Monero didn't exist in the 90s

cortesoft 3 years ago

> “Maybe one person doesn’t have a case, but two people do.”

Except they don’t have any way to contact each other, or for anyone else (like the police) to contact them… so how exactly are they going to have a case?

nl 3 years ago

> Callisto employees have no access to the name of the perpetrator within Callisto Vault. When two or more survivors have entered the same unique identifier of the same perpetrator and a "match" occurs, each survivor is contacted by a Legal Options Counselor (LOC). The LOC is an attorney and all information discussed with survivors is protected by attorney client privilege. The LOC will discuss all legal options with survivors to find the right coordinated action to take. The only way a perpetrator will find out they have been entered into Callisto Vault is if a survivor tells them or moves forward with legal action against the perpetrator and Callisto Vault is disclosed during legal proceedings.
https://www.projectcallisto.org/
Thorrez 3 years ago

The victims can contact their own assigned lawyers. The article doesn't explicitly say this, but I think the 2 lawyers can contact each other.

nopenopenopeno 3 years ago

>they can discover if their abuser is a repeat offender without identifying themselves to the authorities

Nonsense. An anonymous accusation is all but meaningless, and is in no way similar to a conviction. This is some truly garbage journalism.

chrisweekly 3 years ago

Agreed.
Also not to pile it on -- it's a complete tangent, really -- but every time I read the word "whilst" I cringe and wonder why the author didn't write "while". IMHO it adds no additional information or nuance, it's an archaic word that's long since departed from spoken use, and its presence in a sentence serves only to signal a failed attempt to sound sophisticated. Maybe it's just me, but for some reason it always triggers this reaction of "oh get off it, stop being pompous". /rant /tangent
- Nursie 3 years ago
  
  It’s still in common use in the UK, and the author is British.
- youngNed 3 years ago
  
  I can only imagine its akin to the reaction a British person has when they see an American pluralise Lego
  Whilst is a perfectly cromulent word in British english
nl 3 years ago

> Nonsense. An anonymous accusation is all but meaningless, and is in no way similar to a conviction. This is some truly garbage journalism.
Neither the article nor Project Callisto claim it is anything like a conviction. The article itself points out that even when multiple people accuse the same person the lawyers who are contacted do not (and cannot) get access to the accused person's name via the system.
- nopenopenopeno 3 years ago
  
  The article’s claim is simply nonsense. As I quoted, the article literally says “they can discover if their abuser is a repeat offender” but in-fact they can only discover if any anonymous accusations have been registered.
  - nl 3 years ago
    
    Seems like splitting hairs. If the person using it is registering their assault and the perpetrator is already in the database then the person entering that name in be notified.
    The article doesn't need to make the presumption of innocence - it's not a trial, it's showing how this is helpful to a victim of assault. There's no "two sides" argument here.
    If you check out the the Project Callisto website they outline the protections there are for accused - namely that they can never be named to the lawyers who get involved by the project. That has to be done by the individuals making the accusation to the lawyers.
    It is - as the article says - a way of letting the victims know they are not alone.
    
    nopenopenopeno 3 years ago
    
    >The article doesn't need to make the presumption of innocence
    Nor does the article have any business making the presumption of guilt, but it does exactly that.
    I am criticizing what the article actually does, not expecting it to do something more.
    All it takes is a person registering a name then receiving notification that others have registered the same name, then conflating that with evidence of guilt in the same exact manner the articles does.
    If the article didn’t set such a good example of how to ruin an innocent person’s life, I wouldn’t be concerned.
    
    nl 3 years ago
    
    But the person entering the details already thinks the person they are entering it in is guilty, so unclear why getting notified would be "conflating that with evidence of guilt in the same exact manner the articles does".
    > If the article didn’t set such a good example of how to ruin an innocent person’s life
    But it doesn't? The accused person is never named publicly - a lawyer is involved to help decide on next steps.
    It seems you are concerned about fraudulent accusations. But this doesn't seem to do anything to make fraudulent accusations easier in any way because the public accusation is out-of-scope of what it does.
  - lupire 3 years ago
    
    If a person is being accused, they offended the accuser in some way.
    
    Eleison23 3 years ago
    
    A zero-knowledge database like that is guaranteed to have no method of oversight and likewise to be immediately gamed and trolled by anyone with a grudge about anyone.
    
    nl 3 years ago
    
    It's unclear how this would work, given the thoughtful way the project has implemented it.
    
    nopenopenopeno 3 years ago
    
    It’s very simple. Person registers a name then receives notification that others have registered the same name, then conflates that with evidence of guilt in the same exact manner the articles does.
    
    Eleison23 3 years ago
    
    All you need is someone's social media handle to silently accuse them of some sexual offense without need for proof or evidence. So a few dozen women could coordinate an attack on Elon Musk or basically anyone they want who they know on social media.
    
    nl 3 years ago
    
    But.. If they choose they can do this without this tool - and this does nothing to make that easier or to amplify them.
    All it does is get a lawyer to contact them - which is something they could do already if they are making something up.

Settings

Techniques which allow the sharing of data whilst keeping it secure

Keyboard Shortcuts