Attested Audio with ZKP
ddkang.github.ioI don't understand how this avoids any simple attack both for audio or video:
- Alter/manipulate the audio/video has much as you want.
- Play it.
- Have a "attested" microphone/video recording the "manipulated audio/video".
For video maybe you can actually try to prove that's recording a screen not the real event, but seems much harder for audio, you can just say the echos/ distortions came from the environment...
It doesn't. It's called the "analog hole" and has caused much head shaking with regards to the media cartels' attempts at digital restrictions management on the output side.
Trying to secure that regime on the input side would seem to be even more fraught with problems. At least using the analog hole for the output side causes quality degradation from reencoding the content (the most common goal is digital redistribution). Whereas on the input side, the content begins in the digital domain so it's not even adding an extra analog step.
I think in this case the point would be to trust the source publishing the audio and not content.
I cam see a gpg like registry where news stations publish their public keys to verify that their audio / video snippets have not been tampered with.
Yeah, this is my understanding also. I don't understand how this is better than the publisher signing the content with their own key though, rather than relying on a special microphone.
I wish the article was more clear on what threat model they're trying to address.
I would say there is a better way of doing this. In the forensic community it has been long known that audio can be timestamped and fingerprinted by the 50/60hz electrical hum in the background. In many countries this hum is recorded so it can be used later as evidence.
Unless counter-forensics are applied, AI audio is not going to have the right hum.
It’s not cryptographically secure, but it gives good assurance and doesn’t require tamper resistant hardware (which makes the cryptographic security dependent on how secure the resident keys are)
https://en.m.wikipedia.org/wiki/Electrical_network_frequency...
But isn't it really easy to apply those counter-forensics? It seems like it's much easier to fake a hum on a digitally created audio file than to change or remove the hum from existing file.
> The hardware manufacturer can destroy the private key after it is placed on the device. By doing so, the private key is inaccessible!
I'm pretty sure the private key is very accessible, if it's used at runtime which it is. Just not easily.
Yeah. People have been hoping for a signed "Photoshop-proof camera" since digital image manipulation was invented, but it has the same limitations as any form of DRM. It only slows people down a bit. There's also the analog hole - i.e. just stick your "attested" microphone in front of a speaker
Canon tried to do this with their ODD system, and they got kinda close, but it is possible to extract the key from a given camera, and forge signatures.
Their verification system also seems to include a smartcard or SD card-like thing (which might be doing something special, or might just be DRM)
or rip out the wires and attach whatever speaker you want, not really detectable
It is detectable if the other components are cryptographically paired, like they are in newer iPhones.
Go try to swap a camera or screen between two identical iPhone models, it won't work.
Extracting the private key out of a modern HSM enclave is essentially impossible for anything less than NSA-level capabilities.
> In our setting, we can compute a function that computes the hashes of the inputs and outputs the edited audio from the inputs. By revealing the hashes, we can be assured that the inputs match the recorded audio!
I don't understand how this works, or what exactly is being proven here. For instance, you could silence the given inputs and inject some other unrelated audio, so the fact that your output hash incorporates the input hashes doesn't seem very meaningful.
I figure I must be missing something here.
The editing can be checked. Basically the verifier is given the input hash, the editing function (e.g. remove noise) and the output. By using zk-snark the verifier can be convinced that yes indeed the output is the result of editing the input, and the input hash is the result of hashing the input.