Settings

Theme

Show HN: Fincher, a steganography tool for text

github.com

40 points by m4xm4n 7 years ago · 12 comments

Reader

fredley 7 years ago

Very interesting tool, although storing as typos does seem to be a bit visible and prone to mistaken 'correction'. Other approaches to consider might be:

* Changing punctuation for visually identical, but different characters. This would not work for printed documents however.

* Encoding only 'believable' typos, e.g. it's its. You could encode a binary stream across all instances of it(')s, or other substitutions.

* Encoding the stream in whitespace, e.g. Two/One spaces after a full stop. Printed documents would be lossy though (as full stops at line endings would be ambiguous). There are error detection/correction systems that can help though.

  • bambax 7 years ago

    Typical OCR errors would be interesting too: confusion between the letter "n" with the letters "ri" for example.

    It would be visually challenging to detect (and also, maybe, difficult for an OCR engine).

  • nrjames 7 years ago

    Snow is interesting and uses white space instead. http://www.darkside.com.au/snow/

  • m4xm4nOP 7 years ago

    Yeah, I need to work on making the displacements and replacements a bit more context-aware (& probably linguistically aware). There are cases where it can "replace" a character with the same character, for example.

    I do like your idea about visually similar but distinct character replacement. That would be a really fun one to implement.

wstuartcl 7 years ago

I worked on something very similar, my version also mutated punctuation and common phrases/words with synonyms and sentence re-ordering. Instead of steganography the purpose was to create identifiable mutations in text acting as a canary to tie disclosures back to specific recipients. Each party receiving a confidential document had slight mutations unique to their own document and given a copy/paste from a fairly small fragment(s) could be used to identify the owner of the version.

  • matt_the_bass 7 years ago

    This seems like a useful tool. Is it a product?

    • wstuartcl 7 years ago

      No Sorry it was constructed to catch an employee leaking confidential company information to media. I do not know how you could make this into a product and still maintain its reliability -- the more widely known the mutations are the easier it would be to mitigate the watermarking.

sehugg 7 years ago

I did one of these many years ago, basically just abusing lex/flex: https://github.com/countrygeek/stegparty/blob/master/stegpar...

josephcar 7 years ago

This is similar to steganos (https://github.com/fastforwardlabs/steganos), which tries to limit itself to changes that do not change the meaning of the text.

  • m4xm4nOP 7 years ago

    Oh, very cool! I like the data model for the changes. I've been thinking about adding an analysis pass using something similar to make it possible to implement more sophisticated strategies. The tricky bit will be retaining the stream-based approach.

awinter-py 7 years ago

first crystal codebase I've seen! niccce.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection