Settings

Theme

Convert standard Geiger counter to RNG

github.com

36 points by propmaster 2 years ago · 41 comments

Reader

anfractuosity 2 years ago

https://www.fourmilab.ch/hotbits/how3.html is one algorithm I've seen before.

I'm just wondering with a very active source with this algorithm, could you potentially get a sequence of 0 - 15 being generated? (I could well be misunderstanding)

If a 'pulse' of activity from the source was detected at each interval

LeoPanthera 2 years ago

I'm not a math or a CS expert, but I naively "designed" a PRNG which was simply repeatedly doing hash(random_seed+counter).

Obviously you have to keep random_seed secure, and use a hashing algorithm that does not have easy collisions, but other than that, is there any actual downside to this method?

  • er4hn 2 years ago

    You do have a few details to worry about such as the size of the random_seed, and how much you can output before you need to get a new seed. There's also a problem that if the state of the rng is exposed at any point, all past generated values can be determined (oh, it's currently on {random_seed="foobar", ctr=5}, I can predict the prior values.)

    But it is very close to the US Fed approved HASH_DRBG, under NIST SP 800-90A, section 10.1.1 here: https://nvlpubs.nist.gov/nistpubs/SpecialPublications/NIST.S... .

    The big problem with PRNG algorithms is that, imo, determining how good they are feels like a bit of a black art. You can have things that measure as having high entropy very easily that have very predictable inputs or can be easily broken. It's always been a nit of mine when trying to analyze how good a PRNG is.

    • eternityforest 2 years ago

      Is there actually a limit to how much data you can output? Modern state sizes are so big you'll never get a repeat in billions of years.

      We also mix entropy in continually. Even if your sources only give you 1 bit per minute, within a few hours you'll have way too much entropy for attackers to decode anything before that window.

      Plus, if you're using hashing, you can't figure out previous states from the current state.

      Then on top of it people generally also use hardware RNGs built into all modern chips. On Linux I believe they don't trust it by itself but mix it with other sources.

      Actually building a PRNG yourself would still be hard though!

      It would be cool if modern devices had a solid state radiation detector for this purpose. Not so much because they actually need it, but anything security related often sells well, and it might let them put radiation sensing in consumer devices. I wonder how much stuff people would find!

      • stcredzero 2 years ago

        Actually building a PRNG yourself would still be hard though!

        Well, coming up with a crappy one isn't hard. Coming up with a new good one, with good efficiency is hard, though.

        It would be cool if modern devices had a solid state radiation detector for this purpose.

        Don't current chips already do this with noise from a diode? Quick google: Yes. There are quite a number of these sorts of methods: https://en.wikipedia.org/wiki/Hardware_random_number_generat...

        Nothing is perfect, it turns out.

        • eternityforest 2 years ago

          They do, but a regular diode noise source can't double as an actual radiation detector. I was thinking something like a SPAD diode or something, so everyone could have background radiation monitoring at all times and get an alert if someone mixed a cobalt source in the scrap metal used to make their furniture or whatever, or the nearby coal plant was puffing out radiation or something.

    • kloch 2 years ago

      Analysis of PRNG's is difficult and very important:

      https://www.pcg-random.org/posts/does-it-beat-the-minimal-st...

      • tptacek 2 years ago

        This isn't a security analysis, and PCG isn't a secure random number generator, but it is the case that you can fashion a crude secure random number generator out of a sufficiently large state and a cryptographic hash.

  • lxgr 2 years ago

    One important thing this does not achieve is forward and backward secrecy.

    A proper PRNG periodically irreversibly mutates its internal state (which gives you the property of not being able to retrieve past outputs in case of a point-in-time system compromise) and also incorporates additional entropy into its internal state whenever it becomes available (which makes it impossible to predict all future outputs after a point-in-time compromise).

    Backward secrecy can be achieved by completely replacing your current state with some hash function of it, i.e. completely deterministically; forward secrecy needs additional entropy, so it's not always feasible.

  • aljarry 2 years ago

    Depends on the usage. Commonly available "legit" PRNGs have uniform distribution, even over long range of numbers.

    Wikipedia has a list of interesting tests you could do to confirm you PRNG: https://en.wikipedia.org/wiki/Pseudorandom_number_generator#... where first two are interesting for a non-cryptographic PRNG.

  • Verdex 2 years ago

    IIRC there's a PRNG talk for videogames where they mention your method:

    https://www.youtube.com/watch?v=LWFzPP8ZbdU&t=4s

    There are downsides, but for videogames, the speaker decided that this wasn't that bad of an idea.

stcredzero 2 years ago

Not a crypto expert, but here's my understanding of the problem. The system just has to gain bits of entropy from the physical randomness source faster than the hardware and software leaks them. So one could feed the entropy into a stream cipher with the right characteristics.

What if one just counted the number of clicks, modulo 2? Then one could take the output of ChaCha20 or Salsa20 stream ciphers and every N outputs of the stream cipher, increment the stream cipher's counter bits 0 or 1 times. One would also have to re-key the stream cipher periodically. (Maybe after gathering 128 bits of output from the Geiger counter in another register, then using that.)

  • tptacek 2 years ago

    I don't understand what you mean by "leak" here. You can definitely make a secure random bit generator with a stream cipher. Once it's fully initialized/seeded, it's practically never going to run out of random bits.

    • stcredzero 2 years ago

      I don't understand what you mean by "leak" here.

      Again, not a crypto expert, but my understanding is that all stream ciphers leak some bits of the key with enough output. This is why differential cryptanalysis is possible against them. This is also why RC4 could be broken.

      EDIT: Perhaps a better algorithm: Accumulate 128 bits of entropy and use that to key Salsa20. Then, whenever another 128 bits are accumulated, simply re-key the stream cipher.

      • tptacek 2 years ago

        It’s why RC4 is broken, not a thing we accept from ciphers that aren’t comically broken.

        • stcredzero 2 years ago

          Isn't that a matter of degree? RC4 leaks a lot. "Acceptable" ciphers leak so little, it's still not feasible to break the full version. But apparently this scales with increasing or reducing rounds. So it seems unlikely it ever goes to zero. It just gets near enough to zero, that there are no feasible attacks for some length of message.

          • tptacek 2 years ago

            No, it's not. A reasonable cryptanalytical model of a modern stream cipher (AED in a stream mode, or Salsa20, or whatever) is that --- within the birthday bounds of the underlying cipher (exabytes? you'd look it up, whatever you're using), and, in the case of something like an AEAD (or maybe just in the idiosyncratic case of GCM), the bounds of your nonce width --- they're not leaking _anything_. "Comical" really is the right word to use with respect to what RC4 did. It is kind of amazing that a cipher that broken remained in common use for as long as it did.

            • stcredzero 2 years ago

              within the birthday bounds of the underlying cipher (exabytes?

              So modern stream ciphers do leak, just so slowly, it shouldn't ever matter in a practical sense if you're using them correctly? That was more or less my understanding to begin with, but maybe expressed ineptly.

              the bounds of your nonce width --- they're not leaking _anything_

              This reminds me of unicity distance. So under a certain number of outputs, one simply can't infer the internal state of the algorithm? It could be any number of internal states? That makes sense to me.

              • tptacek 2 years ago

                I don't think "leak" is the right way to think about it. There's are birthday bounds on the transform and, for nonce-based modes, on the nonces used for successive encryptions. I'm not sure it's really meaningful to think about the incremental marginal degree of "leakage" with each successive ciphertext up to that limit. We're coming up to the limit of what I'm comfortable talking about though; this is normally the point where 'pbsd jumps in to explain why what I'm saying is dumb.

      • adrian_b 2 years ago

        The term "stream cipher" is much more general than how you understand it.

        There is a huge number of stream ciphers that have never been broken.

        There is also a great number of stream ciphers that have been broken, but almost all of them belong to a class of ciphers that deserves the name "cheap stream ciphers".

        Such "cheap stream ciphers" have been designed so that they can have a much cheaper implementation than the ciphers that are based on pseudorandom functions similar in complexity to DES and its successors, with the hope that despite being cheaper they will still be hard enough to break.

        In most cases, sooner or later these hopes have been shown to be unfounded.

        The term "stream cipher" just means that it encrypts a stream of plaintext symbols by combining a stream of encryption mask symbols with them.

        There are many kinds of stream ciphers based on how the stream of encryption mask symbols are generated and based on the methods used to combine them with the plaintext.

        Even if more general combination functions are possible when the only condition is to be able to decrypt the ciphertext, when an additional condition is imposed, that the statistical properties of the plaintext must not leak into the ciphertext, then the encryption mask symbols must be combined with the plaintext symbols using a quasigroup operation (a.k.a. a Latin square operation).

        Most frequently, it is preferred that the quasigroup operation is a group operation, because this makes it simpler to implement. Moreover, usually it is preferred that the group operation is an additive group operation, to have an even cheaper implementation. Such stream ciphers are called additive stream ciphers.

        When implemented in software, there are many additive group operations with equal complexity. However, when the encryption is implemented in hardware, addition modulo 2 is the cheapest so it is preferred.

        The use of addition modulo 2 as a combination method has been invented by Gilbert S. Vernam in 1918. The term "binary additive stream cipher" should have been synonymous with "Vernam cipher". Unfortunately, in another example of mistaken historical attribution, the term "Vernam cipher" is used to mean any stream cipher where the encryption mask stream is truly random, which is a different invention, first described by Frank Miller in 1882 and rediscovered and popularized by Joseph Mauborgne, a user of Vernam machines.

        Based on how the encryption mask stream is generated, there are many kinds of stream ciphers, but most of the practical stream ciphers are either synchronous stream ciphers, where the initial state and the transition function and the output function of the automaton that generates the encryption mask symbols do not depend on the plaintext or on the ciphertext, and self-synchronizing stream ciphers, where the encryption mask symbols are generated either as a pseudorandom function of the ciphertext symbols (these are a.k.a. ciphertext autokey stream ciphers) or as a pseudorandom function of the plaintext symbols (these are a.k.a. plaintext autokey stream ciphers; they have been abandoned a long time ago, because they are less secure). The two kinds of self-synchronizing stream ciphers are dual, i.e. starting from one such cipher a dual cipher is obtained by using the encryption algorithm for decryption and the decryption algorithm for encryption.

        The stream ciphers that are most frequently used nowadays are binary additive synchronous stream ciphers, but there is a great variety of possible structures for the automaton that generates the encryption mask symbols. When implemented in software on a modern CPU, the simplest such automaton is made by using AES in counter mode (CTR mode).

        • stcredzero 2 years ago

          The term "stream cipher" is much more general than how you understand it.

          This strikes me as a non-sequitur? Could you provide me with a quote where I misunderstood the term "stream cipher" and how I misunderstood it?

          There is a huge number of stream ciphers that have never been broken.

          I understand this to mean, that the full version of the algorithm doesn't have any feasible attacks. That the difficulty of the known attacks is so high, there is virtually zero value to them over using brute force.

          Based on how the encryption mask stream is generated, there are many kinds of stream ciphers

          If you read the thread carefully enough, you can deduce which kind of stream cipher I was thinking of.

          • tptacek 2 years ago

            I don't really follow the misunderstandings here, but on the off chance this re-level-sets the thread: you don't continuously inject new "entropy" (ie: rekey) a random bit generator out of concern that the underlying primitive is "leaking" (if it leaks, you fix the leak). Rather, you do so for post-compromise security, because there is basically no reason not to have some measure of post-compromise security such that if the state of your DRBG is compromised (by some systems vulnerability, not by cryptanalysis), in some relatively short interval security will be "restored" by rekeying.

            I think that's pretty much it, the whole reason sophisticated CSPRNGs have entropy pools and stuff.

          • adrian_b 2 years ago

            I quote here what you have said:

            "my understanding is that all stream ciphers leak some bits of the key with enough output"

            This is a false statement.

            Nowadays, the most frequently used encryption method is to use a stream cipher (more precisely a binary additive synchronous stream cipher), where the automaton that generates the encryption mask stream is implemented with a counter and with an output function that is an unpredictable pseudorandom function, usually either AES or ChaCha20.

            With an appropriate pseudorandom function, such a stream cipher does not leak any bits of the key, even with enough output. There are methods of breaking a cipher with enough computation and memory, but those are specific to the pseudorandom function that happens to be used and they are not limited to stream ciphers, they are equally applicable to block ciphers that use the same pseudorandom function. Moreover, the known methods used to construct block ciphers are usually easier to break than good stream ciphers that use the same pseudorandom function.

            It is obvious that when you have written "all stream ciphers", what you had in mind were not "all" such ciphers, but certain stream ciphers with very low implementation cost, but also with inadequate strength, which have been specified in many older communication protocols, e.g. in those for mobile phones (because the communication equipment manufacturers only cared about a low production cost, not about security, and the so-called encryption was only added to fool the customers, not to prevent professional attacks). Such cheap stream ciphers have been easily broken by anyone who has studied them seriously.

            The ciphers that you have designated by "all stream ciphers" are an extremely small fraction of what is meant by "all stream ciphers" in the cryptographic literature, as I have explained in my previous post.

            Your confusion might have been increased by hearing people referring to AES as being a "block cipher". That is very wrong, because AES is not a block cipher. It is not even a cipher. AES is an unpredictable pseudorandom function, which maps an encryption key and an input symbol to an output symbol. Alternatively, AES can be viewed as a parametrized family of bijective functions (a.k.a. invertible functions), where each of the bijective functions is identified by a parameter that is the encryption key.

            With an unpredictable pseudorandom function, it is possible to build various kinds of ciphers, either block ciphers or stream ciphers. Nowadays it is much more frequent to use stream ciphers based on AES, than block ciphers based on AES.

            General disadvantages of the block ciphers are that the message must be padded to a multiple of the block length and that both the direct and the inverse forms of the invertible pseudorandom function are required to be implemented.

            In the past, because authenticated encryption was not used, block ciphers (e.g. the so-called CBC mode) were preferred, because they made difficult the tampering of the message by amateurs, even if, in the absence of authentication, they had little chances to stop a professional attacker.

            After the use of authentication has become ubiquitous, the stream ciphers have become preferable for most applications, even if there remain a few niches where block ciphers have advantages.

            • stcredzero 2 years ago

              There are methods of breaking a cipher with enough computation and memory, but those are specific to the pseudorandom function that happens to be used

              But doesn't this constitute leakage of the algorithm's internal state in the output? (Albeit, very slow.)

              and they are not limited to stream ciphers

              I never said any of the above was limited to stream ciphers.

  • lxgr 2 years ago

    > So one could feed the entropy into a stream cipher with the right characteristics.

    Certainly, but your proposed scheme does not feed entropy back into the stream cipher, it just offsets its output. This can be trivially broken if the stream cipher key ever leaks.

    A better solution would combine (ideally securely hash) the stream cipher key with a relatively large (i.e. on the order of > 100 bits) number of bits.

    • tptacek 2 years ago

      I mean this is true, but it's also true that every cipher can be trivially broken if its key leaks. Yes: you want to rekey regularly (and that's not complicated). But whatever circumstance allowed your state to leak in the first place is most probably repeatable by an attacker, so you likely have much bigger problems than PRNG design.

      If you want to snipe at people for freelancing their own CSPRNGs, the better point to make is that people should be relying on the OS's secure random generator to the exclusion of everything else; there are systems security reasons to avoid userland CSPRNGs.

      But I think for the most part these are just people enjoying working out how a CSPRNG works. Which, mazel tov!

    • stcredzero 2 years ago

      Certainly, but your proposed scheme does not feed entropy back into the stream cipher, it just offsets its output.

      Which effectively introduces entropy into the output, as well as to the internal state of the stream cipher.

      This can be trivially broken if the stream cipher key ever leaks.

      This is a trivial restatement of how symmetric ciphers are supposed to work.

gus_massa 2 years ago

Have you tested how good the random numbers are?

I think they should be good, if the interval between clicks is much larger than the interval for the counter, but I may be missing something. Also, some source emit two particles and I don't know if there are interesting cascades of decompositions.

If two consecutive clicks are close enought, I expect an uneven distribution on increasing secuances like 13478AC023489BC...

Also, for very high click rates I expect missing double clicks.

  • auspiv 2 years ago

    The linked previous project does:

    https://github.com/gbonacini/nuclear_random_number_generator

    "I tested the randomness both testing the bytes and the single bits in the binary file created from the ASCII file from the appliance console ( see Appendix here or test directory for full result text).

    Accordingly with ENT man page: "If the percentage is greater than 99% or less than 1%, the sequence is almost certainly not random. If the percentage is between 99% and 95% or between 1% and 5%, the sequence is suspect. Percentages between 90% and 95% and 5% and 10% indicate the sequence is “almost suspect”", so having 85.17 percent in the bytes test and 60.52 percent in the bits test should be fine."

    Entropy = 7.998386 bits per byte.

    Value Char Occurrences Fraction 0 413451 0.500284 1 412981 0.499716

    Total: 826432 1.000000

    Entropy = 1.000000 bits per bit.

  • mike_hock 2 years ago

    How about this?

    Preparation: Pick some measurement interval in which you'd expect a bunch of clicks, e.g. a minute. Count how many clicks you get in that interval. Then take 138% of that time and divide it by the number of clicks you counted. That will be the interval for your timer.

    RNG: Tune the counter so it rolls over approximately once per time interval you've computed above. We'll take two measurements per interval, b0 and b1. Every time the counter rolls over, reset b0 and b1 to zero. When a click occurs, set b0 to one if the counter is less than half its maximum value, otherwise set b1 to one (if multiple clicks occur, the respective bit just stays at 1). When the counter rolls over, check b0 and b1. If they're equal, do nothing (discard the measurement). If they're different, take b0 as a random bit that you've just created (discard b1, and then reset b0 and b1 for the next round). You can then combine multiple successive bits into random numbers of the width that you want.

  • nonrandomstring 2 years ago

    The source of radiation surely makes a difference.

    Directing the sensor at a small point source makes clustering higher than if pointing it up into space [0] and receiving the "planar wave" background radiation of the Universe.

    I took a while to figure this out when measuring the audio clicks and crackles from fire. I expect similar statistics to apply on a nuclear level (physicists please correct me). Within small, local sources common effects can be linked to common instigators. In other words events beget events. So a local variation in some variable sets off a bunch of related things [1].

    As you sum this over an ever larger number of suitably decoupled sources, the central limit theorem starts to apply and what was bad uniform/even distribution becomes good Gaussian. In practice you need more than the theoretical >12 sources, but once you approach a few hundred uncorrelated sources the bell curve gets pretty good.

    [0] probably assume the sky/space is an infinite number of sources all at very large distances.

    [1] In the (dangerous) limit imagine a near critical mass of Uranium emitting bursts of chain cascades.

    • pxx 2 years ago

      No this is all wrong. Fire crackles clearly depend on state. Nuclear decay is memoryless; decays are independent events.

      https://en.m.wikipedia.org/wiki/Radioactive_decay#Mathematic...

      > theoretical >12 sources

      What?

      • nonrandomstring 2 years ago

        Thanks for the correction on nuclear state. It's hard to imagine anything having absolutely no state - but I guess a nucleus is not a "thing" as such, even if surrounded by lots of similar non-things. Does state not subsist in the total assembled mass in my example of near-critical collection?

        • nonrandomstring 2 years ago

          Just answering myself after some reading today.

          No, not really Because fission caused by free neutrons (which are rare) is something different from more common "spontaneous" radio-decay.

          That does beg the question; what is the cause that we don't yet understand, what lies behind little bits of the universe falling apart?

  • ashleyn 2 years ago

    I'm assuming the source is just ordinary background radiation. I'm wondering if, at such low levels, the geiger counter can accurately determine it enough to get a truly random output (as opposed to, some internal squelch on the unit creating predictable patterns). You probably would get far better randomisation ripping out americium from an old smoke detector and putting it by the geiger counter. (Just take note that residue on your hands from handling it may be toxic if ingested.)

    • robotguy 2 years ago

      Much safer to buy Uranium on Amazon. That’s what I did. It really freaks people out when you pull out a small metal can with a RADIOACTIVE sticker on it.

      You can also get uranium glass beads and radioactive watch hands on Electronics Goldmine.

  • pxx 2 years ago

    Edit: removed previous post.

    The increment happens as fast as the clock allows, which means any reasonable rate is much more than the cycle time. It delays 250us once it hears a click.

    I'm not quite sure why we bother with resetting the counter to 0 when it hears a click though.

sethaurus 2 years ago

Slightly off-topic: I’ve been thinking about getting a Geiger counter as a hobbyist toy. Anyone got a recommendation?

filterfiber 2 years ago

Several of the GQ counters have serial access, does anyone know if you can get the individual events from those?

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection