Compression
Would then a compression algorithm alleviate this weakness, and a perfect compression completely remove it? (The idea here being that compression increases entropy and if perfect up to one)
Encryption systems may be a lot less secure than we thought, according to new research into the maths underpinning today's cryptography. Boffins in the US and Ireland have managed to poke holes in modern information theory, an area of mathematics used to prove the strength of cryptographic systems before they are trusted and …
There's no way to completely eliminate the weaknesses inherent in encryption schemes, assuming you want someone, including yourself to be able to read it again. All of encryption is not based around 'unbreakable' but around 'really, really, really time consuming to break'.
Just like any other aspect of security, there is no end game, or perfect, final solution, you only try to keep everything moving forward. As ways to increase decryption speeds of one method are found, another method will have to be developed.
The beauty is that a message encoded with a load contains not just any message, but every message. You can (and agencies do) transmit the encoded message via short wave radio with the whole world listening in, with the wrong pad they get the wrong message but would never even know it was.
It's actually very revealing to do a proper wartime one time pad encypherment - you can make a pad pair with a set of dice and a lot of patience. Once you do it yourself by hand you see how perfect the encryption is, and why. The reason doing it electronically is less perfect is because you have to share the pads securely at some point, and to do that instead of using impossible to break encryption, we use hard to break encryption.
"All of encryption is not based around 'unbreakable' but around 'really, really, really time consuming to break'."
Mm. at some level you are simply setting random monkeys to write Shakespeare..if your private and unshared secret is as bog as the data itself, or bigger.
And you might never know that what you had in fact decrypted was in fact the original encryption.
For example you might encrypt a plain text message that in itself was an encrypted message. The NSA would stop right there, thinking the job was in fact done.
the deficiencies of most encryptions systems are that they are public private key, and the encryption algorithms are known.
Encrypting your own hard disk using the entire work of 'jude the Obscure' as a key, and an algorithm of your own fiendish devising, would challenge a lot more.
Plug in the memory stick on which algo and key are held, and the computer works. remove it, and its disk is a a jumble of meaningless noise.
So you're saying randomize and then encrypt normally? (I assumed it was just a dumb OTP encryption scheme.) But it still adds no extra security.
Without your scheme:
1. guess key K
2. attempt decrypt using K.
With your scheme:
1. guess key K.
2. decrypt first four bytes using K, seed PRNG with them.
3. attempt decrypt using K, remembering to xor the bytes with output of PRNG.
Essentially you have replaced encryption function E(xi)
with a new function:
E'(xi) = E(xi) xor PRNG(i)
As far as I can see, it doesn't provide any additional protection against brute force attacks. It would provide protection against weaknesses in the encryption - e.g. if the algorithm performs badly when handed a string of zeros.
As far as I can see, it doesn't provide any additional protection against brute force attacks. It would provide protection against weaknesses in the encryption - e.g. if the algorithm performs badly when handed a string of zeros.
Exactly.. and since the article is about an analytical attack, not key guessing, Destroy All Monster's PRNG suggestion seems perfectly reasonable. In fact, "XOR against a good PRNG" is pretty much a synonym of "encrypt by stream cipher". Not really practical on a block device of course but on a block device any of the decent chaining modes should be employed and will be delivering exactly the same effect anyway.
Until very recently there hasn't been a trustworthy stream cipher but for the kind of pre-encryption suggested I'd imagine even the crappy oldies like Mersenne Twister or Blum Blum Shub should be more than adequate for any conceivable amount of data. For the truly paranoid, there's suddenly an embarrassment ( http://www.ecrypt.eu.org/stream/ ) of high quality stream ciphers to choose from - all designed to deliver ciphertext indistinguishable from true randomness... but as properly applied block ciphers also produce pseudorandom ciphertext such a block cipher would serve just as well as a stream cipher. So this latest scare is another great ad for TrueCrypt and cipher cascades in general: The first cipher serving as the pseudorandomiser which feeds unpredictable pseudorandom cleartext to the second cipher... and on again to the third if required. The first layer of encryption might be "exponentially easier" to attack but to deploy that attack you'd have to get through one or two layers of what is effectively encrypted randomness. Unless you believe in the "evil cipher" bogey man, of course. ;)
"good PRNG"
If it's a pseudo-random number generator, then its output isn't random! That's the very definition of the "pseudo" part of it. That's why a certain brand of Las Vegas blackjack machines was cracked. You want an output that looks as if it is totally random, for a non-random input. Say you have a camera focused on a busy street. Each frame, you XOR the bytes, and that's your new random number. There's no predictable algorithm generating the raw numbers, so the output is random. A pseudo-random number generator uses an algorithm, so the sequence repeats at an interval. That's what makes it pseudo-random.
That's why a plain-text attack is so brutal on encrypted text. Once you get a guess on that key, you narrow your search down until you've got it. We won't need quantum computers to break encryption because we'll have cheap CUDA and Adapteva machines to try out absolute gobs of keys.
> If it's a pseudo-random number generator, then its output isn't random!
Imagine that ... I KNOW THAT. And the good thing is ... it doesn't matter.
> We won't need quantum computers to break encryption because we'll have cheap CUDA and Adapteva machines to try out absolute gobs of keys.
The former are needed for RSA breakage. Your classical parallel processing ain't useful for that. And here we are talking about symmetric crypto.
Just in case anyone reading through doesn't know: random.org "offers true random numbers to anyone on the Internet". Probably not useful if you're doing frequent encryption, but worth knowing about, even if only for the good ideas about generation of randomness.
Who in their right mind would trust random numbers from a complete stranger sent over the net? Even if the site is genuine, how do you know you've not been spoofed by a man in the middle attack? Sure, it's fine for picking your lottery ticket numbers, but not for anything you want to be secure.
There's no predictable algorithm generating the raw numbers, so the output is random.
"Predictability" (by which I would assume you mean deterministic behaviour) and randomness are two completely different qualities. A data source may be inherently nondeterministic but not random at all. To qualify as random any value in the target domain must be as likely an output as any other - if there is any weighting or bias in the output it is not random. A lot of real-world systems have been compromised by this very implied assumption - it's unpredictable, therefore it's random. Carry on spouting basic mantras if you want, but yes, "good" PRNGs - that is statistically tested, qualified PRNGs are frequently better for many applications than half-assed attempts at true randomness implemented by those without a firm grasp of even the basics.
"Predictability" (by which I would assume you mean deterministic behaviour) and randomness are two completely different qualities... To qualify as random any value in the target domain must be as likely an output as any other - if there is any weighting or bias in the output it is not random. A lot of real-world systems have been compromised by this very implied assumption - it's unpredictable, therefore it's random.
Right on. I've long believed that most modern security flaws are not down to lack of thought or effort but lack of study or lack of knowledge. People spend 10 minutes studying this stuff and imagine themselves to be some kind of expert and begin to spout fundamentally flawed premises and if they were absolute truths. History is littered with examples of how non-random systems have been broken - it was essentially this very issue of a slight bias (not encoding a letter to its original value) that allowed even Enigma to be broken.
(Sorry, wordy.)
Actually, to increase entropy on a bit of data one would decompress it.
That is, you have 4 kilobytes of data, send 8 kilobytes after encryption. One way of doing this is to cut down on the alphabet used, sending data as fake genome information GATC and so forth or just send as Hex code, the less letters in the alphabet the longer the word size.
Of course this schema is just an add on to existing encypherment; the problem being as I see it is we have simplified the number of actual gross methods.
What happens, I wonder, when a machine runs across some new method?
There's a neat trick you can do to securely encrypt your messages permanently. First bring up Word and select New Document. Now copy paste (CTRL C) your message into the new document. Then select all the text you want to make secret. Time to find the box called "font" (this might take a while as microsoft have recently hidden away all the controls in ribbons for security). Now use the dropdown in the font box to pick the security option called "wingdings" (or if you want more security, wingdings3). Now save the document to a .doc file. This .doc file is now unreadable/encrypted. You can encrypt a message of any length using this technique.
The only downside is if the NSA have access to Microsoft Word they will be able to decrypt your .doc file. Given recent revelations we have to assume the NSA might indeed have access to Microsoft Word. The only solution is to take a screen copy of your encypted message and paste it into mspaint and save it into .bmp encrypted format. The NSA won't be able to decrypt it now because there is no font box in mspaint (well there is, but microsoft have locked it down to prevent it being used in this way).
If you want to use this technique and are using Linux you will first need to upgrade to Windows Vista or above.
Ho, ho, yus. V. funny, and funnier because it's already been done, in public, and by people who should really have known better:
SCO further obfuscated the code on this slide by switching it to a Greek font, but that was easily undone. It's entertaining that the SCO folks had no clue that the font-change could be so easily reversed. I'm glad they don't work on my computer security :-)
Source: Analysis of SCO's Las Vegas Slide Show [freerepublic.com, 2003].
I think you are missing the point. Sure, the encrypted data looks like noise. But the *unencrypted* data never looks like noise. Therefore, you can throw away all partial decrypts that look too much like noise as soon as you see it is the case. An encryption scheme is an invertible function, but much effort has been spent looking at the qualities of the right side. But one can trim the set of functions based on the qualities of the left side, also.
As some people have pointed out, compression might actually be a bad thing. An All-or-Nothing Transform (AONT) on the data would mitigate against this attack because as the name suggests the attacker gains nothing from being able to "probably" figure out correlations for small parts of the message.
This post has been deleted by its author
What the OP is probably talking about is the tendency of people to send things like WORD documents through an encryption stream.
WORD documents are FULL of predictable and repetitive bit strings and characters. They include large amounts of macro data and formatting information, all in standard positions. It's child's play to decrypt something if you have most of the plaintext immediately available.
If you are doing encryption properly, you have a code officer who knows these things, and who formats the messages accordingly before transmission. It helps if you use very short message strings and make extensive use of codewords. The military and FCO don't do this for fun, you know.
Sometimes when I am asked to implement a 'really strong' crypto system which I know will be used to send standard emails and documents I wonder if I should tell the client this. And then I think, 'Why bother his head over it? " I'm not being paid to do his risk analysis as well.....
This isn't about known plaintext attacks it's about cryptanalysis targeting poor entropy of the cleartext... and yes, it's ancient news that poor entropy of the cleartext makes such attacks feasible but that isn't what's being suggested here. What's being suggested appears to be that such attacks are significantly MORE feasible than some cryptographers previously thought because those cryptographers may have misunderstood or misapplied the theory used to calculate those feasibilities.
Oh and yes again, it's precisely because these attacks *are* so well known and understood that encryption *does* often begin with an initial compression pass to improve the entropy density of the cleartext. Just as suggested above.
My research into what I call 'Data Packaging' leads me to believe that the universe of numbers that are likely to be used for these things is much, much smaller than you would expect from the number of nominal bits. As a matter of fact, the problem appears to be near intractably difficult. I found this early on and actually registered a domain name whose only purpose was to supply 'good' numbers to support encryption more than a decade ago.
A key that is hundreds of bits long has no realistic chance of being from a truly random sampling of that space unless the process used to get the number is intensive.
Probably the same effect of "zipping a zipped file". Could do, or may not. Depends on the method.
For example, a simple decoder ring, applied twice with 2 different transforms, would add to the complexity, but not be "impossible" to decode.
But I'm not an expert...
The second application of a decoder ring would have no additional benefit. Decoder rings are purely transitive, so 1st transform plus 2nd transform is equivalent to some 3rd (single) transform. I believe that even the 3DES method would not help; 3DES does DES, then reverses the output of that, and DES again on that, then reverses that and does DES one more time (thus: triple DES). Reversing it each time does improve its encryption for DES, but, I believe, it would not for decoder ring style encryption.
Once again you show us that your chosen handle is a misnomer.
Compressing a compressed file in a lossless process, using the same settings and algorithm - as in a ZIP file - will always yield a larger file, it will always make the file bigger due to adding overhead. There is nothing to be gained from a second pass, as the first pass has already achieved an optimum output under the given conditions.
When we're talking about real encryption (i.e. not just a simple substitution), further encrypting a ciphertext using even the same method is very likely to improve its resistance to being cracked. This is because the first cipher will increase the entropy of the input data to the subsequent ciphers, as is explained above. This helps to avoid repetitious or predictable patterns, which are what the cryptanalyst is looking for to give her starting clues.
If a number of different schemes or settings (block size is an important one) are used in sequence, then even better - an option provided for example by the popular TrueCrypt product.
I think this is accounted for in information theory. It is just that people have made some incorrect assumptions. For instance, ASCII text, you have on average about 5 bits of information per character even though they are 8 bit bytes, taking advantage of this knowledge could potentially help cryptoanalysis compared to assuming totally unknown plaintext. If you had a compressed file, using a tar.bz, tar.gz, zip, etc., first, you then have a known plaintext "magic" header, if the file size is stored in the archive that is know plaintext, and for filename you may be able to make some assumptions. This may reduce the search space for cryptoanalysis. I'd assume longer keys are the way to go still though.
Well the compression itself doesn't always need the header. Even if you need something like Huffman tables you can easily pre-share them as it's done in so many standards.
Maybe it's even beneficial to use certain parts of the shared knowledge of the compression as a key.
I tend to use a 1k or larger key; sometimes many times the actual information length. This key is, of course, based on that dandy pseudo random function with either the time of day to the millisecond or the actual message as the seed.
Most of these things are rotations (remember ROT 13?) around the 256 ascii set, the number of rotations used against the key and message are many and randomly based. This allows you to play with modular math.
I used a conversion to hex mostly to allow the message (or log in, in some cases) to be sent as post (well, you try to escape a chunk of binary.)
This essentially doubles post size, but a real clever person could encrypt the hex code and throw it back into hex (rinse, repeat, etc.)
Most of this is just because it is fun and fairly easy.
Now if we were dealing with credit card numbers or socials I suppose I would have to get serious.
There are some notions in both the original article and the comments that need to be cleared up here.
1. The idea that cryptosystems are built on the notion that the incoming data is uniformly distributed / random looking is absolute bollocks. All cryptosystems are designed to be secure regardless of input entropy. Otherwise, it's literally not a cryptosystem, it's some kind of text mangler that little Jimmy might write in his highschool computing class.
2. A known plaintext attack is one of the attack models that cryptosystems are built to resist. That is to say, cryptosystems are specifically designed from the ground up (and have been for decades) to take this type of attack vector into account. That being said, attack models are theoretical only and are not actually useful for anything other than proving security of various cryptosystems and protocols. They aren't actual attacks - it's not like someone out there is an expert at known plaintext attacks and is just waiting for you to initiate a HTTPS connection to your dating website of choice.
3. The material referenced in the article merely shows yet another type of cryptanalysis attack. There is a long history of new cryptanalysis methods popping up over the years - Differential cryptanalysis, linear cryptanalysis, integral cryptanalysis, boomerang attacks etc. These might result in a cryptographic break (theoretical or otherwise), but then the next round of cryptosystem development will take the various new attack methods into account. We aren't seeing any fundamental shift in information theory or cryptographic design here, at least from what I can see. Of course, that's not as interesting as your generic sky-is-falling-down story that El Reg uses to lure unwary internet surfers into a frantic click-fest of all their ads (or so I presume).
4. There is definitely such thing as unbreakable encryption - the one time pad. One time pads are literally completely unbreakable, provided the pad is completely random (ie. not a PRNG) and that the pad is only used once and immediately destroyed after use. Given only the ciphertext of an OTP, all the cryptanalysts in the world still couldn't tell whether the message was "Attack at dawn" or "Retreat to sea" or even "We love Beiber". Of course, one time pads are not at all practical for most things, so discussion of OTPs is mostly just academic.
5. If you use a PRNG with an OTP, you've literally just made a stream cipher. Of course, you need to use PRNGs designed for cryptographic use if you want any hope of some greasy analyst in a three-letter-agency not listening into your voip calls to your mother; most PRNGs are not designed for cryptographic use and are merely designed to provide pseudo-randomness in a statistical sense, rather than an unpredictable sense.
eg if I deliberately misspell or add random letters in my message does this make it harder for machines to identify the text. ie the opposite of Enigma operators using standard texts in their messages?
"meet me at nine at the station below the clock"
becomes
"meet me 9 at stn below t'clok"
I guess this is part of the reason they used to transmit messages as 5 character alphanumeric groups.
One of the classics for Enigma was (IIRC) to look for fo "Heil Hitler" at the end of the message. Get a station that used that and you could nail it's entire output (if not it's whole network).
Likewise DESCrack had a mode where the processing elements would interrupt the control processor if they found something that looked "interesting". You might get a bunch of false positives but intuitively this is also the idea of non uniform entryopy.
As for wireless door looks.
Oh look another proprietary IE secret protocol that has not been subject to public scrutiny and turns out to not quite as secure as its makers claim.
So bottom line randomness is the enemy of compressability. So first randomize your files contents. I don't mean what a character means, I mean it's actual location in the file IE transmit the file out of order, and try to ensure no one knows what type of file is being transmitted in the first place. If it's known that this file is a Word or Excel document you already know it's structure. You've already got cribs
Frankly I'm very surprised this assumption is made.
Garbo, With one of the Nazi encrypted networks they used to plant in information via a double agent. Then wait for the report to turn up in the traffic and they'd have the plain text and the ciphertext and that would give them the keys. Bingo. now you can read the rest of the traffic you didn't know already.
Doesn't this stuff make encryption potentially more secure for those who know what they're doing?
If there are biases in (say) how people choose keys, or in the plaintext, that can be exploited, then an attacker will be using methods that search for the most likely cases first.
So if you are able to choose keys or plaintext that are statistically unlikely (as far as the attacker's knowledge goes), then it's likely to take the attacker longer to crack the encryption than if he used unbiased techniques.
It's a bit like trying to choose lottery numbers that no-one else will have chosen, in order get a bigger payout.
I would hope the authors (none of whom seem to be crypto mainstream) are not responsible for the puffery in this article. It's perhaps an advance in terms of coding theory, but not cryptography (which is completely different).
Cryptographers already know about low entropy in plaintexts and passwords. They don't often consider uniform sources, and they hardly ever think asymptotic equipartition to be relevant.
In real life, when considering resistance against a brute force attack, cryptographers typically assume that a plaintext is known, and therefore has entropy of 1. For advanced situations, they assume a chosen plaintext with an entropy of zero.
These are the main forms of theoretical brute force attack on a cipher. Some are pretty unlikely in real life, but a cipher which is not resistant to all of these will be rejected out of hand.:
Ciphertext-only attack - the attacker has only the ciphertext and what he knows about the sender - eg he speaks English - to help him.
Known plaintext attack - the attacker can find the plaintext for one message, and wants too find the key so he can decrypt more messages sent with the same key.
Chosen plaintext attack - the attacker can trick the sender into encrypting a message of his choice. In some cases this can be more useful when trying ti find the key.
Adaptive chosen plaintext attack, where the attacker choses a plaintext and gets the sender to encrypt it, then can choose another based on the resuylts iof the first encryption and trick the sender again. And so on.
Chosen ciphertext attack - the attacker can get the recipient to decrypt messages of his choice. Again he wants to find the key.
Adaptive chosen ciphertext attack - as in the adaptive chosen plaintext attack above but with ciphertexts.