Hacker Professional Ebook part 159 ppt

The resulting pseudorandom confusion sequence can be combined with data as in the usual stream cipher.. Part of the result, often just a single byte, is used to cipher data, and the resu

Trang 1

c.d.f

In statistics, cumulative distribution function A function which gives the probability of obtaining a particular value or lower

CFB

CFB or Ciphertext FeedBack is an operating mode for a block cipher

CFB is closely related to OFB, and is intended to provide some of the

characteristics of a stream cipher from a block cipher CFB generally forms

an autokey stream cipher CFB is a way of using a block cipher to form a random number generator The resulting pseudorandom confusion sequence can be combined with data as in the usual stream cipher

CFB assumes a shift register of the block cipher block size An IV or initial value first fills the register, and then is ciphered Part of the result, often just

a single byte, is used to cipher data, and the resulting ciphertext is also

shifted into the register The new register value is ciphered, producing

another confusion value for use in stream ciphering

One disadvantage of this, of course, is the need for a full block-wide

ciphering operation, typically for each data byte ciphered The advantage is the ability to cipher individual characters, instead of requiring accumulation into a block before processing

Chain

An operation repeated in a sequence, such that each result depends upon the previous result, or an initial value One example is the CBC operating mode

Chaos

The unexpected ability to find numerical relationships in physical processes formerly considered random Typically these take the form of iterative

applications of fairly simple computations In a chaotic system, even tiny changes in state eventually lead to major changes in state; this is called

"sensitive dependence on initial conditions." It has been argued that every good computational random number generator is "chaotic" in this sense

In physics, the "state" of an analog physical system cannot be fully

measured, which always leaves some remaining uncertainty to be magnified

on subsequent steps And, in many cases, a physical system may be slightly affected by thermal noise and thus continue to accumulate new information into its "state."

Trang 2

In a computer, the state of the digital system is explicit and complete, and there is no uncertainty No noise is accumulated All operations are

completely deterministic This means that, in a computer, even a "chaotic" computation is completely predictable and repeatable

Chi-Square

In statistics, a goodness of fit test used for comparing two distributions Mainly used on nominal and ordinal measurements Also see: Kolmogorov-Smirnov

In the usual case, many independent samples are counted by category or separated into value-range "bins." The reference distribution gives us the the number of values to expect in each bin Then we compute a X2 test statistic related to the difference between the distributions:

X2 = SUM( SQR(Observed[i] - Expected[i]) / Expected[i] )

("SQR" is the squaring function, and we require that each expectation not be zero.) Then we use a tabulation of chi-square statistic values to look up the probability that a particular X2 value or lower (in the c.d.f.) would occur by random sampling if both distributions were the same The statistic also

depends upon the "degrees of freedom," which is almost always one less than the final number of bins See the chi-square section of the Ciphers By Ritter / JavaScript computation pages

The c.d.f percentage for a particular chi-square value is the area of the

statistic distribution to the left of the statistic value; this is the probability of

obtaining that statistic value or less by random selection when testing two

distributions which are exactly the same Repeated trials which randomly sample two identical distributions should produce about the same number of

X2 values in each quarter of the distribution (0% to 25%, 25% to 50%, 50%

to 75%, and 75% to 100%) So if we repeatedly find only very high

percentage values, we can assume that we are probing different distributions And even a single very high percentage value would be a matter of some interest

Any statistic probability can be expressed either as the proportion of the area

to the left of the statistic value (this is the "cumulative distribution function"

or c.d.f.), or as the area to the right of the value (this is the "upper tail")

Using the upper tail representation for the X2 distribution can make sense because the usual chi-squared test is a "one tail" test where the decision is

Trang 3

always made on the upper tail But the "upper tail" has an opposite "sense"

to the c.d.f., where higher statistic values always produce higher percentage values Personally, I find it helpful to describe all statistics by their c.d.f., thus avoiding the use of a wrong "polarity" when interpreting any particular statistic While it is easy enough to convert from the c.d.f to the complement

or vise versa (just subtract from 1.0), we can base our arguments on either form, since the statistical implications are the same

It is often unnecessary to use a statistical test if we just want to know

whether a function is producing something like the expected distribution:

We can look at the binned values and generally get a good idea about

whether the distributions change in similar ways at similar places A good rule-of-thumb is to expect chi-square totals similar to the number of bins, but distinctly different distributions often produce huge totals far beyond the values in any table, and computing an exact probability for such cases is simply irrelevant On the other hand, it can be very useful to perform 20 to

40 independent experiments to look for a reasonable statistic distribution, rather than simply making a "yes / no" decision on the basis of what might turn out to be a rather unusual result

Since we are accumulating discrete bin-counts, any fractional expectation

will always differ from any actual count For example, suppose we expect an even distribution, but have many bins and so only accumulate enough

samples to observe about 1 count for every 2 bins In this situation, the

absolute best sample we could hope to see would be something like

(0,1,0,1,0,1, ), which would represent an even, balanced distribution over the range But even in this best possible case we would still be off by half a count in each and every bin, so the chi-square result would not properly characterize this best possible sequence Accordingly, we need to

accumulate enough samples so that the quantization which occurs in binning does not appreciably affect the accuracy of the result Normally I try to

expect at least 10 counts in each bin

But when we have a reference distribution that trails off toward zero,

inevitably there will be some bins with few counts Taking more samples

will just expand the range of bins, some of which will be lightly filled in any case We can avoid quantization error by summing both the observations and expectations from multiple bins, until we get a reasonable expectation value (again, I like to see 10 counts or more) In this way, the "tails" of the

distribution can be more properly (and legitimately) characterized

Trang 4

Cipher

In general, a key-selected secret transformation between plaintext and

ciphertext Specifically, a secrecy mechanism or process which operates on individual characters or bits independent of semantic content As opposed to

a secret code, which generally operates on words, phrases or sentences, each

of which may carry some amount of complete meaning Also see:

cryptography, block cipher, stream cipher, a cipher taxonomy, and

substitution

A good cipher can transform secret information into a multitude of different

intermediate forms, each of which represents the original information Any

of these intermediate forms or ciphertexts can be produced by ciphering the information under a particular key value The intent is that the original

information only be exposed by one of the many possible keyed

interpretations of that ciphertext Yet the correct interpretation is available merely by deciphering under the appropriate key

A cipher appears to reduce the protection of secret information to

enciphering under some key, and then keeping that key secret This is a great reduction of effort and potential exposure, and is much like keeping your valuables in your house, and then locking the door when you leave But there are also similar limitations and potential problems

With a good cipher, the resulting ciphertext can be stored or transmitted otherwise exposed without also exposing the secret information hidden inside This means that ciphertext can be stored in, or transmitted through, systems which have no secrecy protection For transmitted information, this also means that the cipher itself must be distributed in multiple places, so in general the cipher cannot be assumed to be secret With a good cipher, only the deciphering key need be kept secret

A Cipher Taxonomy

For the analysis of cipher operation it is useful to collect ciphers into groups

based on their functioning (or intended functioning) The goal is to group ciphers which are essentially similar, so that as we gain an understanding of

one cipher, we can apply that understanding to others in the same group We

thus classify not by the components which make up the cipher, but instead

on the "black-box" operation of the cipher itself

Trang 5

We seek to hide distinctions of size, because operation is independent of

size, and because size effects are usually straightforward We thus classify serious block ciphers as keyed simple substitution, just like newspaper

amusement ciphers, despite their obvious differences in strength and

construction This allows us to compare the results from an ideal tiny cipher

to those from a large cipher construction; the grouping thus can provide

benchmark characteristics for measuring large cipher constructions

We could of course treat each cipher as an entity unto itself, or relate ciphers

by their dates of discovery, the tree of developments which produced them,

or by known strength But each of these criteria is more or less limited to

telling us "this cipher is what it is." We already know that What we want to

know is what other ciphers function in a similar way, and then whatever is

known about those ciphers In this way, every cipher need not be an island

unto itself, but instead can be judged and compared in a related community

of similar techniques

Our primary distinction is between ciphers which handle all the data at once (block ciphers), and those which handle some, then some more, then some more (stream ciphers) We thus see the usual repeated use of a block cipher

as a stream meta-cipher which has the block cipher as a component It is also

possible for a stream cipher to be re-keyed or re-originate frequently, and so appear to operate on "blocks." Such a cipher, however, would not have the overall diffusion we normally associate with a block cipher, and so might usefully be regarded as a stream meta-cipher with a stream cipher

component

The goal is not to give each cipher a label, but instead to seek insight Each cipher in a particular general class carries with it the consequences of that class And because these groupings ignore size, we are free to generalize from the small to the large and so predict effects which may be unnoticed in full-size ciphers

A BLOCK CIPHER

A block cipher requires the accumulation of some amount of data or

multiple data elements for ciphering to complete (Sometimes stream ciphers accumulate data for convenience, as in cylinder ciphers, which nevertheless logically cipher each character independently.)

Định dạng
Số trang	5
Dung lượng	26,3 KB