项目作者: terrillmoore

项目描述 :
The code from NIST SP-800-22 for testing random-number generators, along with docs for reference
高级语言: E
项目地址: git://github.com/terrillmoore/NIST-Statistical-Test-Suite.git
创建时间: 2019-01-05T22:42:21Z
项目社区:https://github.com/terrillmoore/NIST-Statistical-Test-Suite

开源协议:Other

下载


The NIST Statistical Test Suite

This repository contains the code for testing random number generators published by the US National Institute of Science and Technology (NIST). The code was created to accompany NIST SP-800-22 R1a, a copy of which can be found in this repository as nistspecialpublication800-22r1a.pdf (ref [1]).

Given that this is rather useful code, and that it was unavailable in January 2019 due to the US government shutdown, I thought it would be useful to publish this for reference. (Call to action for civic hackers: we need a curated site that links to this kind of material.)

Contents:

Import structure

So that we can add the usual GitHub sugar and keep clear “what is new here” vs “what came from NIST”, the NIST code is in the subdirectory sts.

How to build on Ubuntu

Because git doesn’t store empty directories, you need to do some setup after initial checkout. This repository has a script for that, at the top level: setup.sh. The setup procedure is:

  1. $ ./setup.sh
  2. Setting up directories in ./sts/experiments.
  3. Created ./sts/obj.
  4. Created ./sts/experiments/AlgorithmTesting.
  5. Created ./sts/experiments/BBS.
  6. Created ./sts/experiments/CCG.
  7. Created ./sts/experiments/G-SHA1.
  8. Created ./sts/experiments/LCG.
  9. Created ./sts/experiments/MODEXP.
  10. Created ./sts/experiments/MS.
  11. Created ./sts/experiments/QCG1.
  12. Created ./sts/experiments/QCG2.
  13. Created ./sts/experiments/XOR.
  14. Creating the subdirectories via ./sts/experiments/create-dir-script.
  15. create-dir-script succeeded.
  16. Directories are set up. Change directory to ./sts and say 'make'!
  17. $ cd ./sts
  18. $ make

The build procedure creates an executable, sts/assess, which is usually run interactively to do a test. See Section 5.6 of [1].

Sample run for known input

Here’s a transcript of a full run.

  1. $ ./assess 100000
  2. G E N E R A T O R S E L E C T I O N
  3. ______________________________________
  4. [0] Input File [1] Linear Congruential
  5. [2] Quadratic Congruential I [3] Quadratic Congruential II
  6. [4] Cubic Congruential [5] XOR
  7. [6] Modular Exponentiation [7] Blum-Blum-Shub
  8. [8] Micali-Schnorr [9] G Using SHA-1
  9. Enter Choice: 0
  10. User Prescribed Input File: data/data.pi
  11. S T A T I S T I C A L T E S T S
  12. _________________________________
  13. [01] Frequency [02] Block Frequency
  14. [03] Cumulative Sums [04] Runs
  15. [05] Longest Run of Ones [06] Rank
  16. [07] Discrete Fourier Transform [08] Nonperiodic Template Matchings
  17. [09] Overlapping Template Matchings [10] Universal Statistical
  18. [11] Approximate Entropy [12] Random Excursions
  19. [13] Random Excursions Variant [14] Serial
  20. [15] Linear Complexity
  21. INSTRUCTIONS
  22. Enter 0 if you DO NOT want to apply all of the
  23. statistical tests to each sequence and 1 if you DO.
  24. Enter Choice: 1
  25. P a r a m e t e r A d j u s t m e n t s
  26. -----------------------------------------
  27. [1] Block Frequency Test - block length(M): 128
  28. [2] NonOverlapping Template Test - block length(m): 9
  29. [3] Overlapping Template Test - block length(m): 9
  30. [4] Approximate Entropy Test - block length(m): 10
  31. [5] Serial Test - block length(m): 16
  32. [6] Linear Complexity Test - block length(M): 500
  33. Select Test (0 to continue): 0
  34. How many bitstreams? 10
  35. Input File Format:
  36. [0] ASCII - A sequence of ASCII 0's and 1's
  37. [1] Binary - Each byte in data file contains 8 bits of data
  38. Select input mode: 0
  39. Statistical Testing In Progress.........
  40. Statistical Testing Complete!!!!!!!!!!!!
  41. $

The test results for this test show up in sts/experiments/AlgorithmTesting/finalAnalysisReport.txt. This test is checking the bits from a transcendental number (pi), and so all the tests pass.

If you run a different test, things show up in different places! See Result Locations, below.

Sample test results

Here are the results generated by the sample run.

  1. $ cat experiments/AlgorithmTesting/finalAnalysisReport.txt
  2. ------------------------------------------------------------------------------
  3. RESULTS FOR THE UNIFORMITY OF P-VALUES AND THE PROPORTION OF PASSING SEQUENCES
  4. ------------------------------------------------------------------------------
  5. generator is <data/data.pi>
  6. ------------------------------------------------------------------------------
  7. C1 C2 C3 C4 C5 C6 C7 C8 C9 C10 P-VALUE PROPORTION STATISTICAL TEST
  8. ------------------------------------------------------------------------------
  9. 1 1 3 0 0 2 1 0 1 1 0.534146 10/10 Frequency
  10. 1 2 1 0 2 2 1 0 1 0 0.739918 10/10 BlockFrequency
  11. 1 1 1 2 1 0 0 2 1 1 0.911413 10/10 CumulativeSums
  12. 1 2 0 1 1 1 1 2 1 0 0.911413 10/10 CumulativeSums
  13. 0 4 1 1 0 2 0 1 0 1 0.122325 10/10 Runs
  14. 0 1 0 4 1 0 1 1 1 1 0.213309 10/10 LongestRun
  15. 1 1 0 1 1 1 2 1 0 2 0.911413 10/10 Rank
  16. 2 1 0 0 2 1 1 1 2 0 0.739918 10/10 FFT
  17. 1 2 1 0 2 2 1 1 0 0 0.739918 10/10 NonOverlappingTemplate
  18. 1 3 0 0 3 3 0 0 0 0 0.035174 10/10 NonOverlappingTemplate
  19. ... many output lines omitted for clarity ...
  20. 0 1 1 2 1 1 1 2 0 1 0.911413 10/10 NonOverlappingTemplate
  21. 4 0 1 2 0 1 0 0 0 2 0.066882 9/10 OverlappingTemplate
  22. 10 0 0 0 0 0 0 0 0 0 0.000000 * 0/10 * Universal
  23. 0 1 1 2 0 1 3 1 1 0 0.534146 10/10 ApproximateEntropy
  24. 0 2 0 0 0 0 0 0 0 0 ---- 2/2 RandomExcursions
  25. 0 0 0 0 1 0 1 0 0 0 ---- 2/2 RandomExcursions
  26. 0 0 0 0 1 0 0 1 0 0 ---- 2/2 RandomExcursions
  27. 0 0 0 0 0 0 1 1 0 0 ---- 2/2 RandomExcursions
  28. 0 0 1 0 0 0 0 1 0 0 ---- 2/2 RandomExcursions
  29. 0 0 1 0 0 0 0 0 0 1 ---- 2/2 RandomExcursions
  30. 0 0 0 1 0 0 1 0 0 0 ---- 2/2 RandomExcursions
  31. 0 1 0 0 1 0 0 0 0 0 ---- 2/2 RandomExcursions
  32. 0 0 0 1 0 0 0 1 0 0 ---- 2/2 RandomExcursionsVariant
  33. 0 0 0 1 0 0 0 1 0 0 ---- 2/2 RandomExcursionsVariant
  34. 0 0 0 1 0 0 0 0 0 1 ---- 2/2 RandomExcursionsVariant
  35. 0 0 0 1 0 1 0 0 0 0 ---- 2/2 RandomExcursionsVariant
  36. 0 0 0 0 1 1 0 0 0 0 ---- 2/2 RandomExcursionsVariant
  37. 0 0 0 0 2 0 0 0 0 0 ---- 2/2 RandomExcursionsVariant
  38. 0 0 0 1 1 0 0 0 0 0 ---- 2/2 RandomExcursionsVariant
  39. 0 0 0 0 0 1 1 0 0 0 ---- 2/2 RandomExcursionsVariant
  40. 0 0 0 0 0 2 0 0 0 0 ---- 2/2 RandomExcursionsVariant
  41. 0 1 0 0 0 0 0 0 0 1 ---- 2/2 RandomExcursionsVariant
  42. 0 1 0 0 0 0 0 0 1 0 ---- 2/2 RandomExcursionsVariant
  43. 0 1 0 1 0 0 0 0 0 0 ---- 2/2 RandomExcursionsVariant
  44. 0 1 1 0 0 0 0 0 0 0 ---- 2/2 RandomExcursionsVariant
  45. 0 1 0 1 0 0 0 0 0 0 ---- 2/2 RandomExcursionsVariant
  46. 0 0 2 0 0 0 0 0 0 0 ---- 2/2 RandomExcursionsVariant
  47. 0 0 1 1 0 0 0 0 0 0 ---- 2/2 RandomExcursionsVariant
  48. 0 0 1 0 1 0 0 0 0 0 ---- 2/2 RandomExcursionsVariant
  49. 0 1 0 0 0 0 1 0 0 0 ---- 2/2 RandomExcursionsVariant
  50. 3 0 2 1 0 0 1 0 1 2 0.350485 9/10 Serial
  51. 2 2 1 1 0 2 0 0 0 2 0.534146 9/10 Serial
  52. 2 2 1 0 0 1 1 0 2 1 0.739918 10/10 LinearComplexity
  53. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
  54. The minimum pass rate for each statistical test with the exception of the
  55. random excursion (variant) test is approximately = 8 for a
  56. sample size = 10 binary sequences.
  57. The minimum pass rate for the random excursion (variant) test
  58. is approximately = 1 for a sample size = 2 binary sequences.
  59. For further guidelines construct a probability table using the MAPLE program
  60. provided in the addendum section of the documentation.
  61. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Notes on using the package

Test input scripts

If you are careful, you can supply answers in a text file.

The top-level directory test is intended for input scripts for test purposes. Currently defined:

  • test/data-pi-all-10-streams.txt contains the inputs used in Sample run for known input, above, and so can be used to quickly reproduce the results above. Use:

    1. ./assess 100000 <../test/data-pi-all-10-streams.txt

Result locations

The results show up in different subdirectories depending on the algorithm chosen in the first menu. The correspondence is:

Choice Menu Description Output Location Output File

0 | Input File | Algorithm-Testing | experiments/AlgorithmTesting/finalAnalysisReport.txt
1 | Linear Congruential | LCG | experiments/LCG/finalAnalysisReport.txt
2 | Quadratic Congruential I | QCG1 | experiments/QCG1/finalAnalysisReport.txt
3 | Quadratic Congruential II | QCG2 | experiments/QCG2/finalAnalysisReport.txt
4 | Cubic Congruential | CCG | experiments/CCG/finalAnalysisReport.txt
5 | XOR | XOR | experiments/XOR/finalAnalysisReport.txt
6 | Modular Exponentiation | MODEXP | experiments/MODEXP/finalAnalysisReport.txt
7 | Blum-Blum-Shub | BBS | experiments/BBS/finalAnalysisReport.txt
8 | Micali-Schnorr | MS | experiments/MS/finalAnalysisReport.txt
9 | G Using SHA-1 | G-SHA1 | experiments/G-SHA1/finalAnalysisReport.txt

Bitstreams, the parameter to assess, and data files

assess tests a random number generator by looking at several “bit streams” from that random number generator. The parameter to the assess command is the length of each tested bitstream.

Each tested bit stream must pass a variety of tests. Generally speaking, assess must test many bit streams before coming to a judgment. The practical minimum is 10 bit streams. The number of bit streams comes from your answer to this question:

  1. How many bitstreams?

Data files are assumed to contain sequences of random bits, i.e., coin flips. “ASCII” files contains series of 0 and 1 characters, one per bit. (Whitespace is ignored.) “Binary” files contains series of binary bytes, with 8 bits per byte. (All bytes are taken as data.)

When providing data in a file, you must provide enough data for the entire test, i.e. your file must contain nBitStreams * lengthBitStream bits. lengthBitStream comes from the command line; nBitStreams comes from interactive input. If you supply extra bits in your data file, that’s OK; extra bits are ignored. But if you don’t supply enough bits in your data file, assess will exit with a failure:

  1. Statistical Testing In Progress.........
  2. ERROR: Insufficient data in file data/data.pi. 4882 bits were read.
  3. free(): double free detected in tcache 2
  4. Aborted (core dumped)

For example, data.pi contains 1,004,882 bits of data (calculated using tr -dc 01 <data/data.pi | wc). This means you can test 10 streams of 100,000 bits (ignoring the last 4,882 bits), or 100 streams of 10,000 bits, and so on.

Random Number Generators and floating-point data

If you are testing a random-number generator that returns floats, you have to figure out how to turn that into a sequence of random bits. Almost always, you’re using the floats to map into some other continuous or discrete distribution, and that maps some number of random bits from the float into some other number of random bits in your app. There are subtle questions about how best to do that.

If you’re familiar with floating-point representation, you can extract the bit sequence from a binary representation of your float, but you have to be careful to extract the right number of bits, and you must take care to avoid round-off error.

If you’re not very familiar with floating-point representation, it’s best to find the API for your generator that will return uniformly distributed integers. Generate integers that are distributed uniformly over a power of two (e.g., in the range 0 to 255), and convert each integer to binary as an ASCII string. Packing into bytes is not necessary, and is tricky if your generator is not generating numbers that are distributed uniformly over the range from 0 to 255.

Anomalies

I’ve noted the following:

  1. The manual says that the data files were generated with a million bits. However, an “assess” test asking for a million bits will run out of data. In fact, there are only enough bits to support an assess value of 100,000.

  2. If you try to run an analysis on more bits than are present, you get a core dump.

    1. $ assess 1000000 <../test/data-pi-all-10-streams.txt
    2. {output omitted}
    3. Statistical Testing In Progress.........
    4. ERROR: Insufficient data in file data/data.pi. 4882 bits were read.
    5. *** Error in `./assess': double free or corruption (!prev): 0x00000000017e1b10 ***
    6. ======= Backtrace: =========
    7. {output omitted}
    8. Aborted (core dumped)
    9. $
  3. Several of the generators seem to generate igamc: UNDERFLOW errors over and over during the test. These are all generators that fail the tests, so I assume it’s due to insufficient randomness. The message comes from function cephes_igamc() in file cephes.c; this is used many places, and I’ve not tracked down which test is causing this.