[ Home ]
CRC RevEng: arbitrary-precision CRC calculator and algorithm finder
Copyright (C) 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018,
2019, 2020, 2021, 2022, 2024 Gregory Cook
This file is part of CRC RevEng.
CRC RevEng is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
CRC RevEng is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with CRC RevEng. If not, see <https://www.gnu.org/licenses/>.
CRC RevEng incorporates a public domain implementation of getopt
written by Keith Bostic and Rich Salz, with extensions added by the
author. This is found in the contrib/
directory as file getopt.c
. A
compatible getopt
module is available as standard in many environments
and will work just as well, but this copy is included for those ANSI C
environments that lack one.
A copy of the RISC OS Shared C Library (CLib), and a patch to make
certain versions of RISC OS compatible with this version of CLib, are
supplied with the RISC OS binary under licence from Castle Technology
Limited for distribution to end users for the purpose of upgrading, if
required. These are found in the bin/riscos/
directory. The ARM Tube
OS binary, located in the bin/armtubeos/
directory, is statically linked
with (that is, it incorporates) the RISC OS Shared C Library.
CLib qualifies as a "System Library" and GPLv3 permits licensees to
"combine GPLed software with GPL-incompatible System Libraries, [...]
and distribute them both together." See Sections 1, 5 and 6 and A Quick
Guide to GPLv3, <https://www.gnu.org/licenses/quick-guide-gplv3.html>
The patch is an independent program and is not bound by GPLv3 by mere
inclusion. See Section 5.
CRC RevEng is an arbitrary-precision, machine word length-independent,
byte order-independent CRC calculator and algorithm finder in ANSI C.
It is a port and expansion of the author's crcbfs.pl
script from 2007,
and runs up to 200 times faster on equivalent problems. It is also a
reference implementation of the author's "Catalogue of parametrised CRC
algorithms", with the 113 currently listed models available as presets.
To the author's knowledge CRC RevEng is the first published compiled
application to address the general case of CRC algorithm reversal and
reverse engineering, its predecessor crcbfs.pl
being the first published
application of any type to do so. Greg Ewing of Canterbury University
in New Zealand solved a CRC algorithm manually on similar principles in
2010, but partly by feeding chosen plaintexts into an implementation at
hand.
CRC RevEng is hosted by SourceForge, from where the latest version can be downloaded and documentation browsed:
<https://reveng.sourceforge.io/>
Compiling CRC RevEng is straightforward: in the i386 GNU/Linux, MinGW and Raspbian environments, simply cd to the directory containing the source files, and enter
make
A special makefile can also be used in MinGW to make a Windows executable with version information and an icon:
make -f Mk.Win32
In non-ASCII compatible environments, the array aliases[]
in file
preset.c
may need to be reordered to suit the local string collation
order. The correct order of models[]
and aliases[]
can be verified at
compile time by the error-free completion of this command:
make clean pretst
Some enterprise users may wish to disable the -F switch to minimise CPU
usage. To do this, define the macro ALWPCK
in config.h
or on the
command line.
* * *
In RISC OS, with the Acorn Desktop Development Environment (DDE)
installed, click Menu on the file RISCOSify
, and set its type to Obey.
Double-click Select first on RISCOSify
, then on Mk/RISCOS
.
Mk/RISCOS
is compatible with both !Make
and !AMU
.
Alternatively, install GCC4 (and optionally Bash) from the cloud using
the package manager !PackMan
. Run RISCOSify
as above, then press F12 or
Ctrl+F12 to obtain a command prompt. Navigate to the directory
containing CRC RevEng by entering the appropriate Dir command. With
Bash installed, enter the following:
Run <Packages$Apps>.Development.!GCC Run <Packages$Apps>.Utilities.!Bash WimpSlot -min 16000K make -f Mk.ROSGCC
Without Bash, you must validate the bitmap size definitions manually. Enter:
Run <Packages$Apps>.Development.!GCC WimpSlot -min 16000K make -f Mk.ROSGCC bmptst Run bmptst Show Sys$ReturnCode
Ensure that bmptst
produces no messages and that the value of
Sys$ReturnCode
is 0. If all is well, continue with:
make -f Mk.ROSGCC
* * *
Otherwise, enter commands similar to the following to compile CRC RevEng on any ANSI C compliant compiler:
gcc -O3 -Wall -ansi -c bmpbit.c cli.c model.c poly.c preset.c \ reveng.c cd contrib gcc -O3 -Wall -ansi -c getopt.c cd .. gcc -o reveng bmpbit.o cli.o model.o poly.o preset.o reveng.o \ contrib/getopt.o
The platform-independent method does not compile the preset models. To
compile them, you will need to edit the configuration options in
config.h
to suit your architecture. Having done so, define the macro
PRESETS
in config.h
and recompile as above.
Usage: reveng -cdDesvhu? [-1bBfFGlLMrStVXyz] [-a BITS] [-A OBITS] [-i INIT] [-k KPOLY] [-m MODEL] [-p POLY] [-p POLY] [-P RPOLY] [-q QPOLY] [-w WIDTH] [-x XOROUT] [STRING...] Options: -a BITS bits per character (1 to n) -A OBITS bits per output character (1 to n) -i INIT initial register value -k KPOLY generator in Koopman notation (implies WIDTH) -m MODEL preset CRC algorithm -p POLY generator or search range start polynomial -P RPOLY reversed generator polynomial (implies WIDTH) -q QPOLY search range end polynomial -w WIDTH register size, in bits -x XOROUT final register XOR value Modifier switches: -1 skip equivalent forms -b big-endian CRC -B big-endian CRC output -f read files named in STRINGs -F skip preset model check pass -G skip brute force search pass -l little-endian CRC -L little-endian CRC output -M non-augmenting algorithm -r right-justified output -S print spaces between chars -t left-justified output -V reverse algorithm only -X print uppercase hexadecimal -y low bytes first in files -z raw binary STRINGs Mode switches: -c calculate CRCs -d dump algorithm parameters -D list preset algorithms -e echo (and reformat) input -s search for algorithm -v calculate reversed CRCs -h | -u | -? show this help
You can use one of the preset models or specify your own.
reveng -m crc-32/iso-hdlc
selects the CRC-32 algorithm used in PKZIP and elsewhere. You can dump any preset model as an extended Williams model record using -d:
reveng -m crc-32/iso-hdlc -d width=32 poly=0x04c11db7 init=0xffffffff refin=true refout=true xorout=0xffffffff check=0xcbf43926 residue=0xdebb20e3 name="CRC-32/ISO-HDLC"
You can specify the parameters of the model instead:
reveng -w 32 -p 04c11db7 -i ffffffff -l -x ffffffff
This is equivalent to
reveng -m crc-32/iso-hdlc
except the model will have no name when dumped. (The -l switch sets both RefIn = True and RefOut = True. To set RefOut separately, use switches -L and -B.)
The options and switches for specifying a model are:
0x18005
is specified
as C002. This automatically provides the Width value.0x18005
is specified as 8005.0x18005
is specified as A001. This automatically
provides the Width value.Other model-related options:
Messages for CRC RevEng to process can be specified as files, as raw binary strings, or as numerical (typically hexadecimal) string arguments on the command line. Output from CRC RevEng is either as extended Williams model records (having their own fixed format) or as numerical string arguments printed one per line on standard output.
When passing numerical arguments on the command line, each argument is conceptually divided into characters, each character consisting of one or more hexadecimal digits. For each character, enough hex digits to specify it are read then a number of bits (specified by -a BITS) are taken from the least significant end, reversed (if RefIn = True) and appended to the binary representation of the argument; any excess bits are discarded. The -a BITS option (q.v.) permits a number of useful representations of a given underlying binary sequence.
When passing messages as files or as raw binary strings, the same division into characters applies; bytes of the message are read until enough bits have been collected, then the number of bits specified by -a BITS are taken from the specified least significant end of the collection (see -y), reflected (if RefIn = True) and added to the binary representation.
When printing CRCs, the binary representation is again divided into characters (of size specified by -A OBITS), each of which is reversed (if RefOut = True) and printed with the minimum sufficient number of hex digits. The alignment of the character divisions is controlled by -r and -t.
Output model records conform to the Williams model set out in "A
Painless Guide to CRC Error Detection Algorithms". To recap, the model
consists of a linear feedback shift register (LFSR) having the number
of bit cells defined in the width
parameter, and shifting from right to
left. The init
parameter defines the settings of the bit cells at the
start of each calculation, before reading the first message bit. The
refin
parameter, if equal to false
, specifies that the characters of the
message (whose size is specified by -a BITS) are read bit-by-bit, most
significant bit (MSB) first; if equal to true
, the characters are read
bit-by-bit, least significant bit (LSB) first. Each sampled message bit
is then XORed with the bit being simultaneously shifted out of the
register at the most significant end, and the result is passed to the
feedback taps. The poly
parameter specifies the feedback taps of the
register – it is the result of initialising the register to all zeroes,
then reading a single one bit.
The refout
parameter, if equal to false
, specifies that the contents of
the register after reading the last message bit are unreflected before
presentation; if equal to true
, it specifies that they are reflected,
character-by-character, before presentation. For the purpose of the
definitions below, the reflection is performed before the division into
characters, by swapping the content of each cell with that of the cell
an equal distance from the opposite end of the register; the characters
(size defined by -A OBITS) of CRCs output by -c and -v are then true
images of parts of the reflected register, the character containing the
original MSB always appearing first. The xorout
parameter defines the
XOR value applied to the contents of the register after the last message
bit has been read and after the optional reflection. The check
parameter defines the contents of the register after initialising,
reading the UTF-8 string "123456789"
(as 8-bit characters), optionally
reflecting, and applying the final XOR. The residue
parameter defines
the contents of the register after initialising, reading an error-free
codeword and optionally reflecting the register (if refout=true
), but
not applying the final XOR. This is mathematically equivalent to
initialising the register with the xorout
parameter, reflecting it as
described (if refout=true
), reading as many zero bits as there are cells
in the register, and reflecting the result (if refin=true
). The residue
of a crossed-endian model is calculated assuming that the characters of
the received CRC are specially reflected before submitting the codeword.
width
is printed as a decimal integer. poly
, init
, xorout
, check
and
residue
are true images of the shift register at various times, printed
as hexadecimal integers. refin
and refout
are Boolean values, displayed
as either true
or false
. The name
parameter is the name assigned to
a preset model, enclosed in double quotes, otherwise (none)
.
There are a few more options for controlling the presentation of input and output:
When a model has been specified, use -c or -v to calculate CRCs for input messages.
If -V and -v are given together, their respective model reversals cancel
out. CRC RevEng then calculates an ordinary CRC for each argument,
processing the characters from right to left and likewise emitting the
CRC characters in reverse order.
Correspondingly, to obtain the same effect as -v using a model reversed
by -V, the user must present the characters of his or her message, and
process those of the returned CRC, in reverse order.
Take care when the CRC width (-w) is not a multiple of the character width (-a). If the result of a calculation is not what you expect, try selecting left-justification (with -t) or right-justification (with -r).
The -c mode is, of course, useful for creating a checksum to append to a message so that the combination will pass a particular CRC check. The -v mode, on the other hand, is useful for editing a message so as to force its checksum to a desired or at least predetermined value. In order to do this there must be some part of the message's data that can be modified freely without observable effect; many network protocols and file formats, including executables, images and word processor documents, have (or can be altered to have) reserved fields or comment sections that cannot be easily viewed, and whose contents are entirely ignored.
Among the simplest ways to control a CRC calculation is to find one such unused space that is both contiguous and large enough to hold a checksum. For example, suppose we have an existing message with an SDLC CRC:
0: 44 6F 67 73 2F 2A 12 34 2A 2F 72 6F 63 6B 4E 47 | Dogs/*.4*/rockNG
Here, 4E 47
is the SDLC checksum, and we wish to alter the message
without either changing the checksum or failing the CRC. We notice that
the 7th and 8th bytes can be replaced at will, and these can contain a
calculated value to force the CRC. Firstly we change the text as we
wish:
0: 43 61 74 73 2F 2A 12 34 2A 2F 72 75 6C 65 4E 47 | Cats/*.4*/ruleNG
Calculate the CRC of the part on the left of the unused space with XorOut = 0:
reveng -m crc-16/ibm-sdlc -x 0 -c 436174732f2a 9dc5
Then reverse-calculate the CRC of the part on the right, including the old CRC, with Init = 0:
reveng -m crc-16/ibm-sdlc -i 0 -v 2a2f72756c654e47 1505
Now exclusive-OR the two returned CRCs together, and insert the result in the unused space. CRC RevEng can be used to do the exclusive-OR if a hex calculator is not at hand:
reveng -w 16 -p 0001 -c 9dc51505 88c0
Our edited message now looks like this:
0: 43 61 74 73 2F 2A 88 C0 2A 2F 72 75 6C 65 4E 47 | Cats/*.A*/ruleNG
To confirm that it still passes the SDLC CRC:
reveng -m crc-16/ibm-sdlc -c 436174732f2a88c02a2f72756c65 4e47
For more flexible editing options, see Mark Adler's source file spoof.c
(2014). "spoof.c
takes an abbreviated description of the CRC, the
exclusive-or of the current CRC of the message and the desired CRC, the
length of the message, and a list of bit locations in a message, and
tells you which of those bits should be inverted in the message to get
the desired CRC. Note that it does not need the message itself, due to
the linearity property of CRCs."
* * *
In Stigge et al. (section 4.1) a polynomial q(x) is calculated as the
multiplicative inverse of xN such that (xN) q(x) = 1 (mod pCRC(x)).
The authors promote the extended Euclidean algorithm as a means of
calculating q(x), however any CRC calculator can also produce it. The
reciprocal of the CRC-32 polynomial is 0xdb710641
, as output by:
reveng -w 32 -p 04c11db7 -V -d
The authors' constant CRCINV
, a reflected representation of q(x), is the
CRC of the reversal of the desired remainder, 0x00000001
:
reveng -w 32 -p db710641 -c 80000000 5b358fd3
Equivalently, the -v function returns q(x) in direct order from the unreflected parameters:
reveng -w 32 -p 04c11db7 -v 00000001 cbf1acda
The most important feature of CRC RevEng is the ability to recover the parameters of a CRC algorithm from a handful of codewords created by that algorithm. In general at least four data points are needed, either as known parameters or as message-CRC pairs. Extra data points help to eliminate false results and to confirm models that are found.
Known parameters are specified using -w WIDTH, -p POLY, -i INIT and -x XOROUT (see SPECIFYING A MODEL above). The width, -w, is a required parameter for all searches and counts as one of the data points. The size of characters (words) in the protocol must also be known and set with -a if this is not 8 bits.
The search function is selected with -s, and message-CRC pairs are given as arguments, each message and CRC combined into one argument. There must not be any non-participating characters between each message and its CRC, or the search will not work. Typically, end-of-message markers do not participate in the CRC.
As non-standard algorithms are comparatively rare, the program first tries all the preset models of the given width, reporting and exiting if one is found. Otherwise it commences a full search. As it proceeds it prints a progress message from time to time, allowing the search to be restarted from that point:
reveng: searching: width=32 poly=0x50000001 refin=false refout=false
If -b or -l are specified, CRC RevEng only searches for algorithms with that bit ordering. Otherwise, it tries RefIn = False, RefOut = False then RefIn = True, RefOut = True. Crossed-endian algorithms are also uncommon and the program will not search for them.
The search for the Poly value proceeds much faster if the most compact difference between any two right-aligned arguments is limited to a few bytes in length. The search space is then determined by the length of the difference, less the length of the generator polynomial. When the CRC algorithm is wide, try for instance to obtain a pair of codewords where the message portion differs only in the last byte or two. Even if the difference at the end of the message portion is as long as the checksum itself, the number of trial divisions needed is still cut in half. CRC RevEng takes the greatest common divisor of all differences between arguments, which is no longer than the shortest difference, and may reduce the search space further.
To find the Poly value when Init is not known, at least two of the given arguments must have the same length.
To find both the Init and XorOut values, at least two of the given arguments must have different lengths; otherwise there is only enough information to determine one value, given the other. If all arguments have the same length then, by default, CRC RevEng fixes XorOut at zero and calculates Init accordingly. (In hardware it is easier to set a non-zero Init than to apply a non-zero XorOut.) To set XorOut to another value, specify -x XOROUT; to fix Init and calculate XorOut instead, use -i INIT.
When bit strings are input with -a 1, there is no information on
endianness. In such cases -s returns the big- and little-endian forms
of each algorithm found. The Check values of these forms will differ,
as they are always calculated on the 8-bit UTF-8 string "123456789"
.
The Residue values will be each other's mirror images in binary.
Each CRC algorithm whose generator polynomial is a multiple of (x + 1) has one or more equivalent forms. Equivalent forms differ in Init, XorOut and Residue values, but they produce identical CRCs at all message lengths. Therefore unless -1, -i INIT or -x XOROUT are specified, any such algorithm and its equivalent forms are always returned together, no matter how many codewords are given. (If (x + 1)n appears in the factorisation of G(x), then there shall be 2n − 1 equivalent forms.)
To restart a stopped search, or to divide a search between several processors, CRC RevEng can be instructed to search within a specified range of generator polynomial values.
The full search space comprises all 'odd' polynomials of the specified WIDTH, that is, polynomials of the form xn + ... + 1. Treating the concatenated coefficients as a binary integer, the range can be up to (but excluding) a specified polynomial, from a specified polynomial upwards, or from one polynomial up to (but excluding) another.
Where a compact difference between arguments has been found, the polynomial that is searched for is not the generator itself, but its shorter cofactor whose value is determined by the difference between the message portions of the arguments. Candidate generator polynomials are then obtained by dividing the difference by the cofactor, and taking the quotient when the remainder is zero. In such cases it is this cofactor whose width and value is displayed in the progress messages; the width is displayed for information purposes, but should not be entered at the command line when restarting a search. Only the width of the CRC algorithm itself should be entered, along with the polynomial value printed in the most recent progress message.
Polynomial range searching is enabled using [
-p POLY]
-q QPOLY, where
POLY and QPOLY are hex strings. -p POLY, if given, must precede
-q QPOLY. To start searching at a polynomial, use -p POLY -q 0. To
stop searching at a polynomial (exclusive), use -q QPOLY. To search
between two polynomial values, use -p POLY -q QPOLY.
Range limiting does not apply to the initial check against the preset models, or to Init or XorOut values, which are computed using Ewing's fast, efficient algorithm.
For example, to split a 32-bit search into four processes:
reveng -w 32 -q 40000000 -F -s \ 123456789aaf946042 edcb8434325a439fbd 2468ace03238f64e reveng -w 32 -p 40000000 -q 80000000 -F -s \ 123456789aaf946042 edcb8434325a439fbd 2468ace03238f64e reveng -w 32 -p 80000000 -q c0000000 -F -s \ 123456789aaf946042 edcb8434325a439fbd 2468ace03238f64e reveng -w 32 -p c0000000 -q 0 -F -s \ 123456789aaf946042 edcb8434325a439fbd 2468ace03238f64e
To continue an interrupted search:
NB: If an either-endian search is stopped while RefIn/RefOut = False
then it takes two further command lines to complete the search: one
big-endian range search, and one little-endian full search.
reveng -w 32 -p 50000001 -q 0 -b -F -s \ 123456789aaf946042 edcb8434325a439fbd 2468ace03238f64e reveng -w 32 -l -F -s \ 123456789aaf946042 edcb8434325a439fbd 2468ace03238f64e
The full list of search options is as follows:
CRC RevEng provides a few additional options for convenience:
reveng -w 16 -l -F -s 31816b 32c16a 31326a0a reveng -w 32 -l -s c98964f6b9 a5fa49f2fd 13370aee7df0 reveng -w 32 -l -F -s \ 123456789aaf946042 edcb8434325a439fbd 2468ace03238f64e
A comprehensive list is being compiled.
In addition to the disclaimers listed at the top of this file and in the GNU General Public License (see file COPYING), remember that CRC RevEng is a search tool. The return of a CRC model as a search result is a statement that all the given codeword samples satisfy that model, not a proof that the model, or any CRC algorithm at all, produced those samples. Any particular search result may be a false positive, especially from a small number of samples. Also any output is only as accurate as the input.
Model reversal (-V, -v) makes little sense on crossed-endian models.
CRC RevEng lacks a facility to generate code to implement specified
algorithms. Pycrc by Thomas Pircher (2014) is one suitable code
generator; Mark Adler's crcany
source package (2014) also produces
source code when compiled and run, and references the author's CRC
catalogue.
Adler, Mark (15 January 2017). "zlib Home Site" (section "CRC (Cyclic
Redundancy Check) Bonus Information"). Contains links to spoof.c.gz
and
crcany.tar.gz
.
Bies, Lammert; et al. "Computer Interfacing Forum" (section "Error detection and correction").
Cook, Greg (7 August 2024). "Catalogue of parametrised CRC algorithms".
"Dram" (25 December 2021). "Magic tricks with CRC".
Ewing, Gregory C. (March 2010). "Reverse-Engineering a CRC Algorithm". Christchurch: University of Canterbury.
Koopman, Philip (July 2002). "32-Bit Cyclic Redundancy Codes for Internet Applications". The International Conference on Dependable Systems and Networks: 459–468. doi:10.1109/DSN.2002.1028931.
Koopman, Philip (23 January 2017). "Best CRC Polynomials". Pittsburgh: Carnegie Mellon University.
Koopman, Philip; Chakravarty, Tridib (June 2004). "Cyclic Redundancy Code (CRC) Polynomial Selection For Embedded Networks". The International Conference on Dependable Systems and Networks: 145–154. doi:10.1109/DSN.2004.1311885.
Pircher, Thomas (12 December 2016). pycrc. Python based parametrised CRC calculator and C code generator.
Stigge, Martin; Plötz, Henryk; Müller, Wolf; Redlich, Jens-Peter (May 2006). "Reversing CRC – Theory and Practice". Berlin: Humboldt University Berlin.
Williams, Ross N. (24 September 1996). "A Painless Guide to CRC Error Detection Algorithms V3.00".
CRC RevEng came about from the coincidence of four events:
crcbfs.pl
:
The author would like to thank Dr. Mark Adler, Lammert Bies, Wolfgang Ehrhardt, Greg Ewing, Prof. Philip Koopman, Thomas Pircher, Dr. Ross Williams, and all contributors to CRC RevEng and the CRC Catalogue.
Greg Cook
<>
<http://regregex.bbcmicro.net/>
–END–
[ Top of page ]