JPEG File Layout and Format |
|
The File Layout
If a 0xff byte occurs in the compressed image data either a zero byte (0x00) or a marker identifier follows it. Normally the only marker that should be found once the image data is started is an EOI. When a 0xff byte is found followed by a zero byte (0x00) the zero byte must be discarded.
A JPEG file consists of the eight following parts:
JPEG File Format
Header : · It occupies two bytes. · 0xff, 0xd8 (SOI : Start Of Image ) (these two identify a JPEG/JFIF file)
Segments or markers: · Following the SOI marker, there can be any number of “segments” or “markers” such as APP0,DQT, DHT, SOF, SOS and so on. · An APP0 segment is immediately follows the SOI marker.
Trailer: · It occupies two bytes. · 0xff, 0xd9 (EOI: End of Image) (these two identify end of image).
Format of each segment:
Header (4 bytes): 0xff 1byte identifies segment . n 1byte type of segment. sh, sl 2bytes size of the segment, including these two bytes, but not including the 0xff and the type byte. Note, not intel order: high byte first, low byte last!
Contents of the segment: max. 65533 bytes.
Notes:
Segment types:
SOI 0xd8 Start Of Image
APP0 0xe0 JFIF APP0 segment marker, APP15 0xef ignore
SOF0 0xc0 Start Of Frame (baseline JPEG), for details see below SOF1 0xc1 Start Of Frame (baseline JPEG), for details see below SOF2 0xc2 usually unsupported SOF3 0xc3 usually unsupported SOF5 0xc5 usually unsupported SOF6 0xc6 usually unsupported SOF7 0xc7 usually unsupported SOF9 0xc9 for arithmetic coding, usually unsupported SOF10 0xca usually unsupported SOF11 0xcb usually unsupported
SOF13 0xcd usually unsupported SOF14 0xce usually unsupported SOF15 0xcf usually unsupported
DHT 0xc4 Define Huffman Table DQT 0xdb Define Quantization Table SOS 0xda Start Of Scan
JPG 0xc8 undefined/reserved (causes decoding error) JPG0 0xf0 ignore (skip) JPG13 0xfd ignore (skip)
DAC 0xcc Define Arithmetic Table, usually unsupported
DNL 0xdc usually unsupported, ignore DRI 0xdd Define Restart Interval, for details see below DHP 0xde ignore (skip) EXP 0xdf ignore (skip)
*RST0 0xd0 RSTn are used for resync, may be ignored *RST1 0xd1 *RST2 0xd2 *RST3 0xd3 *RST4 0xd4 *RST5 0xd5 *RST6 0xd6 *RST7 0xd7 *TEM 0x01 usually causes a decoding error, may be ignored COM 0xfe Comment, may be ignored
EOI 0xd9 End Of Image
All other segment types are reserved and should be ignored (skipped).
SOF0 (Start Of Frame 0) marker:
Field Size Description
Marker Identifier 2 bytes 0xff, 0xc0 to identify SOF0 marker
Length 2 bytes This value equals to 8 + components*3 value
Data precision 1 byte This is in bits/sample, usually 8 (12 and 16 not supported by most software).
Image height 2 bytes This must be > 0
Image Width 2 bytes This must be > 0
Number of components 1 byte Usually 1 = grey scaled, 3 = color YcbCr or YIQ 4 = color CMYK Each component 3 bytes Read each component data of 3 bytes. It contains, (component Id(1byte)(1 = Y, 2 = Cb, 3 = Cr, 4 = I, 5 = Q), sampling factors (1byte) (bit 0-3 vertical., 4-7 horizontal.), quantization table number (1 byte)).
Remarks: JFIF uses either 1 component (Y, greyscaled) or 3 components (YCbCr, sometimes called YUV, colour).
APP0 (JFIF segment marker) marker:
Field Size Description
Marker Identifier 2 bytes 0xff, 0xe0 to identify APP0 marker
Length 2 bytes It must be >= 16
File Identifier Mark 5 bytes This identifies JFIF. 'JFIF'#0 (0x4a, 0x46, 0x49, 0x46, 0x00)
Major revision number 1 byte Should be 1, otherwise error
Minor revision number 1 byte Should be 0..2, otherwise try to decode anyway
Units for x/y densities 1 byte 0 = no units, x/y-density specify the aspect ratio instead 1 = x/y-density are dots/inch 2 = x/y-density are dots/cm
X-density 2 bytes It should be <> 0
Y-density 2 bytes It should be <> 0
Thumbnail width 1 byte -------
Thumbnail height 1 byte -------
Bytes to be read n bytes For thumbnail (RGB 24 bit), n = width*height*3 bytes should be read immediately followed by thumbnail height
Remarks:
DHT( Define Huffman Table) marker:
Field Size Description
Marker Identifier 2 bytes 0xff, 0xc4 to identify DHT marker
Length 2 bytes This specify length of Huffman table
HT information 1 byte bit 0..3 : number of HT (0..3, otherwise error) bit 4 : type of HT, 0 = DC table, 1 = AC table bit 5..7 : not used, must be 0
Number of Symbols 16 bytes Number of symbols with codes of length 1..16, the sum(n) of these bytes is the total number of codes, which must be <= 256 Symbols n bytes Table containing the symbols in order of increasing code length ( n = total number of codes ).
Remarks: A single DHT segment may contain multiple HTs, each with its own information byte.
DRI (Define Restart Interval) marker:
Field Size Description
Marker Identifier 2 bytes 0xff, 0xdd identifies DRI marker
Length 2 bytes It must be 4
Restart interval 2 bytes This is in units of MCU blocks, means that every n MCU blocks a RSTn marker can be found. The first marker will be RST0, then RST1 etc, after RST7 repeating from RST0.
DQT (Define Quantization Table) marker:
Field Size Description
Marker Identifier 2 bytes 0xff, 0xdb identifies DQT
Length 2 bytes This gives the length of QT.
QT information 1 byte bit 0..3: number of QT (0..3, otherwise error) bit 4..7: precision of QT, 0 = 8 bit, otherwise 16 bit
Bytes n bytes This gives QT values, n = 64*(precision+1)
Remarks:
DAC (Define Arithmetic Table) marker:
SOS (Start Of Scan) marker:
Field Size Description
Marker Identifier 2 bytes 0xff, 0xda identify SOS marker
Length 2 bytes This must be equal to 6+2*(number of components in scan).
Number of Components in scan 1 byte This must be >= 1 and <=4 (otherwise error), usually 1 or 3
Each component 2 bytes For each component, read 2 bytes. It contains, 1 byte Component Id (1=Y, 2=Cb, 3=Cr, 4=I, 5=Q), 1 byte Huffman table to use : bit 0..3 : AC table (0..3) bit 4..7 : DC table (0..3)
Ignorable Bytes 3 bytes We have to skip 3 bytes.
Remarks: The image data (scans) is immediately following the SOS segment. |
Designed and Managed by |
|