650 likes | 824 Views
Introduction to BUFR. Simon Elliott EUMETSAT simon.elliott@eumetsat.int. What is BUFR?. B inary U niversal F orm for the R epresentation of Meteorological Data Used for data that are not on a regular grid, such as observations Conceptually equivalent to CREX, but format is
E N D
Introduction to BUFR Simon Elliott EUMETSAT simon.elliott@eumetsat.int RA-VI Regional Training on BUFR and Migration to Table Driven Code Forms Langen, Germany, 17 - 20 April, 2007
What is BUFR? Binary Universal Form for the Representation of Meteorological Data Used for data that are not on a regular grid, such as observations Conceptually equivalent to CREX, but format is binary rather than alphanumeric RA-VI Regional Training on BUFR and Migration to Table Driven Code Forms Langen, Germany, 17 - 20 April, 2007
What does a BUFR message look like? 01000010010101010100011001010010000000000000000000110100000000110000000000000000 00010010000000000000000000111000000000000000000000000000000000000000100100000001 00000001000001000001110100001100000000000000000000000000000000000000111000000000 00000000000000011000000000000001000000010000000100000010000011000000010000000000 00000000000000000000100000000000100100001111010111011100010000000011011100110111 0011011100110111 (In other words, just an apparently random string of 0’s and 1’s!) RA-VI Regional Training on BUFR and Migration to Table Driven Code Forms Langen, Germany, 17 - 20 April, 2007
Sections of a BUFR message • 0 Indicator section • 1 Identification section • 2 Optional local use section • 3 Data description section • 4 Data section • 5 End of message RA-VI Regional Training on BUFR and Migration to Table Driven Code Forms Langen, Germany, 17 - 20 April, 2007
Section 0 – Indicator section This section contains: • The character string “BUFR” indicating the start of the message • The total length of the message • The BUFR edition number RA-VI Regional Training on BUFR and Migration to Table Driven Code Forms Langen, Germany, 17 - 20 April, 2007
Section 0 - Details • Length always 8 • Octets 1-4 “BUFR” (in CCITT IA5) • Octets 5-7 Total length of message (including Section 0) • Octet 8 Edition number (currently 4, but 3 is still used) RA-VI Regional Training on BUFR and Migration to Table Driven Code Forms Langen, Germany, 17 - 20 April, 2007
Now, let’s go back and look at that BUFR message again… ‘B’ ‘U’ ‘F’ ‘R’ end of section 0 + octet number 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | binary string 01000010010101010100011001010010000000000000000000110100000000110000000000000000 00010010000000000000000000111000000000000000000000000000000000000000100100000001 00000001000001000001110100001100000000000000000000000000000000000000111000000000 00000000000000011000000000000001000000010000000100000010000011000000010000000000 00000000000000000000100000000000100100001111010111011100010000000011011100110111 0011011100110111 RA-VI Regional Training on BUFR and Migration to Table Driven Code Forms Langen, Germany, 17 - 20 April, 2007
Section 1 – Identification section This section contains: • The table versions referred to by this message • An overall description of the message contents, including: • The originating centre and sub-centre • The data category and sub-category • A representative date and time • Whether or not the optional section is included RA-VI Regional Training on BUFR and Migration to Table Driven Code Forms Langen, Germany, 17 - 20 April, 2007
Section 1 – Details(example based upon BUFR edition 3) • Length at least 18 • Octets 1-3 Length of section • Octet 4 Master table (0 for WMO, 10 for IOC, etc.) • Octet 5-6 Originating sub-centre and centre • Octet 7 Update sequence number • Octet 8 Flag (Optional section?) • Octets 9-10 Data category and sub-category • Octets 11-12 Master and local table version numbers • Octets 13-17 Date and time typical of message contents • Octets 18-?? Reserved for local use RA-VI Regional Training on BUFR and Migration to Table Driven Code Forms Langen, Germany, 17 - 20 April, 2007
Now, let’s go back and look at that BUFR message again… ‘B’ ‘U’ ‘F’ ‘R’ end of section 0 + octet number 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 1 | 2 | binary string 01000010010101010100011001010010000000000000000000110100000000110000000000000000 octet number 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | binary string 00010010000000000000000000111000000000000000000000000000000000000000100100000001 end of section 1 + octet number 13 | 14 | 15 | 16 | 17 | 18 | binary string 00000001000001000001110100001100000000000000000000000000000000000000111000000000 00000000000000011000000000000001000000010000000100000010000011000000010000000000 00000000000000000000100000000000100100001111010111011100010000000011011100110111 0011011100110111 RA-VI Regional Training on BUFR and Migration to Table Driven Code Forms Langen, Germany, 17 - 20 April, 2007
Section 2 – Optional section This section is defined by the ADP (Automated Data Processing) centre generating or using the message • It typically contains additional information of use to the ADP centre, such as • Database keys to aid searching for specific data without decoding the message • Anything else a processing centre may find useful RA-VI Regional Training on BUFR and Migration to Table Driven Code Forms Langen, Germany, 17 - 20 April, 2007
Section 3 – Data description section This section contains: • A count of the number of data subsets (typically individual observations) • Flags indicating whether or not the data are compressed or uncompressed and observed or forecast • A list of the data elements (fields) that are contained in each data subset RA-VI Regional Training on BUFR and Migration to Table Driven Code Forms Langen, Germany, 17 - 20 April, 2007
Section 3 - Details • Length at least 10 • Octets 1-3 Length of section • Octet 4 Set to zero • Octets 5-6 Number of subsets • Octet 7 Flag (Obs?, Compressed?) • Octets 8-?? List of descriptors • Each descriptor 2 bits F, 6 bits X, 8 bits Y RA-VI Regional Training on BUFR and Migration to Table Driven Code Forms Langen, Germany, 17 - 20 April, 2007
Now, let’s go back and look at that BUFR message again… ‘B’ ‘U’ ‘F’ ‘R’ end of section 0 + octet number 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 1 | 2 | binary string 01000010010101010100011001010010000000000000000000110100000000110000000000000000 octet number 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | binary string 00010010000000000000000000111000000000000000000000000000000000000000100100000001 end of section 1 + octet number 13 | 14 | 15 | 16 | 17 | 18 | 1 | 2 | 3 | 4 | binary string 00000001000001000001110100001100000000000000000000000000000000000000111000000000 end of section 3 + octet number 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | binary string 00000000000000011000000000000001000000010000000100000010000011000000010000000000 00000000000000000000100000000000100100001111010111011100010000000011011100110111 0011011100110111 RA-VI Regional Training on BUFR and Migration to Table Driven Code Forms Langen, Germany, 17 - 20 April, 2007
Section 4 – Data section This section contains: • The actual data as specified by Section 3 • One of two formats is used • Compressed • Uncompressed • Such data are still packed, but not as efficiently as compressed data usually are RA-VI Regional Training on BUFR and Migration to Table Driven Code Forms Langen, Germany, 17 - 20 April, 2007
Section 4 - Details • Octets 1-3 Length of section • Octet 4 Set to zero • Octets 5-?? Binary data as specified by Section 3 RA-VI Regional Training on BUFR and Migration to Table Driven Code Forms Langen, Germany, 17 - 20 April, 2007
Now, let’s go back and look at that BUFR message again… ‘B’ ‘U’ ‘F’ ‘R’ end of section 0 + octet number 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 1 | 2 | binary string 01000010010101010100011001010010000000000000000000110100000000110000000000000000 octet number 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | binary string 00010010000000000000000000111000000000000000000000000000000000000000100100000001 end of section 1 + octet number 13 | 14 | 15 | 16 | 17 | 18 | 1 | 2 | 3 | 4 | binary string 00000001000001000001110100001100000000000000000000000000000000000000111000000000 end of section 3 + octet number 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | binary string 00000000000000011000000000000001000000010000000100000010000011000000010000000000 end of section 4 + octet number 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | binary string 00000000000000000000100000000000100100001111010111011100010000000011011100110111 0011011100110111 RA-VI Regional Training on BUFR and Migration to Table Driven Code Forms Langen, Germany, 17 - 20 April, 2007
Section 5 – End section This section contains: • The character string “7777” indicating the end of the message • Checking for this indicator can be useful to detect some types of data corruption (especially missing bytes in the rest of the message) since the total length of the message is known from Section 0 RA-VI Regional Training on BUFR and Migration to Table Driven Code Forms Langen, Germany, 17 - 20 April, 2007
Now, let’s go back and look at that BUFR message one last time! ‘B’ ‘U’ ‘F’ ‘R’ end of section 0 + octet number 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 1 | 2 | binary string 01000010010101010100011001010010000000000000000000110100000000110000000000000000 octet number 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | binary string 00010010000000000000000000111000000000000000000000000000000000000000100100000001 end of section 1 + octet number 13 | 14 | 15 | 16 | 17 | 18 | 1 | 2 | 3 | 4 | binary string 00000001000001000001110100001100000000000000000000000000000000000000111000000000 end of section 3 + octet number 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | binary string 00000000000000011000000000000001000000010000000100000010000011000000010000000000 end of section 4 + ‘7’ ‘7’ octet number 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 1 | 2 | binary string 00000000000000000000100000000000100100001111010111011100010000000011011100110111 ‘7’ ‘7’ + end of section 5 octet number 3 | 4 | binary string 0011011100110111 RA-VI Regional Training on BUFR and Migration to Table Driven Code Forms Langen, Germany, 17 - 20 April, 2007
BUFR Descriptors • Section 3 contains a list of BUFR descriptors • These describe the data elements that are contained in Section 4 • Most descriptors are references to BUFR Tables B, C and D • Using the list of descriptors in Section 3, together with the tables, it is possible to unpack the data in Section 4 RA-VI Regional Training on BUFR and Migration to Table Driven Code Forms Langen, Germany, 17 - 20 April, 2007
Types of BUFR descriptors • Element descriptors (Table B) • Replication descriptors • Operator descriptors (Table C) • Sequence descriptors (Table D) • Specified by 3 numbers in 16 bits (2 octets) • F: 2 bits 0-3 • X: 6 bits 0-63 • Y: 8 bits 0-255 RA-VI Regional Training on BUFR and Migration to Table Driven Code Forms Langen, Germany, 17 - 20 April, 2007
Element descriptors • Defined by entries in Table B • F is 0 • Each element descriptor describes an encoded value, such as: • The value of a meteorological parameter (e.g. mean sea level pressure, temperature, wind speed) • Instrument details • Location or date and time information • Quality control information RA-VI Regional Training on BUFR and Migration to Table Driven Code Forms Langen, Germany, 17 - 20 April, 2007
Replication Descriptors • Describe the repetition of one or more element, operator, sequence or other replication descriptors • Used for repetitive data such as the individual levels in vertical soundings or temperature profiles • Can be: • Fixed - the number of repetitions is pre-determined and the same for all data subsets • Variable - the number of repetitions can differ from one subset to the next (ie. delayed replication) RA-VI Regional Training on BUFR and Migration to Table Driven Code Forms Langen, Germany, 17 - 20 April, 2007
Replication descriptors - continued • Replication descriptors are defined by three numbers F X Y • F is 1 • X is an integer between 1 and 63 • Defines the number of descriptors to be repeated • Y is an integer between 0 and 255 • Defines how many times the X descriptors are to be repeated • A count of zero indicates delayed replication, where the repeat count is stored in the data section and can change from one data subset to another. RA-VI Regional Training on BUFR and Migration to Table Driven Code Forms Langen, Germany, 17 - 20 April, 2007
Operator descriptors • Defined by entries in Table C • F is 2 • Describe changes to be made to other descriptors • Operators exist for applications such as: • Changing the scaling and packing of data • Adding quality control or other associated fields • Describing the descriptors to which quality control information applies • Substituting a better value for an element, while retaining the original value RA-VI Regional Training on BUFR and Migration to Table Driven Code Forms Langen, Germany, 17 - 20 April, 2007
Sequence descriptors • Defined by entries in Table D • F is 3 • Shorthand notations for pre-defined lists of other element, replication, sequence and operator descriptors • Not really necessary, but useful in reducing the overhead involved in transmitting data in BUFR: • Replace a commonly-used sequence of descriptors with a single descriptor, and thereby reduce the overall length of Section 3 RA-VI Regional Training on BUFR and Migration to Table Driven Code Forms Langen, Germany, 17 - 20 April, 2007
BUFR tables There are many different tables involved in BUFR: • Table A • Data Categories, used in Section 1 • Table B • Element descriptors, used in Section 3 • Table C • Operator descriptors, used in Section 3 • Table D • Sequence descriptors, used in Section 3 • Code and Flag tables • Numerical values to be encoded where the element values are qualitative, used in Section 4 RA-VI Regional Training on BUFR and Migration to Table Driven Code Forms Langen, Germany, 17 - 20 April, 2007
Table A • Defines the general category of the data contained in the BUFR message • Encoded in Section 1 • Examples of typical entries: Code figureMeaning 0 Surface data – land 1 Surface data – sea 2 Vertical soundings (non-satellite) 3 Vertical soundings (satellite) … … 6 Radar data … … 10 Radiological data 12 Surface data (satellite) … … 31 Oceanographic data RA-VI Regional Training on BUFR and Migration to Table Driven Code Forms Langen, Germany, 17 - 20 April, 2007
Table B • Describes the individual values that are encoded • Element descriptors are grouped according to classes (i.e. X value) Class NumberClass NameClass NumberClass Name 01 Identification 12 Temperature 02 Instrumentation 13 Hydrological … … 14 Radiation and radiance 04 Location (time) … … 05 Location (horizontal-1) 19 Synoptic features 06 Location (horizontal-2) 20 Observed phenomena 07 Location (vertical) 21 Radar data … … … … 11 Wind and turbulence 33 Quality information RA-VI Regional Training on BUFR and Migration to Table Driven Code Forms Langen, Germany, 17 - 20 April, 2007
Class 01 – Identification(excerpt) RA-VI Regional Training on BUFR and Migration to Table Driven Code Forms Langen, Germany, 17 - 20 April, 2007
Class 11 – Wind and turbulence(excerpt) RA-VI Regional Training on BUFR and Migration to Table Driven Code Forms Langen, Germany, 17 - 20 April, 2007
Table B • Columns are: • Table reference • Element name • Unit • Scale • Reference value • Data width (in bits) • Scale, reference value, and bit width are chosen so that the desired range of possible data values can be stored in BUFR as positive integers • Preserves the machine-independence of BUFR RA-VI Regional Training on BUFR and Migration to Table Driven Code Forms Langen, Germany, 17 - 20 April, 2007
Table B reference • Expressed as 3 small numbers F X Y • Used to refer to this descriptor • F is always 0 for an element descriptor • X is in the range 0 to 63 and refers to a broad class of elements • Classes 48 to 63 are reserved for local use • Y is in the range 0 to 255 and refers to the individual descriptor in the class • Within all classes, descriptors 192 to 255 are reserved for local use RA-VI Regional Training on BUFR and Migration to Table Driven Code Forms Langen, Germany, 17 - 20 April, 2007
Table B element name • Natural language description of the meaning of the value • Usually English but could be translated to other languages • For example: • Brightness temperature • Total precipitation past 3 hours • Wind speed at 10m RA-VI Regional Training on BUFR and Migration to Table Driven Code Forms Langen, Germany, 17 - 20 April, 2007
Table B unit • The units used for the value • Normally SI units are used • “CCITT IA5” (the international version of ASCII) is used for character data such as identifiers • “Code Table” is used for qualitative data where only one of a set of possible values can be applicable in a given data subset • “Flag Table” is used for qualitative data where more than one of a set of possible values may be applicable in a given data subset • For qualitative data, the coded values are references to the Code and Flag tables RA-VI Regional Training on BUFR and Migration to Table Driven Code Forms Langen, Germany, 17 - 20 April, 2007
Table B scale • Scale • Power of 10 by which to multiply the data value before packing • Determines the precision with which the data are encoded • A scale of 2 means 2 decimal places of precision (eg. 273.16) • A scale of –1 means that the data values are rounded to the nearest multiple of 10 RA-VI Regional Training on BUFR and Migration to Table Driven Code Forms Langen, Germany, 17 - 20 April, 2007
Table B reference value • Used to subtract an offset where negative data have to be encoded • Table B contains the value (multiplied by the scale) of the offset to be subtracted • For example, scale=2, reference value -9000 means that -90.00 is to be subtracted before scaling (ie. -9000 after scaling), allowing values as negative as -90.00 to be represented RA-VI Regional Training on BUFR and Migration to Table Driven Code Forms Langen, Germany, 17 - 20 April, 2007
Table B data width • The number of bits to be used to encode the value • If all bits are set to ones when encoding (ie. a value of (2n-1) when n is the data width), then this denotes a “missing” value. • If the scale is s, the reference value is r, and the data width is n, then the representable range of values is: • Minimum (10-s r) • Maximum (10-s (2n-2+r)) and (10-s (2n-1+r)) denotes the “missing” value. RA-VI Regional Training on BUFR and Migration to Table Driven Code Forms Langen, Germany, 17 - 20 April, 2007
Table B examples RA-VI Regional Training on BUFR and Migration to Table Driven Code Forms Langen, Germany, 17 - 20 April, 2007
Table B examples - continued • 0 11 012 - Wind speed at 10m • Scale=1, Reference value=0, Data width=12 • Precision is one decimal place (ie. 0.1 m s-1) • Minimum representable value is: (10-1×0) = 0.0 m s-1 • Maximum representable value is: (10-1×(212-2+0)) = 409.4 m s-1 • “Missing” value is: (10-1×(212-1+0)) = 409.5 m s-1 RA-VI Regional Training on BUFR and Migration to Table Driven Code Forms Langen, Germany, 17 - 20 April, 2007
Table B examples - continued • 0 13 020 - Total precipitation past 3 hours • Scale=1, Reference value=-1, Data width=14 • Precision is one decimal place (ie. 0.1 kg m-2) • For this descriptor, -0.1 kg m-2 is a special value for trace, according to a specific note in Table B • Minimum representable value is: (10-1×-1) = -0.1 kg m-2 (= trace) • Maximum representable value is: (10-1×(214-2-1)) = 1638.1 kg m-2 • “Missing” value is: (10-1×(214-1-1)) = 1638.2 kg m-2 RA-VI Regional Training on BUFR and Migration to Table Driven Code Forms Langen, Germany, 17 - 20 April, 2007
Table B examples - continued • 0 20 003 - Present weather • Scale=0, Reference value=0, Data width=9 • Coded values are integers since Scale=0 • Minimum representable value is: (100×0) = 0 • Maximum representable value is: (100×(29-2+0)) = 510 • “Missing” value is: (100×(29-1+0)) = 511 • One must refer to Code Table 0 20 003 in order to discover the actual meaning of each coded value RA-VI Regional Training on BUFR and Migration to Table Driven Code Forms Langen, Germany, 17 - 20 April, 2007
0 20 003 – Present WeatherCode Table (excerpted) Code figureMeaning 0 Cloud development not observed or not observable 1 Clouds generally dissolving or becoming less developed … 10 Mist 11 Patches of shallow fog or ice fog 13 Lightning visible, but no thunder heard … 171 Snow, slight 172 Snow, moderate 173 Snow, heavy … 511 Missing RA-VI Regional Training on BUFR and Migration to Table Driven Code Forms Langen, Germany, 17 - 20 April, 2007
Code tables vs. Flag tables(choice of one vs. choice of more than one) 0-01-0030-08-001 WMO region numberVertical sounding significance Code figureMeaningBit numberMeaning 0 Antarctica 1 Surface 1 Region I 2 Standard level 2 Region II 3 Tropopause level 3 Region III 4 Maximum wind level 4 Region IV 5 Significant 5 Region V temperature level 6 Region VI 6 Significant wind level 7 Missing value All 7 Missing value For a Code table, the value stored in Section 4 is the code figure corresponding to the applicable meaning. For a Flag table of N bits, the value stored in Section 4 is (2(N-bit#) + 2(N-bit#) + …) for the bit(s) corresponding to each applicable meaning. An extra bit is always present in order to be able to distinguish “all meanings applicable” from “missing”. RA-VI Regional Training on BUFR and Migration to Table Driven Code Forms Langen, Germany, 17 - 20 April, 2007
Some other important regulations pertaining to Table B Elements in classes 01 – 09 are “coordinate” descriptors which remain in effect until redefined or until the end of the subset Exception: when two identical descriptors from classes 04 – 07 are listed consecutively, they define the boundaries of a range Similar descriptors exist in “coordinate” vs. “non-coordinate” classes Example: 0 07 004 and 0 10 004 are both “Pressure” with identical scale, reference value and bit width; however, the former is a “coordinate” for use when pressure is the main defining coordinate measured in the vertical direction (e.g. in radiosondes) vs. the latter which is a “non-coordinate” for use when pressure is a derived value (e.g. an aircraft calculating pressure as a function of an observed or measured height) Class 08 contains significance qualifiers which can be used to report qualitative information and which can be explicitly “cancelled” Example: 0 08 011 with value 12 can indicate that we are talking about a “cloud” RA-VI Regional Training on BUFR and Migration to Table Driven Code Forms Langen, Germany, 17 - 20 April, 2007
Table C • Describes the various operators • Columns are: • Table reference F X • F is 2 • X is an integer between 0 and 63 • There is no sub-range of X values reserved for local use • Operand • A number between 0 and 255 • Operator name • A short name describing the operation • Operator definition • A detailed description of the operation and its effects RA-VI Regional Training on BUFR and Migration to Table Driven Code Forms Langen, Germany, 17 - 20 April, 2007
Table C • This is just an excerpt – there are many other (even more complicated!) operators in Table C. • There are also many important notes to Table C describing, e.g. how to cancel an operator. RA-VI Regional Training on BUFR and Migration to Table Driven Code Forms Langen, Germany, 17 - 20 April, 2007
Table C example • Table reference F=2 X=01 • Operand, in this case represented as Y • Operator name “Change data width” • Operator description: “Add Y-128 bits to the data width given for each data element in Table B, other than CCITT IA5 (character) data, code or flag tables” • According to a note under Table C, this operator is cancelled (ie. effect is turned off) by repeating the operator with Y=0, or at the end of each data subset RA-VI Regional Training on BUFR and Migration to Table Driven Code Forms Langen, Germany, 17 - 20 April, 2007
Table C example - continued • The “Change data width” operator causes the data width to be changed for subsequent elements, in effect giving them a larger (or smaller) range than is otherwise prescribed within Table B. Thus, it can be used to: • encode values that exceed the usual representable range • for a descriptor, instead of having to introduce a new • Table B descriptor (note: in such cases, Y > 128) • reduce the size of the data (and thus the overall encoded • message as well!) if the required data range can be • encoded using a smaller data width than provided within • Table B (note: in such cases, Y < 128) RA-VI Regional Training on BUFR and Migration to Table Driven Code Forms Langen, Germany, 17 - 20 April, 2007
Table C example -continued As an example, one of the standard descriptors for the height coordinate of an observation is 0 07 007 with unit=m, scale=0, reference=-1000, data width=17, giving a representable range of –1000 m to 130070 m. If one needed to encode a value larger than this, then the 2 01 operator could be used to increase the data width. For example, use of the operator 2 01 130 before the 0 07 007 descriptor would increase its data width to 19 bits and therefore allow values up to 523286 m. RA-VI Regional Training on BUFR and Migration to Table Driven Code Forms Langen, Germany, 17 - 20 April, 2007