440 likes | 515 Views
Doc MET 400 (2015) & D ata validation Training course. Training course on European Union Trade in Goods Statistics. Table of content. Introduction Doc MET400 : what is it ? Link with legislation Versions Structure of Doc MET 400 document Contact persons
E N D
Doc MET 400 (2015)& Data validation Training course Training course on European Union Trade in Goods Statistics
Table of content • Introduction • Doc MET400 : what is it ? • Link with legislation • Versions • Structure of Doc MET 400 document • Contact persons • Transmission of detailed trade data • General structure • Logical format • Physical format • Non confidential data • Confidential data • Reception of detailed trade data • Validation Training course on European Union Trade in Goods Statistics
Table of content • Introduction • Doc MET400 : what is it ? • Link with legislation • Versions • Structure of Doc MET 400 document • Contact persons • Transmission of detailed trade data • General structure • Logical format • Physical format • Non confidential data • Confidential data • Reception of detailed trade data • Validation Training course on European Union Trade in Goods Statistics
Doc MET 400 : What is it ? (1) First page : “TRANSMISSION OF THE RESULTS OF INTRA AND EXTRA-COMMUNITY TRADE The purpose of this paper is to determine the rules for the transmission of the results, which Member States are to submit to Eurostat in accordance with the regulations“ Training course on European Union Trade in Goods Statistics
Doc MET 400 : What is it ? (2) • Rules for transmission of data • from MSs to Eurostat • Intrastat and Extrastat data • Detailed • Aggregated • “In accordance with the regulation” • ExtrastatDetailed part alsoapplies to EFTA (withsome exceptions)
Link with legislation • Legal basis : • in section 1.1 (Extrastat) and 2.1 (Intrastat) : • Extrastat Basic Regulation (EC) No 471/2009 and implementing provisions (Regulations (EU) No 92/2010 and 113/2010) • Intrastat Basic Regulation (EC) No 638/2004 and implementing provisions (Regulation (EC) No 1982/2004) • EFTA : Decision of the EEA Joint Committee No 105/2011 and 106/2011 ; Decision No 2/2010 of the European Union/SwitzerlandStatisticalCommittee • Legal basis and Doc MET 400 are complementary • Legislation ≈ methodological principles • Doc MET 400 ≈ syntactical rules <= this training • Training course on European Union Trade in Goods Statistics
Versions (1) • Several versions ; most recent ones are : • Year 2015 : reference periods in 2015 • Year 2014 : reference periods in 2014 • Year 2013 : reference periods in 2013 • Year 2012 : reference periods in 2012 • Year 2011 : reference periods in 2011 • Year 2010 : reference periods in 2010 • Rev. 17 : reference periods in 2009 or 2008 • Rev. 16 : reference periods in 2007 • Rev. 15 : reference periods in 2006 since 2010 : yearly Training course on European Union Trade in Goods Statistics
Versions (2) • Version applicable : • version corresponding to reference period • => data must be split at least on yearly basis • Revisions : • “Revisions of data referring to previous years must be sent according to the relevant revision of this document” • This training focuses on 2015 / 2014 version • Training course on European Union Trade in Goods Statistics
Structure of Doc MET 400 document • Part 1 : transmission format for Extrastat/Efta detailed data • Part 2 : transmission format for Intrastat detailed data • Part 3 : transmission format for aggregated data • Annexes Training course on European Union Trade in Goods Statistics
Annexes • 1 & 2 : physical format • 6 & 7 : examples of data files • 5 : summary of fields used in detailed data • 3 : special product codes (chapter 99) • 4 : link to a correlation table between customs procedure codes and statistical procedure (detailed data) • 8 : syntax of Checklist file for detailed data • 9 : syntax to request Embargo for detailed data • 10 : syntax to mark detailed data as test data • 11 : product codes for which Net Mass shouldbetransmitted as zero (detailed data) • 12 : nomenclature for supplementary unit codes (EFTA detailed data only) Training course on European Union Trade in Goods Statistics
Contact persons • For parts 1 and 2 : Mr. XAVIER RUTTEN +352-4301-34240 • For part 3 : Mr. GILBERTO GAMBINI +352-4301-35806 • For technical matters concerning Gesmes format : estat-support-edamis@ec.europa.eu+352-4301-33213 • Training course on European Union Trade in Goods Statistics
Table of content • Introduction • Doc MET400 : what is it ? • Link with legislation • Versions • Structure of Doc MET 400 document • Contact persons • Transmission of detailed trade data • General structure • Logical format • Physical format • Non confidential data • Confidential data • Reception of detailed trade data • Validation Training course on European Union Trade in Goods Statistics
General structure Logical format = fields (e.g. syntax, nomenclatures) Physical format = how fields are transmitted (e.g. positions, separators) Parts 1 & 2 (detailed data) Part 3 (aggregated data) Annex 1(detailed data) Annex 2 (aggregated data) Training course on European Union Trade in Goods Statistics
Logical format - detailed data – 2015 (1) (*) I=Intrastat, E=Extrastat, F=eFta • Training course on European Union Trade in Goods Statistics
Logical format - detailed data – 2015 (2) (*) I=Intrastat, E=Extrastat, F=eFta • Training course on European Union Trade in Goods Statistics
Logical format - detailed data – 2015 (3) • All 28 fields must always be present (Intrastat / Extrastat / Efta) • Main nomenclatures : • Products : • Combined Nomenclature / Taric nomenclature • Some special codes : chapter 99, Annex 3 • Countries : • Geonomenclature • Some special codes : • Nomenclatures can (and do) change !!! • Training course on European Union Trade in Goods Statistics
Physical format • 3 formats : • fixed column width • comma delimited • Gesmes mandatory !!! • “Exceptionally, in case of problems, and in agreement with Eurostat, Member States may temporarily use the "fixed column width" format or the "comma delimited" format” • Training course on European Union Trade in Goods Statistics
Gesmes (1) • 1 row per data/record, containing all fields • order of fields defined by Doc MET 400 • values must be left justified i.e. do not contain leading blanks or spaces • trailing blanks at the end of values must be removed Example : ARR++201101:1:FR:1:7005108000:US:US:FR:FR:1:100:4:0:US:2::0:::::::5648:0:0::' ARR++201101:1:FR:1:7005108000:US:US:FR:FR:1:100:4:0:US:3::0:::::::47045:0:0::' ARR++201101:1:FR:1:7005108000:IL:IL:FR:FR:1:100:4:0:SE:2::0:::::::2632:0:0::‘ … Training course on European Union Trade in Goods Statistics
Gesmes (2) Header : UNA:+.? ' UNB+UNOC:3+FR1+4D0+100221:1448+IREF000001++GESMES' UNH+MREF000001+GESMES:2:1:E6' BGM+74' NAD+Z02+EUROSTAT' NAD+MR+4D0' NAD+MS+FR1' DSI+COMEXT_EXTRA_M' STS+3+7' DTM+242:201502211448:203' DTM+Z02:201501:610' IDE+5+EUROSTAT_FT' GIS+AR1‘ • Training course on European Union Trade in Goods Statistics
Gesmes (3) Footer : UNT+51+MREF000001' UNZ+1+IREF000001' • Training course on European Union Trade in Goods Statistics
Physical format - Transmission (sections 1.1 & 2.1) • Transmission method (sections 1.1 & 2.1) : • “via the Single Entry Point (SEP) of eDAMIS (Stadium) • dataset COMEXT_EXTRA_M (including EFTA) / COMEXT_INTRA_M • for both the first transmission of the latest month and revisions” • Checksum: • “The detailed statistics must be accompanied in the same Stadium envelop by a checklist corresponding to the total by flow of the detailed statistics for section 24 (statistical value) and section 25 (quantity expressed in net mass): see Annex 8” • Deadline : • “The EXTRA (including EFTA) / INTRA file should be transmitted no later than 40 / 70 calendar days following the reference month” • Training course on European Union Trade in Goods Statistics
Non confidential data Training course on European Union Trade in Goods Statistics
Threshold - Extrastat • Training course on European Union Trade in Goods Statistics
Threshold - Intrastat • Training course on European Union Trade in Goods Statistics
Confidential data • Training course on European Union Trade in Goods Statistics
Confidential data - concepts • Levels of confidentiality • Eurostat-restricted • Commission-restricted • Type of camouflage • camouflage by product • camouflage by partner • camouflage by product & partner • Indicators concerned • all indicators (value, quantity expressed in net mass, quantity expressed in supplementary units) can be declared confidential, independently ones from the others, by using a distinct flag ("confidentiality flag") for each indicator • values for indicators declared confidential will be associated to camouflaged data • other values will be associated to real (non camouflaged) data • Training course on European Union Trade in Goods Statistics
Levels of confidentiality : confidentiality flag • Values : • 0 : confidentiality is not applied • 1: access to confidential data restricted to Eurostat • 2: access to confidential data restricted to Eurostat and other services within the Commission • Adjustments cannot be confidentialised • Training course on European Union Trade in Goods Statistics
Type of camouflage – camouflage by product • Section 5 : real product code (CN8 or Taric) • Section 18 : “public” product code • Same as section 5 if product is not confidential • Otherwise HS6, HS4, HS2 or 99900000 • Section 19 : “public” SITC product code • to be transmitted if the information in section 18 is “poor” i.e. if HS4, HS2 or 99900000 • SITC3, SITC2 or SITC1 • In Extrastat : 999 also allowed if 99900000 in section 18 • Disseminated (=> camouflaged) data : • <HS6>SS • <HS4>S<SITC3> or <HS4>S<SITC2>S or <HS4>S<SITC1>SS • <HS2>SSS<SITC3> or <HS2>SSS<SITC2>S or <HS2>SSS<SITC1>SS Training course on European Union Trade in Goods Statistics
Type of camouflage – camouflage by partner • Possible values for section 20 : • Disseminated (=> camouflaged) data : • If no confidentiality : real country codes • If confidentiality : • QY (Countries and territories not specified for commercial or military reasons in the framework of intra-Community trade) • QZ (Countries and territories not specified for commercial or military reasons in the framework of trade with third countries) • Training course on European Union Trade in Goods Statistics
Indicators concerned – confidentiality flags • Values : • 0 = value of indicator is not confidential • 1 = value of indicator is confidential • Example => Quantity is confidential but Value and QSU are not • Camouflage : • values for indicators declared confidential will be associated to camouflaged data • values for indicators not declared confidential will be associated to real (non camouflaged) data • => each record is split into 2 records : • 1 record with camouflaged data • 1 record with real (non-camouflaged) data • Training course on European Union Trade in Goods Statistics
Camouflage – example - product confidentiality only (1) • PRODUCT : • real CN code=01031000 • public CN code=01 • public SITC code=3 • PARTNER : • real partner code=NO • public partner code=NO • real other partner code=DE • public other partner code=DE • INDICATORS • VALUE=140 • NETMASS=500 • QSU=111 • INDICATORS CONFIDENTIALITY FLAGS • FLAG_VALUE=0 • FLAG_NETMASS=1 • FLAG_QSU=0 • Training course on European Union Trade in Goods Statistics
Camouflage – example - product confidentiality only (2) • After camouflage : 2 records : • => it is not possible to link values of confidential indicator with confidential data (real product code) : • values of the confidential indicator (net mass) is located in the first record (the one containing the camouflaged product value) • values of non confidential indicators (value, qsu) are located in the second record (the one containing confidential information) • Sum of the 2 records equal orginal record Training course on European Union Trade in Goods Statistics
Camouflage – example - partner confidentiality only (1) • PRODUCT : • real CN code=01031000 • public CN code=01031000 • PARTNER : • real partner code=NO • public partner code=QZ • real other partner code=DE • public other partner code=QY • INDICATORS • VALUE=140 • NETMASS=500 • QSU=111 • INDICATORS CONFIDENTIALITY FLAGS • FLAG_VALUE=0 • FLAG_NETMASS=1 • FLAG_QSU=0 Training course on European Union Trade in Goods Statistics
Camouflage – example - partner confidentiality only (2) • After camouflage : 2 records : • => it is not possible to link values of confidential indicator with confidential data (real partner and other partner codes) : • values of the confidential indicator (net mass) is located in the first record (the one containing the camouflaged product value) • values of non confidential indicators (value, qsu) are located in the second record (the one containing confidential information) • Sum of the 2 records equal original record Training course on European Union Trade in Goods Statistics
Camouflage – example – all confidentiality flags set (1) • PRODUCT : • real CN code=01031000 • public CN code=01 • public SITC code=3 • PARTNER : • real partner code=NO • public partner code=QZ • real other partner code=DE • public other partner code=QY • INDICATORS • VALUE=140 • NETMASS=500 • QSU=111 • INDICATORS CONFIDENTIALITY FLAGS • FLAG_VALUE=1 • FLAG_NETMASS=1 • FLAG_QSU=1 Training course on European Union Trade in Goods Statistics
Camouflage – example – all confidentiality flags set (2) • After camouflage : 2 records : • => it is not possible to link values of confidential indicator with confidential data (real product, partner and other partner codes) : • values of the confidential indicator (net mass) is located in the first record (the one containing the camouflaged product value) • values of non confidential indicators (value, qsu) are located in the second record (the one containing confidential information) • Sum of the 2 records equal original record • !!! Second record needs not to be generated !!! • Training course on European Union Trade in Goods Statistics
Table of content • Introduction • Doc MET400 : what is it ? • Link with legislation • Versions • Structure of Doc MET 400 document • Contact persons • Transmission of detailed trade data • General structure • Logical format • Physical format • Non confidential data • Confidential data • Reception of detailed trade data • Validation Training course on European Union Trade in Goods Statistics
Validation : generalities • Validation = decidewhetherot not to load Data = boolean ≠ Qualitymeasurement = assess Data quality = real number • Validation handbook • 4 steps • 9 rule types • 4 severities • eachruleisidentified by a "Rule ID" Training course on European Union Trade in Goods Statistics
Validation : 4 steps • Step 1 - Preliminary global data file checks: The purpose of tests of step1 is to ensure that no major inconsistency is detected at file level preventing to continue the loading of the file. • Step 2 - Records validation checks: This step is performed at record level and consists in checking the validity and consistency of the record data as well as their compliance with Doc MET400. • Step 3 - Post record check global validation: Tests in this step are performed at file level after the records have been corrected in order to evaluate if the number of corrections needed to be applied to accept the data file are within acceptable limits and therefore if the file can be loaded in the Comext databases. • Step 4 - Advanced post validation checks: These checks are performed after the data are loaded in Comext. These tests are based on more advanced statistical methods, are more time consuming and the analysis of their results cannot be automated. Example : detecting outliers in the data. Training course on European Union Trade in Goods Statistics
Validation : 9 Rule types Training course on European Union Trade in Goods Statistics
Validation : 4 Severities Training course on European Union Trade in Goods Statistics
Validation : Rule ID Rule ID = VV_XX_YZ where • VV is the variable or step concerned : • For rules in Step 1: VV=”PRE” (indicating “Preliminary global file checks”) • For rules in Step 2: VV=the field concerned (“01” to “28”) • For rules in Step 3: VV=”POST” (indicating “Post record check global validation”) • For rules in Step 4: VV=”ADV” (indicating “Advanced post validation checks”) • XX is the rule number within the section • Y is the rule type (B,F,X,I…) • Z is the severity (A, E,W, I) Training course on European Union Trade in Goods Statistics
Validation : Example of rule Rule ID = 05_02_XE • “05” indicates the rule is a rule in Step 2 concerning field number 5 • “02” indicates the rule is the second rule in the list of rules in Step 2 concerning field number 5 • “X” indicates the rule type is “X” (“incorrect format”) • “E” indicates the severity is “E” (“error”) • rule description : Field Commodity is only 1 digit, which is never allowed. It will be converted to 99CCC000 (or 99CCC00000 for Extrastat import) when the record is related to standard threshold, else it will be converted to 99YYY000 (00) Training course on European Union Trade in Goods Statistics
Thank you for your attention ! • Xavier RUTTEN • Eurostat G5 • Xavier.Rutten@ec.europa.eu • +352-4301-34240 • Training course on European Union Trade in Goods Statistics