1 / 26

XBRL Programming 4. XML DOM

XBRL Programming 4. XML DOM. 20120119 魏長風 c fwei.tw@gmail.com. XML Data = Tree. Welcome to the TREE world. XML Schema = Tree Data format. W3C XML API. W3C XML API version. XPath 1.0 不是 100% 相容於 XPath 2.0 XPath 1.0 與 XPath 2.0 差異很大 XSLT 1.0 與 XSLT 2.0 差異也很大. 1999. 2007.

kbiles
Download Presentation

XBRL Programming 4. XML DOM

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. XBRL Programming 4. XML DOM 20120119 魏長風 cfwei.tw@gmail.com

  2. XML Data = Tree • Welcome to the TREE world

  3. XML Schema = Tree Data format

  4. W3C XML API

  5. W3C XML API version • XPath 1.0 不是 100%相容於 XPath 2.0 • XPath 1.0 與 XPath 2.0 差異很大 • XSLT 1.0 與 XSLT 2.0 差異也很大 1999 2007 XSLT 1.0 XSLT 2.0 XPath 1.0 XPath 2.0 XQuery 1.0

  6. W3C DOM standard • API for all language • DOM Level 1 in 1998 • For HTML and XML • DOM Level 2 in 2000 • getElementById • XML namespace • CSS • DOM Level 3 in 2004 • XPath

  7. W3C DOM – node-tree <?xml version="1.0" encoding="utf-8"?> <html> <body id=“001” color=“002”> Hello <b>world</b> byebye </body> </html> 9Document Text Stored in Text Nodes 7PI ?xml 1Element html 1Element body 2id 001 2color 002 3Text Hello 1Element b 3Text world 3Text byebye

  8. W3C DOM – NodeTypes

  9. W3C DOM - traverse tree 1Element body firstChild 2id 001 2color 002 3Text Hello Attribute nextSibling 1Element b parent lastChild previousSibling 3Text World 3Text byebye

  10. Space/Enter text node problem <?xml version="1.0" encoding="utf-8"?> <html> <body> Hello <b>world</b> byebye </body> </html> Space/Enter between Elements 9Document 7PI ?xml 3Text “\n “ 1Element html 1Element body 2id 001 2color 002 3Text “\n Hello” 1Element b 3Text world 3Text “\n “ 3Text “byebye\n “

  11. Java DOM - Load XML import javax.xml.parsers.*; import org.w3c.dom.*; DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance(); DocumentBuilder builder = factory.newDocumentBuilder(); Document doc = builder.parse("xbrl.xml");

  12. C# .Net DOM - Load XML using System; using System.Xml; XmlDocument doc = new XmlDocument(); doc.Load ("xbrl.xml");

  13. example 01 xbrl.xml • 只有一個 context, 一開始就出現 預計要抓的資料

  14. Java DOM – example 01a Node xbrl = doc.getDocumentElement(); Node context = xbrl.getFirstChild(); Node entity = context.getFirstChild(); Element identifier = (Element)entity.getFirstChild(); System.out.println( identifier.getAttribute("scheme") ); // 結果跑不出來!!

  15. Java DOM – example 01b Node xbrl = doc.getDocumentElement() ; Node context = xbrl.getFirstChild(); Node entity = context.getFirstChild(); Element identifier = (Element)entity.getFirstChild(); System.out.println( identifier.getAttribute("scheme") );

  16. Java DOM – example 01c Element identifier = (Element) doc.getDocumentElement().getFirstChild().getFirstChild().getFirstChild(); System.out.println( identifier.getAttribute("scheme") );

  17. Java DOM – example 01d //處理firstChild, 自動跳過 Space/Enter text node static Element getFirstChildElement( Node n ) { Node Child = n.getFirstChild(); while ( Child != null ) { if ( Child.getNodeType() == 1 ) return (Element)Child; else Child = Child.getNextSibling(); } return null; }

  18. Java DOM – example 01d Element xbrl = getFirstChildElement(doc); Element context = getFirstChildElement(xbrl); Element entity = getFirstChildElement(context); Element identifier = getFirstChildElement(entity); System.out.println( identifier.getAttribute("scheme") );

  19. C# .Net DOM – example 01c XmlElement identifier = (XmlElement) doc.DocumentElement.FirstChild.FirstChild.FirstChild; Console.WriteLine( identifier.getAttribute("scheme") ); // .Net 不需要處理換行textNode問題

  20. XPath – What is path? • /xbrl/context/entity/identifier/@scheme 預計要抓的資料

  21. W3C DOM standard • API for all language • DOM Level 1 in 1998 • For HTML and XML • DOM Level 2 in 2000 • getElementById • XML namespace • CSS • DOM Level 3 in 2004 • XPath

  22. Java Xpath – example 03 XPath xpath = XPathFactory.newInstance().newXPath(); String expr = "/xbrl/context/entity/identifier/@scheme"; String scheme = (String)xpath.evaluate(expr, doc, XPathConstants.STRING); System.out.println("scheme= " + scheme );

  23. C# .Net Xpath – example 03 XmlNamespaceManager nsmgr = new XmlNamespaceManager(doc.NameTable); nsmgr.AddNamespace("xbrli", "http://www.xbrl.org/2003/instance"); XmlNodeList schemaList = doc.SelectNodes( "/xbrli:xbrl/xbrli:context/xbrli:entity/xbrli:identifier/@scheme“, nsmgr); string scheme = (string)schemaList[0].Value; Console.WriteLine("scheme= " + scheme );

  24. example 04 xbrl3.xml • 多個context, 不同的period type 預計要抓的資料

  25. Java XPath – example 04 XPath xpath = XPathFactory.newInstance().newXPath(); NodeList contextSet = (NodeList)xpath.evaluate("/xbrl/context", doc, XPathConstants.NODESET); for( int i=0; i<contextSet.getLength(); i++ ) { Node context = contextSet.item(i); String contextid = (String)xpath.evaluate("@id", context, XPathConstants.STRING); String datevalue1 = (String)xpath.evaluate("period/endDate", context, XPathConstants.STRING); String datevalue2 = (String)xpath.evaluate("period/instant", context, XPathConstants.STRING); if( datevalue1.compareTo("")!=0 ) System.out.println(contextid + "= " + datevalue1 ); if( datevalue2.compareTo("")!=0 ) System.out.println(contextid + "= " + datevalue2 ); }

  26. C# .Net XPath – example 04 XmlNodeList contextSet = doc.SelectNodes("/xbrli:xbrl/xbrli:context", nsmgr); foreach (XmlNode context in contextSet ) { string contextid = (string) context.SelectNodes("@id")[0].Value; XmlNodeList datevalue1list = context.SelectNodes( "xbrli:period/xbrli:endDate",nsmgr) ; XmlNodeList datevalue2list = context.SelectNodes( "xbrli:period/xbrli:instant",nsmgr); if (datevalue1list.Count > 0) { string datevalue1 = datevalue1list[0].FirstChild.Value; if (datevalue1.CompareTo("") != 0) MessageBox.Show(contextid + "= " + datevalue1); } if (datevalue2list.Count > 0) { string datevalue2 = datevalue2list[0].FirstChild.Value; if (datevalue2.CompareTo("") != 0) MessageBox.Show(contextid + "= " + datevalue2); } }

More Related