420 likes | 460 Views
Learn how to automate Internet Explorer using VB6 through COM, navigate web pages, enumerate open IE windows, capture events, handle HTML DOM and more.
E N D
Agenda • Automating IE • HTML DOM • Automating Web Pages • Question and Answers
COM • COM is Component Object Model. • Application developed using COM provides interfaces to third party application for using certain features of the application. • Nearly all Microsoft applications are built using COM. Eg. – Internet Explorer, All Office applications like Outlook, Excel, Word etc… • COM makes it possible to create automation solution for the AUT.
Using COM object in VB • There are 2 ways to create a COM object in VB • By referring the ActiveX library in the project • By creating an object using the ProgID of COM application. • By referring the ActiveX library in the project Dim IE as InternetExplorer.Application Set IE = New InternetExplorer.Application IE.Visible = True IE.Quit Set IE=Nothing • By creating an object using the ProgID of COM application. Dim IE as Object Set IE = CreateObject(“InternetExplorer.Application”) IE.Visible = True IE.Quit Set IE=Nothing
Adding reference to IE in VB • Go to Project->References… and check the checkbox for “Microsoft Internet Controls” and Click OK. Now all the libraries of IE would be available in the project.
Browsing to Website and Waiting for Page to Load Dim IE Set IE = CreateObject(“InternetExplorer.Application”) IE.Visible = True IE.Navigate2 “http://www.yahoo.com” Do DoEvents Loop while IE.Busy = TRUE Msgbox “Page Loaded. Press OK to close the opened IE” IE.Quit Set IE=Nothing
Enumerating all open IE – Method 1 Dim IEWindows As SHDocVw.ShellWindows Set IEWindows = New SHDocVw.ShellWindows Dim IEWindow For Each IEWindow In IEWindows Debug.Print "Name: " & IEWindow.Name Debug.Print "Visible: " & IEWindow.Visible Debug.Print "FileName: " & IEWindow.FullName Debug.Print "LocationURL: " & IEWindow.LocationURL Debug.Print "LocationName: " & IEWindow.LocationName Next For i = 0 To IEWindows.Count - 1 Set IEWindow = IEWindows.Item(i) Next
Enumerating all open IE – Method 2 Dim objShell Dim objShellWindows Set objShell = CreateObject("Shell.Application") Set objShellWindows = objShell.Windows If (Not objShellWindows Is Nothing) Then Dim objEnumItems For Each objEnumItems In objShellWindows MsgBox objEnumItems.LocationURL Next End If Set objShellWindows = Nothing Set objShell = Nothing
Closing all open IE Dim IEWindows As SHDocVw.ShellWindows Set IEWindows = New SHDocVw.ShellWindows Dim IEWindow() ReDim IEWindow(1 To IEWindows.Count) Dim i For i = 1 To IEWindows.Count Set IEWindow(i) = IEWindows.Item(i - 1) Next For i = LBound(IEWindow) To UBound(IEWindow) If InStr(1, IEWindow(i).FullName, "iexplore.exe", vbTextCompare) Then IEWindow(i).Quit End If Set IEWindow(i) = Nothing Next
Getting reference to already open IE Public Function GetOpenIE(ByVal strUrl As String) As InternetExplorer Dim shie As InternetExplorer Dim sh As New ShellWindows For Each shie In sh If shie.LocationURL Like strUrl Then Set GetOpenIE = shie Exit Function End If Next Set GetOpenIE = Nothing End Function
Events in IE • Event is nothing but a change in system state • We can capture Events in IE using the below codePrivate WithEvents IE As InternetExplorer • After adding the above line the IE variable will appear in the controls combo box and all its events appear in the Events combo box
DOM • The HTML DOM is the Document Object Model for HTML • The HTML DOM defines a standard set of objects for HTML, and a standard way to access and manipulate HTML documents • The HTML DOM is platform and language independent • The HTML DOM views HTML documents as a tree structure of elements. All elements, along with their text and attributes, can be accessed and manipulated through the DOM tree
DOM Tree • Every TAG in the HTML source represent a node in the DOM tree • Once a TAG is opened, all the tags following it become child nodes of the starting node • Each TAG can have various attributes. Some are predefined and some are user-defined attributesEg - <INPUT type=“textbox” value=“Name” name=“txt_Name” myval=“Test” • Here pre-defined attributes are type, value and name • myval is a User defined attribute for the INPUT tag
DOM Tree Contd… <html> <head> <script type="text/javascript"> function ChangeColor() { document.body.bgColor="yellow" } </script> </head> <body onclick="ChangeColor()">Click on this document!</body> </html>
Document • Document object represents the whole document • It is the top node in the DOM Tree • Document node doesn’t have any sibling nodes • It provides various collections for Links, Anchors, Scripts, Images etc... in the Document • It also provides various functions using which we can access an element using the name of the element
Element & Element Collection • Element is an object referring to any particular node in the DOM • Depending on the type of node the element refers to, it would give access to methods and properties related to those type of elements • Every element has the properties outerText, outerHtml, innerText, innerHtml, tagName etc… • Element Collection is a collection of one or more elements. For Eg-<INPUT name=“txt_Name” type=“text”><INPUT name=“txt_Name” type=“text”>Now see the following VBScript codeset txt_Boxes=document.getElementsByName(“txt_Name”)for i=0 to txt_Boxes.Length - 1 txt_Boxes.item(i).value=“Tarun” txt_Boxes(i).value=“Tarun”next
How to get an element from the web page <INPUT name=“txt_Name” id=“firstname” type=“text” value=“Tarun”> Various ways to get this element are ‘Mainly used at time of IE4. Are compatible with higher versions but not recommended Set txt_Elem=Document.All(“firstname”) Set txt_Elem=Document.All(“txt_Name”) ‘Used with IE 5.0 Set txt_Elem=Document.getElementsById(“firstname”) ‘How to check if the elements is present or not If txt_Elem is Nothing then Msgbox “Element Is not present”
How to get an element from the web page contd… Set txt_Elem=Document.getElementsByTagName(“INPUT”).item(0) Set txt_Elem=Document.getElementsByTagName(“INPUT”).item(“txt_Name”) Set txt_Elem=Document.getElementsByTagName(“INPUT”).item(“firstname”) Set txt_Elem=Document.getElementsByName(“txt_Name”).item(0) The above 2 line would throw error if there is no element having INPUT tag or having name as “txt_Name”. To avoid this we can first check the length of the collection - If Document.getElementsByName(“txt_Name”).length<>0 then Set txt_Elem=Document.getElementsByName(“txt_Name”).item(0) End if
LINK or BUTTON <A href=http://www.microsoft.com id=mslinkid name=mssoft>Microsoft</A> <INPUT type="button" id=mslinkid name=mssoft value="Click"> Various ways to click on this link or button: document.Links(“mslinkid”).click document.Links(“mssoft”).click document.getElementById(“mslinkid”).click document.getElementsByName(“mssoft”)(0).click document.all(“mslinkid”).click document.all(“mssoft”).click
Text Box <INPUT myprop=test name="name" id="firstname" type="text" value="initial"> Various ways of changing the value of the text box: document.getElementById(“firstname”).value=“Tarun” document.getElementsByName(“name”)(0).value=“Tarun” (If name and id is not available then use the below example code to change value) Set allElems= document.getElementsByTagName(“INPUT”) For each elem in allElems if elem.myprop=“test” then elem.value=“Tarun” Exit For end if Next
Combo box or List Box Combo box and list box have an array of options that a user can select. <SELECT size="1" name=“demo_ComboBox"> <option value=“Actual Value 1">Displayed Value 1</option> <option value=“Value 2" >Value 2</option> <option value=“Value 3" >Value 4</option> </SELECT> Set objCombo=document.getElementsByName(“demo_ComboBox”).item(0) numOptions=objCombo.Options.length ‘ Would give 3 in our case firstOptionValue=objCombo.Options(0).value‘ “Actual Value 1” in our case firstOptionText=objCombo.Options(0).text ‘ “Displayed Value 1” in our case To select one of the options use the below code objCombo.Options(0).Selected = true objCombo.value=“Actual Value 1”
Checkbox A checkbox can be either checked or unchecked <input type="checkbox" name=“demo_CheckBox”> Set objChkBox=document.getElementsByName(“demo_CheckBox”).item(0) objChkBox.Checked=True
Radio button Radio button is an array of options out of which only one of the option can be selected. To group the radio buttons the elements are give the same name. <input type="radio" name="Sex" value="male" checked="checked" /> <input type="radio" name="Sex" value="female" /> To select the radio button we need to assign the appropriate value to it’s object Set objRadio=document.getElementsByName(“Sex”).item(0) objRadio.checked=True ‘Will select male objRadio.value=“female” ‘Will select female even if we point to the male object node
HTML Table <table id="myTable" border="1"> <tr> <td>Row1 cell1</td> <td>Row1 cell2</td> </tr> <tr> <td>Row2 cell1</td> <td>Row2 cell2</td> </tr> </table> Table object provides two collections • cells – gives access to all the cells present in the table • rows – gives access to all the rows present in the table. Rows also provides cells collection to access particular cells present in the row.
HTML Table Contd. To access 1st row and 1st column, use the below code Set objTable=document.getElementById(“myTable”) objTable.rows(0).cells(0).outerText objTable.cells(0).outerText To access 2nd row and 1st column, use the below code objTable.rows(1).cells(0).outerText objTable.cells(2).outerText ‘ Index 2 comes from 0,1 index going for 1st row and 2 for the first cell in 2nd row. objTable.rows.length will give total # of rows in the table objTable.cells.length will give total # of cells in the table objTable.rows(0).cells.length will give total # of cells in the 1st row of the table
Automating web pages • For automating web pages we only need to combine the techniques discussed in the previous 2 sections • Not all pages can be automated due to limitations of DOM, discussed in later slides. • The automation is applicable to normal HTML/DHTML web content. The technique does not work on Java based web sites. • The important task in automating a web page is to identify the element that needs to be worked upon. This could be one of the most challenging task, as the developers usually do not develop considering the automation needs.
Determining the DOM path of an Element • DOM path is nothing but the way we can access a particular element on the web page. • There are two ways to determine the same • Do a View Source on the web page and try to locate the code corresponding to that web element. This technique is sometimes tedious and needs experience to determine the possible access method. • Use a DOM viewer tool to view the entire DOM structure and locate the DOM node for the same. “IE DOM Inspector” is one of the very good shareware softwares available to view the DOM of a web page in IE. It is available at www.inspectorsoft.com.
Demo • www.yahoomail.com • www.google.com • www.gmail.com • www.orkut.com (DOM limitation) • www.rediffmail.com • On the spot web pages.
Searching on google.com Dim IE Set IE = CreateObject(“InternetExplorer.Application”) IE.Visible = True IE.Navigate2 “http://www.google.com” Do DoEvents Loop while IE.Busy = TRUE IE.document.getElementsByName(“q”)(0).value=“Automation” IE.document.getElementsByName(“btnG”)(0).click Set IE=Nothing
Click on a link for a given href Function ClickOnLink(ByVal objIE As InternetExplorer, ByVal strHref As String) As String Dim Link For Each Link In objIE.Document.links If strHref Like link.href Then ClickOnLink = Link.href Link.Click Exit Function End If Next ClickOnLink = "" End Function
Limitation of DOM • When a web page has a frame having different domain then the web page, the access to the DOM of the frame is denied. • For example a web page on www.gmail.com loads a frame which is present on www.google.com, then we can access the DOM of www.gmail.com but not the DOM of the frame. • Microsoft calls this as Cross domain security, to prevent the author of one website to modify content of other’s web site while using frames. • The limitation can be overcome only if both the web pages define the same domain in the web page i.e. they must have the below line in the web pagedocument.domain=“somedomain.com” • For more details on cross domain security refer to the below URL:http://support.microsoft.com/kb/167796/EN-US/
References • http://www.w3schools.com/htmldom/ • http://www.w3schools.com/vbscript/ • http://msdn.microsoft.com/workshop/author/dhtml/reference/objects/obj_document.asp • http://msdn.microsoft.com/library/default.asp?url=/library/en-us/vbcon98/html/vbconloopstructures.asp • http://msdn.microsoft.com/library/default.asp?url=/library/en-us/vbcon98/html/vbconcollectionsinvisualbasic.asp • http://msdn.microsoft.com/library/default.asp?url=/library/en-us/vbcon98/html/vbconhowobjectcreationworksinvbcomponents.asp • http://msdn.microsoft.com/library/default.asp?url=/library/en-us/vbcon98/html/vbconassigninganobjectreferencetoavariable.asp