1 / 24

Building Intelligent Web Agents with CFML Michael Dinowitz November, 2000

Building Intelligent Web Agents with CFML Michael Dinowitz November, 2000. Intelligent Agents in ColdFusion. What are Agents? Code that does automatic work for you Involves retrieving information Processing or storing that information Usually a single page or has no interface

Pat_Xavi
Download Presentation

Building Intelligent Web Agents with CFML Michael Dinowitz November, 2000

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Building Intelligent Web Agents with CFMLMichael DinowitzNovember, 2000

  2. Intelligent Agents in ColdFusion • What are Agents? • Code that does automatic work for you • Involves retrieving information • Processing or storing that information • Usually a single page or has no interface • What are Intelligent Agents (IA)? • Term user for a specific class of agents • Retrieves remote information • Processes the retrieved information • Decision making code built in • Usually involves Parsing operations • Interfaces with remote processes

  3. Intelligent Agents in ColdFusion • What aren’t Intelligent Agents? • Push of any sort (CFMAIL) • Calls to structured locations • DBs • LDAP • Browsers • Grey Areas - Structured data • Syndicated data (Spectra) • HTTP query returns • Comma delimited information • Most local information calls

  4. Intelligent Agents in ColdFusion • Broad examples • CF_StockGrabber - grabs and processed stock information • CF_UPS - interface to UPS shipping data • CF_MetaSearch - searches multiple search engines and collates results • CF_GetTags

  5. Intelligent Agents in ColdFusion • Technologies used for retrieval • CFHTTP - retrieve websites • CFFTP - retrieves ftp information • CFX_Socket - socket calls for information • CFX_NNTP - retrieves usenet news • Technologies used for parsing • Find() / FindNoCase () • Replace() / ReplaceNoCase () • Mid() • REFind() / REFindNoCase () • REReplace() / REReplaceNoCase()

  6. IA technique I - CF_EbayItem • IA technique I - CF_EbayItem • 1. Define what you want • A page from ebay with the results of a search • 2. Define how it will be displayed • Whole page returned in a variable. No parsing • 3. Define the steps to get it • CFHTTP to retrieve a page • Place information in file or on browser

  7. CFHTTP Basics • <CFHTTP • Url - Url to retrieve. Does not need http:// prefix • Method - Get or Post. • ResolveUrl - Turns all relative links into ‘full’ ones. Needed for graphics and links from the page. • Notes: • The URL does not need to be prefixed by http://, but it’s good practice to do so. • Get is standard and uses the tag ‘as is’. Post requires a CFHTTPPARAM as well as a closing CFHTTP tag. • ResolveUrl should only be used when you expect to follow links from the called page or want to see the media content.

  8. IA technique I - CF_EbayItem <!--- CF_EbayItem - Module to get all items from ebay and return it ---> <!--- Required attributes ---> <CFPARAM name="attributes.searchitem"> <CFPARAM name="attributes.ReturnVar" default="ReturnVar"> <cfhttp url="http://search-desc.ebay.com/search/search.dll?MfcISAPICommand=GetResult&ebaytag1=ebayreg&ht=1&query=#attributes.searchitem#&ebaytag1code=0&srchdesc=y&SortProperty=MetaNewSort" method="GET" resolveurl="true"> <CFSET “Caller.#Attributes.ReturnVar#”=CFHttp.FileContent> • IA technique I - CF_Ebay (Code)

  9. IA technique II - CF_EbayItem • 1. Define what you want • All items from an ebay search • 2. Define how it will be displayed • in a return array • 3. Define the string to search for in the page • <a href="http://cgi.ebay.com/aw-cgi/eBayISAPI.dll?ViewItem&item=449570667">HEBREW AMULETS: By T Schrire</a> • 4. Define the steps to get it • CFHTTP to retrieve a page • CFLOOP over the page for elements • FindNoCase() to get start of specific element • FindNoCase() to get end of specific element • Mid() to get whole element • Place information in array for return

  10. Find()/FindNoCase() Basics • FindNoCase(substring, string [, start ]) • SubString - The exact string your looking for • String - The string that your searching • Start - Optional start position. • Notes: • FindNoCase is slightly slower, but better when you don’t know exactly what your looking for. • Always a good idea to set a start. Speeds up the search. • Remember that the return value is the START position of the SubString. Add the SubString length to get the end position.

  11. Mid() Basics • Mid(string, start, count) • String - The string that contains the SubString you want. • Start - The start position of the SubString you want. • Count - The amount of characters in the SubString that you want. • Notes: • When used with FindNoCase, it is usual to have a start variable and an end variable. The count would then be noted as • End-Start

  12. IA technique II - CF_EbayItem <!--- CF_EbayItem - Module to get all items from ebay and return it ---> <!--- Required attributes ---> <CFPARAM name=”Attributes.SearchItem"> <CFPARAM name=”Attributes.ReturnArray" default="ReturnArray"> <cfhttp url="http://search-desc.ebay.com/search/search.dll?MfcISAPICommand=GetResult&ebaytag1=ebayreg&ht=1&query=#Attributes.SearchItem#&ebaytag1code=0&srchdesc=y&SortProperty=MetaNewSort" method="GET" resolveurl="true"> <CFSET End=1> <!--- Set local array for storage. We set all values to a local array rather than to the calling template to reduce the number of ‘calls’ between templates and improve performance. ---> <CFSET LocalArray=ArrayNew(1)> <!--- CF_EbayItem - Module to get all items from ebay and return it ---> <!--- Required attributes ---> <CFPARAM name="attributes.searchitem"> <CFPARAM name="attributes.ReturnVar" default="ReturnVar"> <cfhttp url="http://search-desc.ebay.com/search/search.dll?MfcISAPICommand=GetResult&ebaytag1=ebayreg&ht=1&query=#attributes.searchitem#&ebaytag1code=0&srchdesc=y&SortProperty=MetaNewSort" method="GET" resolveurl="true"> <CFSET “Caller.#Attributes.ReturnVar#”=CFHttp.FileContent>

  13. IA technique II - CF_EbayItem <CFLOOP condition="1"> <CFSET Start = FindNoCase('<a href="http://cgi.ebay.com/aw-cgi/eBayISAPI.dll?ViewItem&item=', cfhttp.filecontent, end)> <CFIF Start> <!--- Add the search item’s length to its position to get its true end position. This will help in getting its full value in the Mid() function. ---> <CFSET End=FindNoCase('</a>', cfhttp.filecontent, start)+4> <!--- Add item to a local array ---> <CFSET ArrayAppend(LocalArray,Mid(cfhttp.filecontent, start, end-start))> <cfelse> <cfbreak> </cfif> </cfloop> <!--- Set local array to calling template ---> <CFSET "caller.#Attributes.ReturnArray#"=LocalArray>

  14. IA technique III - CF_EbayItem • 1. Define what you want • All items from an ebay search • 2. Define how it will be displayed • in a return array • 3. Define the string to search for in the page • <a href="http://cgi.ebay.com/aw-cgi/eBayISAPI.dll?ViewItem&item=449570667">HEBREW AMULETS: By T Schrire</a> • 4. Define the steps to get it • CFHTTP to retrieve a page • CFLOOP over the page for elements • REFindNoCase() to get specific element • Mid() to get whole element • Place information in array for return

  15. REFind()/REFindNoCase() Basics • REFindNoCase(RegEx, String [,start] [,returnsub] ) • RegEx - Regular Expression to use as search criteria • String - String to search in • Start - Position in String to start search at • ReturnSub - Returns sub expressions as defined in the RegEx • Notes: • Start should always be used as it speeds up the search. If using ReturnSub, it is required and can be set to 1. • This function returns the numeric position of the searched for text unless ReturnSub is specified. Then it returns a structure

  16. REFind()/REFindNoCase() Basics • Structure returned by this string will have two keys (Pos, Len) with each key being an array. The first array (Variable.Pos[1], Variable.Len[1]) will always contain the position/Length of the ENTIRE match. Each additional array element will contain the position and length of a subelement. • Variable • Pos • [1] • [2] • Len • [1] • [2]

  17. RegEx Basics • The following is a fast rundown of important characters in Regular Expressions • In most cases, a character is equal to itself • A \ will escape any special character • A period (.) represents any one character • .at can mean bat, cat, rat, or anything that has a single character and ends with at. • A pair of brackets denotes a set of characters (I.e. one of them can be used) • [01256] means any one of those numbers • A dash (-) within a set means “a range of” • [0-9] means any single number of 0 through 9 • A carat (^) within a range means “Not the range” • [^aeiou] means any character but a vowel

  18. RegEx Basics • Parenthesis is used to denote a compound expression OR a subexpression • (this) will return the position and length of the word “this” • When used within a compound, a pipe (|) means either/or • (this|that) will return the position and length of the first occurrence of “this” or “that” • A question mark (?) means that the previous character, set or compond may or may not exist but if it does, will exist 1 time • A plus (+) means that the previous character, set or compond must exist 1 or more times • An asterisk (*) means that the previous character, set or compond may exist 0 or more times

  19. IA technique III - CF_EbayItem <!--- CF_EbayItem - Module to get all items from ebay and return it ---> <!--- Required attributes ---> <CFPARAM name=”Attributes.SearchItem"> <CFPARAM name=”Attributes.ReturnArray" default="ReturnArray"> <cfhttp url="http://search-desc.ebay.com/search/search.dll?MfcISAPICommand=GetResult&ebaytag1=ebayreg&ht=1&query=#Attributes.SearchItem#&ebaytag1code=0&srchdesc=y&SortProperty=MetaNewSort" method="GET" resolveurl="true"> <CFSET end=1> <!--- Set local array for storage. We set all values to a local array rather than to the calling template to reduce the number of ‘calls’ between templates and improve performance. ---> <CFSET LocalArray=ArrayNew(1)> <CFQUERY DATASOURCE ="demo" NAME="products"> SELECT PRODUCT, PRICE FROM PRODUCTS </CFQUERY> <H2>Car Paint Colors</H2> <CFOUTPUT QUERY="products"> <B>#product#</B> - $#price#<BR> </CFOUTPUT>

  20. IA technique III - CF_EbayItem <CFLOOP condition="1"> <!--- Search the CFHTTP.FileContent for any link (A HREF=></A>) where two parts of the link will change. The Url variable item= will always contain a number. [0-9]+ will get 1 or more numbers. The text in the body of the A tag will contain any characters, but never HTML. Using [^<]+ to search for anything other than a closing bracket will get us all the text. Note that a forward slash is used before each period and question mark in the URL to ‘escape’ these characters and have them treated as a normal character rather than a special RegEx character.---> <CFSET Item=REFindNoCase('<a href="http://cgi\.ebay\.com/aw-cgi/eBayISAPI\.dll\?ViewItem&item=[0-9]+">[^<]+</a>', cfhttp.filecontent, end, 1)>

  21. IA technique III - CF_EbayItem <!--- If the value of Item.Len[1] is TRUE (I.e. not 0) then add the element to the array. Else break out of the loop ---> <CFIF Item.len[1]> <!--- Add the search item length to its position. This will be used as the new position to start the search from in the next loop iteration. A simple +1 would work as well.---> <CFSET End=Item.pos[1]+Item.len[1]> <!--- Add item to a local array. Note that the return from a REFind()/REFindNoCase() function fits perfectly into a Mid() function. ---> <CFSET ArrayAppend(LocalArray,Mid(cfhttp.filecontent, Item.pos[1], item.len[1]))> <cfelse> <cfbreak> </cfif> </cfloop> <!--- Set local array to calling template ---> <CFSET "caller.#Attributes.ReturnArray#"=LocalArray>

  22. Extra Information • CFHTTP Headers - extra information returned by a CFHTTP (or any HTTP) call • FILECONTENT - Text grabbed • HEADER - Header info (including cookies) • MIMETYPE - Return mime type • RESPONSEHEADER - structure with all information except content • STATUSCODE - HTTP return code

  23. Syndication (WDDX & Queries) • Can return structured information as a query • Better to use WDDX to send query encoded in a packet • Basis of Spectra syndication • Can pass binary files encoded with ToBase64() function

  24. Conference Closing Slide

More Related