420 likes | 548 Views
SIUG Annual Meeting 2010 UNC Charlotte January 28, 2010. Web Logs: Finally! Now What Do We Do With Them?. What are web logs? What information do they gather? Where are the logs stored? How are the logs accessed and analyzed? What do the reports mean? What limitations exist?.
E N D
SIUG Annual Meeting 2010UNC CharlotteJanuary 28, 2010 Web Logs: Finally! Now What Do We Do With Them? SIUG Annual Meeting 2010 Web Logs: Finally! Now What Do We Do With Them? Dan Pfohl, UNC Wilmington
What are web logs? • What information do they gather? • Where are the logs stored? • How are the logs accessed and analyzed? • What do the reports mean? • What limitations exist? SIUG Annual Meeting 2010 Web Logs: Finally! Now What Do We Do With Them? Dan Pfohl, UNC Wilmington
Useful Web Sites CSDirect: http://csdirect.iii.com/documentation/weblogs.shtml WebPAC Wiki: http://csdirect.iii.com/lswiki/WebPAC/WebHome Web Access Logs topic: http://csdirect.iii.com/lswiki/WebPAC/AccessLogs IUG Listserv: http://innovativeusers.orgSearch term: web server logs SIUG Annual Meeting 2010 Web Logs: Finally! Now What Do We Do With Them? Dan Pfohl, UNC Wilmington
What is a weblog file? SIUG Annual Meeting 2010 Web Logs: Finally! Now What Do We Do With Them? Dan Pfohl, UNC Wilmington
What’s in a log file line? 80 152.20.226.60 - - [20/Jan/2010:19:15:01 -0500] "GET /search~b001o001c001i001 HTTP/1.1" 200 4763 "http://library.uncw.edu/web/systems/start_page/start_page2.htm" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR 1.1.4322; .NET CLR 2.0.50727)" 5970 SIUG Annual Meeting 2010 Web Logs: Finally! Now What Do We Do With Them? Dan Pfohl, UNC Wilmington
What’s in a log file line? Port Number 80 152.20.226.60 - - [20/Jan/2010:19:15:01 -0500] "GET /search~b001o001c001i001 HTTP/1.1" 200 4763 "http://library.uncw.edu/web/systems/start_page/start_page2.htm" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR 1.1.4322; .NET CLR 2.0.50727)" 5970 SIUG Annual Meeting 2010 Web Logs: Finally! Now What Do We Do With Them? Dan Pfohl, UNC Wilmington
What’s in a log file line? User’s IP Address 80 152.20.226.60 - - [20/Jan/2010:19:15:01 -0500] "GET /search~b001o001c001i001 HTTP/1.1" 200 4763 "http://library.uncw.edu/web/systems/start_page/start_page2.htm" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR 1.1.4322; .NET CLR 2.0.50727)" 5970 SIUG Annual Meeting 2010 Web Logs: Finally! Now What Do We Do With Them? Dan Pfohl, UNC Wilmington
What’s in a log file line? Date/Time 80 152.20.226.60 - - [20/Jan/2010:19:15:01 -0500] "GET /search~b001o001c001i001 HTTP/1.1" 200 4763 "http://library.uncw.edu/web/systems/start_page/start_page2.htm" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR 1.1.4322; .NET CLR 2.0.50727)" 5970 SIUG Annual Meeting 2010 Web Logs: Finally! Now What Do We Do With Them? Dan Pfohl, UNC Wilmington
What’s in a log file line? Page requested 80 152.20.226.60 - - [20/Jan/2010:19:15:01 -0500] "GET /search~b001o001c001i001 HTTP/1.1" 200 4763 "http://library.uncw.edu/web/systems/start_page/start_page2.htm" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR 1.1.4322; .NET CLR 2.0.50727)" 5970 SIUG Annual Meeting 2010 Web Logs: Finally! Now What Do We Do With Them? Dan Pfohl, UNC Wilmington
What’s in a log file line? HTTP status code 80 152.20.226.60 - - [20/Jan/2010:19:15:01 -0500] "GET /search~b001o001c001i001 HTTP/1.1" 200 4763 "http://library.uncw.edu/web/systems/start_page/start_page2.htm" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR 1.1.4322; .NET CLR 2.0.50727)" 5970 SIUG Annual Meeting 2010 Web Logs: Finally! Now What Do We Do With Them? Dan Pfohl, UNC Wilmington
What’s in a log file line? Bytes transferred 80 152.20.226.60 - - [20/Jan/2010:19:15:01 -0500] "GET /search~b001o001c001i001 HTTP/1.1" 200 4763 "http://library.uncw.edu/web/systems/start_page/start_page2.htm" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR 1.1.4322; .NET CLR 2.0.50727)" 5970 SIUG Annual Meeting 2010 Web Logs: Finally! Now What Do We Do With Them? Dan Pfohl, UNC Wilmington
What’s in a log file line? Referring page 80 152.20.226.60 - - [20/Jan/2010:19:15:01 -0500] "GET /search~b001o001c001i001 HTTP/1.1" 200 4763 "http://library.uncw.edu/web/systems/start_page/start_page2.htm" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR 1.1.4322; .NET CLR 2.0.50727)" 5970 SIUG Annual Meeting 2010 Web Logs: Finally! Now What Do We Do With Them? Dan Pfohl, UNC Wilmington
What’s in a log file line? Browser info 80 152.20.226.60 - - [20/Jan/2010:19:15:01 -0500] "GET /search~b001o001c001i001 HTTP/1.1" 200 4763 "http://library.uncw.edu/web/systems/start_page/start_page2.htm" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR 1.1.4322; .NET CLR 2.0.50727)" 5970 SIUG Annual Meeting 2010 Web Logs: Finally! Now What Do We Do With Them? Dan Pfohl, UNC Wilmington
What’s in a log file line? Load time 80 152.20.226.60 - - [20/Jan/2010:19:15:01 -0500] "GET /search~b001o001c001i001 HTTP/1.1" 200 4763 "http://library.uncw.edu/web/systems/start_page/start_page2.htm" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR 1.1.4322; .NET CLR 2.0.50727)" 5970 SIUG Annual Meeting 2010 Web Logs: Finally! Now What Do We Do With Them? Dan Pfohl, UNC Wilmington
Here’s another. 443 152.21.32.14 - - [20/Jan/2010:19:01:37 -0500] "GET /patroninfo~b01o01c01i01/ HTTP/1.1" 200 2043 "http://www.uncp.edu/library/" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR 1.1.4322; .NET CLR 2.0.50727; InfoPath.2; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729)" 22934 SIUG Annual Meeting 2010 Web Logs: Finally! Now What Do We Do With Them? Dan Pfohl, UNC Wilmington
Web Logs are configurable (sort of): • Specify where logs are stored. • Specify storage schedule. • Specify what information is collected. SIUG Annual Meeting 2010 Web Logs: Finally! Now What Do We Do With Them? Dan Pfohl, UNC Wilmington
Only Innovative can modify the Apache config. • Rolling 35-day set of log files. • Each log = 24 hours (not midnight to midnight). • 2-day delay in log availability. • More on Innovative’s implementation later. SIUG Annual Meeting 2010 Web Logs: Finally! Now What Do We Do With Them? Dan Pfohl, UNC Wilmington
So where are the Millennium server web logs and how do we get at them? SIUG Annual Meeting 2010 Web Logs: Finally! Now What Do We Do With Them? Dan Pfohl, UNC Wilmington
SIUG Annual Meeting 2010 Web Logs: Finally! Now What Do We Do With Them? Dan Pfohl, UNC Wilmington
Web Server Logs – live/logs (livelogs) SIUG Annual Meeting 2010 Web Logs: Finally! Now What Do We Do With Them? Dan Pfohl, UNC Wilmington
SIUG Annual Meeting 2010 Web Logs: Finally! Now What Do We Do With Them? Dan Pfohl, UNC Wilmington
SIUG Annual Meeting 2010 Web Logs: Finally! Now What Do We Do With Them? Dan Pfohl, UNC Wilmington
SIUG Annual Meeting 2010 Web Logs: Finally! Now What Do We Do With Them? Dan Pfohl, UNC Wilmington
SIUG Annual Meeting 2010 Web Logs: Finally! Now What Do We Do With Them? Dan Pfohl, UNC Wilmington
Analysis Tools • Analog: http://www.analog.cx/ • Webalyzer: ftp://ftp.mrunix.net/pub/webalizer/old/ Then select webalizer-2.01-10-win32-bin.zip • WebLog Expert Lite:http://csdirect.iii.com/lswiki/WebPAC/WeblogExpertLite SIUG Annual Meeting 2010 Web Logs: Finally! Now What Do We Do With Them? Dan Pfohl, UNC Wilmington
Web Log Format Innovative server produces Apache Combined format From the WebPAC Wiki: %p %h %l %u %t "%r" %>s %b "%{Referer}i" "%{User-Agent}i" %D From Alan Dyck (April 2009). ‘%p %h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\" %D’ From my Analog configuration file. I substituted the parentheses for single quotes. (%p %h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\" %D) SIUG Annual Meeting 2010 Web Logs: Finally! Now What Do We Do With Them? Dan Pfohl, UNC Wilmington
Setting up analyzers Analog: Download/install software Readme file and manual Edit configuration file Run from a command line SIUG Annual Meeting 2010 Web Logs: Finally! Now What Do We Do With Them? Dan Pfohl, UNC Wilmington
Analog config file: SIUG Annual Meeting 2010 Web Logs: Finally! Now What Do We Do With Them? Dan Pfohl, UNC Wilmington
SIUG Annual Meeting 2010 Web Logs: Finally! Now What Do We Do With Them? Dan Pfohl, UNC Wilmington
Analog Report Report written to specified output directory General Summary Daily Summary Hourly Summary Domain Report Organization Report Search Word Report Browser Summary Operating System Report Status Code Report File Size Report File Type Report Directory report Request Report SIUG Annual Meeting 2010 Web Logs: Finally! Now What Do We Do With Them? Dan Pfohl, UNC Wilmington
Webalyzer Setup SIUG Annual Meeting 2010 Web Logs: Finally! Now What Do We Do With Them? Dan Pfohl, UNC Wilmington
Webalyzer Report Report written to same directory as logs Monthly Statistics Daily Statistics Hourly Statistics Top URLs, entry pages, exit pages, referrers,Search strings, User agents, Country codes More difficult than Analog to configure Runs a single log file, so cron job or combine logs SIUG Annual Meeting 2010 Web Logs: Finally! Now What Do We Do With Them? Dan Pfohl, UNC Wilmington
WebLog Expert Setup Download/install software Edit config files? Multiple config files Requires standard or professional edition Edit log files to remove port number. SIUG Annual Meeting 2010 Web Logs: Finally! Now What Do We Do With Them? Dan Pfohl, UNC Wilmington
Process web logs for WebLog Expert Uncompress log files into a new directory Edit logs to remove the port number from entries Can be done using WordPad’s Find & Replace? Maybe Use EditPlus’s SEARCH with Regular Expression ^[0-9]* Remove any .bak files from the directory. SIUG Annual Meeting 2010 Web Logs: Finally! Now What Do We Do With Them? Dan Pfohl, UNC Wilmington
Title SIUG Annual Meeting 2010 Web Logs: Finally! Now What Do We Do With Them? Dan Pfohl, UNC Wilmington
Title SIUG Annual Meeting 2010 Web Logs: Finally! Now What Do We Do With Them? Dan Pfohl, UNC Wilmington
WebLog Expert Lite Report Report written to browser, must be saved to location of choice Summary – hits, page views views, visitors , bandwidth Activity Access Visitors Referrers Browsers Errors Standard and Professional versions are much more robust and configurable. Both are reasonably priced. SIUG Annual Meeting 2010 Web Logs: Finally! Now What Do We Do With Them? Dan Pfohl, UNC Wilmington
What can we do with these numbers? SIUG Annual Meeting 2010 Web Logs: Finally! Now What Do We Do With Them? Dan Pfohl, UNC Wilmington
Possible uses: Counts – justify existence, services, programs Activity Times – Help determine Global/Rapid Updates, MARC record loads, upgrade days/times, compiling lists, processing inventories, etc. Track effects of system/program changes Know popular entry pages. Where to place alerts. Troubleshoot error messages. Web development, testing pages, ensuring functionality. Search patterns. SIUG Annual Meeting 2010 Web Logs: Finally! Now What Do We Do With Them? Dan Pfohl, UNC Wilmington
Limitations Can only analyze the data collected. Not all browsers provide information. Time/ability to configure analyzers. Data and report storage. Knowing what you want/need to analyze SIUG Annual Meeting 2010 Web Logs: Finally! Now What Do We Do With Them? Dan Pfohl, UNC Wilmington
Summary • Millennium web access logs are now available for download and manipulation. • Log files contain information such as browser type, Operating System, entry page, user IP address, etc. • Programs such as Analog, Webalyzer, and WebLog Expert Lite are available to help analyze masses of data. • Analysis can help in decision-making and reporting. • Web log analysis has limitations. SIUG Annual Meeting 2010 Web Logs: Finally! Now What Do We Do With Them? Dan Pfohl, UNC Wilmington
Thank You! SIUG Annual Meeting 2010 Web Logs: Finally! Now What Do We Do With Them? Dan Pfohl, UNC Wilmington