450 likes | 530 Views
Topic 11 第十一讲 : Web Site Analysis 网站分析. Marshall Breeding Director for Innovative Technologies and Research Vanderbilt University http://staffweb.library.vanderbilt.edu/breeding. Redefining Libraries: Web 2.0 and other Challenges May 2007 Xiamen, China. Theme 主题.
E N D
Topic 11 第十一讲:Web Site Analysis网站分析 Marshall Breeding Director for Innovative Technologies and Research Vanderbilt University http://staffweb.library.vanderbilt.edu/breeding Redefining Libraries: Web 2.0 and other Challenges May 2007 Xiamen, China
Theme主题 • For many libraries, the number of visitors of their Web site and electronic resources exceeds the numbers that visit their physical premises. It's vital for libraries to understand how these remote visitors approach the Web site, not only to measure use but to improve the resources themselves. Marshall Breeding will present a number of practical techniques that libraries can use to better understand the use of their Web-based resources. 许多图书馆的网站和电子资源的访客远多于他们馆舍的访客。 明白这些远程访客如何上网对图书馆非常重要,这不但是为了计算用量,而且是为了改善资源。 Marshall Breeding将介绍一些图书馆可用作进一步了解他们网上资源运用的实际技术。
Theme主题 • Topics will include the basics of analyzing the server logs of the library's Web site, transaction logs from the OPAC, the complexities of measuring use of subscription-based electronic resources, and techniques for enhancing applications to better record how they are used. 主题包括图书馆网站服务器日志和在线公众查询目录事务日志的分析基础,量度订购电子资源的运用的复杂性,及建立更完善上网记录的技术。
Understanding remote users了解远程用户 • Vital to providing relevant library services对提供相关的图书馆服务是重要的 • More libraries may use library resources remotely through the Web than from physical library facilities 更多的图书馆可能透过网络遥距运用图书馆资源多于实际的图书馆设施
Understanding remote users了解远程用户 • Must work harder to ensure that Web-based services meet patron needs 必需更努力地工作以确保网上服务能满足顾客需要 • Move beyond hit counters and raw statistics to more sophisticated analysis and assessment 超越浏览人数计算器和原始统计,迈向更高层次的分析与评估
Analysis goals分析目标 • Improve usability增加可用性 • Web site diagnostics网站诊断 • Understand user needs了解用户需要 • Content selection decisions选择内容的决定 • Improve quality of service提升服务质素 • Marketing推广 • Budget justification预算的理由 • Strategy to increase interest and activity增加兴趣和活动的策略
Data sources for tracking remote use追踪遥距使用的数据来源 • Web server logs网站服务器日志 • Application logs应用日志 • Remote tracking data (Google Analytics)遥距追纵数据 (Google网站分析系统) • Vendor provided use statistics (e-resources)供应商提供的用量统计 (电子资源)
Enterprise approach to analytics 用企业方法作分析 • Multiplicity of Resources to track 多种资源跟踪 • Web Servers 网站服务器 • OPACS 在线公众查询目录 • E-Resources 电子资源 • Databases 数据库 • Repositories典藏 • Important to track the flow of use among all the library’s Web-based resources 跟踪所有图书馆的网络资源之中的使用流程是重要的
Enterprise approach to analytics 用企业方法作分析 • Beyond the library: study flow to and from higher-level Web sites and portals (University -> Courseware -> Library) 图书馆以外:研究高水平网站和网络出入口的来去流程 (大学 -> 课程套件 -> 图书馆)
Web server logs网站服务器日志 • Web servers are routinely configured to record detailed information about each request. Common elements include网站服务器是日常配置来记录关于每一请求的详细资料: • File requested需要的档案 • Date/time stamp日期/时间印 • Status code 状态代码 • Request directive (get, post, head) 需求指令 • Referrer (where the user came from)来源(用户来自何处) • User agent (browser and platform data) 用户代理 (浏览器与平台数据)
Example Web log 网站服务器日志例子 • Raw data for analysis process 分析过程的原始数据 2006-06-20 05:01:43 129.59.150.105 GET /index.pl - 80 - c-69-250-131-199.hsd1.md.comcast.net Mozilla/4.0+(compatible;+MSIE+6.0;+Windows+NT+5.1;+SV1;+.NET+CLR+1.1.4322)http://www.google.com/search?hl=en&lr=&safe=off&q=september+11+television+archive 200 0 0 11752
Exploiting referral data发掘来源资料 • The query string component of the referrer can be parsed to reveal search terms and other interesting information 可以分析来源查询字串成分以揭示搜寻术语和其它有趣的资料 • http://www.google.com/search?hl=en&lr=&safe=off&q=september+11+television+archive • User typed “september 11 television archive” in Google to find our site 用户在 Google 输入“september 11 television archive” 找寻网址
Exploiting referral data发掘来源资料 • Important to study how users get to your site 研究用户怎样上你的网址是重要的 • [example: TV News Public Web queries vs OpenWeb) (例子:电子新闻公开网查询相对于开放网络)
Analysis methodology分析方法 • Go beyond simply counting pages 不要只限于数页数 • Identify Sessions 识别不同部份 • Categorize users 用户分类 • Determine use patterns 推定用户模式 • Measure interest 量度利率 • Time spent on Web site 用于网站的时间 • Bounce rate 回弹率 • Page overlay analysis 页面分析
Move from measurement to impact 从量度移到影响 • Establish site goals 建立网站目标 • Benchmark current use 评核现有的使用 • Implement goal oriented improvements 实施以目标为主的改进 • Measure impact 量度影响 • Repeat as needed 需要时重复步骤 • (Example: enhancement of TV News OpenWeb) (例子:改进电视新闻开放网络)
Appropriate data filtering适当的数据过滤 • Requests from indexing bots (crawlers) can skew statistics 搜索器的请求会曲解统计 • Count user requests and bot requests separately 分开计算用户请求和搜索器请求 • Performance monitors 追踪表现 • Link checkers 链接检查器 • Monitoring crawler activity is an important component of SEO and Web site discoverability strategies. 监视搜索器的活动是搜索引擎最佳化和发现网站的策略的一个重要部分
Resource Discovery发现资源 • How do users get to your site? 用户如何上你的网站? • Track performance of the Web site relative to major search engines 追踪与主要搜索引擎有关的网站的表现 • SEO – Search engine optimization 搜索引擎最佳化 • Few users begin with library Web sites 很少用户一开始便搜查图书馆网站
TV News OpenWeb project电视新闻开放网络项目 • Dramatic increase in Web site activity and loan requests through systematic and controlled exposure of metadata to Google and other search engines透过有系统和有控制地将資料数据展示在Google和其它搜索引擎上的网站活动和借用请求戏剧性地增加 • SEO (Search Engine Optimization) strategy 搜索引擎最佳化策略 • Helped the Archive become financially self-sufficient. 令档案管理在财政上自给自足
Examples of Web reporting and analysis tools 网络报告和分析工具的例子
Selected utilities 选择工具 • Analog – free, open source 免费,开放资源 • NetTracker – enterprise level Web analysis application 企业水平的网络分析应用 • Google utilities Google 工具 • Sitemap – process for submitting Web pages for optimized indexing by Google with some assessment capabilities 网站地图 ─ 提交网页至带有评估性能的Google优化索引的步骤 • Analytics – Sophisticated approach for measuring Web site performance 分析学 ─ 量度网站表现的成熟方法
Analog • Free Open Source application 自由开放资源应用 • Basic Web statistics application 基本网络统计应用 • Includes fairly full set of static metrics 包括整套静态分析法 • Command line utility– generates Web report命令列工具- 建立网络报告 • Windows, Unix, Linux, etc.
NetTracker • Unica Corporation • Enterprise level Web analytics 企业水平的网络分析 • http://www.sane.com/
Google SiteMaps 网站地图 • XML specification for systematically submitting URLs that represent a Web site 有系统地提交代表一个网站的URLs的XML规格 • Makes indexing more efficient but does not affect PageRank 令索引更有效率但不影响网页排名 • SiteMap interface provides utilities for monitoring how the site has been indexed with some analytical information on terms used to find your Web site. 网站地图接口提供工具以监察网站如何根据一些用作搜寻你的网站的术语的分析数据而被编入索引
Google Analytics Google 分析法 • Available at no cost from Google 无需成本 • Must receive invitation code 必须接收邀请码 • Slanted toward e-commerce 倾向电子商业 • “Conversion University” – training on how to optimize Web site for high conversion rates. “顾客转化率大学” – 培训如何优化网站以提高转化率 • Allows Webmasters to establish site goals and measure performance 容许网站管理员建立网站目标和量度表现
Application-level reporting and analysis 应用层报告和分析 • Content management systems and other dynamically driven Web environments can provide additional usage information. 内容管理系统和其它动态驱动的网络环境可提供额外用途信息 • Can offer additional information beyond raw Web logs 可提供原始网络日志以外的附加信息 • More capabilities for identifying use based on user categories 更多以用户种类为基础识别用途的性能 • Reporting can be built into the business logic of the application 可在应用服务器的业务逻辑内设立报告
Examples from the TV News Web Site 电视新闻网站的例子 • Reports of use by user category and institution 以用户种頪和机构编排的使用报告 • Statistics on resource use 资源使用的统计 • Data on search types, query terms, etc. 搜寻形式、查询术语等的数据 • Ability to track all aspects of business activity 全方位追踪业务活动的能力
Other sources of Use data其它使用数据的来源 • ILS OPAC Logs ILS在线公用目录日志 • Proxy Server logs and reports 代理服务器日志和报告 • Link resolver logs and reports 链接解析器日志和报告
Limitations限制 • Can’t know the intent of the user 不知道使用者的目標 • User success can only be estimated 使用者成功只能估计 • Difficult to obtain trends by user type 难以得知用户种类的趋式 • More aggressive reporting might intrude on privacy 更多报告可能涉及私隐范围 • Few libraries require the level of user authentication needed to determine use by type of patron很少图书馆要求使用可藉客户种类推断用途的用户认证
Additional Information附加资料 • Breeding, Marshall. Strategies for Measuring and Implementing E-use. ALA TechSource. May-June 2002. 79 pages. • Breeding, Marshall. “Analyzing Web server logs to improve a site’s usage.” Computers in Libraries. Information Today. Medford, CT. October 2005.
Group Exercise小组研习 • Devise a strategy in which you can follow a more user-centered approach to the ongoing development of your library’s Web site through monitoring and analysis of use data. 你可透过监察和分析使用数据,遵循一个更加以用户为中心的方法,对贵馆正在实行的网站发展制定一项策略。