520 likes | 605 Views
WCMS and Campus Web Search Updates. Campus Search. Previous campus search used free search providers It switched between providers as daily usage limits were exceeded Started with Google, then Yahoo then Bing. Campus Search. Google Search Appliance went live in October
E N D
Campus Search • Previous campus search used free search providers • It switched between providers as daily usage limits were exceeded • Started with Google, then Yahoo then Bing
Campus Search • Google Search Appliance went live in October • Indexes web pages in the uwaterloo.ca domain as well as documents and LDAP records
Campus Search • 1,000,000 document license • 680,889 documents being served • 59,372 of those “documents” are people from UWLDAP • The initial crawl quickly used up the license • Duplicate and “retired” servers were omitted • Revision history of wiki pages was ignored • Some sites had pages that presented the same content in different orders
Campus Search • Content breakdown • HTML: 514,648 • PDF: 82,303 • Text files: 37,979 • Postscript: 4,938 • MS Word: 4,308 • MS Powerpoint: 2,496 • Flash: 2,140 • MS Excel: 399
Campus Search • We can add suggestions for acronyms
Campus Search • Or promote pages that should be prominent
Campus Search • If you want to have something removed from the campus search, adjust your robots.txt file accordingly or hide it behind a login and it will disappear over time • If you need something removed immediately, submit an RT ticket
Campus Search • Future plans • Upgrade to Google Search Appliance 7.0 • Add different content type options on search page • Faceted search
Google Search Appliance 7.0 • Universal Search • Search content across silos • New SharePoint connector • Relevance and performance improvements • Enhancements to Google’s algorithms • Assisted navigation to refine search results
Google Search Appliance 7.0 • Document preview • Not just HTML, also includes MS Word, PowerPoint and PDFs • Good for mobile use as you do not have to download an entire PDF to know it is the one you want
Google Search Appliance 7.0 • Document translation and language capabilities • Translate titles and snippets from inside the search appliance with support for 60 languages • Better support for crawling languages such as Arabic and Japanese
Google Search Appliance 7.0 • Index compressed files • Now crawls and indexes compressed files in .zip, .tar, .tar.gz and .tgz • Expert search • Find subject matter experts on campus by searching on keywords • Ex. Search for “network security” and a list of network security experts will appear on the sidebar
Other search enhancements • Auto suggest on search fields • Not just on the main search page, but on individual Drupal sites as well
Other search enhancements • Adding content types to the Google search
Other search enhancements • Faceted search
WCMS Updates • Lily Yan - New content types • Liam Morland - Opening up some sites so all CAS users can authenticate • Chris Shantz– Prototypes for the IT Strategic Plan site and access control with Organic Groups • Kris Olafson – Feature requests site
Six content types • Project • Service • Graduate award • Teaching tip • Grebel publication • Exchange board
Project content type • Proposed, ongoing and completed projects • It includes project title, description, members (name, role), status, audience and topic • Projects can be searched by status, topic and audience • Initially the IST pilot site will use this content type
Service content type • Services offered at the University of Waterloo • Service content type includes service name, description, cost, support for this, link to service, service audience and category • Can be searched by categories and audience • IST website will use this content type
Graduate award content type • This is a custom content type for graduate-studies site • This content type includes all information about graduate awards • Graduate awards can be searched by name, description, value, program, deadline, type, category and citizenship
Graduate award content type • Site managers can view graduate award report and download a .csvfile • Site mangers can mass publish content
Teaching tip content type • A custom content type for the Centre for Teaching Excellence - this is a tip to help inform users • This content type includes title, listing image, body, categories, tags and audience. • Teaching tips can be searched by entering keywords and tag, and by categories
Grebel publication • Grebelpublication includes two content types (Grebel journal and Grebel journal publication) • The full article of HTML and PDF for past issues can be viewed, PDF can be downloaded • Only title and author's name can be viewed for current issue • Issues archived by year
Exchange board content type • This content type is for recreation-committee site only • It includes personal information and exchange information • It can be searched by exchange type (for sale, for rent, free or wanted)
Authentication and Performance • Authenticated traffic skirts around our caching servers (varnish) • Lack of caching contributed to our last outage • The Main site is a fairly content heavy site compared to most, but we should still proceed with caution
Prototypes for New Functionality • On the IT Strategic Plan site, the following features are being evaluated: • Organic Groups • Commenting • Inline Diffs
Organic Groups (OG) • Enable users to create and manage their own 'groups'. • Each group can have subscribers, and maintains a group home page where subscribers communicate amongst themselves. (from Drupal.org/project/og)
Organic Groups (OG) • There are over 35,000 sites that use OG including whitehouse.gov and groups.drupal.org • OG allows us to have two sets of permissions for content on a site • Can have private or public content • Can have private or public groups
Comments • With authenticated access we can explore having comments available for certain content types • Currently being used on the IT Strategic Plan site with OG content
Inline Diff • Inline Diff allows you to view changes to revisions on your site on the node page instead of having to go to the revisions page to see the difference • The diff widget is in a block on the right, you choose the revision to review from a select box • Changed text is yellow, added text is green, deleted text is grey
What’s next? You decide! • In the new year, with our next major release, you will see a link appear on your site dashboards to request a new feature • We will open up our project management site and you can watch our progress on your requests