1 / 25

Interoperability of Retrieval

Interoperability of Retrieval. HARVEST as Search Engine. Kerstin Zimmermann Oldenburg University Hamburg August 2000. Retrieval. public. WWW. Workstation. Server / Archive. PC. private. Server Structure in Science. University page Department / Facultiy working groups members

Download Presentation

Interoperability of Retrieval

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Interoperability of Retrieval HARVEST as Search Engine Kerstin Zimmermann Oldenburg University Hamburg August 2000

  2. Retrieval public WWW Workstation Server / Archive PC private Kerstin Zimmermann Oldenburg University

  3. Server Structure in Science University page Department / Facultiy working groups members Publications Kerstin Zimmermann Oldenburg University

  4. Kind of Archives / Sources For Documents a) lists (name, title, date, link) b) additional with abstracts c) only fulltext d) metadata and fulltext Kerstin Zimmermann Oldenburg University

  5. Formats indexed • sgml x • xml x • html X • ps X Text, attention: do not use graficmode ASCII required • pdf X Text, Destiller-Options: asciipdf=on, commpressed text= off exchangedo not use optimize • doc X • rtf X • tex X • dvi X Kerstin Zimmermann Oldenburg University

  6. Harvest WWW Browser HARVEST WWW-SERVER N BROKER Result Result .......... ............. ............ http://www.physik... Internal Area Request Dissertation GATHERER User Kerstin Zimmermann Oldenburg University

  7. Dublin Core MetaData Title Creator Subject Desciption Date Publisher Contributer Type Format Identifier Relation Source Language Coverage Rights 15 Elements http://purl.org/DC/ Kerstin Zimmermann Oldenburg University

  8. Source Code <META NAME="DC.Subject" CONTENT="(SCHEME=MathNet) People.Faculty/Staff"> <META NAME="DC.Format" CONTENT="text/html"> <META NAME="DC.Creator.Person.Name" CONTENT="Zimmermann, Kerstin"> <META NAME="DC.Creator.Person.Address" CONTENT="(Email) kerstin@merlin.physik.uni-oldenburg.de"> <META NAME="DC.Creator.Person.Address" CONTENT="(Phone) + 49 (0) 441 798 3465"> <META NAME="DC.Creator.Person.Address" CONTENT="(Fax) + 49 (0) 441 798 5649"> <META NAME="DC.Creator.Person.Address" CONTENT="(Postal) Fachbereich Physik Carl von Ossietzky Universitaet D-26111 Oldenburg"> <META NAME="DC.Creator.Person.IsMemberOf" CONTENT="DPG"> <META NAME="DC.Creator.Person" CONTENT="(Keywords) Dipl.Phys., Wiss.Mit."> <META NAME="DC.Subject" CONTENT=""> <META NAME="DC.Relation" CONTENT="(SCHEME=url) "> <META NAME="DC.Relation.References" CONTENT="(SCHEME=url) "> <META NAME="DC.Date" CONTENT="1999-03-8"> <META NAME="DC.Type" CONTENT="Text.Homepage"> <META NAME="DC.Rights" CONTENT="These personal data may not be used for any commercial purpose or incorporated in mailing lists without written permission by the author. They are free for information and communication services of learned societies."> <LINK REL="SCHEMA.dc" HREF="http://purl.org/DC/"> Kerstin Zimmermann Oldenburg University

  9. Existent Gatherers and Brokers www.iuk-initiative.org/ iwi/TheO/ www.physik.uni-oldenburg.de/ EPS/PhysNet/ www.mathnet.de

  10. Search results How they look like: - list of results - link to the index-file - link to the fulltext (- link to the word in text) Kerstin Zimmermann Oldenburg University

  11. Kerstin Zimmermann Oldenburg University

  12. inhomogeneous Kerstin Zimmermann Oldenburg University

  13. Kerstin Zimmermann Oldenburg University

  14. with Metadata Kerstin Zimmermann Oldenburg University

  15. How to create MetaData? Using an online tool Kerstin Zimmermann Oldenburg University

  16. Global Harvest Serverstructure global national Learned field Europewide Kerstin Zimmermann Oldenburg University

  17. Harvest - Configuration Provider Gatherer gmb Provider Broker Gatherer SOIF HTTP Provider Broker Provider Provider Kerstin Zimmermann Oldenburg University

  18. SOIF: Example @FILE { http://www.physik.uni-oldenburg.de/Docs/THEO3/publications/metadocs/ebs.shell.structure.html update-time{9}: 938935362 url-references{208}: http://www.physik.uni-oldenburg.de/Docs/THEO3/publications/ebs.shell.structure.pdf mailto:hilf@merlin.physik.uni-oldenburg.de http://www.physik.uni-oldenburg.de/Docs/THEO3/publications/ebs.shell.structure.pdf title{59}: Shell Structure and Stability of Very Neutron-Rich Isotopes keywords{97}: and author date eberhard ebs files hilf isotopes neutron pdf rich shell stability structure very head{16}: -Version 1.0 --> dc.type{59}: InProceedings (SCHEME=Freetext)publication-status=published dc.title{59}: Shell Structure and Stability of Very Neutron-Rich Isotopes dc.publisher{18}: IKDA, TH Darmstadt dc.language{18}: (SCHEME=Z39.53)ENG dc.format{15}: application/pdf dc.date{75}: (SCHEME=ANSI.X3.30-1985)1975 (SCHEME=ANSI.X3.30-1985)(TYPE=current)19990408 dc.creator{126}: Eberhard R. Hilf (TYPE=email)hilf@merlin.physik.uni-oldenburg.de (TYPE=phone)+49-(0)441-798-2543 (TYPE=fax)+49-(0)441-798-3201 body{190}: =+4>Shell Structure and Stability of Very Neutron-Rich Isotopes Author: Eberhard R. Hilf Phone: +49-(0)441-798-2543 Fax: +49-(0)441-798-3201 Files: ebs.shell.structure.pdf Date: 1975 md5{32}: bc1f2750a042a8175cce710030c60d76 file-size{4}: 2440 type{4}: HTML gatherer-version{6}: 1.5.19 gatherer-host{31}: egoiste.physik.uni-oldenburg.de gatherer-name{17}: Physics Oldenburg refresh-rate{5}: 86400 time-to-live{7}: 3888000 last-modification-time{9}: 928224570 description{186}: =+4>Shell Structure and Stability of Very Neutron-Rich Isotopes Author: Eberhard R. Hilf Phone: +49-(0)441-798-2543 Fax: +49-(0)441-798-3201 Files: ebs.shell.structure.pdf Date: 1975 } Kerstin Zimmermann Oldenburg University

  19. <tags> and Metadata HTML Element SOIF-Element <A HREF> url-reference{} <ADDRESS> address{} <H1 ... H6> headings{} <TITLE> title{} ... Metadaten SOIF-Element DC.title dc.title{} DC.author dc.author{} ... Kerstin Zimmermann Oldenburg University

  20. Harvest links Harvest-Sources: ftp://ftp.tardis.ed.ac.uk/pub/harvest/develop/snapshots/ More infos: http://www.dissonline.org/harvest.html Kerstin Zimmermann Oldenburg University

  21. Port-Numbers • Harvest 8500 • Webserver http 80 • ftp 21 tcp • telnet 23 • smtp (email) 25 • pop3 110 • time-server 123 Kerstin Zimmermann Oldenburg University

  22. Why Harvest? • set up portals for specific needs • heterogeneous archives • runs on different platforms • Software public domain (lower costs) • open sourcecode • world wide community Kerstin Zimmermann Oldenburg University

  23. runtime and pack of data DFN-Net 3 Docs pro Minute connecting time see Browser index [ms] memory 9 MB PhysDis (Jan.‘00) 306 ‚real‘ links 1475 Documents 112 Server Gatherer 2h 4min Kerstin Zimmermann Oldenburg University

  24. Legal Aspects § • searching a database - look for robot.txt • Discussion in DC.Rights - rights of the resource (un-)restricted access / use - rights of Metadata Kerstin Zimmermann Oldenburg University

  25. Topics of Discussion • Search depth • fulltext vs metadata + abstract • integration of old archives • access • Questions, Commtents • -> kerstin@physik.org Kerstin Zimmermann Oldenburg University

More Related