1 / 1

Development of computers 1970-2009

Corpora: From magnetic tape to web access Knut Hofland, hofland@uib.no , icame.uib.no/history/poster.ppt UNIFOB Aksis, Bergen, Norway. Brown Corpus was made from 1961-64 12.02.1977: ICAME founded in Oslo 29-30.03.1979: First ICAME conference in Bergen 1977-79: ICAME News started March 1978

Download Presentation

Development of computers 1970-2009

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Corpora:From magnetic tape to web accessKnut Hofland, hofland@uib.no, icame.uib.no/history/poster.pptUNIFOB Aksis, Bergen, Norway Brown Corpus was made from 1961-64 12.02.1977: ICAME founded in Oslo 29-30.03.1979: First ICAME conference in Bergen 1977-79: ICAME News started March 1978 Converted the Brown Corpus from original punched card format to a more readable format and corrected errors found during the tagging of the corpus (from 1971-78). **R**T *THE *FULTON *COUNTY *GRAND *JURY SAID *FRIDAY AN INVESTIGATION 0010E1A01 OF *ATLANTA**AS RECENT PRIMARY ELECTION PRODUCED **QNO EVIDENCE**U TH 0020E1A01 AT ANY IRREGULARITIES TOOK PLACE. **R**T *THE JURY FURTHER SAID IN TER 0030E1A01 M-END PRESENTMENTS THAT THE *CITY *EXECUTIVE *COMMITTEE, WHICH HAD OVE 0040E1A01 R-ALL CHARGE OF THE ELECTION, **QDESERVES THE PRAISE AND THANKS OF THE 0050E1A01 *CITY OF *ATLANTA**U FOR THE MANNER IN WHICH THE ELECTION WAS CONDUCT 0060E1A01 ED. **R**T *THE *SEPTEMBER-*OCTOBER TERM JURY HAD BEEN CHARGED BY *FUL 0070E1A01 TON *SUPERIOR *COURT *JUDGE *DURWOOD *PYE TO INVESTIGATE REPORTS OF PO 0080E1A01 SSIBLE **QIRREGULARITIES**U IN THE HARD-FOUGHT PRIMARY WHICH WAS WON B 0090E1A01 Y *MAYOR-NOMINATE *IVAN *ALLEN *JR**.. **R**T **Q*ONLY A RELATIVE HAND 0100E1A01 A01 0010 1 The Fulton County Grand Jury said Friday an investigation A01 0020 1 of Atlanta's recent primary election produced "no evidence" A01 0020 9 that any irregularities took place. A01 0030 5 The jury further said in term-end presentments that A01 0040 3 the City Executive Committee, which had over-all charge A01 0050 2 of the election, "deserves the praise and thanks of A01 0050 11 the City of Atlanta" for the manner in which the election A01 0060 11 was conducted. A01 0070 1 The September-October term jury had been charged A01 0070 9 by Fulton Superior Court Judge Durwood Pye to investigate A01 0080 8 reports of possible "irregularities" in the hard-fought A01 0090 6 primary which was won by Mayor-nominate Ivan Allen A01 0100 5 Jr&. LOB Corpus was finished in Oslo/Bergen in 1979. Concordances were made to both Brown and LOB Corpus. The texts and concordances were distributed on magnetic tape and microfiche. One fiche = 207 pages (each with 72 lines with 132 columns). The LOB concordance contained frequency counts from the Brown Corpus. The LOB KWIC used 100 fiches. London-Lund corpus was distributed on tape. 1970s Mainframe computers: Univac, IBM, ICL 1971 Floppy disk (diskette) 1975 Altair 8800 Personal computer 1976 Apple I 1977 Apple II 1978 VisiCalc, spreadsheet 1979 WordStar, word processing software 1980 Seagate 5.25” 5 MB hard disk 1981 IBM PC (4.77 MHz, 16/64 kB RAM, 160 kB 5.25” diskette, MS-DOS, CGA) 1982 Commodore 64 1983 IBM PC XT (128 kB RAM, 10 MB HD, 360 kB diskette) 1983 Apple Lisa, first GUI interface 1984 Apple Macintosh (128 kB, 400 kB 3.5” diskette) 1984 First HP Laserprinter (Apple LaserWriter PS 1985) 1984 IBM PC AT (286 6-10 MHz, 20 MB HD, 256kB RAM, 1.2 MB diskette, EGA) 1984 MS/DOS 3.1 1985 Windows 1 1985 Philips CM-100 CD-ROM (Apple 1988) 1987 PS/2 (386 8-20 MHz, 640 kB RAM, 1,44 MB 3.5”, 20-70 MB HD, VGA) 1990 World Wide Web, text version 1990 Typical PC: 486 25 MHz, 4 MB RAM, 150 MB HD 1992 Windows 3.1 1993 Mosaic graphic web client 1994 MS/DOS 6.0 1995 Windows 95 1997 Typical PC: Pentium II 233 MHz, 64 MB RAM, 4 GB disk 2001 Windows XP 2007 Windows Vista 2009 Portable PC: Dual Core 2.2 GHz, 4 GB RAM, 400 GB HD 2009 Desktop PC: Quad Core 2.6 GHz, 16 GB RAM, 1000 GB HD Development of computers 1970-2009 Moores law: transistor count doubling every two year 1981: London-Lund KWIC concordance available on tape. 1982-1985: POS-tagging of LOB in Lancaster and Bergen (CLAWS1, Constituent Likelihood Automatic Word-tagging System). Word list and suffix list for look-up were based on the tagged Brown Corpus. Text and concordance available on tape. 1987: Melbourne-Surrey Corpus available (100K word newspaper text). ICAME News -> Journal. A version of Brown Corpus indexed by the MS-DOS program WordCruncher was made by Randall L. Jones from Brigham Young University (11 MB including index files). The index was so efficient that the program could be used on a standard IBM PC XT/AT. Distribution on diskettes started. Kolhapur Corpus (Indian English) and Lancaster Spoken English corpus were added to the collection. A mail-based infoserver was started (FAFSRV at NOBERGEN, EARN/BITNET). 1990: Polytechnic of Wales Corpus. 1992: Lancaster Parsed Corpus, Corpora list started. FTP info-server. Gopher server in 1993. ICAME CD-ROM collection, version 1. Contained Brown, LOB, Kolhapur, London_Lund and Helsinki Corpora, all indexed by WordCruncher. Macintosh/Unix version of the texts. Texts also indexed by MS-DOS program TA CT. WordCruncher logo 1980 = 5 MB, 2009 = 1000 000 MB 1995: Newdigate newsletters, ICAME web-site, 900 members on Corpora list 2000: ICAME CD-ROM, version 2, COLT CD-ROM with sound files, Internet search for holders of the CD-ROM to the main corpora. 2009: More than 3000 members on the Corpora list. Some statistics Content of ICAME CD, version 2: Future: More material, new CD/DVD More corpora searchable on Internet Part of CLARIN (www.clarin.eu)

More Related