Projects 2004-05 1st Period

Projects 2004-051st Period • Project Listings • Title • Research Areas • Implementation Language • Abstract • Example Output/Analysis

Archiving of Articles in a Searchable Database Research: Data Mining, Information Gathering on the Internet Implementation: Perl

The purpose of this project is to archive articles referenced in RSS feeds in a database into a searchable and linkable format. Additionally, it is planned for the use of datamining techniques in the assembly of this database. Abstract

Data Pending Example Output/Analysis

Construction and Application of a Cluster of x86 Pentium II Computers Research: Parallel Computing, System Administration, Grid Computing Implementation: Linux shell scripting, OpenMosix

This project is to construct a supercomputing cluster of about 15-20 or more "MCW" computers with the OpenMosix Kernel Patch. Once constructed, the cluster could be configured to transparently aid workstations in processing demanding jobs run in the lab. The project can increase the computing power of the lab and also serves as an experiment in building a low-level, low-cost cluster with Linux. This idea can be useful to facilities with old computers that otherwise may be deemed outdated. Abstract

- Openmosix/kernel: understand completely the workings of the openmosix patch, also try to understand the other beowulf cluster systems such as Condor. - Grid Computing: try to locate large scale clusters and understand how they operate. - System Administration: general system administration for the cluster, pretty broad, just enough to keep it running nicely. - Parallel Computing - ie, the fundamental idea and benefits, the problems of file-sharing, etc. Output/Analysis

Investigation Into Factoring of Polynomials in Zp[x] Research: Algorithms Implementation: C

Data Pending Abstract

Data Pending Analysis

The Study of Microevolution Using Agent-Based Modeling Research: Agent Based Modeling, Artificial Life, Object-Oriented Programming Implementation: C++

The goal of the project is to create a program that uses an agent-environment structure that imitates a very simple natural ecosystem: one that includes a predator species and a prey species, each with DNA and unique physical abilities. When this structure is built, the quantitative attributes of each species will be graphed for analysis. Abstract

This matrix is currently printed. Each number/character set is a space in the matrix. The number is the ID number of the organism in the space, while the character is the gender of the organism. The "0n" is zero-n, and it signifies a null organism, or empty space. 0n 1f 0n 2f 0n 0n 0n 3m 0n 0n 0n 4m 0n 5f 6f 7f 0n 0n 8f 0n 9m 0n 10f 0n 0n 0n 0n 0n 0n 11f 0n 12f 0n 13f 0n 14f 0n 15m 0n 0n 0n 0n 16m 0n 17m 18m 19m 20m 0n 0n 0n 0n 21f 0n 22m 23m 0n 24m 25m 26m 27m 28m 29m 30f 0n 31m 32f 0n 33m 0n 0n 34f 0n 35m 0n 36m 37f 38m 39m 40f . . . Example Output/Analysis

Using Machine Translation in Creation of a German-English Translator Research: Natural Language Processing Implementation: Java

There are many different natural languages in the world that people speak, and people are often unable to communicate to each other if they are speaking different languages. It is beneficial to have a trustworthy translator that can accept sentences from one language and translate it into another language. Languages spoken by humans, such as German or English, are difficult for computers to parse for understanding. The languages rely on context and semantics - what a particular word or phrase means in different situations. Natural language processing is concerned with allowing the computer to understand natural languages for translation from one to the other. Abstract

Example Output/Analysis

A Study of Balanced Search Trees Research: Red-Black trees and other tree balancing algorithms Implementation: C++

The subject of this project is the research into binary tree variants and to compare the efficiencies of variants using large data. Binary trees are a classical algorithm problem that is ubiquitous in computer science applications. The primary research area for this project is advanced algorithms and data structures, with an emphasis on data structures. The result will provide pros and cons of binary tree variants. Abstract

Balanced search trees are important data structures. A normal binary search tree has some disadvantages, specifically from its dependence on the incoming data, that significantly affect its tree structure hence its performance. To improve efficiency, tree strategies have been developed that self-balance into optimal tree structures allowing quicker access to stored data stored. For example, red-black tree is a balanced binary tree that balances according to a color pattern of nodes (red or black) by rotation functions. Other balanced trees have been developed: red-black tree, AVL tree, weight-balanced tree, B tree and more. Analysis

Kernel Debugging Userspace API Library (KDUAL) Research: Linux Kernel Development Implementation: Linux scripting language, C

The purpose of this project is to create an implementation of much of the kernel API that functions in user space, the normal environment that processes run in. The issue with testing kernel code is that the live kernel runs in kernel space, a separate area that deals with hardware interaction and management of all the other processes. Kernel space debuggers that are unreliable have difficulty dumping useful error information because there's no operating system left to write that information to disk. Abstract

There are two important sources of literature on the Linux kernel. One is the kernel itself. Like many other large scale projects (such as the Perl sources) the kernel code is heavily commented, and reading through the actual source files for a particular subsystem can be extremely educational. After the kernel, the primary source of information on kernel development is the Linux Kernel Mailing List, the LKML. This is the official kernel development mailing list, home discussion board of all current work in progress. Analysis

Programming A Baseball Simulation and Using It For Evaluating Statistical Hypotheses Research: Evaluation of statistical methods Implementation: C++

Investigate statistical characteristics of winning teams. For example, compare percentages of winning teams who had a higher on base percentage plus slugging percentage than the losing team. Which statistics are most important for winning teams? Abstract

//ORIOLES----------ORIOLES-----------ORIOLES------- Orioles.Pitchers[0].Name = "Ponson"; //p1 Orioles.Pitchers[0].PitchAcc = 50; Orioles.Pitchers[0].PitchSpeed = 50; Orioles.Pitchers[1].Name = "Lopez"; Orioles.Pitchers[1].PitchAcc = 50; Orioles.Pitchers[1].PitchSpeed = 50; ... Orioles.Hitters[0].Name = "JLopez"; Orioles.Hitters[0].Power = 50 ; Orioles.Hitters[0].Contact = 50; Orioles.Hitters[0].Speed = 50; Orioles.Hitters[0].Discipline = 50; Orioles.Hitters[1].Name = "Palmeiro"; //1b . . . Example Output/Analysis

Implementations of DNA Sequence Pattern Matching Algorithms Research: Computational Biology, Bioinformatics Implementation: MPI, Perl

The main objective of this project is to develop or obtain the software and infrastructure necessary to run non-intrusive background programs on the systems lab computers. These programs will increase the overall utilization of those computers and produce some real results for the Bioinformatics community. A secondary objective is to compare the efficiency, accuracy, and parallelization of several existing DNA sequence pattern matching algorithms on different machines. Abstract

Data Pending Analysis/Output

Part-of-Speech Tagging with Limited Training Corpora Research: Computational Linguistics, “Corpus” Linguistics Implementation: C++

This project is in the area of computational linguistics with a focus on part-of-speech taggers, especially in the case of limited available training materials. Computational methods in linguistics contributes data furthering statistical studies of language in theoretical linguistics and its practical offshoots. Computational linguistics also forms the core of the field known as natural language processing (NLP). NLP is the attempt to use a computer to interpret and/or produce human language writing or speech. A particularly well-known field within NLP is machine translation. Part-of-speech tagging is an area of computational linguistics known as corpus linguistics. Corpus linguistics analyzes sample bodies of written or spoken text ("corpora," singular "corpus") in order to extract data pertaining to specific languages or language in general for practical or theoretical linguistics. Abstract

POS tagging has applications in theoretical linguistics in bettering the description of syntax in a language, analyzing frequency of word and meaning usage, and in parsing, NLP, and machine translation. It forms a partial root structure to many forms of computer processing of language, and thus it is important that we understand it well.

Benchmarking using Cryptographic Algorithms Research: High performance computing, cryptography Implementation: C

The purpose of this project is to develop a set of benchmark tools that will attempt to analyze cryptographic algorithms. A byproduct of this analysis, which will be very computationally intensive, will be a benchmark of the Systems Lab computational power. Abstract

Data Pending Output/Analysis

Projects 2004-05 1st Period

Projects 2004-05 1st Period

Presentation Transcript

Computer Systems Lab TJHSST Current Projects 2004-2005 Third Period

Computer Systems Lab TJHSST Current Projects 2004-2005 Second Period

Migration Trends 2004/05

Winter Term 2004/05

2004-05

1st period seating chart

Budget Plan 2004/05

2004-05

MSBH 7001, 2004/05

Advertising 2004/05

1st period in flexicurity history

Student Computing 2004/05

1st period

1st Period

1st Period - Seating Chart

Computer Systems Lab TJHSST Current Projects 2004-2005 First Period

Computer Systems Lab TJHSST Current Projects 2004-2005 Second Period

SOCCER 2004-05

Computer Systems Lab TJHSST Current Projects 2004-2005 Third Period