200 likes | 415 Views
Object Oriented Database Design - A Case Study. Cliff Frazier CS457/657 December 6, 2002. Motivations. Permanent access to internet-published Linux kernel programming information Explore object oriented database design Learn Enhanced Entity Relationship (EER) modeling. FAQS. How To's.
E N D
Object Oriented Database Design - A Case Study Cliff Frazier CS457/657 December 6, 2002
Motivations • Permanent access to internet-published Linux kernel programming information • Explore object oriented database design • Learn Enhanced Entity Relationship (EER) modeling
FAQS How To's Tutorials Mailing Lists NewsGroups On-line Books Internet Sources of Programming Information
DB Design Approach • Data requirements • Functional requirements • Develop data model that represents our “miniworld” - EER modeling • Convert data model to physical model
Data Requirements • Organize information on Linux kernel development • Classify data by subject • Query / View / Report capability • Display mailing list and newsgroup data as threads • Provide annotation capability
LKPDB Functional Diagram ... Mailing List Newsgroup FAQs How To Tutorial ManualFunction AutomatedFunction Data Parse/Import View Classify LKPDB Query Report Annotate
Data Parse/Import • Use source specific data parsing rules where possible • Mailing lists & newsgroups • Specific rule set for each mailing list & each newsgroup • Automated data import • Use generic data parsing rules otherwise • One rule set for each data source type • Manual assistance required for import
Mailing List Parsing - Header Received: from vmg.prodigy.net by vmg with SMTP; Thu, 5 Dec 2002 12:17:39 -0500 X-Originating-IP: [209.116.70.75] . . . Date: Thu, 5 Dec 2002 09:03:03 -0800 (PST) From: Linus Torvalds <torvalds@transmeta.com> To: george anzinger <george@mvista.com> cc: Jim Houston <jim.houston@ccur.com>, Stephen Rothwell <sfr@canb.auug.org.au>, LKML <linux-kernel@vger.kernel.org>, <anton@samba.org>, "David S. Miller" <davem@redhat.com>, <ak@muc.de>, <davidm@hpl.hp.com>, <schwidefsky@de.ibm.com>, <ralf@gnu.org>, <willy@debian.org> Subject: Re: [PATCH] compatibility syscall layer (lets try again) In-Reply-To: <3DEF20E2.5AEE3E78@mvista.com> Message-ID: <Pine.LNX.4.44.0212050846100.27298-100000@home.transmeta.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org
Mailing List Parsing - Body On Thu, 5 Dec 2002, george anzinger wrote: > > I think this covers all the bases. It builds boots and > runs. I haven't tested nano_sleep to see if it does the > right thing yet... Well, it definitely doesn't, since at least this test is the wrong way around (as well as being against the coding style whitespace rules ;-p): + if ( ! current_thread_info()->restart_block.fun){ + return current_thread_info()->restart_block.fun(&parm); Also, I would suggest against having a NULL pointer, and instead just initializing it with a function that sets it to an error return (don't use ENOSYS, since the system call _does_ exist, and ENOSYS is what old kernels would return if you do it by hand by mistake. I'd suggest -EINTR, since that will "DoTheRightThing(tm)" if we somehow get confused). Linus
Mailing List Parsing - Postscript - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Parsing for Linux Threads FAQ • What kinds of things should be threaded or multitasked? • If you are a programmer and would like to take advantage of multithreading, • the natural question is what parts of the program should/ should not be • threaded. Here are a few rules of thumb (if you say "yes" to these, have fun!): • Are there groups of lengthy operations that don't necessarily depend on other processing (like painting a window, printing a document, responding to a mouse-click, calculating a spreadsheet column, signal handling, etc.)? • Will there be few locks on data (the amount of shared data is identifiable and "small")? • Are you prepared to worry about locking (mutually excluding data regions from other threads), deadlocks (a condition where two COEs have locked data that other is trying to get) and race conditions (a nasty, intractable problem where data is not locked properly and gets corrupted through threaded reads & writes)? • Could the task be broken into various "responsibilities"? E.g. Could one thread handle the signals, another handle GUI stuff, etc.?
Classification • Both automatic & manual modes • Each entry classified based on keywords • Multiple categories allowed • Categories: • Scheduler • Virtual memory management • File system
Classification Categories (cont) • Interprocess communication • Modules • Networking • Architecture related • Symmetric multiprocessing • Device drivers • Compiling • Debugging
Query Operations • SQL based • Queries used for Views, Reports, Annotations, and Classification • Primary use to perform SELECTs to search for and view or print certain data • Also include keyword search capability
Annotation Example from the Kernel HowTo . . . 7. Now, give the make command - The gcc compiler distributed with RedHat 7.0 will not compile the kernel correctly. They do supply a kernel compatible compiler as well, which is invoked with kgcc. On RH 7.0 distributions of Linux, change all occurrences of gcc to kgcc in the root level Makefile before giving the make command. ___________________________________________________________ bash# cd /usr/src/linux bash# man nohup bash# nohup make bzImage & bash# man tail bash# tail -f nohup.out (.... to monitor the progress) This will put the kernel in /usr/src/linux/arch/i386/boot/bzImage ___________________________________________________________ . . .
Data Model • Enhanced Entity Relationship (EER) modeling • Enhanced = object oriented concepts • Initial design: list entity types and their attributes • Refinement: some attributes converted to relationships
Mailing List Entity Attributes • MAILING_LIST_POST • Name e.g. Linux-Kernel M.L. * • Serial number • Author • Subject • Date/time stamp • Header • Body of post * • * Converted to relationships
EER Diagrams • Rectangle - entity • Oval - attribute • Diamond - relationship • Structural constraints • Participation • Cardinality ratio • Added types besides mailing lists
Quest. Ans. Child Chap. Subs. Keyword Category Name URL INFO_SOURCE Date_Stamp d Type Date Pub. Author Subject DOCUMENT LIST Thread_SN 1 INCLUDES d Post_SN FAQ * UNSTR. * BOOK * N Parent LIST_ENTRY 1 1 Annotated CONTAINS N A_TEXT Offset 1 DOC_TEXT Size TEXT * Same relationships to DOC_TEXT and A_TEXT as LIST_ENTRY
Conclusion • An OODB for Linux kernel programming information was designed using EER • Attributes vs. relationship roles change during design process • The design methodology influences the content of the DB • Next project - Implement DB