340 likes | 489 Views
Creating the Physical Model. Designing the Physical Model. Phase IV: Defining the physical model. Database Object Naming Conventions. Keep the logical and physical names similar and descriptive. Capitalize table and attribute names.
E N D
Designing the Physical Model • Phase IV: Defining the physical model
Database Object Naming Conventions • Keep the logical and physical names similar and descriptive. • Capitalize table and attribute names. • Use underscores instead of spaces to delineate separate words in an object’s name. • Use a suffix of _PK to indicate primary keys. • Use a suffix of _ID to indicate production keys. • Find a good balance between using very specific and very vague names.
Database Object Naming Conventions • Develop a reasonable list of abbreviations. • List all the objects’ names, and work with the user community to define them. • Resolve name disputes. • Document your naming standards in the metadata document. • Plan for the naming standards to be a living document.
Translating the DimensionalModel into a Physical Model • Apply the naming standards to the tables and attributes of the dimensional model. • List table columns with primary keys listed first. • Label primary keys consistently. • Identify the format and length of columns. • Label unique keys with a (#). • Label column optionality with NULL (o) or NOT NULL (*) constraints. • Label foreign keys with _FK. • Use synonyms for user tables.
# *Product _PK n# *Channel_PK n# *Promotion_PK n * PRODUCT_ID v(11)* PRODUCT_DESC v(125)* PRODUCT_NAME v(35)* CATEGORY_ID v(20)* CATEGORY_DESC v(25)* SUPPLIER_ID v(20)* PRODUCT_STATUS v(10)* LIST_PRICE n* CATALOG_ID v(20)* PRODCUT_TYPE v(20)* PRODUCT_CODE v(10)* PROMOTION_CODE v(10)* WHSE_LOCATION v(10)* VALID_FROM_DATE d* VALID_TO_DATE d Physical ModelProduct
Defining the Hardware • Transforming the base dimensional data model into the physical model includes some of the following: • Defining naming and database standards • Performing an initial sizing • Designing tablespaces • Defining an initial indexing strategy • Using partitioning to split table and index data into smaller, more manageable chunks • Determining where to place database objects on disk (RAID, striping, disk mapping) • Using parallel processing
Architectural Requirements Scalability Manageability Business Technology User Budget Availability Extensibility Flexibility Integrated Accessibility Reliability
VLM (very large memory) 64-bit Connective Open Architecture Characteristics • Robust • Available • Reliable • Extensible • Scalable • Supportable • Recoverable • Parallel
Hardware Requirements • SMP (Symmetric multiprocessing) • Cluster and MPP (massively parallel processing) • Hybrids using SMP and MPP
Evaluation Criteria • Determine the platform for your needs: SMPClustersMPP High Scalability Low High Low Maturity
Parallel Processing • Parallel daily operations • Shared resources • Memory • Disk • Nothing • Loosely or tightly coupled Operatingsystem Hardware Application Database
Making the Right Choice • Requirements differ from operational systems • Benchmark • Available from vendors • Develop your own • Use realistic queries • Scalability important
Symmetric Multiprocessing (SMP) • Communication by shared memory • Disk controllers accessible to all CPUs • Proven technology CPU CPU CPU CPU Commonbus Shared memory Shareddisks
SMP CPU CPU CPU CPU • Benefits: • High concurrency • Workload balancing • Moderate scalability • Easy administration • Limitations: • Memory (cluster for improvements) • Bandwidth Shared memory
Clusters Node1 Node2 Node3 CPU CPU CPU CPU CPU CPU CPU CPU CPU Shared memory Shared memory Shared memory Commonhigh-speedbus Commonhigh-speedbus Shareddisks
Clusters • Shared disk, loosely coupled • Dedicated memory • High-speed bus • Shared resources • SMP node
Massively Parallel Processing (MPP) CPU CPU CPU CPU Memory Memory Memory Memory Disk Disk Disk Disk
MPP nCube Arrangements • A shared nothing architecture • Many nodes • Fast access • Exclusive memory on a node • Low cost per node • Scalable • nCUBE configuration
MPP Benefits • Unlimited incremental growth • Very scalable • Fast access • Low cost per node • Good for DSS
MPP Limitations • Rigid partitioning • Cache consistency • Restricted disk access • High memory cost per node • High management burden • Careful data placement
Distributed structures: Two-tier Three-tier Four-tier (and more) Architectural Tiers • Tiered structures: • Modular • Logical separation DB server Apps server Workstations Web server Internet
Middleware • Technologies for integration Gateway
Database Server Requirements • Robust • Available • Reliable • Extensible • Scalable • Supportable • Recoverable • Parallel
Parallelism • Database • Query • Load • Index • Sort • Backup • Recovery
Further Considerations • Optimization strategy • Partitioning strategy • Summarization strategy • Indexing techniques • Hardware and software scalability • Availability • Administration
Parallel Processing • A large task broken into smaller tasks: • Concurrent execution • One or more processors Elapsedtime Notparallel Processor1 Parallel Processor1 Processor2 Processor3 Processor4
Parallel Database • Increased speed • Improved scalability • Performance gains • Availability • Flexibility • More users Parallel Processor1 Processor2 Processor3 Processor4
Parallel Query • SQL code split among server processes Subquery Subquery Query Subquery
Parallel Load • Bypass SQL processing to speed throughput Jan98 Feb98 Mar98 Ordertable
Parallel Processing • Index: reduces the time to create • Sort: allocates memory in cache efficiently
Parallel Processing • Backup: runs simultaneously from any node (online and offline) • Recovery: runs simultaneously from redo logs • Summaries: uses the CREATE TABLE AS SELECTstatement