810 likes | 1.17k Views
2013 Esri International User Conference July 8–12, 2013 | San Diego, California. Technical Workshop. Planning: Enterprise Geodatabase Solutions. John Alsup Jeff DeWeese. Agenda. Overview Database Design Data Maintenance Infrastructure Design
E N D
2013 Esri International User Conference July 8–12, 2013 | San Diego, California Technical Workshop Planning: Enterprise Geodatabase Solutions John Alsup Jeff DeWeese
Agenda • Overview • Database Design • Data Maintenance • Infrastructure Design • Data Distribution and Infrastructure Security Planning: Enterprise Geodatabase Solutions
Networks Surveys Addresses Vectors 27 Main St. Annotation ABC 3D Objects Attributes 107’ Topology Dimensions Terrain CAD Drawings Images What is a Geodatabase? • A database or file structure used to store, query and manipulate spatial data. • Data and functionality • Three types: • File Geodatabase • Personal Geodatabase • ArcSDE Geodatabase • DB2 • Informix • Oracle • PostgreSQL • SQL Server Planning: Enterprise Geodatabase Solutions
Enterprise GIS • GIS technology regarded by users and IT as key to business operations • May be considered mission critical • Mainstream IT – deployed and managed like any other IT system • Architecture, Interfaces, Development tools, Deployment strategies, Standards • Integrated with other enterprise systems • Requires a higher level of planning, integration, testing and support Planning: Enterprise Geodatabase Solutions
What is an Enterprise Geodatabase? • Data • Serves data promptly and efficiently • Supports multiple users and departments concurrently • Provides seamless data • Centralized data management • Data integrity • Functionality • SQL support • Collaborative editing, and long transactions • Quality control and quality assurance • Infrastructure for distributing and replicating data • Integrates spatial and business data with other systems • Leverages existing GIS and IT skills and resources Planning: Enterprise Geodatabase Solutions
ArcObjects Enterprise Geodatabase ArcMap Business tables Feature tables Spatial Index tables ArcCatalog A and D tables ArcGIS Server Raster tables ArcGIS Engine Topology tables User Schemas Geometric Network Tables Miscellaneous Tables Log files Searching Spatial Processing Temp ArcGIS Runtime Non-Spatial Business Table Native SQL Geodatabase System Schema GDB_ tables ArcSDE tables Planning: Enterprise Geodatabase Solutions
Department File Servers Centralized Data Warehouse Utilities Utilities Parks Parks WAN IT IT Assessor ArcSDE ArcSDE WAN Assessor ArcGIS Desktops ArcGISDesktops Distributed Client/Server DepartmentalGIS Centralized Database Data Warehouse Departmental GIS Operations Centralized Data Sharing Utilities ArcGIS Server/Terminal Servers (server consolidation) Parks WAN Assessor ArcGIS Desktops, Terminals and Browsers Centralized Database Enterprise GIS Operations Centralized Data Administration Organizational GIS Configurations Planning: Enterprise Geodatabase Solutions
Why Plan an Enterprise Geodatabase? • Some key reasons: • Foundation for enterprise-wide use of GIS. • Geodatabase projects are complex • Enterprise Geodatabases and GIS application design requires diligent alignment • Large geodatabase projects span organizational groups and disciplines • Impacts almost every part of an enterprise GIS solution Spatial data is a key component of an enterprise GIS architecture . . . . . . delivery of spatial data must be fast, and this requires planning. Planning: Enterprise Geodatabase Solutions
Geodatabase Project Scales • Larger Multi-phased Approach • Elaborate, large databases • Custom applications • Large user base • Potentially outsourced, dedicated project management • Lighter Workgroup Approach • Evolve the geodatabase, gradually upgrade data and applications • COTS application functionality where possible • Built in-house, part-time project management All enterprise geodatabase projects require planning … Planning: Enterprise Geodatabase Solutions
Agenda • Overview • Database Design • Data Maintenance • Infrastructure Design • Data Distribution and Infrastructure Security Planning: Enterprise Geodatabase Solutions
Challenges and Risks • Application development has a critical dependency • Normalization in the data model • Updating the model “downstream” is expensive • Mass updates are expensive in an Enterprise Environment • Thorough review of model among teams • Optimizing for publication and maintenance Planning: Enterprise Geodatabase Solutions
Geodatabase Design • Elements of good Geodatabase design • Data model reflects requirements • Scalable • Avoids redundant storage of data items • Efficient access to data • Maintains data integrity over time • Clearly documented • Provides for analysis and behavior Planning: Enterprise Geodatabase Solutions
Data Modeling Methodology Three Stages Conceptual Model Conceptual Design Tasks: • Identify business needs • Identify thematic layers • Identify required applications • Leverage data model template • Document Physical Model Logical Model Physical Design Tasks: • Create and implement model design • Generate physical schema in the DBMS • Testing and validation • Document Logical Design Tasks: • Define tabular database structure • Define relationships • Determine spatial properties • Document Planning: Enterprise Geodatabase Solutions
Conceptual Model • Identify and Document: • Business needs - requirements • Thematic layers • Required applications and system interfaces • Leverage existing model templates • Pre-designed schema of data objects • Best practices Planning: Enterprise Geodatabase Solutions
Over 25 industry-specific data models Conceptual and logical diagrams, sample Geodatabase schemas Case studies Tips and Tricks documents Developed and maintained by user and industry communities ArcGIS Data Models Web site:http://support.esri.com/downloads/datamodel Planning: Enterprise Geodatabase Solutions
Logical Model Design • Refine conceptual model based on documented requirements • Define and clarify all feature classes, tables, attributes and relationship classes • Use subtypes to control object behavior • Ex. Geometric Network can enforce behavior • Attribute domains and complex coding • Define network and topological properties and rules • Define spatial reference properties • Map placement considerations Planning: Enterprise Geodatabase Solutions
Logical Model Design • Projection • Projection on the fly can be expensive • All feature classes in the same Geometric Network must use spatial reference • Density of Features • High vertex count can be expensive • Can adversely affect functionality and usability. • Spatial placement vs. Logical placement • Data update cycle • Replacement vs. editing Planning: Enterprise Geodatabase Solutions
Physical Model Design • Implementing the physical Geodatabase - prototype, test, review, and refine • Documenting the design for distribution and efficient updating • Test, refine and tune data model design for deployment Planning: Enterprise Geodatabase Solutions
RDBMS Geometry Storage Format Planning: Enterprise Geodatabase Solutions
Important Considerations • Field Names • Geometry Storage Types • RDBMS’s used • External systems and interfaces – key for enterprise GIS • CRM, WMS,SAP, other Financials, Reporting • Number of interfaces depends upon the organization • Consider data sharing - field data types, naming and length Planning: Enterprise Geodatabase Solutions
External System Interface • ETL • Database Level, duplicating data • Triggers • Update tables • Database Views • Joins data from same or different databases Planning: Enterprise Geodatabase Solutions
Creating Structure Geodatabase • Look to existing tools • CASE and UML tools – Visio, Rational Rose, etc. • Other tools (some free) and samples may work depending on approach • Inheritance, re-use of objects through abstract and concrete classes XMI (XML Design) Physical Model Planning: Enterprise Geodatabase Solutions
Data Modeling Tools • Visio • Rational Rose • Enterprise Architect • Free Esri Tools on ArcScripts: • ArcGIS Diagrammer • GDB Xray • Geodatabase Diagrammer • Geodatabase Designer Free Tools are not supported… Planning: Enterprise Geodatabase Solutions
Utilities Parks IT Assessor WAN Oracle SQL Enterprise SQL Express DB2 Mixed RDBMS Environments • For consideration: • Field Names, length and keywords • Field Data Types and Lengths • Database behaviors Planning: Enterprise Geodatabase Solutions
Utilities Parks IT Assessor WAN GDB Enterprise GDB Workgroup GDB Enterprise GDB Workgroup Mixed RDBMS GDB License Levels • For consideration: • Domain authentication • Field Data Types and Lengths • Database behaviors Planning: Enterprise Geodatabase Solutions
Testing and Refining • Small pilot data migration with sample data • Application testing – Test workflows • Functionality • Performance • Flexibility and consistency • Team review and demonstration • Show how tasks are performed using GIS • Show maps, reports, online demos Planning: Enterprise Geodatabase Solutions
Data Planning • Migration and Conversion • Migration deals with moving existing geospatial data between different GIS environments or platforms • Conversion refers to development of new data by creating new digital geospatial data • Conversion is typically more significant and costly than migration • Data procurement • Landbase • Imagery • Data loading • Tools – In-house or outsourced • Procedures Planning: Enterprise Geodatabase Solutions
Agenda • Overview • Database Design • Data Maintenance • Infrastructure Design • Data Distribution and Infrastructure Security Planning: Enterprise Geodatabase Solutions
Overview of Data Maintenance • Plan and manage the maintenance workflow in the geodatabase • Key Tasks • Analyze and build on business process requirements • QA/QC • Design your maintenance strategy • Plan for versioning • Define maintenance workflows Planning: Enterprise Geodatabase Solutions
Consider QA / QC • Ensure data is captured, loaded and maintained accurately • Quality Assurance • Review data to discover errors and perform data cleaning activities to improve quality. • Quality Control • Ensure data products are designed to meet or exceed data requirements. • QA/QC Plan • Versioning • Manual and automated procedures • Validations Planning: Enterprise Geodatabase Solutions
DEFAULT DEFAULT DEFAULT Versioning and Multiuser Geodatabase • Defining versioning specifications and workflows: • Versioning structure • Reconcile, post, compress regimes • Edit volumes, version durations All impact performance… Versioned Editing Non-Versioned Editing Planning: Enterprise Geodatabase Solutions
Considerations for Versions • Decide how versions will be handled: • Lifespan • Reconciling • Conflict management • Naming conventions • Structure • Staging or QC version between user versions and DEFAULT • Security • Versions for groups or departments • Workflow Management Systems for Handling Versions • Can provide workflows and efficiencies , some examples: • Job Tracking for ArcGIS (JTX) • ArcFM and Network Engineer – In the Utility Area Planning: Enterprise Geodatabase Solutions
Advanced GDB Functionality • Relationship Classes • Persisted vs. temporal • Geometric Network • Performance implications • Topology Planning: Enterprise Geodatabase Solutions
User Workflows • Document with Use Cases • A description of the task you need to perform: • “Add new parcel”, “Update new asset” • Evaluate business needs: • What data needs to be edited and in what order • Tracking of data changes • Conflict detection and resolution • Security – user roles, etc. • QA/QC steps – enforced through application or database Use case “Add new service” Version update Planning: Enterprise Geodatabase Solutions Geodatabase
Data Performance and Scalability • Essential Tasks • Review anticipated data loads • Volume (data file growth management) • Volatility (storage partitioning) • Identify key business transactions • Maintenance operations • Publication operations • Identify performance requirements for key business transactions • Response time • Initial and scheduled user loads • Throughput • Testing Planning: Enterprise Geodatabase Solutions
Performance • Geodatabase designs • Potential performance issues related to database design • Relationships • Both # and Type • Schema Cache can help reduce performance cost • Size of data stored in records • Projection on the fly • Number of records returned in a query • Density of data, both number of features and number of vertices • Application design • Can have a significant affect on performance; e.g., • Frequently opening a table • Retrieving features one at a time vs. bulk Planning: Enterprise Geodatabase Solutions
Agenda • Overview • Database Design • Data Maintenance • Infrastructure Design • Data Distribution and Infrastructure Security • Database Maintenance & Performance Planning: Enterprise Geodatabase Solutions
Infrastructure Design Key Questions • Is it available enough? • Is it big enough (i.e., capacity)? • Is it continuous enough? • Is it performant enough? • Have constraints been removed? Planning: Enterprise Geodatabase Solutions
System Availability • Define availability requirements • The Business defines their needs and states system availability as a non-functional requirement. • Example: "System should be available and online from 5am - 10 pm PST 7 days a week with peak time 6am - 6pm." • IT responds by providing a standards-based technical solution. • Balance between benefits and costs • More servers or more complex servers • More servers means more software • More servers means more administration • Consider maintenance windows • Compress / DB statistics • Reconcile / Post services • Database schema changes / software patching • Database integrity checks post-restore Planning: Enterprise Geodatabase Solutions
High-Availability DB Solutions • Virtual Server Clusters* • Provides “basic HA” • Recovery time can be tens of minutes • Active/Passive Fail-Over Clusters • Services fail-over to stand-by node • Down time measured in minutes • Semi-complex • Active/Active Clusters (i.e., Oracle RAC) • Services fail-over to remaining active nodes • Down time measured in under a minute • Costly / Complex • Fault Tolerant Clusters • Provides seamless failover • Zero down time • Costly / Complex *Caution: Virtualizing “large” database servers not recommended.
DB Server Processing Capacity • Processing capacity is a function of: • CPU service time • Throughput • Max allowed CPU% • Relative performance of the hardware Proper capacity is required to support expected peak user loads while maintaining reasonable performance. Planning: Enterprise Geodatabase Solutions
DB Server Memory Capacity • Memory capacity is a function of: • Number of DB instances • Memory per connection • Number of MXD layers • Number of connections • Database size • Index size Providing adequate memory for the database server is critical for scalability and performance. Planning: Enterprise Geodatabase Solutions
Disaster Recovery & Business Continuity • High-availability addresses minor outages in a short time frame with largely automated means. • Server component failure • Storage failure • Disaster recovery addresses major outages that are expected to last for a significant time period. • Flood / Fire / Earthquake • Core network failure • Major power outage • DR is addressed by additional GIS computing infrastructure with replicated data in a secondary data center. Business continuity requirements dictate if GIS should participate in DR plans. Planning: Enterprise Geodatabase Solutions
Technology Selection and Performance • Technology selection is key to optimal performance. • Important to keep up with server advancements • Client / Server processing distribution is typically 90% / 10% (depends upon spatial data type) • Client technology is typically a larger factor but don’t ignore the DB server Planning: Enterprise Geodatabase Solutions
Removing Constraints • The infrastructure can only be as good as constraints allow • Is the server hardware adequate? • Is the network adequate? • Has the DB been tuned? • Are the workflows reasonable? • Is the storage architecture bottleneck free? Planning: Enterprise Geodatabase Solutions
Agenda • Overview • Database Design • Data Maintenance • Infrastructure Design • Data Distribution and Infrastructure Security Planning: Enterprise Geodatabase Solutions
Data Distribution and Infrastructure Security • Geodatabase connection architectures • Data distribution • Infrastructure security Planning: Enterprise Geodatabase Solutions
Geodatabase Connection Architectures SQL QueriesSpatial Data types ArcSDE Connect(“Application Server”) Direct Connect RDBMS Client RDBMS Client ArcSDE Libraries ArcSDE Libraries RDBMS Client Geodatabase(Database Server) ArcSDE(Application Server) Planning: Enterprise Geodatabase Solutions
Why Direct Connect Architecture? • It aligns with Esri’s long-term development strategy • Recent database technology has only been supported via direct connections • e.g., Oracle RAC, ArcGIS Server Workgroup, IBM’s DB2 on z/OS, and 10.1 support of Oracle Exadata Database Machine • It can perform faster • Assuming client or application server processors are faster • Network traffic can be reduced. • It increases the scalability of the database server • Off-loads GSRVR processing to the client side • Reduces server memory needs • It enables the ability to take advantage of database client-to-server security features • e.g., Oracle Security • It reduces deployment cost of ArcGIS for Server Enterprise • e.g., do not need to license cores on the DB server if 100% Direct Connect Planning: Enterprise Geodatabase Solutions
Data Distribution Options • Copy/Paste • Export to FGDB / Import • Can be very time consuming • Does not synchronize GUIDs and Object IDs • Database export/import • Target DB has to be stopped for the update • Can be very time consuming (entire DB export) • DBMS level replication • Snapshot / Multi-master/ Merge / Transactional / Hybrid • Limited since NOT geodatabase or version aware! • Does not know how to properly replicate advanced geodatabase objects • Cannot edit DBMS replica using ArcGIS…only parent can be edited • Geodatabase replication • See next page…. Source Target Planning: Enterprise Geodatabase Solutions