220 likes | 397 Views
.Stat Suite Architecture. A look at the key non-functional topics. Prepared and presented by Jens Dossé (OECD). .Stat Suite Architecture. Code quality. .Stat Suite DevOps. managed Gitflow process automated unit tests with coverage min 70 % coding standards applied
E N D
.Stat Suite Architecture A look at the key non-functional topics Prepared and presented by Jens Dossé (OECD)
.Stat Suite DevOps • managed Gitflowprocess • automated unit tests with coverage min 70% • coding standards applied • automated builds and deployments QA/Staging environments in the cloud Containers Code base: Dev/Master
Kanban development methodology • Issue review • Peer-review • Quality assurance • Documentation standards
Revised V8 Data model In V8 2.0 the Filter table (series identifiers) is back! [Data].[FILT_{dsd id}] - PK: SID (int) = Unique Identity surrogate key (auto-incremented int) - ROW_ID (binary) = Unique, derived ID of the data row (assuring series unicity) - DIM_{component id#x}… = dimension ID with Non-Clustered Column Stored index Fact table [Data].[FACT_{dsd id}_{A|B}] - PK: SID (int) = Unique, surrogate key generated on the FILT table - PK: DIM_TIME (int)= References the time dimension member in CL_TIME table with Non-Clustered Column Stored index* - COMP_{component id#x}… = for coded/uncoded attributes, with Non-Clustered Column Stored index* - VALUE - LAST_UPDATED Deleted Data Fact table[Data].[Fact_{dsd id}_{A|B}_DELETED] - SID, DIM_TIME, LAST_UPDATED https://gitlab.com/sis-cc/dotstatsuite-documentation/blob/9abdbdbf7bb9c23dfecf422b9da58521f49b4ef0/content/framework/architecture/db%20architecture.md *To speed up actual/available content constraints
Revised V8 Data model • Dataflow attributes[Data].[ATTR_{dsd id}_{A|B}_DF] • - DF_ID (int) = ID of the referenced dataflow item in ARTEFACT table. • - COMP_{component id#x}… = for coded/uncodedattributes • Dimension group attributes[Data].[ATTR_{dsd id}_{A|B}_DIMGROUP] • - PK: ROWID = Unique ID of the related data row in the Fact_{dsd id}_{A|B} table. • - DIM_TIME (int) = References the time dimension member in CL_TIME table. • - DIM_{component id#x}... = dimension ID • - COMP_{component id#x}… = for coded/uncodedattributes Binary unique ROW_ID generation for FACT table: Dimension #1 ID: 1 600 000Dimension #2 ID: 32100Dimension #3 ID: 16 000 000 Binary version of Dimension #1 ID: 0x186A00Binary version of Dimension #2 ID: 0x007D64Binary version of Dimension #3 ID: 0xF42400 Unique row ID: 0x186A00007D64F42400
.Stat Suite SDMX Data upload and download speed Upload, Update, and Delete: slices of 500K rows Data download: 1 million rows Test (virtual) machine setup: 16 GB RAM, vCPUs: 8 Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.4 GHz, 250 GB hard disk, Windows Server 2016 standard (64-bit), Microsoft SQL Server 2017 Enterprise Edition Dataset: 9 dimensions, 1 observation level attribute, 4 series-level attributes, 70 million observations For 1m Performance objectives: Up to 1b per 1m First 100m per 1m ISTAT: 1.37; 0.42; 1.29 1.48; 0.54; 2.29
.Stat Suite SDMX Data upload and download speed Filtered data download speed (in milliseconds) per 1000 data points for up to 3000 data points: Performance objectives: Per 1000 Per 1000
Remaining potentials for SDMX performance improvements To be discussed with Eurostat: - Speed of SDMX structure retrievals in NSI web service - SDMXSource speed
Focus on DE Performance Tuning • SOLR-based Search engine migrated from .Net to nodejs with goals: • Query response time <= a few hundred of milliseconds, but in general < 100 milliseconds • 99,99% availability • Shardingand load balancing are usable to guarantee availability and performance when scaling • Indexing activities should not significantly impact the search API performance and availability, even if indexing activity happens very frequently (every one or few minutes) as Dataflows are updated almost constantly • Data Explorer front end optimisation: • Scoped list replaces classical tree views • New table design and specific implementation optimise performance
Other security measures secure configuration/credentials/password management
Thank you! Contact: jens.dosse@oecd.org