560 likes | 729 Views
Mars Atmosphere and Volatile EvolutioN (MAVEN) Mission. PF Flight Software Critical Design Review May 23 -25, 2011 Peter R Harvey. Agenda. Software Management SwCDR Summary Software Development Plan Software Assurance Plan Software Risk Management Software Configuration Management
E N D
Mars Atmosphere and Volatile EvolutioN (MAVEN) Mission PF Flight Software Critical Design Review May 23 -25, 2011 Peter R Harvey
Agenda • Software Management • SwCDR Summary • Software Development Plan • Software Assurance Plan • Software Risk Management • Software Configuration Management • Software Test Plan • Requirements Verification • Test Environment, Test Beds • Software Maintenance Plan • Software Change Requests/Problem Report System • Software Deliverable Schedule/Status
Agenda • Software Design • Flight Software Overview Hardware Context Module Design Detailed Design Boot Initializaton/Safing/Modes/RTS S/C Interface - Command, Time, Telemetry Data Collection, Data Processing Memory Requirements Data Compression CPU Usage Telemetry APIDs Fault Protection – (New Since PDR)
SwCDR Peer Review Review Team Stewart Harris; UCB (Chair) Ellen Taylor, Robert Abiad; UCB Jim Francis Stasia Habenicht, Doug Leland, David Hirsch; LMCO Steve Scott; NASA IIRT Bob Bartlett, Sara Haugh, Tom Jackson; NASA Maven Project Date: 5/09/2011 Time: 1-5pm Location: UCB/SSL Report Issued: 5/13/11 Actions: 1 (RFA12) Recommendations: 0 Status: 1 of 1 submitted for approval • Review Agenda • Agenda: • Introduction • Project Overview • Management Overview • Development Process and Plans • Software Overview • Software Testing • Delivery, Installation, and Maintenance • Software status • Risks • Issues, TBDs, and action items • Abbreviations and Acronyms
Software Development Plan • Software Products: • Boot FSW (PROM) • Build1 : Support ETU DCB Test, S/C Interface tests • Build2 : Updated, Built for Flight Unit PROM • Operational FSW (EEPROM) • Build1 : Support S/C Interface, ETU Instrument I&T • Build2 : Support Storage Capabilities, Onboard Data Processing, FLT I&T • Build3 : Updated, Built for Flight EEPROM • Command and Telemetry Database • FSW Test Scripts • Table Load Scripts • Software Personnel: • One Software Engineer (34 months)
Software Assurance Plan Assurance Products Quality Assessments to Coding Standard Boot FSW CPT Review/Approval Operational FSW CPT Review/Approval Inspection Report Review/Approval Problem Report/Change Request Closure Review/Approval PR/CR Document Maintenance Test Bed ESD Verification FSW Review Support Support to IV&V Assurance Personnel PF Quality Assurance Src: MAVEN_PF_SYS_008A_PFDPUFlightSoftwareDevelopmentPlan.doc
Software Risk Mgmt Plan • Software Risk Management • MAVEN_PF_SYS_012_RMP • MAVEN_PF_SYS_008_PFDPUFlightSoftwareDevelopmentPlan • All PF Personnel are Responsible for Defining Risks • PF FSW Personnel are Responsible for Additional Tracking Efforts • All FSW Risks must be submitted to PF SysEng for possible inclusion • FSW Risk Process • Continuous Identification of Risks • Assess Risk / Reevaluate Monthly • Develop Contingency Plans • Provide Risk Status Monthly • Mitigation if Authorized by SysEng Src: MAVEN_PF_FSW_009_SRD.xls
Software Config Mgmt Plan • Change Authority • FSW-SysEng-CogE CCB for Level 4 • FSW only on Level 5 changes • Software Configuration • L5 Requirements • Command&Telemetry Spec • Analysis Files • Test Scripts • Displays • Test Reports • All Source & Object modules • All Include files • Version Descriptors • Method Used • Software: Tortoise SVN • Hardware: maven.ssl.berkeley.edu • Provides Change Tracking/Reversion MAVEN_PF_SYS_010_FSWRequirements.xls (Level 4) MAVEN_PF_FSW_002_SRS.doc (Level 5a: Boot, Level 5b: Operational) Boot Operational Src: MAVEN_PF_FSW_002_SRS_Tables.xls
Software Test Plan Test Environment Test Platform ETU DCB or ETU PFDPU Test Equipment: GSE PC GSEOS Software Spacecraft Sim. Instrument Sim. Logic Analyser* Digital Scope* Data Storage: All instrument data and housekeeping Command/Event logs Network access (to Science and Remote GSE computers) *: Not shown Src: MAVEN_PF_FSW_006_STP Revision -, 4/22/2011
Software Verification • FSW Verification (Source: MAVEN_PF_FSW_006_STP) • MAVEN_PF_FSW_002_Tables.xls tracks requirement flows Development status, test overview and procedure name
Software Tests & Environments • Other Tests/Environments • PF ETU Integration and Test of all Instruments • LPW • SEP • MAG • SWEA • SWIA • STATIC • PFDPU – S/C ETU Test at LM • Long Duration Testing on PFDPU ETU (Boot & Op) • Comprehensive Performance Testing on PFDPU ETU (Boot & Op) • PF FLT Integration and Test of all Instruments • PF High Fidelity Simulator to LM Test Bed • These environments provide considerable independent testing
Software Maintenance Plan • FSW Maintenance • PF DPU ETU & GSE Maintained in Flight Configuration • No External PROM simulation. • Verify command uploads prior to uplink • Anomaly resolution Inject test cases Display / verify result
Software Problem Reports Problem Reports/Change Requests Kept for Delivered Products CogE’s added as needed Analyses of Cause Before/After Code sections Test procs included Tracked in Reports/Reviews Closed before Delivery
Software Deliverable Schedule Deliverables (src:MAVEN-SYS-PLAN-0020 rev B)
FSW Design Overview Development Plan : MAVEN_PF_SYS_008A Processor : 16 MHz Coldfire IP Memory : 32K PROM, 256K EEPROM, 3 MB SRAM, 8 GB Flash Language: C Deliveries: Boot, Operational Requirements: 66 (Boot), 188 (Operational) SLOC : ~3900 in 9 modules (Boot), ~15000 in 19 modules (Operational) Test Platform: DVF (Development and Verification Facility) Major Functional Requirements: Command Reception & Distribution Engineering Housekeeping Telemetry On-Board Limit Monitoring, Fault Response Relative Time Command Sequences Real-Time Data Collection and Playback Archive Data Collection and Playback Data Compression High Voltage Controls Attenuator Controls Controllers for Six Science Instruments Fault Protection
Hardware Context CPU Block Diagram (DCB)
Module Design FSW Major Modules
Boot/Initialization Hardware Reset Power-On WDRST -- Watchdog Reset (8-seconds) SCRST – Spacecraft Reset (Commandable) Reset Sequence FPGA Copies PROM into RAM FSW Initializes Local Data RAM to zero Initializes Each Module If Power-On Reset, Starts in Safe Mode Begins Engineering Telemetry (1-sec) Checksums EEPROM programs (2-4 of them) Selects first-program with Good Checksum Waits 4 seconds elapsed time Runs Selected Operational Program Continues to run Safing Sequence
Boot/Safing Seq FSW Safing Sequence Delays 1 second to allow Operational program to stop this sequence; Turns Off all HVs (STATIC, SWIA, SWEA); Delays 60 seconds to allow HV to dissipate; Turns Off all Instruments; Delays 200 seconds to allow Actuator Guardband lockout to timeout; Closes EUV, SEP1 and SEP2 doors; Issues “Safe Me Request” to Spacecraft invoking HDW sequence HDW Safing Sequence Spacecraft Will Power Off PF Automatic Power-Off Door Closures Will Actuate
Modes & Enables FSW Modes Safe – Minimal Activities Allowed Normal - FLASH Memory Allowed, HV, Attenuators Engineering – EEPROM Writing Implementation All Enables are Masked by ModeMask for Safe/Norm/Eng Mode Transitions Have Associated Mode Initialization Script
Relative Time Sequences RTS Database has up to 64 RTS Sequences. All RTS can run simultaneously. Commands executed at 4 Hz. RTS are variable length. Each can be enabled/disabled. RTS can start other RTS sequences and/or loop. Boot Oper
S/C CMD Interface • Command/Timing Information • Commands Use 56 Kbaud Async messaging • S/C Inter-command gap of 30 msec • DMA channel input to 2 x 1024 byte buffers • DMA Automatically switches buffers at 2 ms gap • FSW verifies FPGA transfer status, verifies format prior to use • Time Update Messages at 1Hz • PF FSW must tolerate time update gaps • Zone Alerts Messages at 1Hz • PF FSW must initiate Safe Mode if 3 ZA’s missed in a row • PF FSW must safe the instrument and report “Alive” • If FSW cannot implement safing, it reports “SafeMe” • Format same as TLM (see below)
S/C CMD Time Update Timing Dual Time Bases Required (Spacecraft & Instrument) S/C & Instrument Both Provide 1Hz Ticks DCB Actel Latches Time Difference Between Ticks S/C Sends Time Update 500msec before the 1pps pulse At DCB 1pps, FSW latches S/C time+offset, increments afterward. Accuracy of 0.01 sec required Src: MAVEN_PF_FSW_020_Analyses.xls
S/C TLM Interface • Telemetry Information • Telemetry Uses 56 Kbaud Async messaging • Telemetry will use CCSDS packet headers inside Transaction • Telemetry will use 2x5120 byte DMA buffers • Telemetry must send Aliveness message every second • Commandable Rate: 4.77 Kbps average to 37.10 kbps (4636 Bytes/sec) • Compressing Data Allows Archive Playback Allotment • FSW monitors/adjusts RealTime & Archive mix Transaction Format IP = Internet Protocol UDP= User Datagram Protocol CIP= Common Inst Protocol IDP=Inst Dependent Protocol
Data Collection Data Collection DMA Channels are Assigned to Each Data Instrument FSW Writes Destination Addresses into Each DMA Controller DMA Registers are Double-Buffered to Eliminate Gaps DMA Buffers Automatically Swap at 1, 2 or 4 seconds FSW Modules Process Messages Using Inst. Message Headers Expect to Process using 1-second instead of 2 or 4 secs Data Collection Rates Max Raw Input Rate of ~153 KB/sec (1.2 Mbps) Vast Majority Summing Counts
Data Processing BKG Interrupts 256 Hz Interrupt Process Distributes CPU Time per Table Basic ¼ second table repeats 4Hz CMD, PWR, HSK get 32 Hz Instruments get 8-16 Hz, etc. Implements module requirements Easily reconfigurable (spares) FSW measures time in each ISR FSW measures total CPU% Design for < 50% usage EXEC Loop - Up to 4 User Programs - Calcs that take > 2ms
INST Manager Generic Instrument Manager Design (type1)
INST Manager Generic Instrument Manager Design (type 2)
STATIC Memory Requirements Primary Buffers Processing Buffers Data Processing Non-Volatile Storage Svy = Survey Telemetry Arc = Archive Storage Src: FSW_020_Analyses.xls
Real-Time Data Mgmt RT Data Management
Archive Data Mgmt Archive Data Management
Archive Data Storage FLASH Hardware 8 GB Capacity Each 4GB powered separately EDAC Enabled Write/Read DMA-Channel to/from SRAM Block Addressable 2^16 128KB Blocks Each Block has 2K extra bytes EDAC Bad-Block-Indicator Erase Count Write Time FSW Functions Stores/Retrieves Archive Science Blocks Circular Memory with Separate Read & Write Ptrs Playback Commanded by Block Number and Length Both Read/Write Block pointers Telemetered Ground S/W keeps Time-to-Block Number relationship FMAP of 256 provides 32 MB control FSW_020_ANALYSES.XLS FMAP : FLASH Virtual-to-Physical Memory Map
PFP-RFA-12 • Title : SSR Control • Reviewer: S. Harris(Chair) • Action: • Provide rationale for the additional layer of bad block mapping, that is provided by the physical-to-logical mapping of 32MB chunks of FLASH memory. This seems like large blocks of memory that are getting flagged as bad. • Rationale: • Bad blocks in the FLASH memory are already flagged, at the 128kB block level. Why does the FLASH need additional segregation into 32MB chunks? • Response: • The presentation didn’t address some of the concerns of the Flash memory, and so it is unclear why there is a need for recovery at larger levels than the native 128KB block. There are basically three effects which the map solves. • [1] If there are long sections of memory in which there are bad 128KB blocks, the software writing speed will decrease as the software searches for a good block to write on. This may lead to the point where we lose data. For MAVEN, given the basic logic shown at SwCDR, it would require about 8 seconds of reading status to skip an entire 32 MB (32MB/128KB=256 blocks, 256 Blocks/4 = 64 interrupts, 2 interrupts per read of 4 blocks -> 8 seconds). So, we could maintain about 60 kbps without data loss if there was just 1 bad 32MB section.
PFP-RFA-12 • Title : SSR Control (con’td) • [2] Using the Flash memory itself for the status data has some risk that the 128KB memory status doesn’t actually write. Thus, even if we set the status to say “BAD,” the 128KB block may still say “GOOD”. • [3] There is some possibility that there is a problem powering one of the 4GB memory modules, either because the switch fails or the Flash module takes too much power, or perhaps doesn’t work at all. These cases are handled by mapping out the complete module. • The map element (1/256) is only 0.4% of the memory so this is reasonably small from the percentage point of view.
Data Compression Count Compression Four scales required {19-to-8, tbd, tbd, tbd} Standard lossy encoding Offscale limit checking STEREO (80196) rate: 22uS/byte output Compression rate (Coldfire): ~364 kbps CPU Use*: ~7.5% Waveform Delta-Modulator Lossless algorithm from THEMIS & RBSP Inputs 32-sample blocks Result is 1 raw 16-bit sample, 31 deltas Width of the deltas is determined Deltas are tightly packed Compression rate (Coldfire): 0.9-6.2 Mbps CPU Use*: ~3% *: Worst case assumes all 27 kbps compressed with this algorithm
CPU Usage • CPU Comparison to Prior Instruments • Coldfire IP @ 16 MHz ~ 8x faster than heritage CPUs • Even 2:1 or 4:1 inefficiency for C quite tolerable • Will Implement Assembly if needed & where effective • Test Sample :”VectorSum() routine” • Coldfire C v Z80 Assy • Coldfire:Z80 over 5:1 ratio
Telemetry APIDs CDR Set of Science APIDs
Fault Protection (FP) • R1. The spacecraft shall provide the instrument zone alert states to the PFDPU, RSDPU, and NGIMS at a rate of once a second for the following: • EUV boresight in Ram below parameterized altitude • SEP 1 parameterized rectangular FOV 1 in Sun • SEP 1 parameterized rectangular FOV 2 in Sun • SEP 2 parameterized rectangular FOV 1 in Sun • SEP 2 parameterized rectangular FOV 2 in Sun • Ambient Density STATIC > parameterized density limit • Ambient Density SWIA/SWEA > parameterized density limit • Ambient Density EUV > parameterized density limit • IUVS parameterized rectangular FOV in Sun • Ambient Density IUVS > parameterized density limit • Ambient Density NGIMS > parameterized density limit • R2. The payloads shall respond to a transition into a zone alert region by putting the affected instruments into a known safe state until a transition out of the zone alert region occurs. • R3. The payload shall respond to a transition into a zone alert region by blocking all internally sequenced and spacecraft initiated commands that would put an instrument into an unsafe instrument state until a transition out of the zone alert region occurs. • R4. The payloads shall issue a safe me request if the affected instrument is unable to properly configure upon transition into a zone alert region. Instruments shall stop generating HeartBeat messages. • R5. The payloads shall configure all affected instruments into a safe state if a zone alert message has not been received for a parameterized amount of time. • R6. The payloads shall configure all affected instruments into a safe state upon power up until a zone alert status is established. • R7. HeartBeat should indicate PFDPU is fully functional, meaning TBD (should be sent every 1 second).
FP Notes Associated Requirements • P1. Users must be able to disable HV or Door operations, regardless of S/C Zone Alerts. • (e.g. PF GSE cannot release Zone Alert and allow HV to ramp up when we don’t want it to. ) • P2. Door Actuations have a timeout following actuation of TBD seconds for the SMA to cool down. Implementation Details • Telemetry Task does not itself need to be monitored. If it fails, the S/C will know and take action. • When Tasks succeed, they clear their respective “time-since-task” register. • PF will principally rely upon Relative Time Sequences for Open/Close or HVON/HVOFF sequences. • PF FSW will have two independent software activities: one for safing actions, one for safety verification. • Each safing action expected duration will be independent of others and will be commandable • If you remove power from PF, hardware circuits will safe the HV and will close the EUV, SEP1, SEP2 doors within 5 minutes. On power up, PF is guaranteed to be safe (meet R6). • If you remove power from STATIC, SWEA or SWIA, their High Voltages are zero. So, before asking for the Spacecraft to safe the PF, the PF FSW would prefer to turn off these non-complying instruments.
FP Definitions • Safing Status • Quick digest on all requests and measures • “Allowed” register is combination of Zones and User Enables registers. • “Instrument Status” is independent measures of what’s actually happening. Safing Status can be returned in “Safe Me” message
Summary • FSW Development Plans Are Understood • FSW Requirements Are Understood • FSW Interfaces Are Understood • FSW Platform Will Meet Performance Expectations • FSW Margins are Good • FSW Test Plan is Understood • Keep Going!
Backup Backup Slides