1 / 21

Tizen Architectural Specification: Crash Reporting

Tizen Architectural Specification: Crash Reporting. TIZEN ADS 0000 Ver. 0.4 2013-11-18 Leonid Moiseichuk. Introduction Legacy solutions Architecture Detailed Architecture Appendix. Introduction.

falala
Download Presentation

Tizen Architectural Specification: Crash Reporting

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Tizen Architectural Specification: Crash Reporting TIZEN ADS 0000 Ver. 0.4 2013-11-18 Leonid Moiseichuk

  2. Introduction • Legacy solutions • Architecture • Detailed Architecture • Appendix

  3. Introduction • Crash reporting for embedded system has number of differences in comparison to desktops/servers by limitations and amount of devices. • For example, we cannot have installed debugging symbols because they consume several hundreds megabytes of space. Thus we cannot get backtrace on device. • On the other hand, the centralized secure crash information storage opens new opportunities to make a cross-component analysis to identify most important issues to be fixed first, often across several products if server part will support it. • This presentation will show how we can improve existing solution by extending towards server-based approach.

  4. Introduction | Feature Overview • Easy crash collection on release images – just install Crash Reporter packages. They might be installed always in all images as well but in disabled state. • Kernel and application crashes/oopses will be collected as well any kind of device runtime information, the crash reasons coverage will be closer to 100% • No symbols required but they might be installed if developer needs to analyze traces on device e.g. for security reasons • All possible information will be collected in the moment of crash and it will simplify analysis later by developer to fix issue. • Using centralized processing allows to identify most critical/often issue and verify integrated fix based on statistics from device population i.e. absence/reduction of new crashes with the same backtrace. • Integration with test cases (auto-upload), JIRA and probably sources indexing services (we have used Mozilla MXR) – it will reduce a lot efforts to issues reporting, identification, prioritization, fixing and verification. • Secure dumps/crashes delivery from device to collection servers depending of dump type and application.

  5. Legacy architecture: Tizen

  6. Legacy architecture: Tizen • The legacy implementation has the following areas which would be nice to improve: • kernel oopses (crashes) are not supported • using preload library libsys-assert.so lead to unwanted code execution during any application startup and required symbols installed on device • Crash Worker starts to do work after crash in short but non-controllable time – so reported data outdated for crash • the core dump files are large (in theory up to 3 GB size), thus not in all cases we can copy core file as expected in workflow • the processing crash jeopardizes device consumer qualities like performance and reliability (many copy operations with large files, using gzip, use the same space to store data) • the server part ccr.samsung.com was turned off due to security reasons, and that makes cross-analysis very difficult even practically not possible.

  7. Android native – Google Breakpad

  8. Android native – Google Breakpad • The Google Breakpad is a best multiplatform solution but: • Required linking for every process, according to documentation it leads to code changes but it can be done as a shared library • Not all application crashes can be handled – only after Breakpad initialized and if signal not handled by application • Produces minidumps based on: • ptracing crashed process – required CAP_SYS_PTRACE • processing core file – which might be up to 3 GB size • Dump generation done from server – so we may not have dumps when server is not started, crashed or already shut • By the way, debuggerd leaks in 4.2.1 about 25 MB dirty memory per 1000 crashes • File format is very strict and not compressed • Processor is already implemented and works for clients from Linux, Android, Windows, iOs, MacOS, Solaris, arm, x86, x86_64, ppc, ppc64, mips, sparc • The kernel panics are not supported at all even Android Panic facility is a part of kernel (apanic.c and apanic_mmc.c) • The VM crashes are not supported

  9. Android VM – e.g. ACRA/Acralyzer

  10. Android VM – e.g. ACRA/Acralyzer • There are huge amount VM-based crash reporters, common problems: • Covers just a VM cases and provides just a basic information about system • Expected to have server available on-line, have unsecure and non-throttling connectivity, or have a problems with logic (e.g. cannot send – delete file) • Analyzer (server) part is primitive or just proprietary

  11. Architecture

  12. Architecture: on device part

  13. Architecture: oops processing

  14. Architecture: crash processing

  15. Architecture: VM crash processing

  16. Architecture: server farm

  17. Architecture: dump file format

  18. Architecture: configurability • The uploader and crash reporting controlled from Settings, could be part of product but in disabled mode • The uploader partition could be not used for production devices (not expected many crashes) • Configuration for type=crash may looks the following: • /etc/dumper -- main configuration folder • config -- general configuration file • crash/ -- configuration for crash reporting • config -- file to be used for all crashes • app1 -- crash settings for app1 if non-standard • config -- e.g. own upload server or files • app2 -- crash settings for app2 if non-standard • etc….. • statistics/ -- configuration for statistics • config -- file to be used for all statistics uplods

  19. Architecture: remarks • The proposed architecture is not a final and it is mostly a process due to crash reporting service will require constant work to cover new builds/requests from Customers. Running through /proc/../core_pattern avoid any impact to userspace until crash happened. Absense of daemon and having separate non-reflashable partitions guaranteed that crashes will be delivered from bricked device after re-flashing, if reboot happened during uploading and in other cases. • The adaptation of proposals to other components (MobileCare etc.) is a next step in this process, most likely some pieces should be re-used or replaced because implemented in a better way that I could imagine based on my experience. • The lifelogging (e.g. memory, power, system logs), kernel OOPS support, basic and extended crash dumping, support for VM problems reporting from Java, Python, etc. could be done on the further steps and in parallel.

  20. Appendix

  21. Revision History

More Related