540 likes | 699 Views
Windows 启动的幕后过程. 喻 勇 , PMP/MCSE 微软特约讲师 yy@yuyong.net 讲义下载: www.yuyong.net. 课程简介. 深入了解 Windows 的启动过程 掌握解决启动常见问题的技巧 启动时系统崩溃或者死机 启动过程中的错误信息 登陆过程中的故障排除 启动故障的常见原因 第三方驱动程序或者应用程序 由于硬件故障引起的系统文件崩溃 向重装系统说再见. 第一部分. Windows 系统架构 系统启动过程详述 Boot.ini 重要参数. Windows 系统架构. 环境子系统. 服务. 系统进程.
E N D
Windows 启动的幕后过程 喻 勇, PMP/MCSE 微软特约讲师 yy@yuyong.net 讲义下载:www.yuyong.net
课程简介 • 深入了解Windows的启动过程 • 掌握解决启动常见问题的技巧 • 启动时系统崩溃或者死机 • 启动过程中的错误信息 • 登陆过程中的故障排除 • 启动故障的常见原因 • 第三方驱动程序或者应用程序 • 由于硬件故障引起的系统文件崩溃 • 向重装系统说再见
第一部分 • Windows系统架构 • 系统启动过程详述 • Boot.ini重要参数
Windows系统架构 环境子系统 服务 系统进程 应用程序 POSIX Replicator 服务控制器 Alerter OS/2 用户应用程序 WinLogon RPC Win32 事件日志 会话管理器 用户模式 子系统DLLs 系统线程 NTDLL.DLL 内核模式 执行体API 电源管理器 即插即用管理器 WMI管理器 I/O管理器 高速缓冲 管理器 进程线程 管理器 安全 子系统 虚拟内存 管理器 图形 子系统 文件系统 对象管理器 / Executive RTL 设备驱动程序 Windows内核 硬件抽象层 (HAL) Hardware interfaces (buses, I/O, interrupts, timers, clocks, DMA, cache control, etc.)
启动过程的重要术语 • 在系统安装过程中,写入了启动的主要文件和代码 • 系统卷(System volume): • Master Boot Record (MBR) • Boot sector • NTLDR – NT Boot Loader • NTDETECT.COM • BOOT.INI • SCSI driver – Ntbootdd.sys • 启动卷(Boot volume): • System files – %SystemRoot%: Ntoskrnl.exe, Hal.dll, etc.
启动过程 • 系统加电,读取主引导扇区(MBR) • MBR中包含了读取分区表的代码 • X86系统的分区表有四个条目 • 第一个被标识为活动的分区为系统卷(system volume) • MBR加载系统卷中的引导扇区 • 引导扇区(NT相关的) • 读取系统卷根目录并加载NTLDR • 注: • MBR和引导扇区都是在系统安装时写入的 • 通过默认的磁盘定位来进行读取,不需要文件系统干预
x86 and x64 Boot Process • NTLDR (黑屏) • 把系统从16位切换到32位,并开启内存分页(page) • 如果启动卷(boot volume)是SCSI磁盘,则使用Ntbootdd.sys进行I/O操作 • Ntbootdd.sys保存在系统卷上(system volume) • 读取并分析Boot.ini文件 • Boot.ini selections point to boot drive • Specifies OS boot selections and optional switches (most for debugging/troubleshooting) that passed to kernel during boot • 如果Boot.ini有多个条目,NTLDR显示选择菜单 • 如果用户选择启动64位Windows系统,NTLDR将CPU期换到64位模式 • 注: • NTLDR启动后,如果在系统根目录下发现有Hiberfil.sys文件且该文件有效,那么NTLDR将读取Hiberfil.sys文件里的信息并让系统恢复到休眠以前的状态,这时并不处理Boot.ini文件。 • 在双启动的情况下,如果用户选择DOS,则NTLDR加载BOOTSECT.DOS,这是供DOS使用的引导扇区副本 • http://soft.yesky.com/os/win/457/2285457.shtml
不得不说的Boot.ini • Windows中Boot.ini文件的作用 • http://support.microsoft.com/kb/314081/zh-cn • Boot.ini 文件的可用开关选项 • http://support.microsoft.com/kb/833721/zh-cn • http://www.sysinternals.com/information/bootini.html • Bootcfg 命令及其用法讨论 • http://support.microsoft.com/kb/291980/zh-cn • 如何使用 /userva 和 /3GB 开关将用户模式空间调整为介于 2 GB 和 3 GB 之间的值 • http://support.microsoft.com/kb/316739/
启动过程(续) • NTLDR (cont) • 完成Boot.ini引导选择以后,用户可以按F8进入高级启动选项 • Last Known Good, Safe modes, hardware profile, Debugging mode • NTLDR执行Ntdetect.com进行硬件和BIOS信息检测 • 启动后期会将检测结果保存到 HKLM\Hardware\Description • NTLDR加载注册表SYSTEM hive (HKLM\System), 引导驱动程序, Ntoskrnl.exe, Hal.dll,并将控制权转交给Ntoskrnl.exe的入口函数 • Boot driver: critical to boot process (e.g. boot file system driver) • 注 • 在启动早期,Windows内核还没有完全初始化,这是仅仅加载最基本的驱动程序 • NTLDR会加载启动卷(boot volume)文件系统驱动,以只读的方式访问系统启动需要的其他文件和子目录
启动过程(续) • Ntoskrnl (屏幕显示Windows启动徽标) • 通过两阶段来完成内核子系统的初始化 • 第一阶段完成对象定义(process, thread, driver, etc)和核心数据结构初始化 • 第二阶段完成对象初始化和子系统启动 • 这两个过程有随后成为”System Idle Process”的内核系统线程来完成 • I/O Manager 按顺序加载”boot-start”驱动程序和”system-start”驱动程序 • 最后, Ntoskrnl创建会话管理器进程(Session Manager) (\Windows\System32\Smss.exe), 这是第一个用户态进程
驱动程序的加载顺序 • 每个驱动程序的信息都保存在注册表中 • HKLM\System\CurrentControlSet\Services • 类型: 1 for driver, 2 for file system driver, others are Win32 services • 启动方式: 0 = boot, 1 = system, 2 = auto, 3 = manual, 4 = disabled • 察看驱动程序启动类型: • Run LoadOrd from Sysinternals • Run Msinfo32 and goto Software Environment\System Drivers • Run Driverquery (/v for verbose)
启动过程(续) • Smss.exe: • 运行BootExecute中指定的程序,例如autochk,chkdsk等 • 处理“Delayed move/rename”命令 • 初始化paging files和其余的注册表项 • 加载并初始化内核模式中的Win32子系统 (Win32k.sys) • 启动Csrss.exe (Win32子系统在用户模式的部分) • 启动Winlogon.exe
启动过程(续) • Winlogon.exe: • 启动LSASS (Local Security Authority) • 加载GINA (Graphical Identification and Authentication)并等待用户登录 • 默认GINA是Msgina.dll,可以自行开发GINA来实现基于生物信息的用户登录 • 启动Services.exe(后台服务管理器) • Services.exe启动所有标识为自动启动的Win32服务程序 Windows启动完成!
第二部分 • MBR corruption • Boot sector corruption • Boot.ini misconfiguration • System file corruption • Crashes or hangs • Driver or service startup failure • Logon problems
MBR Corruption • Symptoms: • Hang at a black screen after BIOS executes • “Invalid Partition Table”, “Error loading operating system” or “Missing operating system” message on black screen • Cause: • MBR is corrupt • Resolution: • Boot into Recovery Console • Execute the RC’s “fixmbr” command • If the partition table is corrupt you have to rely on restoring a backup MBR or use 3rd-party disk repair tools
The Recovery Console • Description: • Simple repair-oriented command-line environment • Built on a minimal NT kernel • Bootable from Win2K/XP/Server 2003 Setup CD • Type “r” to repair and then select the installation • Installable onto hard disk (winnt32.exe /cmdcons)
The Recovery Console • Capabilities: • File commands: rename, move, delete, copy • Service/Driver commands: listsvc, enable, disable • MBR/Boot sector commands: fixmbr, fixboot • Limitations: • Must “log into” the system with the Administrator password • Limits on what you can access: • Only access system directory and root of non-removable media • Can only copy files onto system, not off • You can override these in the Local Security Policy editor (secpol.msc) on the installation when its running • No networking, file editing, or registry editing
Boot Sector Corruption • Symptoms: • Black screen hang • “A disk read error occurred”, “NTLDR is missing” or “NTLDR is compressed” error message on black screen • Cause: • Boot sector corruption • Troubleshooting: • Boot into RC • Execute “fixboot” command
Boot.ini Problems • Symptom: • NTOSKRNL complains that boot device is inaccessible • Cause: • Boot.ini is missing or corrupt • Boot.ini is out-of-date because a partition has been added
Boot.ini Problems • Troubleshooting: • Boot into RC • Run Bootcfg /rebuild
System File Corruption • Symptom: • Error message indicating that NTLDR, NTOSKRNL.EXE, HAL.DLL or other system file is missing or corrupt • Blue screen with corruption message
System File Corruption • Causes: • Disk is corrupt • File is missing or corrupt • Troubleshooting: • Boot into RC • Run Chkdsk • If no chkdsk errors obtain clean copy of file and replace file • Check in \Windows\System32\DLLCache for backup • Replacement must be identical match i.e. from same hotfix or service pack • If can’t find replacement use Automated System Recovery (ASR)
Automated System Recovery (ASR) • Description: • Backup of all system state and user data on system volume • Includes registry, system files, boot sector, MBR • Made by Windows Backup • Boot into ASR from Windows setup (press F2 when prompted) and insert the ASR floppy • Capabilities: • Will restore entire system state, including boot sector, MBR, system files, and registry • Limitations: • You have to keep the backup up-to-date • No control over granularity of restore (all-or-nothing) • LAST Choice: Repair Install
SYSTEM Hive Corruption • Symptom: • NTLDR reports that System hive is corrupt • Causes: • Disk is corrupt • System hive is corrupted or deleted
System Hive Corruption • Troubleshooting: • Boot into RC • Run Chkdsk • Copy backup copy of System hive from \Windows\Repair to \Windows\System32\Config • Windows Setup makes backup after it completes • **Backing up “System State” with Windows Backup update the Repair directory** • Note: on XP you can get more recent hives from System Restore points (covered later)
Post-Splash Screen Crash or Hang • Symptoms: • System blue screens on boot • Hang before logon prompt appears • NOTE: If system auto-reboots on crash you won’t see the blue screen! • Causes: • Buggy driver • Registry corruption of non-System hive • Troubleshooting: • Last Known Goodor • Safe Modeor • RC
Accessing Last Known Good • Enable it by pressing F8 and selecting it in the Advanced Options boot menu
LKG Description • Last Known Good (LKG) Uses backup of registry control set last used to boot successfully • A Control Set is core startup configuration • HKLM\System\Control00n • Control set only includes core OS and driver configuration • Control set does not include Software, SAM, Security, or Users • HKLM\System\Select\Current points at active Control Set
LKG Description • Boot control makes a copy of the control set that booted the system • Copy is ControlSet00n, where 00n is the next available number • After a successful boot: • 1. LastKnownGood is set to the copy • 2.The previous LastKnownGood is deleted • By default, “Successful boot” is determined when • All the auto-start services have started successfully • A successful interactive log in • Can be overridden programmatically
LKG Capabilities • Restores bootable configuration when: • A new driver was installed since the last successful boot • A driver’s settings were modified since the last successful boot • System settings were modified since the last successful boot
LKG Limitations • Doesn’t work if: • An existing driver was updated • A latent driver bug for some reason becomes active • Files or registry hives are missing or corrupt
Leveraging the Failed Control Set • When you use LKG the control set you avoid is saved as the Failed control set • Look at the Failed value in the Select key – this is the control set that you aborted • Export the current control set and failed control set to .reg files • Massage the text so that there are no differences in the control set name • Windiff or Fc to see what’s different
Safe Mode Description • Try Safe Mode if LKG doesn’t work • Accessible from same boot menu as LKG • Idea is to only include core set of drivers/services • Modeled after Safe Mode in Windows 95 • Avoids third-party and unnecessary drivers, which hopefully are what’s causing the boot problem
Safe Mode Description • HKLM\System\CurrentControlSet\Safeboot guides safe mode by specifying names and groups of drivers • Normal, Network, Command-Prompt • No networking in Normal • Networking includes networking services • Command-Prompt is same as Normal except launches Command Prompt instead of Explorer as shell for when Explorer shell extensions cause logon problems • Directory Services Restore Mode: not for boot troubleshooting (for repairing or restoring Active Directory database from backup)
Safe Mode Internals • Registry keys guide what’s in safe modes: • HKLM\System\CurrentControlSet\SafeBoot\Minimal is for Normal and Command-Prompt • HKLM\System\CurrentControlSet\SafeBoot\AlternateShell specifies shell for Command-Prompt boot • HKLM\System\CurrentControlSet\SafeBoot\Network is for Network • Drivers and services must be listed by name or by group to be loaded • Exception: all boot-start drivers load regardless! • System assumes they are necessary to boot
Using Safe Mode • If Safe Mode works determine what’s wrong: • Compare boot logs • Analyze a crash dump • Boot logging: • Select it from same menu as LKG and Safe Mode and boot to the failure • Saves log in \Windows\Ntbtlog.txt • Reboot in Safe Mode • Safe Mode appends to the boot log • Extract failed boot and Safe Mode entries to separate files, strip “Did not load driver” lines and compare e.g. Windiff, fc
Analyzing a Crash Dump • Boot into Safe Mode • Download and install the Microsoft Debugging Tools for Windows • Run Windbg and select File|Open Crash Dump • Open \Windows\Memory.dmp if available, otherwise most recent file in \Windows\Minidump • Type !analyze –v to see if debugger identifies faulty driver
Resolving the Faulty Driver Issue • If you can determine what driver is causing the problem: • Roll back to a previous version if one is available and known to be stableor • Disable it with Device Manager • Note: can’t do this for non-PnP drivers: use the registry editor
Using Driver Rollback • Access the rollback option on the Driver tab of a device’s properties • Backup drivers are stored in \Windows\System32\Reinstallbackups
Disabling Drivers • Open the Device Manager on the Hardware page of the System applet • Change usage to Disabled • Or use the SC command to change the start type of a specific driver
Finding the Faulty Driver • There are three approaches when you can’t determine what driver is causing the boot to fail: • Use the Driver Verifier to catch the faulty driver • Disable drivers that don’t load in Safe Mode one by one until the system boots normally • Use System Restore (Windows XP only) as a last resort
The Driver Verifier • The Driver Verifier catches drivers performing illegal operations: • Buffer overflow • Invalid memory access • Invalid I/O commands • Launch it with Start->Run->Verifier • Enable the Driver Verifier on all drivers from within Safe Mode • Choose “custom settings” and then “select individual settings” • Check all settings except “low resource simulation” • Boot normally and you’ll hopefully get a crash that is easy to analyze • Note: the Driver Verifier is disabled in Safe Mode
System Restore Description • Rollback system to previous state (registry, COM+ registration database, user profiles, other files not protected by WFP) • New to XP (not included with Server 2003) • Enabled by default • Replacement of certain file types causes original version to be stored in a restore point folder • 569 file types monitored—see Platform SDK for list • Restore operation replaces these files • Implemented as a service and a filter driver • Access the System Restore Wizard from Start->Help and Support->System Restore • Safe Mode asks when you log in if you want to run the wizard
System Restore Creation • Restore Points are created: • Every 24 hours no one is logged on • Every 12 hours when someone is logged on • When installing an unsigned driver • When explicitly requested by user or an install program (via an API or script) • Start->Help and Support -> System Restore
System Restore Internals Applications User mode Kernel mode File system request System Restore Filter Change.log1 File System Driver (NTFS/FAT) A0009653.exe A0009654.ini \System Volume Information\ _restore{XX-XXX-XXX }\ RP5
Using System Restore • Note that you can also use restore points to obtain backup registry hives
When Safe Mode Fails • Symptom: • Safe mode crashes the same as a normal boot • Causes: • The driver causing the crash also loads in safe mode • Troubleshooting: • Determine the problematic driver: • Boot into RC and look at the last line in the boot log • Boot into debugging mode • Disable it with the RC’s “disable” command
The Logon Process • Winlogon sends username/password to Lsass • Either on local system for local logon, or to Netlogon service on a domain • Creates processes for executables listed in HKLM\Software\Microsoft\Windows NT \CurrentVersion\WinLogon\Userinit • By default: Userinit.exe • Runs logon script, restores drive-letter mappings, starts shell • Userinit creates a process to run HKLM\Software\Microsoft\Windows NT \CurrentVersion\WinLogon\Shell • By default: Explorer.exe • There are other places in the Registry that control programs that start at logon
Logon Errors • Run MsConfig (XP and higher) • Doesn’t show you lots of things • Run Sysinternals Autoruns to see what applications automatically start • Select “show only non-microsoft” to isolate third-party applications
http://support.microsoft.com/kb/326841 • http://support.microsoft.com/kb/324465