1.05k likes | 1.3k Views
GOPAS TechEd 2012. Ing. Ondřej Ševeček | GOPAS a.s. | MCM: Directory Services | MVP: Enterprise Security | ondrej@ sevecek.com | www.sevecek.com |. Active Directory Replication Issues and Troubleshooting. Active Directory Replication Issues and Troubleshooting. Network Services.
E N D
GOPAS TechEd 2012 Ing. Ondřej Ševeček | GOPAS a.s. | MCM: Directory Services | MVP: Enterprise Security | ondrej@sevecek.com | www.sevecek.com | Active Directory Replication Issues and Troubleshooting
Active Directory Replication Issues and Troubleshooting Network Services
Central Database • LDAP – Lightweight Directory Access Protocol • database query language, similar to SQL • TCP/UDP 389, SSL TCP 636 • Global Catalog (GC) – TCP/UDP 3268, SSL TCP 3269 • D/COM Dynamic TCP – Replication • D/COM Dynamic TCP – NSPI • Kerberos • UDP/TCP 88 • Windows NT 4.0 SAM • SMB/CIFS TCP 445 (or NetBIOS) • password resets, SAM queries • SMB/DCOM Dynamic TCP • NTLM pass-through • Kerberos PAC validation
Design Considerations • Distributed system • DCs disconnected for very long times • several months • Multimaster replication • with some FSMO roles
Design Considerations • Example: Caribean cruises, DC/IS/Exchange on board with tens of workstations and users, some staff hired during journey. No or bad satelite connectivity only. DCs synced after ship is berthed at main office. • Challenge: Must work independently for long time periods. Different independent cruise-liners/DCs can accomodate changes to user accounts, email addresses, Exchange settings. Cannot afford lost of any one.
Database • Microsoft JET engine • JET Blue • common with Microsoft Exchange • used by DHCP, WINS, COM+, WMI, CA, CS, RDS Broker • %WINDIR%\NTDS\NTDS.DIT • ESENTUTL • Opened by LSASS.EXE
Installed services LSASS TCP 445 SMB + Named Pipes Security Accounts Manager D/COM Dynamic TCP UDP, TCP 88Kerberos Kerberos Key Distribution Center UDP, TCP 389 LDAP Active Directory Domain Services NTDS.DIT
Installed services LSASS NT4.0 TCP 445 SMB + Named Pipes NTLM Pass-through PAC validation SAM D/COM Dynamic TCP Connect to domain UDP, TCP 88Kerberos KDC Windows 2000+ UDP, TCP 389, ... LDAP NTDS LDAP/ADSI Client NTDS Replication FIM/DRS API Client
Uninstallation • DCPROMO • requires working replication connectivity with other DCs • DCPROMO /forceremoval • does not access network at all • can run in DS Restore Mode
NTDSUTIL Metadata Cleanup • Connection • Connect to server srv2.idtt.local • Quit • Select operation target • List sites • Select site 0 • List domains in site • Select domain 0 • List servers in site • Select server 0 • Quit • Remove selected server
Active Directory Replication Issues and Troubleshooting Topology
Knowledge Consistency Checker (KCC) • runs 5 minutes after boot • Repl topology update delay (secs) • runs every 15 minutes periodically • Repl topology update period (secs)
Intrasite Replication Topology DC1 DC4 DC2 DC3
Originating Updates and Notifications DC1 DC4 15 sec DC2 3 sec 3 sec DC3
Notification and Replication DC1 DC2 I have got some changes Random TCP DCOM Kerberos Authenticated Give me your replica Random TCP DCOM Kerberos Authenticated
Intrasite Replication – 3 Hops max. DC1 DC4 DC2 DC3 DC5 DC7 DC6
Intersite Replication (no Bridgeheads) DC1 DC5 DC2 DC3 DC7 DC6 DC4
Intersite Replication (no Bridgeheads) DC1 15 sec DC5 DC2 3 sec DC3 schedule 3 sec DC7 DC6 DC4 3 sec 3 sec
Intersite Replication with a Bridgehead DC1 15 sec DC5 DC2 schedule 3 sec DC3 3 sec DC7 DC6 DC4 3 sec 3 sec
Intrasite Replication • Uses notifications by default (originating/received) • 300/30 sec on Windows 2000 • 15/3 sec on Windows 2003 • Occurs every hour as scheduled • nTDSSiteSettings • At this frequency KCC detects unavailable partners • HKLM\System\CCS\Services\NTDS\Parameters • Replicator notify pause after modify (secs) • Replicator notify pause between DSAs (secs)
Intrasite Replication DC1 notification DC2 15 sec random TCP download changes random TCP download changes schedule random TCP
Intersite Replication DC1 DC2 download changes schedule random TCP
Intersite Replication • Does not use notifications by default • siteLink: options = USE_NOTIFY (1) • Compression used • siteLink: options = DISABLE_COMPRESSION (4) • Bridge all site links
Site Link Design (Better?) Olomouc Paris London Roma Berlin Cyprus
Site Link Design (Worse?) Olomouc Paris London Roma Berlin Cyprus
Static TCP for Replication • HKLM\System\CurrentControlSet\Services • NTDS\Parameters • TCP/IP Port = DWORD • Replication + NSPI • Netlogon\Parameters • DCTcpipPort = DWORD • LSASS (Pass-through) • NTFRS\Parameters • RPC TCP/IP Port Assignment = DWORD • DFSRDIAGStaticRPC /port:xxx /Member:dc1
Urgent Replication (Notification) • Intrasite only • intersite also if notification enabled • Do not wait for delay (15/3 sec) • In the case of • account lockout • password and lockout policy • RID FSMO owner change • DC password or trust account password change
Immediate Replication (Notification) • Password changes • from DCs to PDC • Regardless of site boundaries • PDC downloads only the single user object • all changed attributes but only single object • From DC/PDC further with normal replication
Example Replication Traffic • Atomic replication of a single object with a one byte attribute change • Notification + replication • intersite compressed • Overall 7536 B • 30 packets ~10 round trips • 50 ms round trip means 500 ms transfer time • consumption at 120 kbps • Useful data ~80 B
Bridge All Site Links On A Olomouc • site links are transitive • can be disabled on IP transport A Prague B A London A Paris B Roma Cyprus A
Bridge All Site Links Off A Olomouc • site links are not transitive • Cyprus partition is cut off A Prague B A London A Paris B Roma Cyprus A
GC Replication A GC Olomouc • one-way:from the source NC into the nearest GC • two-way:GCs between themselves A Prague GC A London GC A Paris B Roma Cyprus A
GC Replication A Olomouc A Prague A London B GC A Paris Roma B • one-way:from the source NC into the nearest GC • two-way:GCs between themselves Cyprus A
Subnetting in AD (Apps) 10.10.x.x / 16 10.10.0.248 / 29 DC1 Exchange DC5 Exchange DC2 Exchange DC3 DC4
Subnetting in AD (Recovery) 10.10.x.x / 16 Recovery Site 10.10.0.7 / 32 DC1 DC5 DC2 DC3 DC4
Rebuilding After Failure • Inter-site • IntersiteFailuresAllowed • MaxFailureTimeForIntersiteLink (secs) • Intra-site (immediate neighbors) • CriticalLinkFailuresAllowed • MaxFailureTimeForCriticalLink • Intra-site (optimalization for non-critical) • NonCriticalLinkFailuresAllowed • MaxFailureTimeForNonCriticalLink
Active Directory Replication Issues and Troubleshooting Modifications
Modification operations • Create new object • Modify attributes • change/delete value • change distinguishedName = rename • Rename container • all subobjects renamed as well
Replication Metadata • REPADMIN /ShowObjMeta • all attributes • when • originating DC
Replication conflicts • The later action wins • if no one is later then random (USN) • Attribute modified on two DCs “simultaneously” • only one change wins • Linked multivalue attribute modified • merged (on 2003+ forest level) • Object/container deleted and object modified • deleted • Object moved into a deleted container • CN=lost and found • Two objects with the same sAMAccountName, cn or userPrincipalName created • object renamed, logins duplicit
Replication 11:05 Kamil 10:00 DC1 9:00 Helen 11:00 DC1 DC2
Replication Basics 11:30 DC1 11:30 Kamil 10:00 Kamil 10:00 Helen 11:00 DC1 DC2 Helen 11:00
Replication Basics 12:05 Kamil 10:00 DC1 11:30 Helen 11:00 Kamil 10:00 Judith 12:00 DC1 DC2 Helen 11:00
Replication Basics 12:30 DC1 12:30 Kamil 10:00 Kamil 10:00 Helen 11:00 Helen 11:00 Judith 12:00 Judith 12:00 DC1 DC2
Replication Basics DC1 12:30 12:30 Kamil 10:00 DC1 Helen 11:00 DC1 Judith 12:00 DC1 DC2 Kamil 10:00 Helen 11:00 Marie 11:00 Me Judith 12:00 DC1 DC3
Replication Basics DC1 12:30 12:30 Kamil 10:00 DC1 Helen 11:00 DC1 DC2 Judith 12:00 DC1 Kamil 10:00 Helen 11:00 Marie 11:00 Me Judith 12:00 DC1 Kamil 10:00 DC1 DC3 DC1 10:30 DC2 7:00