300 likes | 688 Views
tapeSAN and NDMP Backup for Data ONTAP Systems. Piyush Agarwal, SDT Americas Team Lead D2D, Backup, VTL. tapeSAN Module: Objectives. At the end of this module, you will be able to: Describe the components of tapeSAN solutions for Data ONTAP systems and certification matrix
E N D
tapeSAN and NDMP Backup for Data ONTAP Systems Piyush Agarwal, SDT Americas Team Lead D2D, Backup, VTL
tapeSAN Module: Objectives At the end of this module, you will be able to: • Describe the components of tapeSAN solutions for Data ONTAP systems and certification matrix • Describe the components of NDMP solutions for Data ONTAP systems and integration with backup application • Understand NDMP – situation analysis, challenges, road map • Describe role of VTL in simplifying NDMP backups of filers • Best practices and recommendations with NDMP backup issues • Tools, information and contacts
NDMP Learning Courseware • ILT and Self learning Courses on NDMP are offered by NetAppU • Due to limited time and facilities Hands-on exercises had to be omitted for this training • Students interested in expanding expertise on NDMP backups are encouraged to take the NetAppu course which also includes hands on exercises
Self learning Outline: Data Protection Solution for Filers using NDMP Students are encouraged to take the self learning DP course which teaches NDMP configuration for NetBackup. Course outline is listed below: • 4.1 Introduction • 4.2 NetBackup for NDMP Architecture & Components • 4.3 Supported Topologies • 4.4 Installing NetBackup for NDMP • 4.5 Configuring NDMP on the NetApp Storage System • 4.6 NDMP Backup and Restore • 4.7 Shared Storage Option (SSO) • 4.8 Best Practices & Troubleshooting
Self learning Outline: Installation and Configuration of NDMP Hands-on: NetBackup for NDMP • 5.1 Configuring Tape Devices on the NetApp Storage System • 5.2 Configuring NDMP Backup to NDMP-Attached Devices • 5.2.1 Authorizing Access to the NDMP Host • 5.2.2 Media Manager Device Configuration • 5.3 Verify NDMP Password and/or Robot Connection • 5.4 Adding NDMP Storage Units • 5.5 Creating an NDMP Policy • 5.6 Performing an NDMP Backup • 5.7 Performing a Restore from the Server
Summary of High Level Steps for Configuring NDMP Backup • On Filercable and zone device to filer HBA# storage disable –f adapter xx } bounces the adapter to reconfigure# storage enable adapter xx } devices visible to it on the SAN> sysconfig –t displays tape devices visible to filer> sysconfig –m displays changer visible to filer • Test connectivity - Filer to Tape DevicesLoad tape in the tape drive > mt –f nrstxx status> dump 0uf nrstxx /vol/vol0/etc • On the Backup ServerConfigure NDMP authenticationDefine Robot, Tape DrivesDefine NDMP Storage UnitDefine NDMP Policies
Module Objectives • Description of dump • Common Backup Performance Issues • Troubleshooting Tips • Solving Common Performance Issues & Tools • Best Practices • Further Information
Data ONTAP dump/restore:Phases of dump • Phase I & II: Create inode maps per ufs dump format • Phase III: Record directory structure • Phase IIIa: Early ACLs Phase • Phase IIIb: Offsetmap phase (Only NDMP backups with File History) • Phase IV: File dumping phase • Phase V: Late ACLs Phase (Same as IIIa. Exists for reasons of backward compatibility)
Basic Performance Issues • Fragmentation • Very large number of small files • Not enough number of disks on the volume • File history processing
Basic Performance Issues:Fragmentation • Exists in almost every file system • Reduces the chain lengths and increases the amount of seeks that need to be done • Can be measured by “wafl scan measure_layout” • De-fragmentation can be done by “wafl scan reallocate”. However, it needs enough space on the volume and is effective for file sizes > 128K
Basic Performance Issues:Too Many Small Files • Affects the logical characteristics of dump • Dump needs to be spend time gathering meta-data for each file that it needs to record • Small files drastically reduce the chain lengths and increases the number of disk seeks • For NDMP backups with file history, backup applications tend to take a lot of time and resources to process the file history. This tends to flow control the backups and in some cases backup applications may die.
Basic Performance Issues:Limited Number of Disks in Volume • More the number of disks in a volume, more parallelism can be introduced in reading data from WAFL • It can be observed with a near 100% disk utilization in sysstat • More disks can be added fairly easily. However, just adding more disks does NOT do anything. The existing data needs to be re-written to take full advantage of added disks. Otherwise, the volume will show up as fragmented with no gain in read speed. • Data can be re-written using NDMPcopy or backup and restore of the volume in question
Troubleshooting Tools • Most dump performance issues can be diagnosed with the following tools: • Data ONTAP dump to baseline • perfstat • sysstat (Prefer using: “sysstat -x <interval>”) • statit • wafl_susp • wafl scan measure_layout • Logs – backup application, Data ONTAP
Troubleshooting Tools:Data ONTAP dump to null • Use Data ONTAP dump command to baseline • Eliminates NDMP and file history overhead • Use null device to find maximum theoretical backup performance> dump 0f null <dir> whereby <dir> can be a volume/qtree/subdirectory. This will test a backup to the ‘null’ device (tape drive is not needed) and gives an estimation about what the file system is able to deliver during a backup. • Use the df and df –I to display used space and used inodes
Troubleshooting Tools:perfstat • Perfstat (from http://now.netapp.com/) • Use with assistance from NetApp Global Services • Needs to be run during “interesting” time • Can definitively diagnose most performance issues
Troubleshooting Tools:sysstat • High CPU Utilization indicates a lot of load on filer. Part of it may be due to too many small files on the system and some due to other load on the filer. Not much can be done in this case. • A cache hit ratio of anything lower than 99% may be indicative of high fragmentation/small files • A near 100% disk utilization is typically indicative of the fact that performance numbers can be improved if more spindles are added to the volume. However, data migration may be necessary before full advantage of more disks can be realized.
Troubleshooting Tips • Data ONTAP sysstat command • Shows current tape and disk read/write performance • Use sysstat –u or sysstat –x • High disk utilization indicates too few data disks • Check for other activity polluting diagnosis CPU Total Net kB/s Disk kB/s Tape kB/s Cache Cache CP CP Disk ops/s in out read write read write age hit time ty util 10% 0 2 0 10420 4 0 11677 >60 100% 0% - 100% 8% 0 0 0 7840 0 0 5484 >60 100% 0% - 99% 12% 0 0 0 10020 0 0 7439 >60 100% 0% - 100% 8% 0 0 0 7285 0 0 10505 >60 100% 0% - 100%
De-Fragmentation Tools:NDMPcopy • Sometimes when there are too many limitations with wafl scan, NDMPcopy (which is primarily a data migration tool) can be effectively used to migrate data from an existing volume to another one • The result will be a perfectly laid out volume will have a nice layout of data. Then, this new volume can be used as the primary copy of the data. • This technique will effectively remove fragmentation from the data set and hence improve overall performance from wafl: whether it is dump, CIFS or NFS • NDMPcopy cannot be used to migrate from non-NetApp storage to Data ONTAP (data formats are different)
Troubleshooting Tips • Log Files • bpbrm, bptm, ndmp logs from Media Server • /etc/log/ndmpdlog.<date> from NetApp • To Enable/Disable NDMP debugging on NetApp • ndmpd debug 50 – Enable Debug Output • ndmpd debug 0 – Disable Debug Output
Solving Common Performance Issues • “Small Files” Problem • Disable File History if possible • Selective recovery still possible without File History • Will lose ability to browse image for recovery • SET HIST=n in File List of NDMP Policy • Split volume into smaller qtrees (500gb/qtree) • Use large Aggregate/Flex Volume; Defragment volume, • Use Backup selection by qtrees vs subdirectories • Split backup into multiple jobs/streams • Use Local NDMP (VTL) vs 3-way or remote NDMP • File History Performance • Split NDMP backups across multiple Media Servers
Best Practices • Involve NGS Consulting Services early • Use Large Aggregates (Data ONTAP 7.x) • Use Flexible Volumes (Data ONTAP 7.x) • Size Volumes according to backup window and workload requirements • use qtrees and limit size of qtrees (500gb) • Backup at Volume / qtree level not subdir. • Leverage Snapshot Technology for daily recovery • Switch to “Local” Backup to VTL
Contacts: • Product Management: Bill Webster • PPE* Certification team • Tape San (Library Vendors) – Robert Jacobs • NDMP (Backup Apps.) – Pradeep Sheshadri • Engineering: • NGS Consulting & Field Services • SDT – Piyush Agarwal * PPE = Product and Partner Engineer
Further Information • NetApp On the Web Customer Service Website http://now.netapp.com/ • NetBackup for NDMP System Administrator’s Guide • Data ONTAP Tape Backup and Recovery Guide • Data Protection Strategies for NetApp Storage http://www.netapp.com/tech_library/3066.html
Q & A Thank youNDMP for Data ONTAP Configuration, Troubleshooting and Best Practices Piyush Agarwal, SDT Americas Team Lead D2D, Backup, VTL