320 likes | 585 Views
Running CF in a Shared and Dedicated Hosting Environment. “You can’t say that I didn’t tell ya!” What I wish that I could tell every customer before stuff happens. Tim Nettleton timothy.nettleton@hostcentric.com. Hosting Environment:. Single Site Shared. Single Site Dedicated.
E N D
Running CF in a Shared and Dedicated Hosting Environment “You can’t say that I didn’t tell ya!” What I wish that I could tell every customer before stuff happens. Tim Nettleton timothy.nettleton@hostcentric.com
Hosting Environment: • Single Site Shared • Single Site Dedicated • Multi-Server Dedicated • All CF Applications Goal: “To provide a stable and flexible application platform for customers to experience success and grow through profitability toward ownership.” Hosting Obstacles: • Performance • Scalability • Tools and Solutions! • Security • Stability
Performance: CF Configurations Limit simultaneous requests: 15 Timeout Requests at 75 seconds Restart on 3 unresponsive requests Restart CFAS on abnormal termination Suppress whitespace Tip:Stay away from ?RequestTimeout=1000000 Enforce Strict Attribute Validation Missing Template Handler and Default Error Handler are both empty?
Performance: Caching Settings Approx 2x total .cfm template pool. Trusted enabled for Production/Non-development environments. Client Variable Storage • Default storage to NT Registry and purge at 5 days. • Only RDBMS systems allowed for External Client Storage Variables • DO NOT increase your Application, Session variables beyond ‘acceptable’ limits
Performance: Tip:If you don’t use Client variables, don’t make CF track them. Example code that creates unnecessary overhead: <CFAPPLICATION NAME="CF2001" SESSIONMANAGEMENT="YES" CLIENTMANAGEMENT="YES"> Corrected code without Registry interaction: <CFAPPLICATION NAME="CF2001" SESSIONMANAGEMENT="YES" CLIENTMANAGEMENT=“NO">
Performance: Logging Settings Log Long Running Templates. They provide an easy way To identify bottlenecks in code and database design. Any templates that typically runs more than 10-15 seconds will most likely lose a user’s attention and result in F5 or Alt+F4.
Performance: Databases and DSNs • All file based databases get a limit of ½ the total available threads • “Maintain Database Connections” is also Unchecked • RDBMS databases should use a server IP address not HOSTNAME in the Server field • Provide a Database name with each DSN ‘unless’ intended otherwise. • “Maintain Database Connections” is Checked
Performance: Databases and Code • Use CACHEDWITHIN for common shared queries • Use BLOCKFACTOR for all SELECT queries • Convert CFQUERYs to Stored Procedures • Use CFTRANSACTION(s) around all • INSERT, UPDATE and DELETE CFQUERY tags. • Use CFLOCKs with a TIMEOUT value nested inside CFTRY blocks • Use manual caching in the Application or Session scope for pinning commonly requested or Non-Dynamic SQL. • Run an Index Analyzer or similar tool for the most common queries. • Cache generated content with custom tag sets. • Disable RDS service. (Security)
Performance: Databases and Code • NEVER use CFINSERT and CFUPDATE. • Use CFQUERY TYPE=“QUERY” sparingly. • Avoid • “SELECT * FROM TABLE” . • Use • “SELECT INT1,CHAR2,VARCHAR3,NVARCHAR4,BLOB FROM TABLE” • Ordered in Increasing meta data size. • URL Parameters? • “SELECT COLUMN1 FROM TABLE WHERE ID=#ID#” • You expect • “DOMAIN.COM/Report.cfm?ID=2001” • You get • “DOMAIN.COM/Report.cfm?ID=2001 DELETE FROM TABLE”
Performance: Databases and Code CFQUERYPARAM, CFPARAM, VAL(), explicit validation or CGI.HTTP_REFERER 1.) “SELECT COLUMN1 FROM TABLE WHERE ID=#VAL(ID)#” 2.) “SELECT …. WHERE ID= <CFQUERYPARAM VALUE="#URL.ID#" CFSQLTYPE="CF_SQL_INTEGER">” 3.) <CFPARAM TYPE=“NUMERIC” NAME=“URL.ID” VALUE=“#URL.ID#”> Choosing the right database and reworking malfunctioning code can offer the most immediate Performance and Stability gain.
Performance: Code Bottlenecks • Avoid excessive iterations in CFLOOP and CFOUTPUTs. • Avoid CFEXIT as there is no guarantee that it will ever resolve. • CFLOCK all CFHTTP, CFFTP and CFPOP instances as they have a • high probability of external failure. • Use timeout values and explicit error handling on all. • Be careful not to CFINCLUDE the base template. • Look for a CFERROR page that is prone to errors. • Enable and read debugging info in Administrator • Use PERFMON and cfstat.exe (in CFUSION\BIN\) for • periodic analysis
Scalability: First, choose the right Database. Load Balancing Hardware or Software? Sticky or Not? Why is sticky bad? It binds a particular user to an application server until the session is terminated, thereby the primary goal of load balancing. How can you avoid sticky? Avoid all server specific memory resident variables. Convert to Client variables, cookies or a breed of URL identifiers. Similar to CFID and CFTOKEN sent in a CF URL. Note: Client variables will only take simple data. No structures or queries unless serialized for text storage.
Security: • Use CFERROR and CFTRY/CFCATCH to avoid showing an end user any private information • NTFS password protect the Administrator and CFDOCS • or make them only accessible via non-public IP. • Patch your OS and App server like someone is watching! • http://www.microsoft.com/technet/ • http://www.allaire.com/developer/securityzone/ • Get a firewall with IDS system • Port restrictions and local traffic routing • Have your server professionally scanned • You can bet that someone is scanning it right now! • NEVER put a file based database in an HTTP accessible directory. • That includes Verity collections. • “http://www.Domain.com/collection/file/parts/00000001.did” • Protect yourself from URL MDAC hacking by validating input before • building dynamic queries My favorites! • +.HTR, ::$DATA, :$DATA • http://www.yourserver.com/scripts/..%c0%af../winnt/system32/cmd.exe?/c+dir+c:\
Security: Before After .CFM, .DBM, .ASP, .ASA, etc. Unicode Hack
Stability: TIP: Run CYCLE.BAT (in CFUSION\BIN\) to release an ODBC memory leak. Logs, logs and more logs? A thorough examination of the logs with a complete understanding of what goes in ( code ) provides an insight of “What Happened?!?” Hung Threads Long Running Templates Numeric Errors Catastrophic Errors Application Server restarts with Proximity
COSMOS If you have ever looked in the /cfusion/log/ directory you have probably seen one or more of the many Cold Fusion generated error/information logs. These text files can easily grow to hundreds of MB and contain the best indicators of 'what happened'. As with any other service or application, regular review of system logs should be part of normal administration. Unfortunately, because of their large size and the fact that the data is segmented into so many logs, it is difficult to get a complete picture of performance, problems, and failure. Developers who work on a dedicated server can use the Cold Fusion Administrator to view these logs. This can be accomplished clicking on "Log Files" and then downloading the entire log via a browser. Unfortunately, this is usually not possible given the size of most logs and remote connection speed. For shared developers, the critical information is unavailable due to the nature of the shared environment and security. In most cases, a developer only knows what a site user tells them or what they trap using CFTRY/CFCATCH and CFERROR. Even with these mechanisms in place, the larger picture is unavailable and the majority of performance issues go unnoticed and unattended.
COSMOS • Written mainly with Cold Fusion, COSMOS is an integration of ASP, DOS, Perl, ADSI and Call/VoiceXML. It is a remote management platform that leverages the file system, registry, Metabase, service controls, and performance counters. • At current, COSMOS contains over 18 million server events.. Captured within a maximum of 40 seconds, these events include all of the following: • Application errors • Cold Fusion Application Server Stop/Starts • Hung Threads • Long Running Templates • Missing Templates • Scheduled task results • Undeliverable Emails • Mail sent • There are over 20 reports available to a dedicated client, many of which are also available for shared customers. Below is a listing of them with a brief description of how they impact the development and maintenance cycle.
COSMOS General Application Error Listing -Application errors are the best view into the progress and developmental completeness of a site. A well-coded site generates no application errors. This listing provides a top down view of the most recent Application errors for all IIS Roots. By clicking on the error message on the right, a popup window displays the error message as displayed to a site visitor.
COSMOS General Missing Template - This applies to all .cfm templates requested by the web server but not found. In most cases, the developer doesn't even know that people are getting "404 File Not Found" messages. If a search engine indexes your site or a user bookmarks a page, a change in the site causes missed business. The solution is to use the Default Missing Template Handler in Cold Fusion Administrator or to add a CFERROR TYPE="REQUEST" in your site's Application.cfm.
COSMOS Long Running Template Listing -This applies to the processing time for pages that take longer than expected. The determination of how long is too long is configured in the Logging/Settings section of Cold Fusion Administrator. A typical setting is 45 seconds, though anything taking that long would most likely be canceled or ignored by the calling client. In addition, a script running for 45 seconds could help identify a performance bottleneck for the Application Server.
COSMOS Undeliverable CFMAIL Listing - When Cold Fusion is unable to deliver a message, the original template is renamed and filed in the /cfusion/mail/undelivr/ directory. An error message is also written to the Mail.log or Error.log that describes the problem preventing proper delivery. This listing brings those two pieces of information together by clicking on the message at right. The following popup allows a user to correct and resend the message from their server. This function is indispensable for any business that relies on CFMAIL to reliably carry email and cannot accept undeliverable messages.
COSMOS Hung Thread Listing - Probably the greatest indicator of a performance problem. Hung Threads are Cold Fusion's method of alerting us that it was unable to completely process the requested template. This is usually the result of code or database issues. CF4.x and above has an option in the Administrator to have CF "Restart at n unresponsive requests". Hung Threads directly relate to the operation of the Application server. When the Hung Thread count matches the defined threshold, Cold Fusion reaches a critical point, and will stop/restart itself to avoid excessive down time. Constant examination of Hung Threads is necessary to avoid Application Server failure.
COSMOS Scheduled Task Listing - Most scheduled tasks run completely unnoticed until someone realizes that a critical function has not processed in days. This listing is not much to look at but, under the hood, a huge modification and improvement has been created for the Executive Service. COSMOS can determine if your task started, succeeded, or failed. It will also allow you to define a target string in the page HTML and record the generated content from the target URL to the database. If a scheduled task does not return the defined string, an email containing the content and diagnostics can be generated at the time of failure OR a VoiceXML application can call you with the news.
COSMOS Aggregation and Stratification More commonly called a GROUPING, the next series of graphs were created to help identify the greatest problems quickly. By examining the data based on Time, Date, and IIS Root, we can gather a greater understanding of where faults exist.
COSMOS Application Errors Stratified by Date
COSMOS Time/Error graph - Especially useful in determining if your day is getting better or worse, this graph breaks down the servers errors by 10 minute increments over a selectable date span. This is often used to diagnose a recurring failure point over a multiple day or week period.
COSMOS Long Running Template Aggregation by IIS Root - Similar to the previous Root Aggregations, this has several prominent exceptions. Because a Long Running Page has a value associated with the processing time, I have included a column for the Sum and Average values. Using this display, it is possible to extract the templates most often run beyond acceptable limits, demanding the greatest processing time. This affects performance, though not necessarily a failure, and is a fantastic indicator of templates that need to be addressed Before they become a stability issue.
COSMOS Hung Thread Aggregation by IIS Root - This graph will often tell which application is responsible for killing the server. Over a selectable data span, one can easily see which sites are causing CF to lose resources. Puppies=Good Hung Threads=Bad
COSMOS One Final Look So when did your Application Server last crash and why? Event Chronology - The first view that brings together data from multiple sources. This report provides a chronological view of all Application Errors, Hung Threads, Long Running Templates, and Application server failures. This information threads events based on time in order to provide a trace leading up to a failure.
COSMOS Spectral Analysis - This graph is unique because it rapidly identifies problems that would otherwise slip under the wire. The three colors representing CF stops (red), starts (green) and Hung threads (purple) are graphed relative to a 24-hour time line.
What now? Get on all related security mailings Read your errors and understand them. Always look for a better solution:code and database. Find people that can help when you get stuck Never give up
Running CF in a Shared and Dedicated Hosting Environment “You can’t say that I didn’t tell ya!” “And then the aliens came…………………” Tim Nettleton timothy.nettleton@hostcentric.com • Performance debugging: http://allaire.com/Handlers/index.cfm?ID=8627&Method=Full • Allaire on MS Access: http://allaire.com/Handlers/index.cfm?ID=1540&Method=Full • MSFT on MS Access: http://support.microsoft.com/support/kb/articles/q174/4/96.asp