800 likes | 827 Views
Learn how to effectively manage and troubleshoot email issues in Unix systems, including vulnerability fixes, DNS management, and network problem diagnosis. This guide, written by Chen Changsheng from the National Chiao Tung University, covers topics such as spam protection mechanisms, troubleshooting and debugging, common filtering mechanisms, and emerging issues like phishing scams.
E N D
Unix 電子郵件管理 (漏洞排除、網路問題診斷、DNS管理等) 陳昌盛 國立交通大學計算機與網路中心 技術發展組組長 (2008.11.26)
綱要 (Outline) • E-mail 基本運作原理與管理課題 • 垃圾郵件防護機制 • 問題診斷與除錯 ( Troubleshooting & Debugging) • 常見過濾機制 (Mail, DNS, IPS/Firewall, etc.) • 問題與討論 • 個案探討 • 總結與未來發展 • 附錄 • 新興課題 - 網路釣魚詐騙信件 ( Phishing) • 參考資料
Part 1 E-mail 運作原理與管理課題(垃圾郵件的產生)
E-mail Basics • 從 RFC 821/2821, E-mail 基本組成結構 • Message = an envelope + content • Content = mail headers + body • MIME extension • For the transmission of images, audio, or other sorts of structured data in electronic mail messages. • MIME document series [RFC2045, RFC2046, RFC2049]
典型 E-mail 系統運作圖 • Incoming SMTP Gateway Farm • Firewall Internet • Mail Filtering • BL/GL/WL • Auto-learn • Bouncing server • Mail Spool • server • SMTPauth • Outgoing SMTP Gateway Farm
垃圾郵件防治概論 • Q1: 何謂 SPAM ? • 為何會有那麼多 SPAM mail ? • Where did the SPAMmer get my e-mail address ? • Q2: 如何反制 Spam mail ? • Or, how does the E-mail system know if an incoming mail is a spam message ? • Q3: The big issues with filtering to date • “Spam mail 阻擋效率如何 ?” • “多少正常的 e-mail 也被阻擋?”
何謂垃圾郵件 (Spam Mail)? • Best description: "Unsolicited Bulk E-mail" • 簡單地說 –你不想要而且也沒有主動要求他人寄送給你 • In human terms: bulk e-mail you didn't want, and didn't ask for • Mailing lists, newsletters, "latest offers": not spam, if you asked for them in the first place • UCE/UBE • UCE = Unsolicited Commercial E-mail • UBE = Unsolicited Bulk E-mail
垃圾郵件防護概論 (個別認知差異)-- Truth Depends on Interpretation • MTA0 Filtering with H1(msg) Mail Spool Accept • MTA1 (or MUA1) Filtering With H2(msg) Discard • MTA = Mail Transfer Agent • MUA = Mail User Agent • MTA2(or MUA1)
Spam Mail 特性 • 透過密件傳送功能 (header BCC for bulk mailing)以規避大量傳送信件的限制 • Avoiding the header length limits • 使用經常改變且是假的送件者以逃避郵件過濾 • 使用假造的電子信件郵戳以逃避追蹤 (faked Received lines for preventing/confusing the spam source tracing) • Syntax Ok, Semantics Error • 有時候Spam mail 會夾帶病毒
Where did the SPAMmer get my e-mail address ? • UCE/UBE 散佈途徑 • 名單收錄 • www homepage, USENET news articles • account password files on individual servers • 其他不當途徑 ( program bug, 招募會員活動, …) • 找尋管理較鬆散的 mail relay • Domain Zone scanning ( DNS) • URL scanning (web pages ) • Dictionary attack
為何會有這麼多 Spam Mail ? • SPAM 來源 • The most common form of spam is commercial spam, where the user is hoping to make a profit. • Take as an example spam, where a user abuses the un-metered nature of email to send out millions of emails. • As the incremental costs of sending more emails to the spammer are almost zero, he can still make a profit even with a success rate of 0.0001 %.
Using a Botnet to send spam mails- From Wikipedia • computer hacking and botnet attacks bycriminals, terrorists and foreign powers.
Botnet 簡介 • Botnet俗稱殭屍網路 (Zombie Network),又稱機器人網路 (Robot Network), 受Botnet感染之主機猶如殭屍般任由控制者操控 • 駭客藉由 IRC (Internet Relay Chat) 等管道遠端控制受感染的主機,可發動網路攻擊 • 包括竊取私密資料、網路釣魚(Phishing)、散布垃圾郵件(SPAM)、發動阻斷式服務(DDoS), 恐嚇被駭網站等犯罪行為 • Botnet病毒, 通常每隔5至10秒利用IRC通訊埠(6660-6669或7000,主要為6667)進行網路連線 • Botnet具有自我複製並主動散播之特性,同時具有隱匿性,受感染之主機不易發覺 • 若再經有心人士製造變種Botnet病毒,防毒軟體更不易偵測,常造成嚴重之損害
Part 2 垃圾郵件反制
Why Bother Filtering Spam?--Economical view • On a global scale use of bandwidth, CPU resources and people time are wasted. • Seems to be about 30% to 60% of mail traffic, and increasing • The spam recipients time is wasted as well, and receiving the spam may directly costing the user bandwidth fees from their ISP. • ISP's also dislike spam because it costs them time and money to deal with the complaints, and recover overloaded mail servers that sometimes crash under the load of intensive spamming.
Why Bother Filtering Spam? - Social view • Nearly impossible to unsubscribe • “unsubscribe” addresses work only 37% of the time, according to the US FTC • Opt-in vs. opt-out • Legal retaliation (求償) not possible in most parts of the world yet • Only possible in some regions (e.g., Korea, some states in US, etc.) • Taiwan (準備採用 Option-out) • 已經訂草案, 送交立法( 近年的報紙消息)
A Hybrid Model for Anti-spam and Related Issues Economy + Education + Law + Technology • Authentication • Confirmation • Identification • Verification Economy Anti-SPAM Hash-money Education Retaliation ((求償) • Privacy protection Law Technology • BL, GL, WL • Machine learning • Opt-in, opt-out • Cyber forensic
How Do The Spammers Feel? • Spam relies on low overheads and extremely cheap delivery • Disrupt the equation and they will give up! • Already hurting, according to CBS: • “[I’ve gone through] unbelievable hardships [to keep spamming] ... My operating costs have gone up 1,000% this year, just so I can figure out how to get around all these filters”
Anti-Spam Paradigm • View 1: Prevent your sites from becoming SPAM sources • Sender authentication • Vulnerability scanning • Access Control (e.g., Firewall, etc.) • View 2: Prevent SPAM from entering the mailboxes of your site • Tasks: identify/detect incoming SPAM messages • Sender authentication • Whitelist, Blacklist, Greylist, • Auto-learning (personalization) • Sub-views: server-side vs. client-side filtering
View 1 : 防止本單位網站變成 Spam 幫兇(Common SPAM Source ) • 目前 TANet 常見的模式 • Compromised Hosts : • 主機遭入侵, 被植入後門程式, 讓有心人士可以利用該主機發送廣告信件. (e.g., 主機遭病毒攻擊) • Open Proxy : • Proxy Server 未對可信賴的使用者IP或Domain name 範圍做限制, 讓有心人士利用 HTTP 協定上的漏洞寄送廣告信件至Internet. • Open Relay : • 郵件伺服器未對可寄送信件的使用者IP或Domain name範圍做限制, 讓有心人士利用來轉發廣告信函至Internet上. • 其 他 • 使用者不當使用, 利用學術網路發送廣告信函.
Measures to Avoid Becoming SPAM Victims-Other Suggested Administration Policies • Basic Measures • Prevention/Disable of open relay (SMTP) and/or open proxy • Open proxy : tcp port 80, tcp 3128, etc. • Enable anti-virus capability • Sender Authentication • Enable SMTPauth (or something of the like) • Sender Policy Framework (SPF)– • Enable advertised white-list from Sender (through DNS)
Subtype of Problem -Forged Sender • Good news: • SPF, SenderID, DomainKeys, etc. • Bad news • This might cost more money for authentication services in the future (e.g., anti-virus checking services) • It might not be effective to authenticate something like abc.123@yahoo.com or a13yotdfgfg@hotmail.com, etc. • i.e., there might still be too much overhead/loading for authenticating these possible faked sources (Cf. DNSSEC)
View 2: 防止 SPAM 進入單位內信箱(Anti-spam: 垃圾郵件防治) • 用戶端防制法 • Thunder bird • Outlook express • Server 端防治法 • SpamAssasin (Machine Learning) • Dspam (統計)
Candidate Features for Filtering • Envelop address (EnvFrom, EnvRcpt) • Malformed sender (or recipient) address • Relay (Helo, rDNS, IP address range, etc.) • SMTP Peak Connection Ratio (8.13) • Header Address (HdrFrom, HdrRcpt, etc) • Basic Features: From, To, Subject • Extended Features: X-Mailer, Message-Id, etc. • Using SPAMware (e.g., Dynamailer, Dmailer, etc.) • Body Content • Unwanted contents (Subject, Body lines, URL, etc. )
垃圾郵件防治基礎 • 過濾方式評估 (Filtering Approaches) • Server-side vs. user-side • Static vs. adaptive • Auto-learning • 可用過濾機制 (Filtering Mechanisms) • 黑名單 (Blacklist - totally denied) • 白名單(Whitelist - totally accept) • 灰名單 (Greylist - partly denied) • 自動學習 (Auto-learning vs. Statistical Approach)
Mail 過濾機制示意圖-- A typical Anti-spam Model (1) Generic Mail Filtering 白名單 Pass (2) 發信者 黑名單 Mail Spool • 收下 Fail (3) 灰名單 Fail temporarily 暫時收下 隔籬區 (Quarantine) (4) 自動學習防治 Fail Update 拒收 Pass
Anti-SPAM Methodologies (using Artificial Intelligence Terms) • Rule-based Methodology • Milter-regex (server-server), Procmail (client-side) • Database-based Methodology • Blacklist (totally denied) • SpamCop, DNSBL, etc. • Whitelist (totally accept) • Greylist (partly denied) • Razor-like or DCC-like approach (distributed) • Data Mining/Machine Learning Methodology • Thunderbird approach (personal) • SpamAssassin (server, personal)
Rule-based Models for E-mail filtering (e.g., Sendmail, Postfix, etc.) • Pattern Matching (Regular Expression) Tool • E.g., Sendmail + Milter-regex • White list vs. Blacklist • Reject vs. Discard • Behavior Heuristics • SMTP Peak Connection Ratio/Threshold ( sendmail 8.13) • Greylist ( two-phase adaptive blacklist) • E.g., Sendmail + milter-greylist • Machine Learning Tool • Sendmail + SpamAssasin
Introduction to Greylist • In 2003, Evan Harris proposed a novel idea, which he called Greylisting. • i.e., based on the observation that many unsolicited bulk mails are sent via open proxies and other mechanisms that do not involve proper mail transfer agents. • The main ideas of greylisting are as follows: • Mail from unfamiliar senders are temporarily rejected and should be retransmitted by their ISPs' SMTP clients. • Mails from familiar senders are passed immediately. • You canwhitelist friendly SMTP servers, and you shouldwhitelist your own network, otherwise your SMTP clients will have real trouble to send e-mail. • Whitelisting localhost is also a must.
Rationale behind using Greylist (cont) • Grey listing works by assuming that contrarily to legitimate MTA, spam engines will not retry sending their junk mail on a temporary error. • RFC 2821 says that the sending MTA should retransmit 30 minutes or later after a failure, but spam sent through an open proxy as well as some viruses and worms are not retransmitted. • A proper sending MTA will repeat a transmission after a temporary4yz rejection. • If spammers ever try to resend rejected messages, we can assume they will not stay idle between the two sends. • Odds are good that the spammer will send a mail to an honey pot address and get blacklisted in a distributed black list before the second attempt.
Sample Knowledge Representation- Greylist for E-mail filtering # # greylisted tuples #---------------------------------------------------------------------------------------- # Sender IP Sender e-mail Recipient e-mail Time accepted 202.53.72.110 <vjxgtcfn@yahoo.com> <lh15@mail.nctu.edu.tw> 1099474741 # 2004-11-03 17:39:01 202.53.72.110 <vjxgtcfn@yahoo.com> <limintsai@mail.nctu.edu.tw> 1099474742 # 2004-11-03 17:39:02 202.53.72.110 <vjxgtcfn@yahoo.com> <lindajmm@mail.nctu.edu.tw> 1099474743 # 2004-11-03 17:39:03
Sample Knowledge Representation - Greylist for E-mail filtering (cont.) # # Auto-whitelisted tuples #======================================== # Sender IP Sender e-mail Recipient e-mail Expire 210.58.229.92 <epaper@eslite.com.tw> <ysm@cc.nctu.edu.tw> 1100042183 AUTO # 2004-11-10 07:16:23 210.58.229.92 <epaper@eslite.com.tw> <yytzou@cc.nctu.edu.tw> 1100042354 AUTO # 2004-11-10 07:19:14 210.58.229.92 <epaper@eslite.com.tw> <wsfu@cc.nctu.edu.tw> 1100041397 AUTO # 2004-11-10 07:03:17 • … more
Considerations, Caveats, and Differences • False positives (不是 SPAM, 卻沒有放行) are possible if a legitimate SMTP client does not retransmit an embargoed (禁制) message after the embargo expires. • Failing to retransmit clearly violates the SMTP standard, because it will result in lost mail should something other than Greylisting cause a temporary failure response. • False negatives (是SPAM, 卻未擋下) are common. • Greylisting can only detect bogus SMTP clients. Bulk mail advertisers that use standards compliant software automatically pass
Machine Learning Approach for Anti-Spam(cf. , SpamAssassin Concepts) • Combines many systems for a "broad-spectrum" approach • Lots of rules to determine if a mail is spam or not • Detect forged headers • Spam-tool signatures in headers • Text keyword scanner in the message body • DNS blacklists • Razor, DCC (Distributed Checksum Clearinghouse), Pyzor • These are combined to produce an overall score for each message • If over a user-defined threshold, the mail is judged as spam • Spammers cannot aim to defeat 1 system; the others will catch them out
SpamAssassin GA Start Good Enough? Yes Final Scores No Evolve Scores
Machine Learning Approach for Anti-Spam(cf. SpamAssassin Example)
用戶端垃圾郵寄防治工具 --範例: Thunderbird (設定, 2/2) 垃圾郵件防治設定
垃圾郵件防護 (Anti-Spam) Server 端防制 (Dspam 系統) • Dspam 基本運作架構 • Dspam UI 使用範例 • Dspam 成效 (performance) • 調整Dspam 個人化設定條件 • 垃圾信件隔離區檢視 (Quarantine) • 按機率(Spam)高低顯示 (分成三種顏色) • 圖表顯示 (Spam vs. Good mail) • 誤判可放行 (i.e., 正常e-mail) • 重新訓練 (誤判、漏判)
In the diagram below, MTA refers to Mail Transfer Agent, or your mail server software: Postfix, Sendmail, Exim, etc. LDA refers to the Local Delivery Agent: Procmail, Maildrop, etc.. BEFORE: [MTA] ---> [LDA] ---> (User's Mailbox) AFTER: [MTA] ---> [DSPAM] ---> [LDA] ---> (User's Mailbox) \ \--> [Quarantine] [End User] ------> [Web UI] Dspam As a Delivery Agent Proxy
垃圾信件隔離區 (Quarantine)-分三種顏色顯示,低、中、高,不同機率範圍
Part 3 問題診斷與除錯
問題診斷與除錯 ( Troubleshooting & Debugging) • 過濾機制(設備) • Router, Firewall, IPS, Mail, etc. • 常見驗證/診斷機制 • System Log 觀察 • 退信分析 • Header 驗證 • Domain name 查詢 • DNS + Whois • 特殊案例 -- 某區網中心的誤擋實例 (Router 阻擋 Worm 攻擊)
Part 4 問題與討論 個案探討(Case Study) 垃圾防護機制未來發展
問題與討論 • Spam Detection Stats (正確率) • 訂定 Spam 防制策略 • Policy 1 – 訂定不同的管理策略 (包括個人化support) • Policy 2 – 將incoming/outgoing relay 分開 • Policy 3 – 啟動傳送用戶認證 (sender authentication)
Measures to avoid becoming SPAM victims -NCTU Administration Policies • Considering Users’ Views • Server-side filtering • mail.nctu.edu.tw • Client-side Filtering (i.e., no filtering at the server side) • faculty.nctu.edu.tw • Other Advanced Features • Personalization support (e.g., Dspam, etc.) • Client Tools • Server-side Support
Measures to avoid becoming SPAM victims -NCTU Administration Policies (cont.) • Separation of Incoming & Outgoing Relays • For easy separation (or identification) of spam source • Incoming Relay: Enable advertised blacklist (or semi-blacklist) from Sender (through DNS) • Reject direct SMTP relaying from 240.9.80.219.dynamic.tfn.net.tw • Outgoing Relay: Might accept relay from 240.9.80.219.dynamic.tfn.net.tw with SMTPauth ( or similar)
案例: Spam mail 散佈– 利用Webmail 用戶之密碼太簡單者