120 likes | 208 Views
status of dcache n2n and monitoring. Report i. Current situation. x rootd4j is a part of dCache implemented in such a way that each change requires new dCache version
E N D
status of dcache n2n and monitoring Report i
Current situation • xrootd4j is a part of dCache implemented in such a way that each change requires new dCache version • xrootd4j for dCache versions 1.9.12.15+ and 2.2+ have different interfaces for authentication and authorization plugins. • N2N • Two versions available. Tested and working. • dCache version 1.9.12.21 is needed if simultaneous N2N and authentication are required. • When given PFN it is transparent. To use gFLNs two pre-requisites: • Variable $LFC_HOST to be defined and pointed appropriately • Have ATLAS certificate proxy. • Documented here: https://twiki.cern.ch/twiki/bin/view/Atlas/AtlasXrootdSystems#Configurating_dCache_Xrootd_door
Current situation • Monitoring • There is a program that once a minute queries dCache billing DB, makes UDP packets in the xrootd format and sends them to collector. • Due to very low overlap of information collected from dCache and xrootd only two variables are sent: in and out traffic. • Installed and working at MWT2 and AGLT2
Future plans I • Xrootd4j itself will become a plugin for dCache, thus enabling easier updates and independence from underlying dCache version. • This will require new versions of dCache (1.9.12.22, 2.2.xxx, 2.4)
FUTURE PLANS II • N2N • There will be just one version of N2N. • Already written and unofficially tested • Monitoring • We get access to most of messages between xrootd client and xrootd4j dCache server. • Consequences of this approach: • No need for a billing DB and extra process mining it. • Gives us a lot of freedom. And we’ll need it. • Makes us responsible for it’s performance • Need to completely understand xrootd protocol • Need to understand what part of it dCache supports • Leaves us open to protocol changes • Will take me some serious time to write it – week or two
FUTURE plans III • We should get – hopefully this week a CERN based no-authentication needed copy of LFC to be used by European sites. • Expectation is that no changes in N2N will be needed. • We should test WebDAV read-only access to LFC • Will need new versions of both N2N versions (C++and Java)
Events based • Package version: 1.0 • Connect event • conn id (generated by server) • host ip • [client version] • [protocol] • File open event • conn id • filename • file id (generated by server) • size • mode (r/w) • File close event • conn id • file id • bytes read/written • [read ops, Vector read ops, write ops] • Disconnect event • conn id • Connection duration Events sends it’s info immediately. Collector does: dns lookups (host,client) joins info using connID, fileID [optional]
TimerBased Events ( ~ once a minute) • Package version: 1.0 • Server version: 1.9.2.xxx • Server ip • Total bytes read and written • Current number of connections • Number of connection attempts in last period • Number of successful connections in last period • List of fileIDs of all currently active files and bytes transferred to/from them.
overlap • select protocol, isnew, transfersize, connectiontime from billinginfo; • protocol | isnew| transfersize| connectiontime • ----------+-------+------------+--------------+--- • GFtp-1.0 | t | 2698 | 9 • GFtp-1.0 | f | 104947 | 46 • DCap-3.0 | f | 1770433189 | 228510 • DCap-3.0 | f | 2000000 | 646 • DCap-3.0 | f | 1234854377 | 141918 • DCap-3.0 | f | 4327366566 | 500198 • DCap-3.0 | f | 1781785249 | 171788 • GFtp-1.0 | t | 82160365 | 6769
Details dcache 1 \d billinginfo Column | Type | ----------------+-----------------------------+----------- datestamp | timestamp without time zone | 2012-07-03 17:52:01.578 useless cellname | character varying | msufs10_4@msufs10Domain useless action | character varying | transfer useless - always transfer transaction | character varying | pool:msufs10_4@msufs10Domain:1341352321578-1901596 useless pnfsid | character varying | 00005D39999EFE644ABBB4E1DBBE1D4AA61F useless fullsize | numeric | 1208435588 transfersize | numeric | 5435588 storageclass | character varying | usatlas:aglt2@osm useless isnew | boolean | f - read/write client | character varying | 192.41.236.158 - useless connectiontime | numeric | 17100 errorcode | numeric | 0 errormessage | character varying | '' protocol | character varying | DCap-3.0 initiator | character varying | door:DCap-umfs03-<unknown>-2814749@dcap-umfs03Domain:1341352304449-4694621 useless
Details dcache2 \d doorinfo Table "public.doorinfo" Column | Type | Modifiers ----------------+-----------------------------+----------- datestamp | timestamp without time zone | 2012-03-13 05:05:03.682 cellname | charactervarying | DCap-msufs11-<unknown>-38548@dcap-msufs11Domain action | charactervarying | request owner | charactervarying | /DC=org/DC=doegrids/OU=People/CN=Edward Diehl 382381 or unknown mappeduid | numeric | 834083 or -1 mappedgid | numeric | 834083 or -1 client | charactervarying | unknown transaction | charactervarying | door:DCap-umfs01-<unknown>-65599@dcap-umfs01Domain:1331629506050-188335 pnfsid | charactervarying | 000093FC61A0929488492871F651FC7E90E connectiontime | numeric | 2256 queuedtime | numeric | 0 errorcode | numeric | 0 errormessage | charactervarying | '' path | charactervarying | /pnfs/aglt2.org/atlashotdisk/cond10_data/000021/lar/cond10_data.000021.lar.COND/cond10_data.000021.lar.COND._0008.pool.root