1 / 33

IP geolocation Demystified

Understand the workings and limitations of IP geolocation technology. The presentation covers how IP geolocation providers use various publicly available data to identify a location of an IP address. It also covers various limitations of the data sources and comparisons.

Deep8
Download Presentation

IP geolocation Demystified

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. IPGeolocation Demystified Understanding IPGeolocationTechnology

  2. Let'sstartwith.. Whatisan IPAddress? AnInternetProtocoladdress (IPaddress)isanumericallabelassigned toeachdeviceconnectedtoacomputernetwork. Thisnumericallabel isusedtoidentifythesedevices, allowingfordirectcommunication. Thepublicinternetoperateswiththesameprinciples. Whenadevice connectstotheinternetitutilisesagloballyuniqueIPaddressto ensurebothinboundandoutboundcommunicationisdelivered correctly. Inthiscontext, theIPaddressactsinasimilarwaytoapostaladdress usedtodeliverconventionalmail. However, unlikethepostaladdress, anIPaddressdoesnothaveanintrinsiclocationanddoesnotexpose anygeographicalproperties. Thisiswhyyoucannotdeterminethe locationofadevicebyitsIPaddressalone.

  3. So.. Whatisan IPGeolocation? IPGeolocationisanessentialtechnologythatovercomesthis limitationtohelporganisationsidentifythelocationoftheircustomers basedontheirIPaddresses. Organisationssuchasonlineserviceoperators, financialinstitutions, searchengines, adagenciesandanybusinessofferinganonline shopping/e-commerceexperienceareabletoprovidetheircustomers withthebestproductsandservicesavailableintheirregionwithIP geolocationtechnology. ThisIPGeolocationserviceisalsocrucialforpreventingonlinefraud, managingdigitalrights, andservingtargetedmarketingmaterialand pricing.

  4. But.. Howaccurateis IPGeolocation? Ifyouwonderwhereyouronlinecustomersarecomingfromorwishto customiseyourclients’ onlineexperiencebasedontheirlocation, youare likelyfamiliarwithvariouscommercialIPGeolocationservices, ranging fromfreetohighly-pricedtoenterprise-only. Mostoftheseproviders declaresuperioraccuracy, althoughshowlittletransparencyonthe methodology, andpresentscarceevidencetosupporttheirclaimed accuracy. Ingeneral, validationoftheaccuracyofanIPGeolocationserviceis challengingandrequiresalargepoolofground-truthdata (i.e. vast numbersofIPaddressesfromknownlocations). Thisdataisgenerally collectedfromallactiveISPs/AS’ andisrequiredtoberandom, spreadover variousgeographicalregions. Inreality, suchdataisgenerallynotavailable, inwhichcaseanyclaimedIPGeolocationaccuracywithoutfull transparencyisquestionable. Forin-depthunderstanding, checkourblogpost: HowaccuratecanIPGeolocationget?

  5. WHATISTHE ULTIMATEDATA SOURCE? ForIPgeolocationtechnology

  6. Let'sunderstand.. HowIPaddressesaredistributed. 51 Million TheIPv4protocoluses32-bitaddresseswhichmakes themaximumtheoreticaladdressspacelimitedto 4,294,967,296 (2^32) IPaddresses. IPv6, thenext- generationprotocol, utilises128-bitaddresseswhich makesthepoolconsiderablylarger, butstilllimited. 26 Million DuetotheglobaluniquenessrequirementofIP Addressesacrossbothprotocols, theglobalIP addressspaceallocationisheavilyregulated. Link: https://www.iana.org/ IANA – ‘TheinternetAssignedNumbersAuthorityisafunctionof ICANN, anonprofitprivateAmericancorporationthatoversees globalIPaddressallocation, autonomoussystemnumber allocation, rootzonemanagementintheDomainNameSystem, mediatypes, andotherinternetProtocol-relatedsymbolsand internetnumbers.’ (source: Wikipedia).

  7. Moreabout.. IANA InternetAssignedNumbersAuthority IANAisresponsiblefortheallocationoflargeIPaddress spaceblockstotheRegionalInternetRegistries (RIRs): AFRINICforAfricaRegion APNICforAsia/PacificRegion ARINforCanada, USA, andsomeCaribbeanIslands LACNICforLatinAmericaandsomeCaribbeanIslands RIPENCCforEurope, theMiddleEast, andCentral Asia RIPENCC ARIN AFRINIC APNIC LACNIC RIRs, inturn, delegateaportionoftheirallocatedaddress spacetoLocalInternetRegistries (LIRs), e.g. APNIC delegatestotheJapanNetworkInformationCenter (JPNIC). Allregistriesbothregionalandlocalallocatetheir remainingavailableaddressspacetoorganisationsseeking toutiliseitonthepublicinternet.

  8. IllustrationofIP addressdistribution

  9. Let'stalkabout.. Autonomous Systems (AS) Businessentities (orautonomousnetworks) thatareassignedIP addressspacefortheirownusearecalledAutonomousSystems (AS). TheymustfirstregisterasanAS, receivingaglobally-unique AutonomousSystemNumber (ASN) whichcanthenbeusedtoidentify them. TheInternetServiceProvider (ISP) isthemosttypicalexampleofan ASoperator, butitisnottheonlyone. Virtually, anyorganisation seekingtousetheirownIPaddressesontheinternetqualifiesasan AS. ItisacommonoccurrencethatASentitiesliberallyusetheir allocatedIPspaceinanymannertheywish, andmoreimportantly, in anygeographicallocationtheylike. TheycanallocateittoanyAS entity/networkwithinthesameenterpriseregardlessofglobal locationorevensubleaseittoacompletelyunrelated, geographically remoteentity. Despiteexistingregulations, thereisnowaytorestrict allocatedIPaddressspacegeographically.

  10. TheUltimateData Source? Therefore, theonlyultimatelyaccurateIPGeolocationdataisthat whichismadeavailablebyASoperators, whoaretheonlyoneswho confidentlyknowhowandwheretheirIPaddressesareutilised. AS, however, arenotobligedtosharetheirinternaldatawithanyother entity, exceptforlawenforcementagencieswithinthedetermined jurisdictionboundaries. ExistingcommercialIPGeolocationserviceprovidersdonothave accesstoASinternaldata. Someoftheseserviceprovidersclaimthey haveintegratedserviceswithISPsorreceivedatadirectlyfromISPs. Consideringthereismorethan80,000registeredASs, ofwhichmore than60,000areactiveatanyonetime (activeASNsrankedlist), itis largelyimpracticaltoformcommercialrelationshipswithall. ReceivingthedatafromasmallnumberoflocalISPsmayimprove regionalgeolocationaccuracytoaminorextentbutisnotsufficienton aglobalscale.

  11. WHEREDOIPGEOLOCATION SERVICEPROVIDERSGET THEIRDATA? AssumingthattheexistingIPGeolocationservicesdonothaveaccesstothe AutonomousSystems’ internaldata, theycannotbeconfidentregardingthe actualgeographicallocationoftheroutableIPaddresses. So, wherearetheygettingtheirgeolocationdatafrom?

  12. IPGeolocationDataSources 1 5 2 3 4 WhoIs Data BGP Data Field evidence Scientific data Reverse DNS WhoIsdatabaseis nourishedbyRegional andLocalinternet Registryorganisations (RIR/LIR) thatare obligatedtokeeptheir registrationrecords public. TheBorderGateway Protocol (BGP) isaglobal internetaddressrouting directory. Therearemany additionaldatasources thatcanbeutilisedforIP geolocationwhichqualify asfieldevidencedata. Eg: datareceivedfrom userusingGPS-enabled device. Thesearescientifically deriveddatafrom calculationssuchastime- delaytodistance conversionsandothers. Themethodisbasedon DNSrecords (textual nameofthepublic internetaddresses).

  13. WhoIsData WhatisWhoIsData? WhoIsisbyfarthemostcommonsourceofgeolocationdata. WhoIs databaseisnourishedbyRegionalandLocalinternetRegistryorganisations (RIR/LIR) thatareobligatedtokeeptheirregistrationrecordspublic. ThisinformationdisclosesallIPaddressesregisteredforeachentitythey belongto, includingindependentorganisationsorISPs. IPGeolocation servicevendorscanobtainthisregistrydatausingRIRwebsitesandAPIsor canrequestbulkaccesstothedata. Examplesite:

  14. WhoIsData Whatdataisavailable? Thisdataisusuallyupdatedonadailybasisandincludeasetof registrationdata. ThisregistrationdatacontainstheIPaddressblock recordsandwhichorganisationstheyareregisteredunder. Itmayadditionallycontainastreetaddressorthenetworklocation coordinates, althoughnoneofthegeographicalpropertiesis mandatory. Furthermore, theserecordsaremaintainedbytheregisteredparty andarenotvalidatedbyanyexternalbody. Thismeanstheaccuracy ofthedataisquestionableevenwhenitismadeavailable. ScreenshotofexampleWhoIsdatafromARIN'swebsite.

  15. WhoIsData HowaccurateisWhoIsData? Therearearound10millionrecordsintheglobalWhoIsdatabaseforIPv4 alone, someofwhichcanserveasaveryaccurateIPGeolocationsource. Forexample, asmallinternetCafewithastaticIPaddress (orasmallrange ofaddresses) usedon-premisesandrecordedintotheRIRdatabaseinclusive ofitsphysicaladdress. Thisscenarioexposesaccurategeolocation informationwithaprecisionuptoastreetaddress. Inmostcases, whenan organisationreportsincorrectoroutdatedinformation, oroutsourcesthe registeredaddressblockstoanotherparty, therecordswillnotrevealtheIP usagelocation. Therefore, IPGeolocationbasedonWhoIsdatabaseonlyislargelyinaccurate aswhole.

  16. BGPData WhatisBGPData? TheBorderGatewayProtocol (BGP) isaglobalinternetaddress routingdirectory. Thisisastandardisedexteriorgatewayprotocol toexchangeroutinginformationamongstactiveAutonomous Systems (AS) ontheinternet. BGPinvolvestheannouncementof preferredpathwaysanddirectionofinternetaddressblocks (prefixes). Forinstance,ifIneedtosendapackettodestination ‘A’,butIonly knowhost ‘C’ andcanforwardtraffictoit. Thepacketwillstillreach thedesireddestination ‘A’ if ‘C’ knows ‘A’ eitherdirectlyorviaother intermediatepeers. Inanutshell,thisishowglobalinternet connectivityworks. C WhenanASentitywishestouseanIPaddressrangeonthepublic internet, ithasto ‘announce’ ittotheclosestpeers. Insimple words, itsendstheannouncementthatmeans: “I’mresponsiblefor thatrange (prefix), sowhoeverwishestocommunicatewitha deviceinthatrange, directthecommunicationthroughme”. A Thisannouncementeventuallypropagatesacrossallotherpeers worldwidetoinformthemonhowtosendtraffictothatIPaddress rangeifrequired. B

  17. BGPData HowisBGPDatausedforIP geolocation? Now, howthiscanbehelpfulforIPGeolocation? Firstly, unlikethe WhoIsdatawhichshowstheorganisationregisteredagainsta particularIPaddressblock, BGPdatacanrevealwhoisactually usingit. Thisisnotalwaysthesameenterpriseentityaswe discussedabove. If, forexample, wewitnessablockregisteredwithARINforan AmericancompanywithaUSstreetaddress, butisbeingusedby ASregisteredwithRIPEinTurkey, thissuggeststhattheIPblockis likelybeingusedinTurkey, whichimprovesgeolocation. Secondly, theBGPdatacanalsorevealwhataddressesarenotusedatall, an unannouncedspace, withwhichageolocationprocessshouldnot evenbeattempted.

  18. BGPData CoverageofIPgeolocationservice? TheIPaddressisnotaphysicalobjectinaphysicallocation. Itissimplya numericallabelthatcanbeallocatedandunallocatedfromindividual devicesornetworks. Thereisnowaywecangeolocatealabelthatisnotin use (allocated). Therefore, whenyourIPGeolocationserviceprovider statesitcangeolocate100% oftheaddressspace, pleaseinterpretthis withcautionasitcanonlygeolocatetheannounced (routable) spaceat most. TheroutablespaceforIPv4canbemonitoredontheIPv4Address SpaceReport. SomeotherusagesofBGPdatarelyontheassumptionthatIPaddresses belongingtocertainprefixesaremeanttosharegeographicalproximity. This, however, doesnotalwaysholdtrue. Prefixestendtoaggregatealong thewayandmayincludeaclusterofseveralsmallerprefixesthatoriginate fromdifferentregions.

  19. Fieldevidencedata WhatisFieldEvidenceData? TherearemanyadditionaldatasourcesthatcanbeutilisedforIPgeolocation whichqualifyasfieldevidencedata. Thebestexampleisthedatareceiveddirectly fromusersorsubmittedusingGPS-enableddevices, suchasmobilephonesor tablets. Thisdatacanrevealtheallegedgeographicalcoordinatesofadeviceusing apublicIPaddressandcanserveasempiricalevidenceorground-truthdatafor thatparticularIPaddressatthatparticularmomentintime. Othersourcesinclude: eCommerceoriginateddatasources/feeds, suchasbilling/shippingaddressof thecustomerwhencombinedwithanIPaddressusedforthetransaction; IoTdeviceswithknownlocationsandIPaddressesanddevicepools, either publiclyavailableorproprietary, forexample, theRIPEATLASproject; and voluntarilyorcommerciallyobtainedgeolocationdatafeedssuchasSelf- publishedIPGeolocationData.

  20. Fieldevidencedata LimitationofFieldEvidenceData. Thereare2importantprinciplesassociatedwiththefield evidenceIPGeolocationdata: Thedataisalwayslimited, asitisimpracticalforoneentity toaccessallinternet-connecteddevicesaroundtheworld. ThismethodidentifiesIPlocationataspecificpointintime only, andispronetoerrors. Noteverythingcanbetrusted aspureandreliableevidence. Devicemisconfigurationor faultsandnetworkredirectionssuchasVPNorPROXYs alongthewayaresomeofmanydatainaccuracyscenarios thatcanoccurduringthedatacollectionprocess.

  21. Scientificdata WhatisScientificData? Overtheyears, manyattemptshavebeenmadetointroducean additionalactivemeasurementapproachtoIPGeolocationsolutions. Mostoftheseapproachescomefromtheresearchontime-delayto distanceconversions, suchastriangulation, downtotheclosestpoint ofpresence (POP) ofnetworkinterfaces (routers). However, globalnetworktrafficinterfaces (publicrouters) are complex, withtheassumptionthattime-delaybetweentwo consecutiveinterfacesisproportionaltothephysicaldistance betweenthemisincorrect.

  22. Scientificdata LimitationsofScientificData. SomelargeISPsmaketheirinternalsubnetshidden. Therefore, many intermediatenodesarenotpubliclyvisibleandcannotbeaccounted for. Practicalnetworkconsiderationsarebasedon ‘leastcost’ routing, whichisdifferentfromacommonacademicassumptionoftheshortest one. DuetoQualityofService (QoS) considerations, somenetwork interfacescanalsobeprogrammedtoartificiallydelaynon-productive traffic. Therefore, therelationbetweentime-delayanddistanceisinconsistentand cannotlaythefoundationforoverarchingprinciples. Todate, noneofthe methodsbasedontime-delaytriangulationtheoryhasbeenintroduced intotheserviceandisunlikelytoemergeforglobalcommercial implementation.

  23. ReverseDNSdata WhatisReverseDNSdata? TheDomainNameSystem (DNS) isthephonebookoftheinternet. Usually, DNSis usedtotranslateadomainnametoanIPaddress, sothebrowserscanload Internetresources. However, itcanalsoworkinreverseorder, youcanqueryDNS aboutwhatdomainnamerecordisattachedtoanIPaddress. ThistextualrecordassociatedwithanIPaddressisnotmandatory. Itishardlyof anyutilitywhentheaddressisnotinvolvedinpublishinginternetservicesor consumablematerial. However, someISPsmayusethistextualtaggingopportunity tomarktheirIPaddressesforsomeinternalpurposes. SomeoftheDNSentriescanbepotentiallyusedtorevealgeographicalproperties. Forexample,ifthetargetaddressorthelastrouteralongthewayislistedon DNSasanentry:p1-0-0.sanjose1.br2.bbnplanet.net,itsuggeststhattheIP addressislikelylocatedinSanJose,California. Thismethodshowsanadd-on benefitforlocatingareaswithinterpretableDNSnames. Theonlyknowncommerciallyutilisedscientificapproach hasbeenintroducedbyDigitalEnvoy,Inc,protectedby USpatent (6,757,740) grantedin2004. Themethodis basedonDNSrecords (textualnameofthepublic internetaddresses) andcrawling (tracert) totheclosest routerinanattempttoidentifythecityandcountryof thehost.

  24. ReverseDNSdata LimitationofReverseDNSdata? Unfortunately, thereverseDNS-basedapproachsuffersfromseveral limitations: 1. 2. 3. ManyinterfacesdonothaveanassignedDNSname; Themisnamingofaninterfaceresultsinincorrectlocation; Citynamescanoftenberepetitiveacrossdifferentcountriesor territories, i.e. SanJoseCitycanalsobefoundinbothCostaRica andinCalifornia, US; Thelackofuniversallyacceptedrulesandnamingregulations meansrecordsrequiremanualprocessing, whichistime- consumingandpronetoerrors. 4.

  25. THEARTOFGUESSING TheIPGeolocationserviceproviderscanobtaintheirdatafrommultiple sources, althoughnonecanserveasanultimateandundoubtablesourceof truth. Whendataismutuallysupportive, i.e. multiplesourcesindicatethevery samelocationforanIPaddress, itisanobrainer. Often, however, thedata receivedisverycontroversial, andthisiswherethetrickypartlies. WefrequentlyhearpeoplesaythatIPGeolocationispartscience, partart. Well, hereistheartpart. Theartofguessing! Let’strytoseewhatyour averageIPGeolocationserviceproviderisdealingwith.

  26. Challengeswith.. IPGeolocation Imaginewe’vegotthefieldevidence, suchasauser-submitteddatasample, suggestingthattheIPaddressX.X.X.5wasusedtodaysomewherein Manhattan, inthecentreofNewYorkCity, NY, US. TheWhoIsdataforthataddressrevealsthattheblockX.X.X.0 - X.X.X.255 (wheretheabove-mentionedaddressbelongsto) isregisteredforabusiness ‘Y’ locatedinOntario, California, US. TheBGPdatasuggeststhatthataddresshasbeenannouncedbyanAS entity ‘D’, registeredasoperatedfromAustin, Texas, US. Andtheprefix sizewas /22 (1024hosts). So, whereistheactuallocation? CanonesaythatX.X.X.0 - X.X.X.255 blockislocatedinNY? Ormaybeevenentire /22prefixisinNYtoo? MaybetheX.X.X.5istheonlyoneinNYandothersarenotevenclose? Ormaybethesampledatawe’vegotiswrongandtheactuallocationforall isinOntario, CaliforniaorevenTexas? Thefinalconclusiondependsonwhichdatasourcecanbetrustedthemost. Consideringtherearelimitedtoolstoprioritisedatasources, theexistingIP Geolocationserviceprovidersoftenendupguessing. 1. 2. 3. Theirmotto: Anyguessisagoodguess!

  27. Furtheranalysis 51 Million Challengeswithdatasources Ifwehappentoobtainmoreevidencedatapointsfromnearbyaddressentries, itwouldlikelyimproveourconfidence, butonlyifthedatasupportoneofthe leadingguessoptions. 26 Million However, ifthedataiscontroversial, itcanmakegeolocationestimation extremelychallenging. Whatifwehavefurtherevidencefromaddress X.X.X.128fromToronto, Canada, datedjustacoupleofdaysbefore? Would thisaddresshavemovedfromCanadatotheUSrecentlyorjustapartofthe blockorarewefacinganerrorsomewhere? Thisisanothercomplexissue – datagranulation. IPaddressesareusually deployedinblocks. Thelargerblocksarebetterforglobalrouting. Ifblocks aretoosmall, theworld’sroutingtablesubstantiallyexpandsandtherouters caneventuallyfacememoryoverflowerrors. Therefore, IPGeolocation servicescanlogicallyassumethatsomeconsecutivesequencesofIPaddresses arelikelytosharereasonablegeographicalproximity.

  28. Furtheranalysis 51 Million Complexityofreducingerrors However, definingtheactualblockIPaddressboundariescanbetrickyandofteninvolves aseriesofeducatedguesseswhichmayrequireinterventionfromthehumanoperators. 26 Million Forexample, onecanfindsimilaritiesinthereverseDNSentriesfortheblockmember addressesthatpossiblysuggestthesamenetwork. Also, IPaddressescanbetracerouted whilelookingforcorrelationsbetweenthehostIPaddressesthatparticipateinthepacket delivery. Whicheverwayischosen, itiscommonlypronetoerrors. DNSentriesarenotalwaysavailableorcanbewrong. Traceroutedoesnotalwaysrevealallthehostsinthedeliverypath, assomeare simplydonotrespondtoICMPrequests. Perfecthostcorrelationisnotalwayspossible, asnetworkroutersoftenuseseveralIP addressportsforthesamerouterdevice. Theymayappeardifferentonatraceroute butinrealityarethesame, whichmayalsoleadtoanerror.

  29. SOHOWDOIP GEOLOCATIONSERVICE PROVIDERSOPERATE? Let'sunderstandhowvariousIPgeolocationserviceproviderswork.

  30. Entry Level Theentry-levelIPgeolocationprovidersarelikelytousefewerdata sources, largelyusingWhoIsdataonly, whichlimitstheirdecisionscope tomuchfeweroptions. Thismakestheprocesseasierandmaybe faster, butasatrade-off, itismuchlessaccurate. Comparison Advanced Level ThemoreadvancedIPGeolocationproviderspresumablyworkhardto organiseandimprovetheirresultsbydelegatingmanyofthefinal decisionstoahumanpersonal, inadditiontosomelow-levelautomated process. Unfortunately, manualworkdoesnotguaranteebetterresults, ashumansarealsopronetoerrors, anddefinitelymakestheprocess slower. Asaresult, weoftenseecommercialIPgeolocationdatabases updatedonamonthlybasisonly, orweeklyasthebest.

  31. Conclusion TheIPaddressspaceisaverydynamicarea. MillionsofIPaddresseschanginghandsorare reallocatedcontinuouslyeveryhour. Thereforemonthlyorweeklyupdatesarecertainlynot suitableformostIPgeolocationapplications. Insummary, noneofthecurrentlyexistingmethodsissufficientlyaccurate. Eventhougha combinationofmethodsallowsformorepreciseestimationofIPlocation, thisdoesnotsolve theproblemofaccuracyonaglobalscale. 26 Million Moreover, thelackofafullyautomatedanddeterministicmethodologypreventsexistingIP geolocationdatabasesfrombeingupdatedfrequentlyenoughtocopewiththehighlydynamic natureoftheinternetIPaddressspace. TofindouthowBigDataCloud'sIPGeolocationservicediffersfromexistingproviders, checkoutourdetailedblogpost: TheNextGenerationIPGeolocationService.

  32. EmailAddress Contact us support@bigdatacloud.com Website Reach out if you have any questions or clarifications www.bigdatacloud.com

  33. Formorecontentrelatedto IPgeolocation, visitour website. BigDataCloudPtyLtdisahighlyinnovativestart-up companyfoundedin2018andoperated internationallyfromourheadquartersinAdelaide, SouthAustralia. Afteryearsofpreviousexperience ine-commerce, fraudprotectionandtargeted internationalmarketing, theBigDataCloudfounders identifiedanimmenselackofhighquality, fastand affordableAPIswithinthisandothertechnical industries. Formoreinfo, visit: www.bigdatacloud.com

More Related