1 / 27

GIS Applications in Transportation & Logistics: Improving Geocoding Accuracy

Explore GIS applications in transportation & logistics, focusing on online geocoding methods and data matching strategies for improved accuracy. Learn about two-stage matching procedures and experimental results.

camire
Download Presentation

GIS Applications in Transportation & Logistics: Improving Geocoding Accuracy

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Some Recent GIS Applications in Transportation and LogisticsNew York Metropolitan Transportation Council November 19, 2008Fan YangCity College of New Yorkfyang@ce.ccny.cuny.edu

  2. Outline • Introduction to GIS • Online Geocoding Methods • Web-based GIS solutions • GIS in Logistics

  3. Introduction to GIS An Information System For Maintaining and Using Spatial Information Views Products Updates Analysis Mobile / LBS Mission Critical Applications A Generic Platform for Working With Geographic Information

  4. Online Geocoding Processing • Data cleansing from multiple data sources • Various errors might exist in the input data ‘dirty’ input data ‘clean’ output data State DOT Traffic Information Service Provider (ISP) Local Agencies State Patrol Clean Reference DB Radio Data collection Data cleansing Data disseminating

  5. An Example: Online Incident Locator • Match the similarity between an input record and reference ones Input: Reference:

  6. Data Matching • Various input data errors including spelling mistakes, truncations, inconsistent conventions and missing fields

  7. Two-Stage Matching • 1. Use inexpensive metric to quickly find a relatively small candidate set; • 2. Identify the best matches for the input within the candidate set in terms of the similarity score. • Use the offline pre-built similarity index to improve the performance for online operations. • Three ways to build similarity index in the first stage • Build on the whole words in every important column • Build on every token in every important column (token based) • Build on every q-gram in every important column (q-gram based)

  8. Token and Q-gram Based Matching Methods • Token based: • Build tree or hash based similarity index upon all tokens in important columns of all reference records. • Candidates should share at least one common token for each of the columns “main_base” and “cross_base” with that of the input record (e.g., “Mountain Viw”/ “Mountain View”). • Q-gram based: • Divides a token (word) into character groups (grams) with equal length q. E.g., for “Redlands”, if q=3, six q-grams “Red”, “edl”, “dla”, “lan”, “and”, and “nds”. • Build tree or hash based similarity index upon all q-grams. • Candidates should share at least one common q-gram for each of the columns “main_base” and “cross_base” with that of the input record (e.g., “Moutain Viw” / “Mountain View”).

  9. W a t e r m a n | | | | | | | | W a t _ _ m a n | | | | | | | | 0 0 0 1 1 0 0 0 Second Stage: Measuring Similarity Score • Edit distance ed(s1,s2): minimum number of character edit operations (delete, insert, replace) required to transform s1 to s2, divided by the maximum length of s1 and s2. • IDF weight: more frequent, less weight. • The record similarity function: is the cost to transfer record u to v, proportional to ed(u,v). ed(s1,s2) = 2/8

  10. Experimental Results • The road network (reference table) - the processed TIGER database in Los Angeles Area. • Based on 500 geocoded (correct) incident records in downtown LA, we randomly generated “dirty” input data

  11. Matching Accuracy

  12. Online Performance With the pre-built index, only a small portion of reference data is retrieved to match an input record, therefore, significantly improving the online performance.

  13. The Size of the Candidate Set • The smaller q value means finer granularity, and may catch more candidates which might be missed for larger q values. • The size of the candidate set increases as q value becomes smaller.

  14. Remarks • Proposed two efficient approximated matching methods for online incident data cleansing. • A two-stage matching procedure is developed to significantly improve the online performance. • The q-gram based method outperforms the token based one in terms of match accuracy. Suggest q=3. • This study can be applied to ITS online data management such as loop detector data and construction data. More geographical information can be accommodated.

  15. Outline • Introduction to GIS • Online Geocoding Methods • Web-based GIS solutions • GIS in Logistics

  16. Consume fewer licenses and require thinner client. Provide rich spatial analysis and editing functionalities. Satisfy service Oriented Architecture (SOA) Provide SOAP (Simple Object Access Protocol),WMS (Web Map Service), KML(Keyhole Markup Language) based services. Why Web-based GIS Solutions

  17. NYSDMV Application Accident Location Information System (ALIS)Location Editing, Query and Reporting • Integration with NYSDMV and NYSDOT Legacy systems • Multi-Agency Effort • Web Application Host • NYSOFT • Data Management • NYSCSCIC • Application Users • NYSDMV • NYSDOT • NYSCSCIC • GIS Data Co-op (Local Government Agencies)

  18. ALIS: Web-based GIS Application • Automatically verifies the location information against the GIS basemap • Allows users to edit or update accident locations based on the availability of improved map data in a region or the availability of more information pertaining to the accident case. • Allows users to monitor and record changes made to the geospatial database. • Provides users the ability to select street segments for editing using either spatial queries, attribute queries, or network tracing.

  19. Outline • Introduction to GIS • Online Geocoding Methods • Web-based GIS solutions • GIS in Logistics

  20. “Using advanced Geographic Information Systems (GIS) tools and methods in conjunction with existing infrastructure and procedures in order to solve logistics problems” What is “GIS Logistics”? • Main Applications: • Site Selection Analysis • Asset and Property Management • Territory Optimization • Real-time Dynamic Routing and Scheduling • Supply Chain Management

  21. Overwhelming planning task Efficient routes are not guaranteed Dependent on local knowledge Why use GIS Logistics?

  22. What is Territory Optimization? • A periodic vehicle routing solution in a big territory • Distribute periodic orders among available trucks/drivers • Input: service requests, truck schedules, and business rules • Output: truck daily schedule • Large-scale problem size and complicated business rules • The goal • Balance workloads among employees • Minimize total travel time (by all trucks over the entire planning period) • Minimize time window violation • Minimize overtime

  23. Territory Optimization Tool Bar Map View List View Explorer Tree View Gantt Chart View

  24. Desktop App Client Enterprise Systems Web Service Bus GIS Server GIS Enterprise Database Routing & Scheduling System Architecture

  25. What is Real-time Dynamic Routing and Scheduling? • Customers call for periodic service requests • Used to determine optimal truck schedule candidates • Need to be served in a real-time fashion • Might change the existing daily schedule

  26. Real-time Dynamic Routing and Scheduling • Efficiently handles recurring service requests • Dynamically constructs the service tree structure • Considers combinations of feasible employees and date ranges • Re-sequences a daily route by solving a Vehicle Routing • Problem with Time Windows (VRPTW)

  27. Thank you! Questions?

More Related