1 / 70

Making the Sky Searchable: Automatically Organizing the World’s Astronomical Data

Explore the groundbreaking technology of Starfield Search to create a comprehensive astronomical database. Learn about robust matching techniques, inverted index usage, and geometric hashing for efficient data organization. Contact roweis@cs.toronto.edu for more information.

rbernal
Download Presentation

Making the Sky Searchable: Automatically Organizing the World’s Astronomical Data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Making the Sky Searchable:Automatically Organizing the World’s Astronomical Data Sam Roweis, Dustin Lang & Keir Mierle University of Toronto David Hogg & Michael Blanton New York University roweis@cs.toronto.edu

  2. Organize All Astronomical Data • Vision: Take every astronomical image ever taken in the history of the world and unify them into a single accurately annotated and easily searchable database. • We want to include all modern professional telescope surveys plus all amateur photos, satellite images, historical plate archives… • How could this ever be possible? roweis@cs.toronto.edu

  3. Core Technology: Starfield Search • You show me a picture of the night sky. • I tell you where on the sky it came from. roweis@cs.toronto.edu

  4. ? Rules of the game • We start with a catalogue of stars in the sky, and from it build an index which is used to assist us in locating (‘solving’) new test images. roweis@cs.toronto.edu

  5. Rules of the game • We start with a catalogue of stars in the sky, and from it build an index which is used to assist us in locating (‘solving’) new test images. • We can spend as much time as we want building the index but solving should be fast. • Challenges:1) The sky is big.2) Both catalogues and pictures are noisy. roweis@cs.toronto.edu

  6. Distractors and Dropouts • Bad news:Query images may contain some extra stars that are not in your index catalogue, and some catalogue stars may be missing from the image. • These “distractors” & “dropouts” mean that naïve matching techniques will not work. roweis@cs.toronto.edu

  7. Find this “field” on this “sky”. Example (a million times easier) roweis@cs.toronto.edu

  8. Find this “field” on this “sky”. Example (a million times easier) roweis@cs.toronto.edu

  9. Robust Matching Required • We need to do some sortof robust matching of thetest image to any proposed location on the sky. • Intuitively, we need to ask:“Is there an alignment of the test image and the catalogue so that surprisingly many catalogue stars lie (almost*) exactly on top of an observed star?” [*Depends on the noise levels in the catalogue & image. ] roweis@cs.toronto.edu

  10. Search, then Robustly Verify • For each potential match our search algorithm finds, we do a robust verification test to see if we really have the correct alignment on the sky. • This test looks at the number of catalog-object matches and computes the log-odds of getting that many “hits” if the query image were dropped on a random patch of sky. roweis@cs.toronto.edu

  11. Solving the search problem • Now we know how to robustly check if a match is correct. But we still have to solve a huge search problem. • Which match locationsshould we try to verify? • Exhaustive search? ? too expensive! The Sky is Big TM roweis@cs.toronto.edu

  12. (Inverted) Index of Features • To solve this problem, we employ theclassic idea of an “inverted index”. • We define a set of “features” for any particular view of the sky (image). • Then we make an (inverted) index, tellingus which views on the sky exhibit certain (combinations of) feature values. • When we see a new test image,we compute which features arepresent, and use our inverted indexto look up which possible viewsfrom the catalogue also have those feature values. roweis@cs.toronto.edu

  13. Matching a test image • When we see a new test image, we compute which features are present, and use our inverted index to look up which possible views from the catalogue also have those feature values. • Each feature generates a candidate list in this way,and by intersecting the listswe can zero in on the truematching view. The features in our inverted index actas “hash codes” for locations on the sky. roweis@cs.toronto.edu

  14. Robust Features for Geometric Hashing • In simple search domains like text, the inverted index idea can be applied directly. • However, in our star matching task, the features we chose must be invariant to scale, rotation and translation. • They must also be robust to small positional noise. • Finally, there is the additional problem of distractor & dropout stars. The features we use are the relative positions of nearby quadruples of stars. roweis@cs.toronto.edu

  15. B C D A Continuous, vector-valued hash codes! Quads as Robust Features • We encode the relative positions of nearby quadruples of stars (ABCD) using a coordinate system defined by the most widely separated pair (AB). • Within this coordinate system, the positions of the remaining two stars form a 4-dimensional code for the shape of the quad. • Swapping AB or CD does not change the shape but it does “reflect” the code, so there is some degeneracy. roweis@cs.toronto.edu

  16. B C D A Continuous, vector-valued hash codes! Quads as Robust Features • This geometric hash code is invariant to scale, translationand rotation. It is continuous! • It also has the property that if stars are uniformly distributedin space, codes are uniformly distributed in 4D. • We compute codes for most nearby quadruples of stars, but not all; we require C&D to lie in the unit circle with diameter AB. roweis@cs.toronto.edu

  17. “Solving” a new test image • Identify objects (stars+galaxies) in the image bitmap and create a list of their 2D positions. • Cycle through all possible valid*quads (brightest first) and compute their corresponding codes. • Look up the codes in the code KD-tree to find matches within some tolerance; this stage incurs some false positive and false negative matches. • Each code match returns a candidate position & rotation on the sky. As soon as 2 quads agree on a candidate, we proceed to verify that candidate against all objects in the image. roweis@cs.toronto.edu

  18. “Solving” is Easy as 1-2-3 1) Get your image. 2) Upload it to us. 3) Your exact location + lots of other data! roweis@cs.toronto.edu

  19. 1.5 degrees

  20. Astronomy Picture of the Day ? roweis@cs.toronto.edu

  21. 0.5 degrees

  22. 7 arcminutes

  23. 7.5 degrees

  24. 68 degrees

  25. 1.4 degrees

  26. 6.1 degrees

  27. Preliminary Scaleup Results: SDSS • The Sloan Digital Sky Survey (SDSS) is an all-sky, multi-band survey which includes targeted spectroscopy of interesting objects. • The telescope is located at Apache Point Observatory. • Fields are 14x9arcmin corresponding to 2048x1361 pixels. roweis@cs.toronto.edu

  28. Preliminary Scaleup Results: SDSS • 336,554 fieldsscience grade+ • 0 false positives • 99.84% solved 530 unsolved • 99.27% solve w/ 60 brightest objs Assume known pixel scale(for speedup of solving only.) Magnitudes used only to decide search order. roweis@cs.toronto.edu

  29. Speed/Memory/Disk Indexing takes ~12 hours, uses ~ 2 GB of memory and ~100 GB of disk. Solving a test image almost always takes <<1sec (not includingobject detection). Results on all of SDSS All the work is in the hardest few% of fields roweis@cs.toronto.edu

  30. astrometry.net is open source! • We have released all our code.Download it from astrometry.net if you want to try the system out yourself. • We are putting the engine on the web.email alpha@astrometry.net if you want to be an alpha tester for the web service. • Our internal trac pages are public.Check out trac.astrometry.net if you want to see all the gory details. roweis@cs.toronto.edu

More Related