Moving Beyond “Open Sky” Accuracy Metrics

Moving Beyond “Open Sky” Accuracy Metrics

Introduction

When it comes to defining the accuracy of a GNSS technology “open sky” measurements is the coin of the realm. This makes sense. It is the only scenario that is consistent across tests and technologies to compare the accuracy of different approaches to GNSS positioning. The problem is for the majority of consumer and mass market use cases there aren’t major problems with open sky scenarios. A couple of meters accuracy is good enough. The real challenge is urban and some suburban scenarios where accuracy can be in the tens and even hundreds of meters level of accuracy.

From a metrics perspective there is no benchmark for comparing results under multipath conditions. Each geographic location has its own unique topology. This makes having a singular statistic to illustrate the accuracy of GNSS positioning under multipath effects quite tricky. Our solution with the work at Zephr is to create a large variety of testing scenarios and make them part of our continuous integration process. In short this means every time we merge code into the Zephr solver we run a benchmark test across ten different scenarios to determine the impact on positioning accuracy and resilience. This perpetual focus on performance has really helped us dial in the solver to be robust and accurate. To that end we thought it would be interesting to share the results of this work.

The Benchmark Tests

First, it is useful to illustrate the ten tests we’ve set up. The goal is have a variety of scenarios that reflect the use of mobile phones in real life. That means having static and kinematic tests across a variety of geographic location types — open sky, suburban and urban. In many of these tests we actively sought out multipath scenarios. We walked close to tall building, went through tunnels, found tight alleys or sought out dense tree cover. When we talked to companies that saw massive amounts of positioning data we asked them where their trouble spots were. We are somewhat limited in the variety of scenarios we can effectively collect in Colorado, but both Denver and Boulder gave a reasonable variety of built environment to work with. This table provides an overview of the taxonomy for those tests:

It is also helpful to see what the built environment looked like for each of these tests by overlaying the traces on high resolution aerial imagery:

University of Colorado, Boulder, CO (Four Samsungs)
Davidson Mesa, Louisville, CO (Static Test)
Monarch High School, Louisville, CO (Five Pixels)
Monarch High School, Louisville, CO (Seven Samsungs)
Union Station, Denver, CO
University of Colorado, Boulder, CO (Five Pixels)
University of Colorado, Boulder, CO (Three Pixels)
Community Park, Louisville, CO
Downtown, Louisville, CO
LoHi, Denver, CO

The taxonomy and imagery illustrates a nice variety of test scenarios to help us ensure we are achieving robust performance across a variety of real world scenarios. This initial set of tests are focused on pedestrian use cases where multipath can be most impactful, since you don’t have the buffer of a road to help clear the line of sight to satellites.

The Metrics

Now that we have the benchmark tests laid out we can dive into the metrics we’ve created to test our solver’s accuracy. From a high level we measure two different forms of error 1) geodetic error and 2) total error. Geodetic error is the horizontal error or (x,y) that corresponds to a positions latitute and longitude. Total error also includes the vertical error to make it (x,y,z) or latitude, longitude and altitude. In some use cases horizontal accuracy is all you need to care about and other times have an accurate altitude is critical. So, we work to break out all of our results against total error and geodectic error explicitly.

At the core of Zephr’s approach is a multi-receiver set up where we take measurements from multiple devices to create individualized error corrective for all devices on the network. The key metric that powers the accruacy of these corrections is the “relative distance” between the sampled devices in the network. The “relative distance” is just the measurement of distance between each device irrespective of its absolute location in latitude, longitude and altitude. This principle works in a similar way to RTK based positioning and we cover in more detail in this previous blog post. In short the more accurate the relative distance metric in our multi-receiver solver the more accurate the relative position of the receivers will be.

Last but not least we also test the availability of the Zephr solver and how often it successfully positioned the device. Also, since we have access to both the GPS solver and the Zephr solver we can look at initialized position to see if the larger number of observations from the device’s native should should be fallen back to. This allows us to give the user the best possible position from the device as well as see where we can improve our own solver as the model that powers it evolves.

The best way to see these it to take a look as some actual results.

Zephr Continuous Integration Testing Results

Since these testing results are generated programmatically we have several new results every data reported in Github. First we’ll take a look at the accuracy metrics that are all reported as mean (average) error for every epochs (second) we determine the position of a device in the test. For each test we get a set of summary statistics that report the mean error in meters for total error, geodetic error and rtk relative position error.

Aggregate Accuracy Results from Zephr’s CI Testing

Next we get a breakdown of these same statistics for each dataset including a comparison to GPS broken down by the mean error (average), p90 error (90% of the results below the error metric) and the max error (worst error in the test). First we’ll show “geodetic error”, which is just the horizontal error in latitude and longitude positioning:

2D Geodetic Error for Zepr

Next we’ll share the same statistics for “total error” which includes error for altitude in addition to latitude and longitude.

3D Total Error for Zephr

Lastly we can look at the availability of Zephr’s multi-receiver to solve for position with a certainty greater than the native position provided by the chipset on the mobile phone.

Zephr Solution Availability

Conclusion

Overall the robustness of Zephr’s approach is continuing to improve. Our goal is to get our CI benchmarks to below one meter for geodetic error and below two meters for total error. To hit this threshold we need to further improve our results in urban areas. In our testing we’ve found that satellite availability is a major driver of solid performance in urban areas. This can be seen in the last table where availability degrades in the more urban test scenarios. Since we’ve just been using GPS and Galileo constellations we often find that a lack of satellite availability in urban canyons hurts performance. Adding constellations from BeiDou, GLONASS, QZSS and NAvIC will put Zephr on par with Android’s native satellite availability and significantly improve urban performance. Another area of investment is improving our navigation filter through better sensor fusion. As one can see in the results the current navigation filter degrades accuracy as often as it improves it. This will be an area of focus for the team in our next round of work.

Overall we believe this improved level of accuracy will open up some exciting new use cases. In the near future we’ll be sharing demonstrations of what these new levels of accuracy can enable. Stay tuned!

Orell Garten

Making data flow to where it’s supposed to.

5 个月

Look great! A lot of metrics in many fields are far away from the actual use cases making them kind of useless and it's good to see that you are tackling this problem in localization.

Erick Luerken

Generative AI, Causal ML & Experimentation @ Stubhub

5 个月

That’s seriously awesome stuff!

回复

要查看或添加评论,请登录

Sean Gorman的更多文章

  • The Building Blocks for Vector AR

    The Building Blocks for Vector AR

    As we've been testing out our AR capabilities, using just the GPS and IMU, I've started to wonder what data we'd need…

    8 条评论
  • Location Enabling AI without Computer?Vision

    Location Enabling AI without Computer?Vision

    The Question In our previous work at Pixel8earth and Snap we spent a lot of time trying to drive down the cost of city…

    12 条评论
  • Hickam AFB GPS Accuracy Testing

    Hickam AFB GPS Accuracy Testing

    While were out at Hickam AFB for the NSIN - National Security Innovation Network Propel event we had a chance to run an…

    4 条评论
  • The Evolving Shape of GNSS Jamming and Spoofing

    The Evolving Shape of GNSS Jamming and Spoofing

    As we've been building Zephr we do lots of testing around the globe. On occasion we've been asked to test our…

    1 条评论
  • Squashing "Z" Error and Building a More Resilient GPS?Network

    Squashing "Z" Error and Building a More Resilient GPS?Network

    When we announced Zephr’s work on improving the accuracy of GPS we focused on how our approach improved horizontal…

    11 条评论
  • GPS: the technology we all use that is never good?enough

    GPS: the technology we all use that is never good?enough

    Given the trillion dollars in economic impacts and the billions of dollars invested we’d expect GPS to have our…

    51 条评论
  • NeRF World (+3DiM)

    NeRF World (+3DiM)

    2 条评论
  • SnapMapping the World in 3D

    SnapMapping the World in 3D

    When Pramukta, Winnie, Chris and I started Pixel8.earth we wanted to map the world in 3D, and a few months ago we…

    67 条评论
  • Should the “3D Map of the Globe” be a Public Good?

    Should the “3D Map of the Globe” be a Public Good?

    I’ve read a few stories over the last year about the need for a back up to GPS. The New Yorker has highlighted the…

    9 条评论
  • Deploying the Lowest Earth Orbit Satellite to Edit OpenStreetMap

    Deploying the Lowest Earth Orbit Satellite to Edit OpenStreetMap

    One of the challenges we’ve seen with OpenStreetMaps over the year is access to current remote imagery for creating and…

    4 条评论

社区洞察

其他会员也浏览了