Machines and Mappers - How AI and tooling can assist with OSM Data Integrity.
Saturday, 1:45 PM – 20 min
Last year we shared how Facebook has been using machine learning algorithms to help human mappers edit and validate geometry faster. This year our focus is on sharing more about our efforts on data integrity. OSM is a vast dataset full of uneven quality data and we have been working on creating ways to improve that quality at every stage of the process. With billions of tiles, it is impossible to manually check the map each time we update so we have had to find some automated ways to help the process.
The first step was building more helpful validation checks at the mapping level to elevate the quality of incoming data and have a better user experience. Next, we used tools such as OSMCHA, Osmose, Keepright, and Overpass querying to validate the data already on the map and clean up where necessary. Facebook users provide the next source of data integrity by submitting reports to us when there is an issue with the map. With over millions of people viewing the map, we get feedback from all corners of the world. We have found numerous data fix issues through these reports and have contributed our data fixes back to live OSM when relevant. Mobius then allows us to intelligently evaluate clusters of data changes so that we can be sure that our snapshot of OSM retains a high degree of data fidelity. Lastly, when we render the data we draw upon blacklists/whitelists to automatically find vandalized names we may have missed previously in our processes.