Pokemon Go Turned 143 Million Players Into Unpaid AI Trainers

There is a Coco Robotics delivery bot somewhere in Los Angeles right now, navigating a city sidewalk with centimeter-level precision.

It knows exactly where it is, not because of GPS, but because of a dataset built over years by millions of people who thought they were playing a game.

Niantic just confirmed it. The Visual Positioning System powering those sidewalk robots was trained on over 30 billion images that Pokemon Go players submitted through the game.

Players got Poke Balls and XP. Niantic got one of the largest real-world visual datasets in AI history. The robots got to navigate Chicago without getting lost.

I’ve been watching how AI companies source training data for a while, and this story crystallizes something I think most people have not fully processed yet: the game was never the product. The scan data was.

This matters beyond Pokemon Go. The same model is playing out across dozens of apps right now, and knowing how to spot it is increasingly important if you care about what happens to your data.

What Niantic Was Building Inside Pokemon Go the Whole Time

Pokemon Go launched in 2016 and peaked at 230 million monthly active users. Most people remember it as an augmented reality game where you walked around catching digital creatures.

What was also happening, quietly, was that every time you pointed your phone at a statue, a building, or a landmark to complete a field research task, you were scanning the physical world and uploading it to Niantic’s servers.

The 2020 Field Research update made this particularly efficient. Players were asked to scan landmarks from multiple angles, different heights, different lighting conditions, different weather.

The in-game reward was trivial. The resulting 3D environmental models were not.

The Visual Positioning System Explained

The technology Niantic built from that data is called a Visual Positioning System, or VPS. It works by comparing what a camera is currently seeing against a detailed visual map of the environment.

Where GPS can be off by several meters, or disappear entirely in canyons of tall buildings, VPS can place you within a few centimeters.

According to a March 2026 report from MIT Technology Review, Niantic Spatial, which spun off as an independent company in May 2025, is now licensing this technology commercially.

Coco Robotics, a startup running roughly 1,000 suitcase-sized delivery bots across Los Angeles, Chicago, Miami, and Helsinki, is among its first partners.

Niantic’s CEO put the core insight plainly: “getting Pikachu to realistically run around and getting Coco’s robot to safely and accurately move through the world is the same problem.”

How 30 Billion Images Became a Robot Navigation System

The scale here is worth sitting with. 30 billion images. 143 million people contributing.

Years of ground-level visual data from cities across the world, captured at human height, from human perspectives, in every lighting condition imaginable.

No robotics company could have built that dataset on purpose. It would have taken thousands of paid workers walking every street in every city, many times over.

Niantic got it for the cost of digital badges and rare Pokemon encounters.

Ground-Level Data Is So Hard to Get Any Other Way

Satellite imagery is easy to acquire. What is genuinely hard is granular, street-level visual data that matches exactly what a delivery robot’s camera sees when it moves at sidewalk height through a real neighborhood.

This is what makes the Pokemon Go dataset rare.

Google Street View, one of the most comprehensive street-level visual datasets ever assembled, is still limited to roads where a car can drive.

Pokemon Go players went into pedestrian paths, parks, plazas, and alleys. They scanned things from every angle a person might approach from.

The result is coverage that no vehicle-based system could replicate.

Data Source	Coverage	Perspective	Scale
Satellite imagery	Global, high resolution	Top-down only	Massive but no ground context
Google Street View	Major roads only	Vehicle height, fixed route	Large, limited to drivable roads
Pokemon Go VPS dataset	Parks, alleys, plazas, pedestrian paths	Human height, multiple angles	30 billion images, 143 million contributors
Paid field collection	Targeted locations only	Controlled, consistent	Limited entirely by budget

The Play-to-Train Model and Why It Works So Well

play to train AI data pipeline game controller to robot cycle

What Niantic did with Pokemon Go is not unique. It is a data collection strategy that the AI industry has been running quietly for years, and it works precisely because it does not feel like data collection.

The pattern has a few consistent features:

Give users an entertaining reason to interact with the physical or digital world in a specific way.
Make the interaction feel rewarding through progress, in-game currency, or social features.
Collect the resulting data as a commercial asset, usually through terms of service most users never fully read.
Use or license that data for purposes well beyond the original app.

Niantic is not the first and will not be the last. Google’s reCAPTCHA system used millions of users to transcribe books and identify street signs while filtering bots.

Waze users report accidents and road conditions, building a real-time traffic intelligence layer. The pattern scales because users enjoy the interaction, and the data arrives at no additional cost to the company.

If you want to understand more about how AI agents process and act on this kind of real-world data, that background is useful context for everything that follows.

The Economics of Unpaid Data Labor

Here is the part that tends to produce strong reactions. Players did not just give Niantic their time.

Many spent money inside the game, on in-app purchases, special event passes, and anniversary research tasks tied directly to the scanning features that built the dataset.

The value of that transaction was entirely asymmetric. Niantic got a proprietary spatial AI asset that is now being commercially licensed to robotics firms. Players got digital stickers.

Concrete comparison:

If a company hired contractors to walk every major city street and scan every landmark from multiple angles across multiple lighting conditions and seasons, the budget would run into tens of millions of dollars. Pokemon Go players did this voluntarily over several years, paid for the privilege through in-app purchases, and are receiving no share of the commercial value now being generated from that work.

This is also why I’d argue the shift toward AI doing work humans used to do is not only about automation displacing jobs.

It starts earlier, when the training data itself is extracted from human activity without fair compensation.

This Is Not Just a Pokemon Go Problem

Once you see the play-to-train model, you start seeing it everywhere. These are some of the clearer examples from the past few years:

reCAPTCHA (Google): Clicking “select all traffic lights” trained computer vision models for autonomous vehicle research.
Duolingo: Language learners help translate real-world documents while completing exercises, improving NLP training datasets.
Waze and Google Maps: Real-time user-submitted reports feed traffic prediction models central to Alphabet’s advertising business.
Snapchat filters: Facial mapping data from hundreds of millions of users contributed to the company’s spatial computing research.
Voice assistants (broadly): Flagged recordings submitted for “review” trained speech recognition systems across Google, Amazon, and Apple.

The AI companies that will dominate the next decade are not necessarily the ones with the best models. Many of those models are already commoditizing.

The companies with the most comprehensive, hardest-to-replicate training datasets are the ones with a genuine long-term edge.

What the Terms of Service Say vs. What Players Understood

Niantic maintains that landmark scanning was entirely optional and that all submitted scans are anonymized and not connected to individual player accounts.

Both of those things may well be true. They do not address whether players understood that their optional scans would eventually be used to power commercial robot navigation systems operated by a separate company.

The legal question of whether “optional” and “anonymized” satisfies GDPR or CCPA requirements when data is repurposed for commercial AI is not settled.

Researchers studying re-identification risk in spatial datasets have flagged that visual data, even when anonymized, can sometimes be traced back to specific locations tied to specific individuals.

What Niantic Claims	What Privacy Researchers Flag
Scanning was optional	Players had no meaningful notice of commercial licensing
Data is anonymized	Spatial data carries re-identification risks
Scans not linked to accounts	VPS maps reference fixed physical addresses
Data improves the game	Primary commercial beneficiary is now a separate B2B entity

What This Means If You Use AI Apps Right Now

The Pokemon Go story has a relatively mild ending for users: the game was fun, no personal information was directly exposed, and Niantic’s intent was not malicious by any reasonable reading.

There are harder cases ahead.

AI companion apps, productivity tools, and voice assistants all collect behavioral data at scale. The gap between what the privacy policy technically permits and what users understand is often enormous.

These are the questions worth asking before you hand any AI app access to your camera, location, contacts, or conversation history:

Does the privacy policy state explicitly whether your data is used for model training?
Is there an opt-out for AI training use, separate from general data collection consent?
If the company is acquired or spins off a subsidiary, does your data consent transfer to the new entity automatically?
If the product is free or offers strong free-tier features, ask what the actual value exchange is.

The last question is the bluntest version of a rule I keep coming back to: if you cannot identify what makes a product economically viable to run, your behavior is probably what makes it viable.

From what I’ve seen, a handful of AI platforms stand out for being more explicit about this. Nomi AI documents in its terms that conversations are used to personalize your specific Nomi, not to train shared model weights.

Candy AI operates on a consent-forward model for anything beyond core personalization. Whether those policies hold as these companies grow is something to watch, but they start from a more transparent position than most.

For a direct comparison of how two leading AI companion apps handle memory and data, see Nomi vs Replika.

The broader picture is this: AI training data is not a side issue. It is the central strategic resource of the next decade. The most efficient way to acquire it is to make millions of people want to generate it for you.

Pokemon Go was one of the first examples of this at global scale. It will not be the last, and the next version will be harder to spot.

Takeaways

Niantic used 30 billion images from Pokemon Go players to train a Visual Positioning System now used by Coco delivery robots in five cities.
143 million people contributed, most without understanding the data would be commercially licensed to a third party.
VPS achieves centimeter-level accuracy in urban environments where GPS fails completely.
The play-to-train model is used by Google, Snapchat, Duolingo, and Waze, among others.
Check your AI apps’ privacy policies specifically for training opt-outs and consent transfer clauses before a company spinoff voids your original expectations.