Where our data comes from — and how we verify it

10 Jun 2026 · 7 min read

Newsmap is only as good as what it ingests. We pull from three quite different kinds of source — public Telegram channels, RSS feeds, and news APIs — and each comes with its own strengths, blind spots, and failure modes. Knowing which is which helps you read the map critically.

Three kinds of source

Telegram channels. Public channels are often the fastest signal for events unfolding on the ground, particularly in conflict zones where traditional outlets are absent or delayed. The trade-off is reliability: channels vary enormously in accuracy, some carry an obvious agenda, and posts are frequently unverified first impressions rather than confirmed facts.
RSS feeds. Established news outlets and wire services publish structured feeds that tend to be more carefully edited and sourced. They are usually slower than social channels and skew toward events that meet a newsroom’s threshold for coverage, which means smaller incidents can be underrepresented.
News APIs. Aggregated reporting from many publishers gives breadth and helps cross-reference a story across outlets. The downside is duplication and homogenisation — the same wire story echoed by dozens of sites — which we try to collapse rather than count many times.

From a post to a point on the map

Whatever the source, a report goes through the same steps. Non-English text is translated. The report is summarised into a short, neutral description. It is geolocated — to a specific place where the text supports it, or to a country-level approximation where it does not — and then classified against our event taxonomy. Finally it is assigned a veracity and severity signal.

Two of these steps are worth dwelling on because they are the most error-prone. Translation can flatten nuance, especially sarcasm, idiom, or deliberately coded language. Geolocation can be fooled by reports that mention several places, or that name a place only as the location of a press conference rather than the event itself. Where we are unsure of a location, we say so by lowering the confidence on that event, and we keep the least confident placements out of the default view.

Deduplication and clustering

A single real-world event often produces many reports across sources. Showing each as a separate dot would badly overstate how much is happening. We group closely related reports so that a story reads as one event with corroborating sources rather than ten identical incidents. This is also why the map can show a “surge” for a region: not because we counted the same story repeatedly, but because genuinely distinct reports clustered in time and place.

What we deliberately do not do

We do not present aggregated counts as ground truth, and we do not treat the absence of events as evidence that nothing happened — it may simply mean nothing was reported, or that reporting was suppressed. Coverage is uneven by design of the world, not just our pipeline: some regions are saturated with channels and outlets, others are near-dark. Read density as a measure of reporting activity first, and of real-world activity only with that caveat in mind.

Verify before you cite

The single most important habit when using Newsmap is to follow the link. Every event keeps a path back to its original report so you can judge the source yourself, check the wording we summarised, and decide whether it meets your bar for citation. The map is a way to find and navigate events quickly; the original source is what you should rely on. For how we encode our own confidence in a report, continue to veracity and severity.