Kostenlos abonnieren
The Daily Berlin

Berlin news, every day

News

Berlin's Duplicate Image Problem: The Numbers Exposing a City Archive in Crisis

Tens of thousands of redundant digital files are clogging Berlin's public databases — and the bill for cleaning them up is climbing fast.

By Berlin News Desk · Published 4 July 2026, 8:43 pm

3 min read

Wird übersetzt…

Berlin's Senate Department for Urban Development and Housing confirmed this spring that more than 340,000 duplicate image files have been identified across the city's central planning and property documentation archive, a sprawling digital repository that feeds everything from building permit decisions to the public-facing Geoportal Berlin map platform. The figure, drawn from an internal audit completed in March 2026, puts a hard number on what IT administrators at the Senatsverwaltung für Stadtentwicklung, Bauen und Wohnen have described internally as a years-long accumulation problem.

The timing matters. Berlin is mid-way through a €2.1 billion digitalisation push under its Smart City Strategie Berlin, a programme running to 2030 that aims to consolidate dozens of disconnected civic data systems into a coherent citywide infrastructure. Redundant image data — scanned floor plans, aerial survey photographs, facade records — inflates storage costs, slows retrieval speeds, and, critically, creates version-control failures where planners access outdated files without realising it. With the housing shortage pressing the SPD-led Senate to accelerate building approvals, those failures carry real administrative weight.

Where the Duplication Is Concentrated

The worst accumulation sits in records tied to inner-city districts. Mitte and Friedrichshain-Kreuzberg together account for roughly 38 percent of flagged duplicate files, according to figures circulating among data management contractors working on the project. Both districts saw intensive building survey activity between 2018 and 2023, years that coincided with overlapping digitisation efforts by separate Senate departments that were never fully coordinated. The Liegenschaftsfonds Berlin, the state-owned property fund headquartered on Köthener Straße, maintains its own image repository that was mapped against the central archive for the first time during the March audit — producing a significant share of the newly counted duplicates.

The Berliner Stadtbibliothek's digital collections team ran a comparable deduplication exercise on its own photographic holdings in 2024, removing approximately 12,000 redundant image records from a collection of around 900,000 files. That project cost the library roughly €85,000 in contractor hours and internal staff time — a reference point that planning officials are now using to model the larger Senate operation. Scaled to 340,000 files, with the additional complexity of legal metadata and georeferencing data attached to each planning document, early estimates put the cleanup cost somewhere between €600,000 and €1.1 million, depending on the degree of human review required for ambiguous cases.

What Deduplication Actually Costs — and What It Prevents

Storage alone is not the main expense. Raw server capacity at the city's data centre facility in Tempelhof runs at roughly €0.04 per gigabyte per month under the current public-sector framework contract, which means 340,000 medium-resolution image files — averaging perhaps 8 megabytes each — represent a monthly storage cost of around €109,000. That number sounds manageable until you factor in the backup cycles, the bandwidth consumed during routine system replication, and the licensing fees for the document management software that indexes each file individually.

The deeper risk is decisional. Berlin's building permit system processed 14,200 applications in 2025, according to figures published by the Senatsverwaltung's annual statistical report. Each application can touch dozens of archived property images. When duplicate files exist with conflicting metadata — different scan dates logged against the same building record — permit examiners can pull the wrong version. The March audit identified at least 4,700 property records where two or more image files with divergent timestamps existed for the same cadastral parcel, meaning a non-trivial share of last year's permit caseload was processed against potentially ambiguous visual documentation.

The Senate's IT coordination unit is expected to publish a remediation tender through the Berlin procurement portal, DTVP, before the end of August 2026. Firms bidding on the contract will be required to demonstrate experience with georeferenced planning data, a specification designed to exclude general-purpose data cleaning vendors. For residents and developers waiting on planning decisions in Prenzlauer Berg, Neukölln, or anywhere else where the archive backlog bites hardest, the practical upshot is straightforward: a cleaner database means faster, more reliable decisions — and fewer cases where an examiner has to manually reconcile conflicting file records before a permit can move forward.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily Berlin

This article was produced by the The Daily Berlin editorial desk and covers news in Berlin. See our editorial standards for how we use AI.

The Daily Berlin brief

The day's Berlin news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily Berlin and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to Berlin news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily Berlin and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily Berlin

More in News

Enjoyed this story? Get tomorrow's briefing free.