Berlin's Duplicate Image Problem: The Numbers Driving a City-Wide Digital Clean-Up
Thousands of redundant files are clogging Berlin's public sector databases — and the cost of ignoring them is rising fast.
Thousands of redundant files are clogging Berlin's public sector databases — and the cost of ignoring them is rising fast.

Berlin's public administration is sitting on a growing mountain of duplicate digital images, and the numbers are finally forcing action. An internal audit completed in June 2026 by the Senatsverwaltung für Inneres und Digitales identified more than 340,000 redundant image files spread across departmental servers, costing the city an estimated €2.3 million annually in unnecessary storage infrastructure and maintenance contracts.
The timing matters. Berlin is mid-way through its Digital City Strategy 2030, a framework that commits the city to consolidating legacy IT systems across all twelve Bezirke by the end of next year. Duplicate image data — scanned identity documents, urban planning photographs, press archive material duplicated across BerlinOnline and the Senatskanzlei's own content systems — has emerged as one of the most stubborn obstacles to that consolidation. Every redundant file has to be verified before deletion, a process that is both labour-intensive and, when done poorly, legally risky under federal data protection rules.
The problem is concentrated in a handful of institutions. The Stadtentwicklungsamt Mitte, which handles planning documents for the area stretching from Alexanderplatz to the Regierungsviertel, holds roughly 47,000 image files flagged as probable duplicates. The Landesarchiv Berlin on Eichborndamm in Reinickendorf, which digitised large portions of its photographic holdings between 2018 and 2023, found during a spot-check last autumn that approximately one in five image files existed in at least two separate locations on its network. Bezirksamt Friedrichshain-Kreuzberg reported a similar ratio when it conducted its own review ahead of migrating to the city's shared cloud infrastructure, known as the Berlin Government Cloud, in March 2026.
Startup-sector observers have noted the irony. Berlin has positioned itself as Germany's leading tech hub, home to more than 4,200 registered technology companies according to the Berlin Partner für Wirtschaft und Technologie's 2025 annual report — yet core municipal IT infrastructure lags behind. The duplicate image issue is, in part, a legacy of years when individual departments purchased their own document management software without central co-ordination, resulting in incompatible systems that copied files rather than sharing them.
The Senatsverwaltung für Inneres und Digitales has awarded a framework contract, valued at up to €890,000 over 18 months, to a consortium that includes Berlin-based IT consultancy datafusion GmbH, for automated deduplication work beginning in September 2026. The process uses hash-matching algorithms to identify identical binary files regardless of filename or folder location, then flags near-duplicates for human review.
Early projections put potential storage savings at around 28 terabytes across the affected departments — modest by commercial standards, but significant when mapped onto the city's existing contracted storage costs of roughly €67 per terabyte per month under its data centre agreements. Over five years, eliminating those redundant files and preventing future duplication through stricter upload protocols is projected to save the city between €4.1 million and €5.6 million, according to figures presented to the Abgeordnetenhaus's digital affairs committee in May 2026.
There is also a compliance dimension that carries its own financial weight. Under the EU's General Data Protection Regulation, duplicate personal-data images — particularly scanned documents containing names, addresses, or photographs of individuals — multiply the city's legal exposure in the event of a data breach. Each additional copy of a sensitive file is, in regulatory terms, a separate breach risk. Berlin's data protection commissioner, the Berliner Beauftragte für Datenschutz und Informationsfreiheit, has been pressing departments since 2024 to reduce unnecessary data proliferation.
For residents and businesses dealing with the city's planning and registration offices — many of whom already navigate queues at Bürgerämter across Mitte and Tempelhof-Schöneberg — the practical payoff may come gradually: faster file retrieval, fewer errors caused by staff working from outdated document versions, and eventually, a leaner digital infrastructure that supports rather than slows the city's ambitions. Departments are expected to complete their initial deduplication passes by the first quarter of 2027, with a city-wide compliance report due at the Abgeordnetenhaus before the summer recess that year.
How does this story make you feel?
Spread the word
About this article
Published by The Daily Berlin
Daily brief
Free, in your inbox before 7am. Weekdays.
More in News