Berlin's public institutions are sitting on millions of duplicate digital images — redundant files clogging servers, inflating storage costs and slowing down the digital services that residents use daily. That much, at least, is not in dispute. What to do about it is another matter entirely.
The problem has sharpened in 2026 as the SPD-led Senate pushes forward with its broader Verwaltungsdigitalisierung agenda, a multi-year program to modernise city administration. Infrastructure managers say the backlog of duplicated image files — accumulated over more than a decade of ad hoc digitisation drives — is now actively obstructing database consolidation across departments. No single city-wide figure for the total volume of duplicate data has been published, but IT procurement documents reviewed by The Daily Berlin show that at least three Senatsverwaltung departments issued separate storage-expansion tenders in the first half of 2026, a sign that redundancy, not genuine growth, is driving capacity demand.
Who Is Saying What
The debate has drawn in a wide range of voices. Archivists at the Zentral- und Landesbibliothek Berlin, which holds one of the largest publicly accessible digital image collections in the city, have been among the most vocal. Staff there have argued internally — and in a position paper circulated to the Senate Department for Science and Research in March 2026 — that automated deduplication tools must be paired with human review protocols, particularly for culturally significant photograph collections where near-identical images may carry distinct archival value. Two images that look identical to an algorithm may differ meaningfully in provenance or caption metadata, the position paper noted.
Technologists at the Fraunhofer Institute for Open Communication Systems, whose Berlin campus sits in the Wilmersdorf district on Kaiserin-Augusta-Allee, take a more aggressive line. Researchers there have publicly advocated for perceptual hashing — a technique that identifies visually similar images even when file names or formats differ — as the cornerstone of any city-wide deduplication strategy. Their published work from early 2026 argues that institutions delaying automation are spending roughly 30 to 40 percent more on cloud storage than necessary, though they caution that figure varies significantly by institution type.
The Berlin startup scene has also weighed in. Several companies based in the Prenzlauer Berg and Mitte districts — including firms that have received Investitionsbank Berlin funding under the ProFIT innovation program — have pitched AI-assisted image management tools to city departments. Some of those pitches, according to procurement filings, have stalled because the Senate lacks a unified technical standard for what counts as a duplicate, a gap that different departments define differently depending on their workflows.
The Political and Practical Sticking Points
Inside the Rotes Rathaus, the political dimension has not gone unnoticed. City councillors on the Committee for Digitalisation have raised the issue twice in the current legislative session, most recently in June 2026, pressing the Senate for a consolidated policy. The Senate's response, delivered in writing, committed to a working group report by the fourth quarter of 2026 but stopped short of mandating specific tools or timelines for individual departments.
Data protection adds another layer. The Berlin Commissioner for Data Protection and Freedom of Information has previously flagged concerns about deploying third-party AI image-scanning tools on public-sector servers, particularly where images may contain identifiable individuals — a common feature of historical municipal photograph archives. Those concerns have made some department heads reluctant to move quickly, even where budgets allow.
Privacy advocates and archivists are not necessarily at odds. Both camps broadly agree that a clear city-wide framework — defining what qualifies as a duplicate, which tools may be used, and what human oversight is required — would unlock progress faster than any individual institution acting alone. The working group report due later this year will be the first real test of whether the Senate can bridge those competing demands. Until it lands, Berlin's servers will keep carrying the weight of the same image, stored twice, sometimes three times, and occasionally more.