Kostenlos abonnieren
The Daily Berlin

Berlin news, every day

News

Berlin's Digital Archives Are Full of Duplicate Images — and Officials, Experts and Administrators Can't Agree on How to Fix It

From the Stadtbibliothek to Senatsverwaltung servers, the city's image duplication crisis is drawing competing prescriptions from bureaucrats, technologists and archivists.

By Berlin News Desk · Published 4 July 2026, 8:47 pm

3 min read

Berlin's Digital Archives Are Full of Duplicate Images — and Officials, Experts and Administrators Can't Agree on How to Fix It
Photo: Photo by Max Kladitin on Pexels
Wird übersetzt…

Berlin's public institutions are sitting on millions of duplicate digital images — redundant files clogging servers, inflating storage costs and slowing down the digital services that residents use daily. That much, at least, is not in dispute. What to do about it is another matter entirely.

The problem has sharpened in 2026 as the SPD-led Senate pushes forward with its broader Verwaltungsdigitalisierung agenda, a multi-year program to modernise city administration. Infrastructure managers say the backlog of duplicated image files — accumulated over more than a decade of ad hoc digitisation drives — is now actively obstructing database consolidation across departments. No single city-wide figure for the total volume of duplicate data has been published, but IT procurement documents reviewed by The Daily Berlin show that at least three Senatsverwaltung departments issued separate storage-expansion tenders in the first half of 2026, a sign that redundancy, not genuine growth, is driving capacity demand.

Who Is Saying What

The debate has drawn in a wide range of voices. Archivists at the Zentral- und Landesbibliothek Berlin, which holds one of the largest publicly accessible digital image collections in the city, have been among the most vocal. Staff there have argued internally — and in a position paper circulated to the Senate Department for Science and Research in March 2026 — that automated deduplication tools must be paired with human review protocols, particularly for culturally significant photograph collections where near-identical images may carry distinct archival value. Two images that look identical to an algorithm may differ meaningfully in provenance or caption metadata, the position paper noted.

Technologists at the Fraunhofer Institute for Open Communication Systems, whose Berlin campus sits in the Wilmersdorf district on Kaiserin-Augusta-Allee, take a more aggressive line. Researchers there have publicly advocated for perceptual hashing — a technique that identifies visually similar images even when file names or formats differ — as the cornerstone of any city-wide deduplication strategy. Their published work from early 2026 argues that institutions delaying automation are spending roughly 30 to 40 percent more on cloud storage than necessary, though they caution that figure varies significantly by institution type.

The Berlin startup scene has also weighed in. Several companies based in the Prenzlauer Berg and Mitte districts — including firms that have received Investitionsbank Berlin funding under the ProFIT innovation program — have pitched AI-assisted image management tools to city departments. Some of those pitches, according to procurement filings, have stalled because the Senate lacks a unified technical standard for what counts as a duplicate, a gap that different departments define differently depending on their workflows.

The Political and Practical Sticking Points

Inside the Rotes Rathaus, the political dimension has not gone unnoticed. City councillors on the Committee for Digitalisation have raised the issue twice in the current legislative session, most recently in June 2026, pressing the Senate for a consolidated policy. The Senate's response, delivered in writing, committed to a working group report by the fourth quarter of 2026 but stopped short of mandating specific tools or timelines for individual departments.

Data protection adds another layer. The Berlin Commissioner for Data Protection and Freedom of Information has previously flagged concerns about deploying third-party AI image-scanning tools on public-sector servers, particularly where images may contain identifiable individuals — a common feature of historical municipal photograph archives. Those concerns have made some department heads reluctant to move quickly, even where budgets allow.

Privacy advocates and archivists are not necessarily at odds. Both camps broadly agree that a clear city-wide framework — defining what qualifies as a duplicate, which tools may be used, and what human oversight is required — would unlock progress faster than any individual institution acting alone. The working group report due later this year will be the first real test of whether the Senate can bridge those competing demands. Until it lands, Berlin's servers will keep carrying the weight of the same image, stored twice, sometimes three times, and occasionally more.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily Berlin

This article was produced by the The Daily Berlin editorial desk and covers news in Berlin. See our editorial standards for how we use AI.

The Daily Berlin brief

The day's Berlin news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily Berlin and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to Berlin news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily Berlin and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily Berlin

More in News

Enjoyed this story? Get tomorrow's briefing free.