Kostenlos abonnieren
The Daily Berlin

Berlin news, every day

News

Berlin's Digital Archives Are Full of Duplicate Images. Officials and Experts Say the Fix Is Overdue.

From Mitte to Spandau, city agencies and cultural institutions are wrestling with bloated image databases — and the pressure to clean them up is mounting.

By Berlin News Desk · Published 4 July 2026, 9:06 pm

3 min read

Berlin's Digital Archives Are Full of Duplicate Images. Officials and Experts Say the Fix Is Overdue.
Photo: Photo by Wendelin Jacober on Pexels
Wird übersetzt…

Berlin's public sector is sitting on a sprawling, duplicated mess of digital imagery. City administrators, archivists, and technology specialists have spent the past several months pushing for a coordinated strategy to purge redundant image files from government databases — a problem that has grown quietly expensive and, some argue, legally risky under EU data protection rules.

The issue surfaced prominently this spring when the Senatsverwaltung für Stadtentwicklung, Bauen und Wohnen conducted an internal audit of its digital asset holdings. The review found thousands of duplicate image files — some photographs appearing dozens of times under different file names — clogging servers maintained across multiple departments. The audit, completed in April 2026, has since circulated among city technology offices and prompted urgent conversations about how Berlin's institutions manage visual data at scale.

This matters now for a straightforward reason: storage costs money, duplicated records create legal exposure, and Berlin has committed under its 2024 Digital Strategy to consolidating public-sector IT infrastructure by the end of 2027. Image deduplication — the technical process of identifying and removing exact or near-identical copies — is suddenly not just a housekeeping task but a contractual obligation for agencies receiving federal digitisation funding.

What Experts Are Saying

Specialists at the Zuse Institute Berlin, the research centre on Takustraße in Dahlem that handles complex data problems for scientific and public-sector clients, have pointed to the scale of the challenge across large municipal archives. Researchers there have noted that cultural institutions face a particular bind: without robust metadata standards, automated deduplication tools risk deleting images that are visually identical but legally distinct — different licensing agreements attached to the same photograph, for instance.

The Stadtmuseum Berlin, which manages collections across sites including the Märkisches Museum near Köllnischer Park and the Ephraim-Palais in the Nikolaiviertel, has been running a pilot program since January 2026 to test image-recognition software on roughly 80,000 digitised objects in its holdings. Staff there have described the process as painstaking: automated tools flag potential duplicates, but human curators must review each case before deletion. The pilot has a completion target of December 2026.

At Berliner Stadtreinigung, the municipal waste and infrastructure services company, technology staff reportedly reviewed image libraries used in public communications and fleet documentation and found file redundancy rates that complicated routine database queries — though the agency has not published specific figures from that review.

The Practical and Political Pressure

The BVG, Berlin's public transport operator, is also in the frame. The authority has been expanding its digital communications output substantially as part of a broader infrastructure investment push, and managing the image assets that accompany that output — route maps, construction updates, press photographs — has become a genuine operational concern. BVG's IT department has not announced a formal deduplication program, but the topic has appeared on the agenda of inter-agency digital working groups convened under the Berlin Senate Chancellery.

The cost dimension is not trivial. Enterprise cloud storage prices have risen sharply across Europe since 2024, with standard object storage rates at major providers now running between €0.018 and €0.025 per gigabyte per month. For an institution holding tens of terabytes of image data — not unusual for a large archive — unnecessary duplication can translate into thousands of euros in avoidable annual expenditure.

Legal exposure adds another layer. Under the GDPR, images depicting identifiable individuals carry specific retention and deletion obligations. Duplicate files scattered across unsynchronised systems make it harder to honour deletion requests from data subjects — a point that Berlin's data protection commissioner, the Berliner Beauftragter für Datenschutz und Informationsfreiheit, has raised in general guidance published earlier this year.

For institutions looking to act, specialists recommend starting with a file-hash audit before deploying any image-recognition tool — a step that catches exact byte-for-byte duplicates quickly and cheaply. The harder problem, near-duplicate images taken seconds apart or processed differently, requires more sophisticated software and, crucially, human judgment. Berlin's cultural and public-sector bodies are discovering that the technology exists. The bottleneck, as usual, is time, staff, and the political will to prioritise something that is invisible until it becomes a crisis.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily Berlin

This article was produced by the The Daily Berlin editorial desk and covers news in Berlin. See our editorial standards for how we use AI.

The Daily Berlin brief

The day's Berlin news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily Berlin and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to Berlin news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily Berlin and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily Berlin

More in News

Enjoyed this story? Get tomorrow's briefing free.