Berlin's public digital repositories are clogged with duplicate images — and the institutions responsible for managing them are running out of patience, and storage space. Archivists at the Landesarchiv Berlin on Eichborndamm in Reinickendorf have flagged the problem internally for months, and specialists in the city's growing civic-tech community are now joining the call for a systematic solution before the backlog gets worse.
The issue sounds bureaucratic. It isn't. Berlin's Senate departments, cultural institutions, and public housing bodies collectively manage tens of millions of digitised records, a portion of which includes photographs documenting urban development, social housing construction, and community life going back decades. When the same image exists under five different filenames across three separate servers, it doesn't just waste storage capacity — it undermines the integrity of the public record. Archivists cannot reliably verify which version is authorised, when duplicates were created, or which metadata is correct.
The timing is pointed. Berlin's SPD-led coalition has been pushing a broader digital administration agenda under the Senate Department for Urban Development and Housing, and duplicate image accumulation is now surfacing as an obstacle to the city's plans to make its planning records fully interoperable by early 2027. Housing data tied to contested rent-cap debates, for instance, depends on clean photographic documentation of building stock — documentation that becomes legally murky when duplicated files carry conflicting timestamps or geotags.
What the Experts Are Saying
At the Technologiestiftung Berlin, which operates out of offices near Gendarmenmarkt in Mitte, researchers working on smart-city data governance have been vocal about the risks of unmanaged duplication in public image databases. The foundation has previously published guidance on metadata standards for Berlin's open-data infrastructure, and staff there have described duplicate images as a downstream symptom of siloed departmental IT systems that were never designed to communicate with each other.
CityLAB Berlin, the public innovation lab housed in the former Tempelhof Airport terminal building, has run workshops on exactly this kind of data hygiene problem. Participants from the Bezirksamt Neukölln and from Berliner Wohnen, the city's largest public housing company, have attended sessions focused on deduplication workflows and automated image-hash verification — a technical process that identifies visually identical files regardless of their filename or storage location.
The numbers give a sense of scale. A 2025 audit commissioned by the Senate Department for Culture reported that Berlin's public cultural institutions alone held more than 4.2 million digitised image files across fragmented systems, with an estimated duplication rate of between 18 and 23 percent. That means somewhere between 750,000 and nearly a million files may be redundant copies — consuming server resources, complicating retrieval, and generating licensing ambiguity when images are shared with press or research partners.
What Comes Next
The Senate Department for Urban Development is expected to present a cross-departmental image-management framework to the Abgeordnetenhaus before the end of the third quarter of 2026. The proposal, which has been in preparation since January, is understood to include mandatory deduplication checks before any new image batch enters a public archive, along with a central metadata registry that would allow institutions from the Stadtbibliothek to the Stadtentwicklungsamt to cross-reference holdings without duplicating uploads.
For institutions that haven't yet audited their own holdings, specialists at Technologiestiftung Berlin have pointed to open-source tools — including perceptual hashing libraries already used by several German federal agencies — as a practical first step that requires no new budget. CityLAB has indicated it plans to publish a step-by-step guide in German tailored to Berlin's Bezirk-level offices before the autumn.
The practical stakes are immediate. Berlin's housing courts have increasingly relied on timestamped photographic evidence in rent-dispute proceedings under the city's Mietendeckel-adjacent policies. If the evidentiary photographs on file exist in multiple conflicting versions, the legal reliability of that documentation comes into question. Fixing the archive, archivists argue, is not a technical nicety — it is a precondition for the city's governance to function as promised.