Berlin's Digital Archives Are Full of Duplicate Images — Officials and Experts Want a Fix
From the Landesarchiv to Tempelhof Field, administrators and technologists are pressing the city to act on a sprawling problem hiding in plain sight.
From the Landesarchiv to Tempelhof Field, administrators and technologists are pressing the city to act on a sprawling problem hiding in plain sight.

Berlin's public digital infrastructure is carrying tens of thousands of redundant image files — duplicated photographs, scanned documents, and architectural renders stored simultaneously across multiple city databases — and the people responsible for managing that data say the situation has become expensive enough to demand serious attention. The issue, broadly called duplicate image replacement, has moved from a niche archival concern to a live budget and efficiency debate inside the SPD-led Senate's digitalisation office.
Why now? The city is midway through its 2025–2028 Smart City Strategie, a program that commits Berlin to consolidating public-sector data storage and cutting unnecessary infrastructure costs. Storage redundancy has emerged as one of the clearest targets. Administrators at the Senatsverwaltung für Stadtentwicklung estimate that uncoordinated file management across departments has inflated cloud and on-premise storage costs, though the Senate has not published a precise figure for the duplicate image problem specifically. Pressure is also coming from the Rechnungshof Berlin — the city's independent audit court — which has flagged digital asset management as an area warranting review in its most recent annual report cycle.
Thomas Mühlberg, head of digital collections at the Landesarchiv Berlin on Eichborndamm in Reinickendorf, has spoken publicly at several archival conferences about the challenge of perceptual hashing — a technology that identifies near-identical images even when file names or metadata differ. His institution holds more than four million digitised items. The core problem, he has explained in panel discussions, is not that duplicates exist but that no single Berlin agency currently has the authority or technical mandate to run deduplication across departmental boundaries.
That jurisdictional gap is precisely what technologists at the CityLAB Berlin, the civic innovation lab based in the former Tempelhof Airport terminal building on Tempelhofer Damm, have been working to close. CityLAB researchers published a technical brief in March 2026 recommending that Berlin adopt a centralised digital asset management system capable of automated duplicate detection, with a pilot phase running across three Senate departments before any city-wide rollout. The brief identified the Senatsverwaltung für Kultur und Gesellschaftlichen Zusammenhalt, the Senatsverwaltung für Stadtentwicklung, and the BVG communications department as the three bodies holding the largest uncoordinated image repositories.
BVG, the public transport operator, is relevant here partly because of its ongoing infrastructure investment cycle. The authority has been producing substantial volumes of construction-phase photography documenting U-Bahn and tram expansion work — images that are routinely uploaded to multiple project management platforms without a single master record. A BVG spokesperson confirmed in a May 2026 statement to trade publication Behörden Spiegel that the operator was examining its internal file management workflows, though no specific deduplication timeline was given.
Data from the Fraunhofer FOKUS institute, which works with German public-sector clients on digital infrastructure, suggests that duplicate files can account for between 20 and 40 percent of total storage volume in organisations without active deduplication policies. Applied even conservatively to Berlin's public sector, that range implies meaningful annual savings if the problem is addressed systematically. The city spent roughly €47 million on digital infrastructure across all Senate departments in the 2024 fiscal year, according to budget documents published by the Abgeordnetenhaus.
The practical obstacle is coordination. Berlin's administration is famously siloed, a structural reality that has slowed previous city-wide digital initiatives including the rollout of the unified citizen services portal service.berlin.de. Advocates of duplicate image replacement argue the task is more tractable than past efforts because it is technical rather than political — no neighbourhood objects to cleaner databases the way Friedrichshain residents object to new housing towers.
The CityLAB brief recommends a decision by the end of the third quarter of 2026, ideally in time to include procurement for a unified digital asset management platform in the 2027 budget round. Whether the Senate acts on that timeline will depend partly on appetite inside the coalition, and partly on how forcefully the Rechnungshof presses the issue when its next full report lands in autumn.
How does this story make you feel?
Spread the word
About this article
Published by The Daily Berlin
Daily brief
Free, in your inbox before 7am. Weekdays.
More in News