Kostenlos abonnieren
The Daily Berlin

Berlin news, every day

News

Berlin's Digital Archives Are Full of Duplicate Images — and Officials Want Them Gone

From Senate data centres to startup studios in Mitte, the push to clean up Berlin's bloated image databases is drawing sharp opinions from technologists, archivists, and city planners.

By Berlin News Desk · Published 4 July 2026, 8:51 pm

3 min read

Berlin's Digital Archives Are Full of Duplicate Images — and Officials Want Them Gone
Photo: Photo by Adis Resic on Pexels
Wird übersetzt…

Berlin's public administration is sitting on millions of duplicate digital images — redundant files clogging servers across Senate departments, housing agencies, and the city's own urban-planning portals — and a growing chorus of officials and technical experts is demanding a coordinated fix before the problem gets worse. The issue has moved from an IT footnote to a procurement and budget conversation at the Rotes Rathaus, where the SPD-led coalition is under pressure to modernise municipal data infrastructure without blowing its already stretched technology budget.

The timing matters. The city is in the middle of a multi-year digitisation drive under the Berlin Digital Strategy, a programme that has funnelled resources into e-government services and open-data platforms since 2022. As more departments upload planning documents, housing survey photographs, and infrastructure imagery to shared cloud environments, the volume of duplicate files has expanded alongside everything else. Storage costs, which are not trivial at enterprise scale, compound the problem every month.

What the Experts Are Actually Saying

Technologists at the Technologiestiftung Berlin — the city-linked foundation on Grunewaldstraße that advises the Senate on digital policy — have flagged duplicate asset management as a structural gap in Berlin's data governance framework. The foundation has previously published work on data quality in public-sector systems, and people familiar with those discussions say the image duplication problem is routinely underestimated because storage costs in government IT are often bundled into opaque departmental line items rather than broken out transparently.

On the startup side, the conversation is sharper and more commercial. At co-working spaces along Torstraße in Mitte and inside the Factory Berlin campus in Prenzlauer Berg, product teams building tools for real-estate platforms, municipal SaaS, and media-tech applications treat duplicate image detection as a solved engineering problem — one that public institutions simply haven't prioritised. Developers point to perceptual hashing algorithms and machine-learning-based deduplication pipelines as off-the-shelf options that a procurement team could specify today. The sticking point, they argue, is procurement lag and a lack of internal champions inside the Senate departments with the authority to push change through.

The Stadtentwicklung und Wohnen — Berlin's urban development and housing department — manages an especially large image corpus. Property survey photographs, planning application attachments, and neighbourhood documentation for areas like Neukölln and Lichtenberg have accumulated over years, often uploaded multiple times by different staff without any automated check. Sources familiar with the department's IT environment say a review conducted internally in early 2025 estimated that deduplication of legacy image archives could reduce storage load in certain datasets by more than 30 percent, though those figures have not been published officially.

Costs, Contracts, and the Path Forward

Cloud storage is not free, even at government rates. Berlin's Senate Department for Finance signed a framework agreement for public cloud services in 2023 that covers multiple departments under a shared cost model. Every gigabyte of unnecessary duplicate data sitting in those environments translates directly into expenditure that could be redirected — a point that budget hawks in the coalition have started to make explicitly as the 2027 budget cycle approaches.

The BVG, the public transport operator that has been expanding its own digital operations as part of a broader infrastructure modernisation, has separately dealt with image deduplication in the context of its asset-management and maintenance documentation systems. The operator declined to provide figures, but its technical teams have discussed the issue at industry forums in Berlin, including events hosted at the STATION Berlin conference venue in Kreuzberg.

For departments that have not yet acted, experts at the Technologiestiftung and in the city's startup community broadly agree on the practical steps: run a full audit of existing image repositories using automated hashing tools, establish a single-source-of-truth storage standard across departments, and build deduplication checks into any new upload workflow before files hit the archive. The Senate's CIO office is expected to address image and asset data governance in updated digital-strategy guidance due before the end of 2026. Whether that guidance will carry teeth — or remain advisory — is the question officials and observers are watching most closely.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily Berlin

This article was produced by the The Daily Berlin editorial desk and covers news in Berlin. See our editorial standards for how we use AI.

The Daily Berlin brief

The day's Berlin news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily Berlin and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to Berlin news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily Berlin and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily Berlin

More in News

Enjoyed this story? Get tomorrow's briefing free.