Kostenlos abonnieren
The Daily Berlin

Berlin news, every day

News

Berlin's Digital Archives Are Full of Duplicate Images — and Officials, Experts and Archivists Are Finally Talking About the Cost

From the Stadtbibliothek to Senatsverwaltung servers, redundant image files are draining storage budgets and slowing down public-access platforms across the capital.

By Berlin News Desk · Published 4 July 2026, 8:48 pm

3 min read

Berlin's Digital Archives Are Full of Duplicate Images — and Officials, Experts and Archivists Are Finally Talking About the Cost
Photo: Photo by Max Kladitin on Pexels
Wird übersetzt…

Berlin's public digital infrastructure is carrying tens of thousands of duplicate image files across municipal databases, and the people responsible for managing those systems say the problem is no longer a minor housekeeping issue. Officials at the Senatsverwaltung für Stadtentwicklung, archivists at the Zentral- und Landesbibliothek Berlin on Breite Straße, and technology advisers working with the city's open-data initiative have all flagged duplicate image replacement as a priority for the second half of 2026.

The timing matters. Berlin's SPD-led coalition committed in its governing agreement to accelerating the digitisation of public records and expanding open-data access by the end of this legislative term. That push has generated enormous volumes of scanned photographs, planning documents and heritage images — and with volume comes redundancy. Multiple departments scanning the same archival photographs independently, or uploading city marketing assets without centralised coordination, have left municipal servers holding the same file dozens of times over.

What the Experts Are Saying

Specialists in digital asset management who work with Berlin's public institutions describe the situation in practical terms: storage is not free, and at enterprise scale, redundant files translate directly into wasted budget. Cloud storage costs for Berlin's public sector have risen alongside broader market rates — enterprise object storage in European data centres now runs at roughly €0.02 to €0.025 per gigabyte per month, and estimates within the industry suggest large municipal archives can accumulate hundreds of terabytes of duplicated visual content over a decade of uncoordinated digitisation.

The Technologiestiftung Berlin, based in Tempelhof, has been advising city departments on exactly this kind of structural inefficiency as part of its broader digital infrastructure work. The foundation has consistently argued that deduplication — the automated identification and replacement or removal of redundant files — should be built into procurement requirements whenever the city acquires new content management software. Without that requirement, each new platform can inherit the same disorder from its predecessor.

At the Stadtmuseum Berlin, which manages collections across sites including the Märkisches Museum near the Köllnischer Park, curators have been working since early 2025 to rationalise image catalogues before a planned upgrade to their collection management system. The process exposed a pattern familiar to archivists across Europe: well-intentioned digitisation programmes, run on tight timelines and tighter budgets, rarely include a deduplication step because it adds time upfront even though it saves significantly more time downstream.

What Comes Next for City Systems

The Senatsverwaltung für Inneres und Digitales is expected to publish updated data governance guidelines before the end of the third quarter of 2026. Those guidelines, which have been in preparation since January, are understood to address image asset management directly — requiring departments to run deduplication checks before migrating content to shared platforms. The Berlin Open Data portal, accessible at daten.berlin.de, is one of the systems in scope.

For Berlin's growing tech sector, centred on hubs like Factory Berlin in Mitte and the Kreuzberg startup corridor along Oranienstraße, the municipal conversation carries some commercial relevance. Several local firms offer AI-assisted duplicate detection tools that have found customers in media companies and e-commerce platforms across Germany. Whether city procurement rules will open the door to those vendors — or whether the Senatsverwaltung will rely on open-source deduplication tools already embedded in existing software — is a question that remains open ahead of the Q3 guidelines.

Archivists and digital managers working with the city have a consistent practical message for departments that have not yet audited their image libraries: start with the largest collections first, prioritise files uploaded between 2019 and 2023 when pandemic-era digitisation programmes were running at speed, and do not wait for a top-down mandate before running basic hash-comparison checks. The tools exist. The costs of delay are real. And Berlin's servers are not getting any emptier.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily Berlin

This article was produced by the The Daily Berlin editorial desk and covers news in Berlin. See our editorial standards for how we use AI.

The Daily Berlin brief

The day's Berlin news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily Berlin and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to Berlin news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily Berlin and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily Berlin

More in News

Enjoyed this story? Get tomorrow's briefing free.