Kostenlos abonnieren
The Daily Berlin

Berlin news, every day

News

Berlin's Digital Archives Are Full of Duplicate Images — and Officials Say the Clean-Up Can't Wait

From the Stadtbibliothek to the Senatsverwaltung, administrators, archivists and tech specialists are pressing for action on a backlog of redundant digital files that is costing storage budgets and slowing public access.

By Berlin News Desk · Published 4 July 2026, 8:28 pm

3 min read

Berlin's Digital Archives Are Full of Duplicate Images — and Officials Say the Clean-Up Can't Wait
Photo: Department of Defense. European Command. Office of Military Government for Germany (U.S.). Secretariat for U.S. Military Tribunals. 3/15/1947-11/15/1949 / Public domain (Wikimedia Commons)
Wird übersetzt…

Berlin's public institutions are sitting on millions of duplicate digital images, and the people responsible for managing them say the problem has quietly become expensive. Archivists at the Zentral- und Landesbibliothek Berlin on Breite Straße in Mitte estimate the institution's digital holdings have grown by roughly 40 percent since 2022, with a significant share of that expansion driven not by new acquisitions but by repeated uploads of the same files across departments. The numbers, shared internally at a working-group session in May 2026, have pushed the deduplication question onto the agenda of the Senatsverwaltung für Kultur und Gesellschaftlichen Zusammenhalt.

The timing matters. Berlin's coalition government, led by the SPD, committed in its 2025 budget framework to digitising 1.2 million archival items by the end of 2027. That target is already strained by infrastructure costs, and duplicate image storage is one of the line items drawing scrutiny. Cloud storage contracts negotiated by the Berliner Senat run to several hundred thousand euros annually, and administrators say redundant files are inflating those bills without adding public value.

What the Specialists Are Saying

Deduplication — the automated process of identifying and removing or consolidating identical or near-identical image files — is not new technology. What is new, specialists at the Fraunhofer Institut für Offene Kommunikationssysteme, based in Charlottenburg, argue, is the scale at which Berlin's public sector now needs to apply it. The institute has been involved in digital-infrastructure consultancy for German federal and state bodies, and its researchers have pointed in published work to the growing gap between digitisation ambitions and the data-hygiene practices needed to support them.

At the Stadtmuseum Berlin, which manages collections across several sites including the Ephraim-Palais in Nikolaiviertel, curators describe a workflow problem: image files are frequently generated at multiple resolutions during scanning, then saved in full across shared drives without a standardised naming or tagging convention. The result is that a single historical photograph of, say, the Potsdamer Platz can exist in six or seven versions in the same system, indistinguishable to a basic search. Staff time spent manually resolving those duplicates is time not spent on cataloguing new material.

Advocates for faster reform point to the experience of the Stiftung Preußischer Kulturbesitz, which administers collections including those at the Kulturforum near the Tiergarten. The foundation began a structured deduplication programme for its image databases in late 2024, using hash-matching software to flag identical files before human review. According to documentation from the foundation's annual digitisation report published in early 2025, the first phase of the programme identified redundant copies accounting for approximately 18 percent of total image storage in the tested collection segments.

Policy Pressure and Practical Next Steps

The Senatsverwaltung has not yet published a formal directive on duplicate-image management, but officials have signalled that guidelines are being drafted for circulation to publicly funded cultural institutions by the fourth quarter of 2026. The draft framework, according to background briefings from the culture administration, is expected to mandate minimum metadata standards and require institutions receiving digitisation grants under the Berlin Digital Culture Fund to demonstrate deduplication protocols before disbursement.

For smaller institutions — neighbourhood archives in Neukölln or Prenzlauer Berg, community libraries that lack dedicated IT staff — the challenge is less about will than capacity. Advocates within the Berliner Bibliotheksverbund, the network linking the city's public library branches, have called for a centralised deduplication service that smaller bodies could plug into rather than building their own. That proposal is under review but has not been formally funded.

The practical advice from archivists currently managing the problem: start with naming conventions before reaching for software. Institutions that standardised their file-naming structures first — date, subject, resolution, source department — found that automated deduplication tools performed significantly more accurately when they were eventually deployed. The Zentral- und Landesbibliothek's internal documentation from its 2025 workflow audit supports that sequencing. The technology is available. The governance framework is catching up.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily Berlin

This article was produced by the The Daily Berlin editorial desk and covers news in Berlin. See our editorial standards for how we use AI.

The Daily Berlin brief

The day's Berlin news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily Berlin and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to Berlin news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily Berlin and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily Berlin

More in News

Enjoyed this story? Get tomorrow's briefing free.