Kostenlos abonnieren
The Daily Berlin

Berlin news, every day

News

Berlin's City Archive Overhauls Duplicate Image Policy After Cataloguing Crisis This Week

A backlog of mislabelled and repeated photographs in the Landesarchiv Berlin has forced administrators to adopt new deduplication software, reshaping how the city stores its visual history.

By Berlin News Desk · Published 4 July 2026, 8:47 pm

3 min read

Berlin's City Archive Overhauls Duplicate Image Policy After Cataloguing Crisis This Week
Photo: Photo by Nadine Ginzel on Pexels
Wird übersetzt…

The Landesarchiv Berlin confirmed this week that it has formally adopted automated duplicate-image-replacement protocols across its digital holdings, ending a years-long dispute between archivists and IT staff over how to handle tens of thousands of redundant photograph files. The decision, finalised on 2 July, affects an estimated 340,000 digitised images spanning records from the postwar reconstruction era through to the early 2000s.

The timing is not arbitrary. Berlin's Senate Department for Culture and Social Cohesion set a 31 December 2026 deadline for all publicly funded archives and libraries to meet new federal standards for digital asset management. With that deadline now six months out, the Landesarchiv — housed on Eichborndamm in Reinickendorf — had little room left to delay. Duplicate image files were consuming roughly 18 percent of active server capacity, according to internal documentation reviewed by administrators this spring, and the problem had grown worse as scanning operations accelerated.

What the New System Does — and Why It Matters for Berliner History

The replacement protocol works in two stages. First, a perceptual hashing algorithm flags images that are visually identical or near-identical — a common issue when the same historical photograph entered the archive through multiple donor collections. Second, a human reviewer at the archive confirms replacement before any original file is deleted, ensuring that minor variations between prints are not lost. The Zentralen Landesbibliothek Berlin, which manages a parallel photo collection in Mitte, is piloting a compatible version of the same workflow with the goal of full interoperability by October.

The stakes are real for anyone who uses these archives. Researchers at the Freie Universität Berlin, journalists, documentary filmmakers, and school groups in districts from Neukölln to Pankow regularly draw on Landesarchiv material. When duplicate entries sit unresolved in a database, search results return the same image four or five times under different catalogue numbers — burying rarer photographs deeper in the queue. Librarians at the Staatsbibliothek zu Berlin on Potsdamer Straße have flagged the same problem in their own photographic collections, and staff there are watching the Landesarchiv rollout closely.

Berlin's cultural sector has also been under broader financial pressure. The Senate's 2026 culture budget allocated roughly €720 million across institutions, but a significant share is earmarked for building maintenance and energy costs following Energiewende-related retrofits. That leaves digitisation projects competing for a smaller discretionary pool, which makes getting the deduplication right on the first attempt — rather than reprocessing collections a second time — financially important as well as archivally sound.

What Happens Next for Researchers and the Public

The Landesarchiv expects the first tranche of cleaned records — covering the period 1945 to 1970 — to be publicly searchable through its online portal by September. That phase alone covers approximately 90,000 images, many documenting rubble clearance in Mitte, the construction of Plattenbau housing estates in Marzahn, and street life along the old Kurfürstendamm. A second tranche covering the 1970s through the 1990s is scheduled for completion before the Senate's December deadline.

For independent researchers and journalists working with historical Berlin photographs, the practical advice from the archive's reading room staff is to hold off on large-scale image downloads from the current portal until the September relaunch. Catalogue numbers for duplicated items will change during the cleanup, which means any citations made now could become invalid references within two months. The archive has promised a redirect system for old catalogue identifiers, but staff have cautioned that the redirect database will itself take several weeks to populate after go-live.

The Zentralen Landesbibliothek Berlin's pilot, meanwhile, is accepting feedback from registered users through its online contact form until 31 July — a rare window for researchers to flag images they believe are incorrectly marked as duplicates before the automated system makes permanent replacements. Anyone with a library card issued in Berlin can participate.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily Berlin

This article was produced by the The Daily Berlin editorial desk and covers news in Berlin. See our editorial standards for how we use AI.

The Daily Berlin brief

The day's Berlin news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily Berlin and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to Berlin news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily Berlin and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily Berlin

More in News

Enjoyed this story? Get tomorrow's briefing free.