Kostenlos abonnieren
The Daily Berlin

Berlin news, every day

News

Berlin's Duplicate Image Crisis: The Key Decisions That Will Shape the City's Visual Archives

From Mitte to Neukölln, administrators and archivists face a reckoning over how Berlin manages, deduplicates and preserves its sprawling digital image collections.

By Berlin News Desk · Published 4 July 2026, 8:51 pm

3 min read

Berlin's Duplicate Image Crisis: The Key Decisions That Will Shape the City's Visual Archives
Photo: Congressional Research Service / Public domain (Wikimedia Commons)
Wird übersetzt…

Berlin's public institutions are sitting on a problem years in the making. Across municipal databases — from the Stadtbibliothek Berlin's digital holdings to the Landesarchiv on Eichborndamm in Reinickendorf — tens of thousands of duplicate images have accumulated, the result of fragmented digitisation drives, multiple overlapping scanning contracts, and departments that rarely talked to each other. Now, with the SPD-led Senate having flagged digital infrastructure reform as a budget priority for the 2026–2027 fiscal period, the question is no longer whether Berlin will act. It is how, how fast, and who pays.

The urgency is real. Berlin's open-data portal, daten.berlin.de, hosts image sets from at least a dozen separate city departments, many uploaded without cross-referencing. Duplicates inflate storage costs, slow down search functions, and — critically — create legal exposure when licensing metadata on the original file differs from the copy. For a city increasingly positioning itself as a European tech hub, with clusters of AI and data companies concentrated around Kreuzberg's Oranienstraße corridor and the Prenzlauer Berg startup belt, the inability to maintain clean, deduplicated public archives is an embarrassment with practical consequences.

What the Deduplication Decision Actually Involves

At its core, the city faces three distinct choices. First: which deduplication standard to adopt. Hash-based matching — comparing unique digital fingerprints of each image file — catches exact copies but misses near-duplicates created when the same photo is re-saved at different resolutions or with different colour profiles. Perceptual hashing, used by platforms including Getty Images and several European national archives, catches those variants but requires more computational power and a larger procurement contract.

Second: who governs the process. The Senatsverwaltung für Inneres und Digitales, which oversees Berlin's IT infrastructure, has been in discussions with the Kompetenzzentrum Geodateninfrastruktur Berlin-Brandenburg — the regional body responsible for geographic and visual data standards — about where responsibility should sit. Splitting governance between a city body and a regional one has complicated previous projects, including the delayed rollout of Berlin's unified geodata platform, which missed its original 2024 deadline.

Third: what happens to the duplicates once identified. Deletion sounds straightforward. It is not. Archivists at the Landesarchiv have long argued that what looks like a duplicate may carry different provenance metadata — a different photographer credit, a different acquisition date — that makes it independently valuable. A blanket delete policy risks destroying historical context. A case-by-case review, however, could take years and require staff the archive does not currently have.

The Budget Question and the Timeline Ahead

Storage is not cheap. Berlin's Senate approved a digitisation budget of roughly €47 million for 2025–2026, spread across multiple departments, but no dedicated line item has been confirmed for a citywide image deduplication programme. Comparable exercises in Hamburg and Leipzig have cost between €800,000 and €2 million depending on collection size, according to publicly available procurement records from those cities.

The practical calendar matters here. The Senate's IT steering committee is scheduled to meet in September 2026, when digital infrastructure priorities for the following budget cycle will be set. That meeting is effectively the last realistic decision point before any programme could receive funding early enough to begin work in 2027. Miss that window, and the problem carries into a third fiscal year with no resolution.

For the city's institutions, the next eight weeks are the ones that count. The Stadtbibliothek, whose main branch sits on Breite Straße in Mitte, has already begun an internal audit of its image holdings — a process expected to conclude by late August. The Landesarchiv, meanwhile, is preparing a position paper on provenance standards that will feed directly into the September discussions. How those two documents land, and whether the Senatsverwaltung für Inneres und Digitales treats them as the foundation for a unified policy or as competing briefs from rival institutions, will define what Berlin's digital image infrastructure looks like for the next decade.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily Berlin

This article was produced by the The Daily Berlin editorial desk and covers news in Berlin. See our editorial standards for how we use AI.

The Daily Berlin brief

The day's Berlin news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily Berlin and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to Berlin news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily Berlin and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily Berlin

More in News

Enjoyed this story? Get tomorrow's briefing free.