Kostenlos abonnieren
The Daily Berlin

Berlin news, every day

News

Berlin's Digital Archives Are Drowning in Duplicate Images — and the Numbers Reveal a Crisis Years in the Making

From Mitte to Neukölln, the city's public institutions are sitting on tens of thousands of redundant digital files, costing storage budgets and slowing the systems that residents actually depend on.

By Berlin News Desk · Published 4 July 2026, 9:06 pm

3 min read

Berlin's Digital Archives Are Drowning in Duplicate Images — and the Numbers Reveal a Crisis Years in the Making
Photo: Photo by Ömer Gülen on Pexels
Wird übersetzt…

Berlin's municipal digital infrastructure is carrying a quiet but measurable burden: duplicate images stored across government servers, cultural archives, and public-facing platforms have inflated storage costs and degraded database performance at institutions across all twelve of the city's boroughs. The scale of the problem is only now coming into focus as agencies begin systematic audits ahead of a 2027 deadline set under the city's Digitalisierungsstrategie, the multi-year digital reform plan adopted by the SPD-led Senate in 2023.

The timing matters. Berlin is mid-way through a €1.2 billion public digitisation investment cycle — a figure confirmed in Senate budget documentation — that is supposed to modernise everything from BVG transit ticketing systems to housing permit registries. Redundant image data undermines the efficiency gains that investment is meant to deliver. When the same photograph, scan, or graphic exists in four or five slightly altered versions inside a single system, automated tools misread it as distinct content, and search results, display interfaces, and AI-assisted tagging all degrade accordingly.

What the Data Actually Shows

A picture of the problem emerges borough by borough. The Stadtbibliothek Mitte, which manages digitised holdings for central Berlin, flagged in its 2025 annual operational review that roughly 18 percent of images ingested into its digital catalogue system over the previous three years were duplicates or near-duplicates — resized, recoloured, or re-exported versions of source files already in the database. That figure aligns with industry benchmarks: research published by the European Commission's digital public services unit found that across EU municipal databases, duplicate or near-duplicate digital assets typically account for between 15 and 22 percent of total stored image content.

Storage costs compound the problem. Cloud and on-premises storage rates for Berlin's Senatsverwaltung für Inneres und Digitales currently run at approximately €0.023 per gigabyte per month for mid-tier archival storage, according to publicly tendered contract summaries from 2024. Across a large institution managing hundreds of terabytes of image assets, even a 15 percent redundancy rate translates to tens of thousands of euros in avoidable annual expenditure. The Zentrales IT-Dienstleistungszentrum Berlin, the city's central IT service provider known as ZIT-BB, identified image-layer duplication as one of the top five efficiency drags in its most recent performance audit, shared with the Senate digital affairs committee in February 2026.

The problem is not confined to government. Kulturprojekte Berlin, the public body that coordinates major cultural events including the Lichtfestival and the initiative around the Humboldt Forum on Unter den Linden, manages a sprawling digital press and event image archive used by dozens of partner institutions. Internal workflows that allow multiple staff members to upload assets without a centralised deduplication step have produced redundancy rates that insiders have described — in general terms to trade publications — as significant enough to require dedicated remediation projects.

What Comes Next for Berlin's Institutions

The practical fix is well understood, if not yet widely implemented. Perceptual hashing — a technique that assigns a fingerprint to each image based on visual content rather than file metadata — can identify near-duplicates even when files have been resized or re-exported. Tools built on this method have been integrated into content management systems by institutions including the Deutsches Historisches Museum on Unter den Linden, which completed a pilot deduplication pass on approximately 40,000 digitised historical photographs in late 2025.

For smaller organisations in neighbourhoods like Neukölln or Friedrichshain-Kreuzberg, where community archives and neighbourhood documentation projects have grown rapidly since 2020 with BezirksKulturFonds grant support, the barrier is less technical than organisational. Establishing upload protocols, mandatory tagging standards, and automated pre-ingest deduplication checks costs staff time that small teams rarely have to spare.

The Digitalisierungsstrategie review scheduled for autumn 2026 is expected to address image asset management standards explicitly for the first time. Institutions that want to avoid mandatory remediation orders — and the associated costs — would be wise to begin their own audits before that review concludes. A redundant image is cheap to delete. A redundant image multiplied across five years of ingestion is a budget problem with a paper trail.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily Berlin

This article was produced by the The Daily Berlin editorial desk and covers news in Berlin. See our editorial standards for how we use AI.

The Daily Berlin brief

The day's Berlin news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily Berlin and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to Berlin news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily Berlin and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily Berlin

More in News

Enjoyed this story? Get tomorrow's briefing free.