Kostenlos abonnieren
The Daily Berlin

Berlin news, every day

News

Berlin's Duplicate Image Problem: The Numbers Exposing a Crisis in the City's Digital Archives

New data reveals how thousands of redundant image files are quietly draining storage budgets and slowing down the public institutions meant to serve Berlin's 3.8 million residents.

By Berlin News Desk · Published 4 July 2026, 9:22 pm

3 min read

Berlin's Duplicate Image Problem: The Numbers Exposing a Crisis in the City's Digital Archives
Photo: Paul Eltzbacher / Public domain (Wikimedia Commons)
Wird übersetzt…

Berlin's public sector is sitting on a digital mess. An internal audit completed in June 2026 by the Senatsverwaltung für Inneres und Digitales found that roughly 34 percent of all image files stored across the city's 78 networked government servers are exact or near-exact duplicates — wasted data occupying an estimated 4.2 terabytes of publicly funded cloud and on-premise storage.

The figure matters because Berlin is in the middle of a €180 million digital infrastructure overhaul under the Digitalprogramm Berlin 2024-2028, a five-year modernisation drive meant to bring everything from building permit applications to BVG trip-planning tools onto unified platforms. Redundant files slow indexing, inflate licensing costs for storage providers, and create version-control headaches that delay the very services the programme is designed to improve. The audit lands at an awkward moment, just as the SPD-led coalition has been defending the programme's price tag against Haushalt scrutiny in the Abgeordnetenhaus.

Where the Problem Concentrates

The duplication rate is not evenly spread. The worst offenders are the district-level Bürgerämter, particularly those in Mitte and Friedrichshain-Kreuzberg, where decentralised upload workflows mean the same press-release photograph or building-inspection image can be saved independently by multiple staff members with no automated deduplication check. The Stadtentwicklungsbehörde's database of planning and zoning images, housed at its Württembergische Straße offices in Wilmersdorf, showed a duplication rate of 41 percent across approximately 120,000 files audited between January and May 2026.

The Zentralbibliothek at the Stadtbibliothek Berlin on Breite Straße in Mitte operates its own image repository for digitised historical collections. Librarians there have been running a semi-manual deduplication process since 2023 using open-source tooling, and their duplication rate sits at under 8 percent — a benchmark the audit explicitly cited as evidence that the problem is solvable with consistent process rather than expensive proprietary software.

Storage is not cheap, even at government contract rates. Berlin's primary vendor agreement, renegotiated in March 2025 with a consortium including Telekom Deutschland, prices object storage at roughly €0.023 per gigabyte per month. Four-point-two terabytes of redundant image data therefore costs the city an estimated €96 per month in pure storage fees — small in isolation, but the audit projects the duplicate stock will grow to 9 terabytes by 2028 without intervention, pushing that recurring cost past €200 monthly and complicating backup cycles that already run four hours longer than the city's own 90-minute target.

What a Fix Actually Looks Like

The audit recommends three steps. First, deploying perceptual hashing algorithms — software that detects near-identical images even when file names differ — across all 78 servers by the end of Q3 2026. Second, establishing a single master asset repository under the CityLAB Berlin umbrella, the innovation lab on Platz der Luftbrücke in Tempelhof that already coordinates some cross-departmental digital projects. Third, mandatory metadata standards for any image uploaded after January 2027, requiring a unique identifier tied to the originating department and upload date.

The CityLAB has run pilot deduplication sprints before. In autumn 2024 it tested a batch-processing tool across the Umweltatlas Berlin image set and reduced that collection's duplicate count by 62 percent in six weeks. Scaling that approach city-wide is the operational challenge: the audit estimates 1,400 staff hours of supervised migration work, costing approximately €84,000 at current public-sector hourly rates, against projected three-year savings of €310,000 in storage and IT administration costs.

Departments have until 15 September 2026 to submit implementation plans to the Senatsverwaltung. Those that miss the deadline face potential budget clawbacks under the Digitalprogramm's compliance framework — a clause that, according to the audit document itself, has never yet been enforced. Whether it will be this time is the question administrators on Württembergische Straße are now quietly asking each other.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily Berlin

This article was produced by the The Daily Berlin editorial desk and covers news in Berlin. See our editorial standards for how we use AI.

The Daily Berlin brief

The day's Berlin news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily Berlin and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to Berlin news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily Berlin and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily Berlin

More in News

Enjoyed this story? Get tomorrow's briefing free.