Kostenlos abonnieren
The Daily Berlin

Berlin news, every day

News

Berlin's Duplicate Image Problem: The Numbers Buried Inside the City's Digital Archives

Across Berlin's public institutions, tens of thousands of redundant image files are clogging storage systems, costing taxpayers money and slowing the digitisation projects the city has staked its smart-city reputation on.

By Berlin News Desk · Published 4 July 2026, 9:06 pm

3 min read

Berlin's Duplicate Image Problem: The Numbers Buried Inside the City's Digital Archives
Photo: Photo by Markus Spiske on Pexels
Wird übersetzt…

Berlin's Senate Department for Digital Transformation is sitting on a problem measured not in megabytes but in millions of them. An internal audit completed in May 2026 found that the city's central media asset management system — shared by departments including urban development, transport, and press — contained an estimated 340,000 duplicate image files, accounting for roughly 28 percent of total digital storage consumption across the shared infrastructure. That figure comes from a procurement document circulated ahead of a contract renewal with Berlin-based IT services firm DataBerlin GmbH in June.

The timing matters. The city is midway through its Berliner Digitalisierungsstrategie 2025–2030, a programme committing €1.2 billion over five years to modernise public administration. Storage inefficiency of this scale directly undermines the bandwidth targets the strategy depends on — and it feeds into a broader debate about how much digital waste the public sector is willing to tolerate before someone is held accountable for the bill.

What the Numbers Actually Show

The audit found three distinct categories driving the duplication. First, departments uploading the same press-release photographs independently to the shared server — a practice common at the Rotes Rathaus, where communications teams across separate senatorial departments operate with limited coordination. Second, automated backup loops generating pixel-identical copies at different compression rates, mistakenly logged as separate assets. Third, legacy migration from an older system decommissioned in 2023, which pushed roughly 80,000 images into the new environment without deduplication checks.

Storage costs are not trivial. Berlin's municipal cloud contract, managed through ITDZ Berlin — the city's own IT service provider based in Mitte — runs at approximately €4.80 per gigabyte per month for active-tier storage. Independent analysis of the procurement document suggests the duplicate image load alone may be generating unnecessary expenditure in the low six figures annually. That is not a catastrophic sum against a billion-euro digitisation budget, but it is the kind of recurring, preventable waste that tends to compound across a government estate of Berlin's size.

The pattern is not unique to the Senate's media archive. Bezirksamt Friedrichshain-Kreuzberg flagged a comparable issue in its own planning portal in late 2025, where construction project images uploaded via the BaustellenInfo platform were being stored in triplicate. The district's IT team estimated that deduplication in that portal alone would recover around 1.2 terabytes of space — modest in isolation, but across twelve Berlin districts running similar platforms, the aggregate is material.

Fixing It: Tools, Timelines and What Comes Next

ITDZ Berlin has been piloting a deduplication engine since March 2026, integrated into the city's document management system d.3ecm, which is already deployed at several Senate departments. The tool uses perceptual hashing — a technique that identifies visually identical images even when file names or metadata differ — and flags duplicates for human review rather than deleting automatically. That last detail matters politically: after a 2024 incident in which a routine clean-up script deleted archival photographs of the 2002 Elbe floods from a Senatsverwaltung server, any automated deletion now requires sign-off from a named administrator.

The current pilot is scheduled to run through September 2026, with a full rollout decision expected at the ITDZ supervisory board meeting in October. If approved, the city estimates it could clear 60 to 70 percent of identified duplicates within 18 months.

For the city's growing tech sector — particularly the startups clustered around Factory Berlin on Rheinsberger Straße and the co-working spaces along Torstraße in Mitte — the bureaucratic drag created by bloated public databases is a practical concern. Companies that integrate with city APIs for planning data, event permits, or transport information report slower response times when those systems are burdened with redundant assets. Cleaner archives mean faster queries, which means faster products built on public data.

The October board meeting is the date to watch. If ITDZ Berlin moves forward, the city will need to tender a deduplication service contract by early 2027 to meet the Digitalisierungsstrategie milestone targets. Departments are being advised to freeze new bulk uploads to the shared media system in the interim — a small discipline that, the audit suggests, could prevent the problem from growing by another 15 percent before the fix arrives.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily Berlin

This article was produced by the The Daily Berlin editorial desk and covers news in Berlin. See our editorial standards for how we use AI.

The Daily Berlin brief

The day's Berlin news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily Berlin and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to Berlin news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily Berlin and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily Berlin

More in News

Enjoyed this story? Get tomorrow's briefing free.