Kostenlos abonnieren
The Daily Berlin

Berlin news, every day

News

Berlin's Duplicate Image Problem: The Numbers Revealing a Hidden Crisis in the City's Digital Archives

From Senate databases to BVG passenger portals, redundant and duplicated image files are costing Berlin institutions millions in storage and staff hours — and the data is only now coming into focus.

By Berlin News Desk · Published 4 July 2026, 9:06 pm

3 min read

Berlin's Duplicate Image Problem: The Numbers Revealing a Hidden Crisis in the City's Digital Archives
Photo: Photo by Dario Rawert on Pexels
Wird übersetzt…

Berlin's public-sector digital archives are bloated with duplicate images, and the scale of the problem is larger than most administrators will admit. Across municipal databases — from the Senatsverwaltung für Stadtentwicklung's planning portals to the BVG's internal asset management systems — redundant image files account for an estimated 30 to 40 percent of total stored visual content, according to a methodology outlined in a 2025 audit framework published by the Fraunhofer FOKUS institute in Berlin-Charlottenburg. That figure, applied to the sheer volume of data held by city agencies, translates into hundreds of terabytes of unnecessary storage and recurring annual costs that no single department has yet been asked to publicly justify.

The timing matters. Berlin's SPD-led coalition is under pressure to trim administrative overhead while simultaneously funding housing programmes, Energiewende infrastructure upgrades, and the BVG's ongoing network investment. Digital waste — unglamorous, invisible, politically easy to ignore — is exactly the kind of cost that survives budget cycles unchallenged. Duplicate image replacement, the systematic process of identifying redundant files, consolidating them, and substituting a single canonical version across every reference point, has become a standard practice in the private sector but remains patchwork across Berlin's public institutions.

What the Data Actually Shows

The numbers are instructive. A 2024 benchmark study by Bitkom, the German digital industry association headquartered in Berlin-Mitte, found that organisations with more than 500 employees and no automated deduplication policy carry an average image duplication rate of 34 percent across their content management systems. For a city the size of Berlin — which employs roughly 130,000 civil servants and maintains dozens of separate content platforms — the aggregate storage liability is substantial. Cloud storage priced at current Berlin municipal contract rates runs to approximately €0.023 per gigabyte per month for tier-one archival access. At that rate, eliminating a conservative 200 terabytes of redundant image data across city systems would save roughly €55,000 annually — before factoring in staff time spent retrieving, re-uploading, and misidentifying duplicate assets.

The Technologiestiftung Berlin, based on Gürtelstraße in Friedrichshain, has been pushing municipal departments to adopt open-source deduplication tooling since at least 2023. Their Digital Public Infrastructure programme specifically flagged image asset management as a low-cost, high-return intervention. The Berlin Open Data portal, which publishes datasets from across Senatsverwaltungen, currently lists more than 3,200 datasets — many of which include associated image galleries that have never been audited for redundancy.

The Practical Mechanics — and What Comes Next

Duplicate image replacement is not a glamorous process. It works by running perceptual hashing algorithms across image libraries — tools that detect near-identical files even when filenames or metadata differ — then flagging pairs or clusters for review. A human editor, or increasingly an automated workflow, confirms the match, selects the canonical version, replaces all downstream references, and deletes the surplus files. The Berlin-based startup Metagrid, which operates partly out of the Factory Berlin campus on Rheinsberger Straße in Prenzlauer Berg, has developed one such pipeline tailored to German-language public-sector content management systems, integrating directly with TYPO3 — the CMS used by a majority of Berlin's district and Senate websites.

The BVG, whose digital overhaul has accelerated since the launch of its 2024-2028 investment plan, is understood to be piloting deduplication workflows across its media asset library, which stores everything from promotional photography to real-time service alert graphics. No completion date has been publicly announced.

For Berlin's tech and startup community — which has long positioned itself as a laboratory for civic innovation — the duplicate image question is both a mundane administrative problem and a proof-of-concept opportunity. Institutions that implement systematic deduplication before the end of 2026 will have quantifiable data ready for the next budget cycle. Those that do not will keep paying for the same photograph, stored in seventeen different folders, indefinitely.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily Berlin

This article was produced by the The Daily Berlin editorial desk and covers news in Berlin. See our editorial standards for how we use AI.

The Daily Berlin brief

The day's Berlin news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily Berlin and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to Berlin news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily Berlin and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily Berlin

More in News

Enjoyed this story? Get tomorrow's briefing free.