Kostenlos abonnieren
The Daily Berlin

Berlin news, every day

News

Berlin's Digital Archives Strike Back: City Launches Crackdown on Duplicate Image Chaos This Week

A push to clean up Berlin's sprawling public digital infrastructure is exposing just how badly repeated and redundant imagery has cluttered government databases, news platforms, and the city's growing tech sector.

By Berlin News Desk · Published 4 July 2026, 8:58 pm

3 min read

Berlin's Digital Archives Strike Back: City Launches Crackdown on Duplicate Image Chaos This Week
Photo: Photo by Viviana Ceballos on Pexels
Wird übersetzt…

Berlin's Senate Department for Digital Development and Work confirmed this week that a formal review of duplicate image data across city-administered databases is now underway, targeting everything from housing portal listings to public transport route maps published by the BVG. The cleanup effort, which began in earnest on Monday, June 30, follows months of complaints from city contractors and civic tech developers about redundant files slowing down public-facing digital services.

The timing is not arbitrary. Berlin's coalition under SPD leadership has staked a significant portion of its digital governance agenda on making the city's data infrastructure fit for the next phase of the Smart City Berlin strategy. Duplicate images — identical or near-identical files stored multiple times across disconnected systems — have emerged as a concrete, measurable drag on that ambition. For a city positioning itself as a European tech hub alongside startup clusters in Mitte and Prenzlauer Berg, the problem carries real reputational weight.

Where the Problem Is Concentrated

The worst bottlenecks, according to the Senate's internal review documents circulated this week, sit inside the Berlin Open Data portal at daten.berlin.de and the digital inventory systems used by the Stadtentwicklung — the urban development arm responsible for housing permit records. Both platforms grew rapidly during the 2020–2023 pandemic-era digitisation push and were fed image assets from dozens of separate agencies without any centralised deduplication protocol. The result: some building permit files contain the same facade photograph stored up to eleven times under different filename conventions.

The BVG's digital communications team has also been flagged. The public transport operator, which serves roughly 1.1 billion passenger journeys per year across its U-Bahn, S-Bahn, tram, and bus network, maintains a media library for route visualisations and station imagery that auditors found contained substantial redundancy, particularly around major interchange hubs like Alexanderplatz and Ostbahnhof.

Berlin-based startup Metatagger GmbH, headquartered in a co-working facility on Oranienstraße in Kreuzberg, has been brought in as a technical consultant on the deduplication process. The firm specialises in automated metadata cleaning for public sector clients and has previously worked on similar projects in Hamburg and Leipzig.

What the Data Shows — and What Comes Next

The scale is significant. Preliminary figures from the Senate review, shared with the Digital Advisory Council on July 2, indicate that removable duplicate image files account for an estimated 340 terabytes of redundant storage across the city's top-tier government platforms. At current cloud storage contract rates — Berlin's primary public cloud deal runs at roughly €0.023 per gigabyte per month — the annual cost of retaining that redundant data runs into six figures.

Beyond pure cost, the performance argument is gaining traction among developers. Civic tech groups including Code for Berlin, which holds regular open-data meetups at locations including the co-working space Supermarkt on Eberswalder Straße in Prenzlauer Berg, have long flagged that bloated image libraries slow API response times on public datasets, making third-party app development for Berlin services more difficult than it needs to be.

The Senate has set a completion target of October 31 for the first phase of deduplication, covering the Open Data portal and the Stadtentwicklung permit system. A second phase, encompassing BVG's media library and the city's tourism and events databases managed through visitBerlin, is scheduled to follow in early 2027.

For residents and developers who rely on these platforms, the practical upshot is gradual. Faster load times on the housing portal listings — a particularly sore point for Berliners navigating the city's chronic rental shortage — may be noticeable by autumn. Developers building on the Open Data API are being advised to re-index their data pulls after October 31, when file identifiers will be reassigned as part of the cleanup. The Digital Development Department says updated technical documentation will be published to daten.berlin.de no later than two weeks before the switchover.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily Berlin

This article was produced by the The Daily Berlin editorial desk and covers news in Berlin. See our editorial standards for how we use AI.

The Daily Berlin brief

The day's Berlin news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily Berlin and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to Berlin news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily Berlin and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily Berlin

More in News

Enjoyed this story? Get tomorrow's briefing free.