Kostenlos abonnieren
The Daily Berlin

Berlin news, every day

News

Berlin's Digital Archives Race to Fix a Hidden Data Crisis: Duplicate Images Clogging City Systems

A wave of redundant photo files has slowed public databases and internal platforms across Berlin's municipal network, prompting an emergency cleanup effort this week.

By Berlin News Desk · Published 4 July 2026, 8:35 pm

3 min read

Berlin's Digital Archives Race to Fix a Hidden Data Crisis: Duplicate Images Clogging City Systems
Photo: Cook, Theodore Andrea, Sir, 1867-1928 / Public domain (Wikimedia Commons)
Wird übersetzt…

Berlin's Senate Department for Digital Transformation confirmed this week that a system-wide audit of the city's public-facing image databases had uncovered tens of thousands of duplicate files spread across municipal platforms, including the Stadtportal berlin.de and the internal document management system used by Bezirksamt offices across all twelve boroughs. The problem, which administrators say built up over at least three years of uncoordinated uploads, is now the subject of an accelerated remediation project with a completion target set for the end of September 2026.

The timing matters. Berlin has been pushing hard to digitise planning documents, housing permit applications, and transport infrastructure records as part of the Digitalisierungsstrategie 2025-2030, the city's overarching technology roadmap. Clogged image libraries slow search indexing, inflate cloud storage costs, and create version-control confusion for staff — particularly in departments handling sensitive materials like building inspections in Mitte and zoning records for the growing Nordhafen development corridor in Wedding.

How the Problem Grew

The duplication issue traces back partly to the rushed expansion of remote-work infrastructure during 2022 and 2023, when multiple departments began uploading scanned documents and photographs to shared drives without a unified naming or tagging convention. Staff at the Stadtentwicklungsamt — the urban development office — were uploading site photographs of construction projects in Lichtenberg and Tempelhof-Schöneberg simultaneously through at least two separate portals, neither of which had deduplication software enabled. By early 2026, some folders contained four or five copies of the same image file under slightly different filenames, a pattern the audit found repeated across at least 34 departmental accounts.

The Berlin-based civic tech organisation CityLab Berlin, which has collaborated with the Senate on open-data initiatives since 2018, has been brought in to advise on metadata standards that would prevent recurrence. The organisation operates out of the Futurium building on Alexanderufer and has previously worked on the Berliner Open Data Handbuch, a guide used by public bodies to structure datasets for citizen access. Their involvement signals that the Senate wants a structural fix, not just a one-time file purge.

What the Cleanup Involves

The remediation project, which began formally on June 30, uses a combination of perceptual hashing software — tools that identify visually identical or near-identical images even when file names differ — and manual review for flagged edge cases. According to the project's published scope document, the first phase covers roughly 1.2 million image files stored on servers managed by the IT service provider ITDZ Berlin, the state-owned digital infrastructure company based in Müllerstraße in Wedding. A second phase, starting in August, will extend the audit to images embedded in PDF planning documents submitted through the online portal for Bauanträge, building permit applications.

Storage is not a trivial concern for a city government running on constrained budgets. ITDZ Berlin's publicly available annual report for 2024 listed cloud and server storage costs as one of the fastest-growing line items in municipal IT expenditure, with demand doubling between 2021 and 2024. Eliminating redundant files is expected to reduce active storage load by an estimated 15 to 20 percent on the affected databases, though that figure comes from the project scope document and has not yet been independently verified.

For residents and journalists who rely on berlin.de for access to public planning documents — particularly those tracking contentious developments in neighbourhoods like Neukölln and Friedrichshain — the cleanup should eventually improve search reliability. Documents that currently return multiple near-identical image results in portal searches will instead surface single, properly catalogued files with consistent metadata tags. The Senate's digital team has said the portal will display a maintenance notice on affected search functions during the August phase, which is expected to cause intermittent slowdowns on weekday mornings. Anyone filing a time-sensitive Bauantrag is advised to use the in-person service points at the relevant Bezirksamt rather than the online portal until at least mid-September.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily Berlin

This article was produced by the The Daily Berlin editorial desk and covers news in Berlin. See our editorial standards for how we use AI.

The Daily Berlin brief

The day's Berlin news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily Berlin and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to Berlin news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily Berlin and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily Berlin

More in News

Enjoyed this story? Get tomorrow's briefing free.