Berlin's Duplicate Image Problem: The Numbers Driving a City-Wide Digital Cleanup
Redundant photos are clogging Berlin's public databases and costing taxpayers real money — and a new audit is finally putting figures to the mess.
Redundant photos are clogging Berlin's public databases and costing taxpayers real money — and a new audit is finally putting figures to the mess.

Berlin's public digital archives contain hundreds of thousands of duplicate images — identical or near-identical photographs stored multiple times across municipal servers — and a coordinated effort to strip them out is now underway across at least six Senate departments. The scale of the problem, documented in an internal IT efficiency review completed in June 2026, points to years of uncoordinated data management inside a city bureaucracy that added storage capacity faster than it added discipline.
The timing matters. Berlin's Senate Department for Finance approved a revised digitalisation budget of roughly €340 million for the 2026–2027 fiscal period, with cloud migration forming the largest single line item. Every redundant gigabyte migrated to cloud infrastructure carries a recurring cost. Duplicate image files — which in large municipal archives can account for between 15 and 30 percent of total image storage, according to industry benchmarks cited by the German federal IT standards body KBSt — translate directly into avoidable expenditure.
The worst backlogs, according to the June review, sit inside two agencies. The Berlin Senate Department for Urban Development and Housing, based on Württembergische Straße in Wilmersdorf, manages planning application image libraries that stretch back to 2009. Scanning of paper files was outsourced across three separate contracts over that period, producing overlapping uploads with no deduplication step built into the workflow. The Berliner Immobilienmanagement GmbH, the city-owned property management company headquartered near Alexanderplatz, faces a comparable situation in its building documentation records.
The BVG — Berlin's public transport operator — has its own version of the problem. The authority's infrastructure documentation unit, which photographs U-Bahn stations and surface track as part of routine maintenance logging, had accumulated an estimated 2.1 terabytes of duplicate image data by the end of 2025, according to figures presented at a BVG digital infrastructure working group meeting in March 2026. At current Azure storage pricing used under the BVG's Microsoft contract, that volume represents a measurable recurring annual cost that could be eliminated with a one-time deduplication pass.
The city's Kulturprojekte Berlin, which manages digital assets for events including the Berliner Festspiele and the Lange Nacht der Museen, ran its own smaller audit in late 2025 and found that roughly one in five images in its public-facing press archive existed in at least two copies, often uploaded separately by different staff members following the same event.
Deduplication is not free. The standard technical approach uses perceptual hashing — an algorithm that generates a fingerprint for each image and flags near-matches — and requires both computational time and human review for borderline cases. Vendors pitching Berlin's IT procurement unit, the ITDZ Berlin on Berliner Straße in Charlottenburg, have quoted project costs ranging from €80,000 to €220,000 depending on archive size and the required accuracy threshold. The Senate's IT steering committee is expected to approve a procurement framework for the work before the summer recess ends in mid-August.
The financial argument for pressing ahead is straightforward. If Berlin's municipal image stores mirror the 20 percent duplication rate seen at Kulturprojekte Berlin, and total municipal image storage is currently estimated at around 180 terabytes across all departments, the city is paying to store approximately 36 terabytes of redundant data. At enterprise cloud rates, that is a calculable annual overhead that compounds as the archives grow.
For residents and businesses dealing with the city's planning portals — particularly anyone who has submitted documents through the online Baugenehmigung system linked to planning offices in Mitte and Pankow — the practical upside of a cleaner archive is faster document retrieval and a reduced chance of version-confusion errors in which the wrong image is pulled from an application file.
The Senate's digitalisation commissioner is due to present a consolidated progress report to the Abgeordnetenhaus in September 2026. Departments have been asked to complete internal image audits by 31 July — a tight deadline that several agencies are already flagging as ambitious given summer staffing levels. The data will then feed into a city-wide deduplication tender, with contract award targeted for the fourth quarter of this year.
How does this story make you feel?
Spread the word
About this article
Published by The Daily Berlin
Daily brief
Free, in your inbox before 7am. Weekdays.
More in News