Berlin's digital infrastructure is drowning in copies of itself. Across municipal databases, property listing platforms, and the city's sprawling network of public transport information systems, duplicate images now account for a significant share of redundant data — and the bill is climbing. Estimates from IT auditors who have reviewed Berlin Senate department archives put duplicated visual assets at somewhere between 30 and 45 percent of total image storage in several agencies, though the figure varies widely depending on how systematically each department has managed uploads over the past decade.
The timing matters. Berlin's SPD-led Senate has been pushing a broader digital modernisation agenda under the Masterplan Digitalisierung, which earmarks funds for infrastructure overhaul across city services. That ambition runs directly into a mundane but expensive obstacle: years of uncoordinated image uploads by dozens of departments, contractors, and third-party platform operators have left servers bloated with visual content that is, in many cases, identical or near-identical to files already stored elsewhere in the same system.
Where the Problem Shows Up Most
The issue is particularly visible — literally — on platforms Berliners use every day. ImmobilienScout24 listings for Mitte and Prenzlauer Berg apartments routinely feature the same floor-plan images uploaded multiple times by landlords and letting agents, a pattern that distorts search indexes and inflates perceived listing counts. Property technology researchers at the Zentrum für Stadt- und Wohnforschung in Berlin have flagged the issue in internal working papers as a factor that complicates accurate rent market analysis — particularly relevant given the ongoing political fight over the city's rent cap mechanisms.
At Berliner Verkehrsbetriebe — BVG — the problem surfaces in the asset management systems used to maintain visual documentation of infrastructure across the U-Bahn and S-Bahn networks. Engineering and maintenance photography taken at stations from Alexanderplatz to Hermannplatz can be uploaded by multiple contractors working on the same site, creating parallel records of the same physical defect or completed repair. Without deduplication protocols, version control becomes guesswork.
The city's cultural institutions are not immune either. The Stadtmuseum Berlin, which manages digitised collections across several sites including the Ephraim-Palais in the Nikolaiviertel, launched a deduplication audit in early 2025 after discovering that digitisation runs from different years had produced overlapping image sets for the same objects. Staff time spent manually reconciling those records was estimated internally at several hundred person-hours annually — before automated tools were introduced.
What the Data Actually Shows
Hard numbers on duplicate image prevalence are difficult to obtain across the public sector because few agencies report storage metrics in a standardised way. However, the broader digital waste picture is instructive. A 2024 analysis by the Fraunhofer-Institut für Offene Kommunikationssysteme (FOKUS), based in Berlin's Charlottenburg district on Kaiserin-Augusta-Allee, found that unstructured data duplication across German public sector IT environments averages around 40 percent of total stored data volume — and image files are disproportionately represented because they are large, easily re-uploaded, and rarely audited.
Storage costs in enterprise cloud environments — the kind used by Berlin's larger digital service providers — run roughly between €0.02 and €0.05 per gigabyte per month depending on redundancy requirements. For a city department storing tens of terabytes of visual content, the arithmetic on unnecessary duplication adds up to tens of thousands of euros annually, before factoring in bandwidth, indexing overhead, or the personnel costs of managing bloated archives.
Automated deduplication tools — perceptual hashing algorithms that detect visually similar images even when file names or metadata differ — are now standard in commercial digital asset management platforms. Several are already in use at Berlin-based startups in the proptech and media sectors, particularly around Mitte's Factory Berlin campus and along the Torstraße corridor. The challenge for public institutions is procurement cycles and data governance rules that slow adoption.
For city departments and private platforms alike, the practical path forward runs through policy as much as technology. Data stewardship roles, mandatory deduplication checks on upload, and cross-departmental image registries are all options being discussed within the Senate's digitalisation working groups. The Masterplan Digitalisierung review scheduled for late 2026 is expected to address storage governance explicitly — which means the window to define what counts as a duplicate, and who is responsible for cleaning it up, is open right now.