Berlin's Image Duplication Problem: The Numbers Behind Thousands of Wasted Digital Files
City agencies, housing platforms and tech firms are sitting on mountains of redundant image data — and the cleanup bill is growing.
City agencies, housing platforms and tech firms are sitting on mountains of redundant image data — and the cleanup bill is growing.

Berlin's public and private digital infrastructure is carrying a hidden weight. Across municipal databases, rental listing platforms and the city's sprawling startup ecosystem, duplicate image files have accumulated at a scale that costs real money and real time. Estimates from digital asset management consultants working with Berlin-based clients put the share of redundant image files in unmanaged corporate archives at anywhere from 30 to 60 percent of total storage — a range that translates, for a mid-sized organisation, into tens of thousands of euros in annual cloud storage overhead.
The issue has moved from a background IT irritation to a line item that finance departments are actually questioning. The trigger is straightforward: storage costs in Germany rose sharply between 2023 and 2025 as energy prices pushed up data-centre operating expenses, and the Energiewende transition, while advancing renewable capacity, has not yet stabilised commercial electricity tariffs for operators running facilities around Berlin's Ostkreuz corridor.
The housing sector is one of the clearest examples. Platforms managing listings across Prenzlauer Berg, Neukölln and Mitte typically receive the same apartment photographs from multiple estate agents, property management firms and private landlords, sometimes dozens of times over. Degewo, the city-owned housing company managing roughly 75,000 units across Berlin, has publicly committed to digitising its entire property portfolio documentation — a process that, without automated duplicate detection, risks embedding redundancy at the point of creation rather than clearing it up later.
The startup scene around Kreuzberg's Aufbau Haus and the factory campuses in Reinickendorf tells a similar story from a different angle. Early-stage companies burning through product photography for e-commerce or app development routinely generate multiple near-identical renders — same product, marginally different crop or compression setting — that end up filed separately across Dropbox, Google Drive and internal servers simultaneously. A 2024 survey by the Berlin-based digital consultancy Dataversity Network found that teams of between five and twenty people averaged 4.2 copies of every image asset stored across different platforms, with fewer than one in five companies running any form of automated deduplication.
Storage is not free. Amazon Web Services S3 standard storage, widely used by Berlin tech firms, costs approximately €0.023 per gigabyte per month as of mid-2026. A company holding 10 terabytes of image assets — not unusual for a firm doing regular product or property photography — and carrying a 40 percent duplication rate is spending roughly €110 a month on files it does not need. Over a year that is more than €1,300 for a single organisation. Multiply that across Berlin's estimated 4,200 registered tech and digital-media startups and the aggregate waste climbs into the millions annually, before factoring in the labour cost of staff manually searching through redundant libraries.
The municipal side carries larger figures. The Berlin Senate Department for Urban Development and Housing has been expanding its GeoBasis-DE mapping and property imagery archive, which integrates aerial photography updated on roughly an 18-month cycle. Without systematic deduplication protocols baked into the ingestion pipeline, each update cycle risks compounding rather than replacing previous image sets.
Software tooling exists. Open-source libraries such as ImageHash and commercial platforms including Cloudinary's asset management suite both offer perceptual hashing — a technique that identifies visually identical or near-identical images even when file names and metadata differ. Adoption, however, lags significantly behind availability. The Berliner Senatsverwaltung für Digitalisierung, which coordinates citywide IT standards, has not yet published a binding guideline on image asset governance for public-sector bodies, according to publicly available procurement and policy documents reviewed by The Daily Berlin.
For organisations looking to act before any top-down directive arrives, the practical path is straightforward: run a perceptual hash audit on existing archives before the next storage contract renewal, identify the duplication rate, and use that figure to negotiate storage tiers or simply delete. For Berlin's housing platforms, doing that work before the autumn rental market peak — traditionally the city's busiest letting season — would reduce both costs and the risk of outdated images surfacing on live listings. The data already exists. The question is whether anyone will look at it.
How does this story make you feel?
Spread the word
About this article
Published by The Daily Berlin
Daily brief
Free, in your inbox before 7am. Weekdays.
More in News