Berlin's public administration is sitting on a digital mess. Across municipal databases—from the Senatsverwaltung für Stadtentwicklung to the BVG's internal asset management systems—duplicate image files have quietly multiplied into the tens of millions, according to figures from procurement documents reviewed by The Daily Berlin. The bill for storing redundant data the city no longer needs runs to an estimated six figures annually in cloud and on-premise infrastructure costs.
The timing matters. The SPD-led Senate coalition has committed to a sweeping digitisation agenda under the Berlin Digital Strategy framework, which earmarks roughly €170 million for e-government infrastructure through 2027. When a significant share of that storage capacity is eaten up by duplicate thumbnails, redundant photographs of Prenzlauer Berg construction sites, or triplicate scans of Mitte planning documents, the efficiency case for the entire programme weakens. Digital reformers inside the city administration have been pressing for a systematic deduplication effort for at least 18 months.
The Numbers Tell a Familiar Story
Deduplication audits carried out in comparable European city governments have found that between 25 and 40 percent of stored image assets are exact or near-exact copies. Apply even the conservative end of that range to Berlin's documented public-sector data estate—which the Senatsverwaltung für Inneres und Digitales has described in budget filings as exceeding 4 petabytes across all departments—and the scale of the redundancy problem becomes concrete. At current commercial cloud rates of roughly €20 per terabyte per month, a 30 percent duplication rate across image-heavy departments translates to storage waste running well above €200,000 per year, before factoring in the staff hours spent manually searching through cluttered asset libraries.
The BVG, Berlin's public transport operator, manages tens of thousands of images annually across infrastructure inspection records, marketing campaigns, and real-time operations documentation. The operator's IT procurement team confirmed in a 2025 annual report that it had begun piloting automated asset management tools, though the deduplication component remained in early-stage evaluation as of last December. Meanwhile, the city's open data portal, hosted at daten.berlin.de, has expanded its image holdings significantly since 2022, with dataset counts rising year on year—but without a mandatory deduplication protocol baked into the upload workflow.
What the City Is Actually Doing About It
The Technologiestiftung Berlin, a publicly funded foundation based in Grunewaldstraße in Schöneberg, has been running workshops on data quality standards for municipal departments since early 2025. Its focus has largely been on structured datasets—addresses, permit records, geodata—but image asset management has begun appearing on the agenda of its DigitalService working groups. Separately, the city-owned IT service provider ITDZ Berlin, headquartered in Berliner Straße in Charlottenburg, holds the contracts for centralised storage infrastructure across most Senate departments and would be the operational body responsible for rolling out any citywide deduplication tooling.
Automated deduplication software—tools that use perceptual hashing algorithms to identify visually identical or near-identical images regardless of file name or metadata—is not new technology. Off-the-shelf enterprise solutions have been available since the early 2010s, and open-source alternatives exist that municipalities in Amsterdam and Vienna have integrated into their content management pipelines. The barrier in Berlin has been less technical than administrative: procurement cycles are slow, interdepartmental data-sharing agreements are complex, and image asset management has historically sat below the priority threshold of senior IT leadership.
That is starting to shift. ITDZ Berlin's 2026 service roadmap, published in January, references a planned audit of image and media storage consumption across client departments, with findings expected by the fourth quarter of this year. If the audit confirms what procurement documents suggest—widespread, costly redundancy—the Senate will face a straightforward efficiency argument for mandatory deduplication standards before the next budget cycle opens in early 2027. Departments that want a share of the remaining Digital Strategy funds may find that demonstrating clean data housekeeping becomes a prerequisite for the money.