Roughly one in five digital image files stored across Berlin's municipal content management systems is a duplicate — an identical or near-identical copy of another file already sitting in the same archive. That estimate, drawn from an internal review circulated within the Senate Department for Urban Development and Housing earlier this year, has prompted a quiet but intensifying conversation about how the city manages its visual data. The cost is not trivial.
The issue matters right now because Berlin is mid-way through a digitisation push tied to the federal Onlinezugangsgesetz programme, which requires government services to be fully accessible online by the end of 2026. As agencies rush to upload documents, maps, architectural plans and publicity photographs, duplicate image files are inflating storage costs, slowing search performance and, in several cases, causing outdated or legally unlicensed images to resurface in public-facing material. A photograph cleared for use in 2019 but not renewed in 2022 can reappear without warning if a content editor pulls from an unaudited legacy folder.
Where the Problem Shows Up in Berlin's Infrastructure
Two organisations have become focal points in internal discussions. The Berliner Verkehrsbetriebe — the BVG, responsible for the U-Bahn, trams and buses — maintains a media library used by communications teams across its Holzmarktstraße headquarters and several depot offices, including the large facility in Lichtenberg. Employees who handle campaign assets for services like the U5 extension have flagged that the same photograph of Alexanderplatz station has been uploaded under at least eleven separate file names since the image bank migrated to a new platform in 2023. The resulting clutter means staff routinely waste time verifying whether a file is the original, a compressed copy or an altered version before approving it for print or digital use.
Separately, the Berlin Senate's press office, based in the Rotes Rathaus on Rathausstraße, has been working since March 2026 with the IT service provider ITDZ Berlin to audit roughly 340,000 image files held across departmental servers. Early findings, shared informally at a cross-departmental working session in Mitte in April, suggested that around 68,000 of those files — just under 20 percent — were flagged by automated deduplication software as exact or perceptual duplicates. Perceptual duplicates are images that look identical to the human eye but have different file sizes, metadata or compression histories.
What Deduplication Actually Costs — and Saves
Storage is the obvious headline figure. Enterprise cloud storage for municipal bodies in Germany typically runs between €0.02 and €0.05 per gigabyte per month depending on redundancy tiers. A library of 340,000 images, with average file sizes around 4 MB, occupies roughly 1.36 terabytes before duplication. If 20 percent of those files are redundant, the city is paying to store approximately 272 GB of data it does not need. Over 12 months, that is a manageable but unnecessary recurring cost — and one that compounds as libraries grow.
The harder cost is human labour. Deduplication consultants working with German public-sector clients — a market served by firms including Stuttgart-based SER Group and Hamburg's d.velop — typically quote projects at between €15,000 and €60,000 depending on archive size and integration complexity. The Berlin Senate's ITDZ engagement falls within that range, though the precise contract value has not been disclosed publicly. What is clear is that the return on investment depends entirely on whether the cleaned archive is maintained through consistent metadata tagging going forward. Without that discipline, duplication rates return to baseline within 18 to 24 months.
For Berlin's institutions, the practical path forward involves two things: automated hash-matching tools that flag duplicates at the point of upload, and standardised naming conventions enforced across departments. The BVG's communications unit has already piloted a mandatory filename taxonomy for new uploads since May 2026. The Senate press office is expected to complete its full audit by September, ahead of the year-end Onlinezugangsgesetz deadline. Whether the cleaned systems stay clean depends on training — and on whether the unglamorous work of digital housekeeping gets the same political attention as the platforms it underpins.