Berlin's public digital infrastructure has a clutter problem. Across city-managed databases, from the Senatsverwaltung für Stadtentwicklung's planning portals to the archive systems used by institutions along Unter den Linden, duplicate images have quietly accumulated into the tens of thousands — redundant files that inflate storage costs, slow public-facing platforms, and complicate efforts to build a coherent visual record of the capital.
The pressure to act is arriving now for a specific reason: the city's Digital Berlin 2030 modernisation program, which entered its second funding phase in January 2026, has set a mid-year compliance checkpoint for agencies drawing on federal digitisation grants. Organisations that cannot demonstrate clean, deduplicated data repositories by the end of the third quarter risk losing access to the next tranche of funding. For smaller cultural bodies operating on tight margins, that is not an abstract threat.
Where the Problem Is Most Acute
The Stadtmuseum Berlin, which manages collections spread across venues including the Märkisches Museum on Köllnischer Park and the Ephraim-Palais in Mitte, has publicly acknowledged its digitisation backlog in recent annual reports. Duplicate image entries — often the result of multiple scanning runs without adequate deduplication protocols — represent one of the more labour-intensive problems facing archive staff. Similarly, the Berlin State Library on Potsdamer Straße, which holds one of the largest publicly accessible image collections in the German-speaking world, has been working since 2024 to consolidate its digital asset management system under a unified platform.
BVG, the city's public transport operator, faces a parallel issue on its passenger-information infrastructure. Internal platform audits linked to the ongoing €400 million BVG digitalisation programme have flagged duplicate image assets in the real-time display system as a source of unnecessary server load — a technical debt that compounds as the network expands its digital signage across U-Bahn and tram lines.
The cost dimension is concrete. Cloud storage rates for public-sector contracts in Germany currently sit in the range of €0.02 to €0.05 per gigabyte per month, depending on provider and redundancy tier. For a mid-sized municipal archive holding several hundred terabytes of image data — a realistic figure for an institution like the Stadtmuseum — duplicate files conservatively inflate storage bills by 15 to 30 percent annually, according to estimates published by the German Digital Library in its 2025 infrastructure review.
What Comes Next
Three decisions will define how Berlin's institutions handle this over the coming months. First, procurement: several agencies are weighing whether to build deduplication capability into existing content management systems or purchase dedicated tools from vendors already operating under framework agreements with the city's IT procurement body, the ITDZ Berlin. A decision on a pilot contract was expected before the summer recess.
Second, staffing. Automated deduplication tools catch the obvious cases — identical file hashes, pixel-for-pixel matches — but near-duplicate images, such as bracketed photography from heritage documentation shoots in Kreuzberg or Prenzlauer Berg, require human review. Several institutions are lobbying the Senatskanzlei for temporary project funding to hire specialist archivists, a request that will compete with other line-item pressures in the 2027 budget planning cycle beginning in September.
Third, and most consequentially, data governance. Berlin currently lacks a single citywide policy mandating how institutions handle duplicate image data once identified — whether redundant files must be deleted, flagged, or migrated to cold storage. The Senatsverwaltung für Inneres und Digitales is expected to circulate a draft guideline for public consultation before October 2026. How that document is written will determine whether this remains a case-by-case scramble or becomes standard practice across every agency touching the city's digital archive.
Institutions that move early — establishing deduplication workflows and aligning with whatever standard emerges — will be better placed when the Digital Berlin 2030 program's third funding phase opens in early 2027. Those that wait may find themselves managing the same problem under far more pressure, and with less money to fix it.