Berlin's public-sector digital archives are clogged with duplicate images, and the institutions responsible for managing them are under growing pressure to act. Across city departments, cultural foundations, and transport authorities, the same photographs, scanned documents, and graphic assets are stored in multiple locations simultaneously — driving up storage costs, slowing search systems, and creating confusion over which version of an image is the authoritative one.
The issue has gained urgency in 2026 as Berlin's Senate Department for Digital and Administrative Modernisation pushes forward with its Digitalstrategie Berlin framework, a programme designed to consolidate the city's fragmented IT infrastructure by the end of this legislative term. Duplicate asset management — unglamorous but consequential — has emerged as one of the thorniest obstacles to that goal.
What the Institutions Are Saying
The Stadtmuseum Berlin, which holds one of the largest photographic collections in the German capital, has acknowledged that its digital catalogue contains significant redundancy built up over more than a decade of digitisation projects. The museum, headquartered near the Nikolaiviertel, has been piloting a deduplication workflow since early 2026 using perceptual hashing software — a technique that identifies near-identical images even when file names or metadata differ. Staff involved in the pilot have described the results as substantial, though the institution has not published final figures.
The Berliner Verkehrsbetriebe, better known as BVG, faces a parallel version of the problem. The transit authority, which manages around 1,500 kilometres of bus, tram, U-Bahn, and S-Bahn routes across the city, maintains internal image libraries for maintenance documentation, public communications, and infrastructure planning. BVG's communications directorate has said it is reviewing its asset management system as part of a broader IT modernisation tied to the authority's 2025–2030 investment plan, which earmarks funds for digital infrastructure upgrades alongside rolling stock and station improvements.
Experts in digital asset management point out that the problem is not unique to Berlin's public sector — but the city's particular structure makes it worse than average. Berlin operates through a two-tier system of Senate-level departments and 12 semi-autonomous borough administrations, each of which has historically procured its own software and storage solutions. That fragmentation means the same image — say, an aerial photograph of Tempelhofer Feld — can sit simultaneously on servers in Tempelhof-Schöneberg, at the Senate Chancellery on the Spree, and in the archive of whichever project commissioned the original shoot.
The Cost Argument
Storage is not cheap. Commercial cloud storage for institutional users in Germany typically runs between €0.02 and €0.05 per gigabyte per month depending on contract terms and provider — and Berlin's combined public-sector digital estate runs into petabyte territory when you include all borough-level systems. Industry specialists consulted for reports on public-sector IT in Germany have estimated that redundant data commonly accounts for 20 to 40 percent of an organisation's total storage footprint, though figures vary widely by institution and sector.
The Technologiestiftung Berlin, a publicly funded foundation that advises the city on digital policy and is based in Grunewaldstrasse in Schöneberg, has flagged deduplication as part of its broader work on open data and public-sector data quality. The foundation has previously published guidance noting that poor metadata standards are frequently the root cause: when two departments scan the same document independently because neither knows the other has already done it, the duplicate is created before it ever touches a server.
For cultural institutions such as the Akademie der Künste on Pariser Platz, the stakes go beyond cost. Duplicate images with inconsistent colour profiles or compression levels can cause archivists to work from degraded versions of historically significant photographs without realising it. The question of which copy is canonical matters.
Institutions navigating this now are being told by digital archivists to treat deduplication as a policy problem before it becomes a technology problem. That means agreeing on metadata standards, establishing a single authoritative repository for shared assets, and building procurement rules that require interoperability by default. The Senate's Digitalstrategie Berlin is supposed to provide that framework — but borough-level buy-in is still uneven, and the deadline pressure is real. The next major review of the city's digital governance structure is scheduled for the fourth quarter of 2026.