Berlin Tackles Millions in Duplicate Photos Across City Archives
Years of siloed digital storage across city departments left Berlin's official image libraries bloated with redundant files, and a reckoning is now underway.
Years of siloed digital storage across city departments left Berlin's official image libraries bloated with redundant files, and a reckoning is now underway.

Berlin's network of public-sector communications offices is sitting on a problem it can no longer ignore. Across more than a dozen Senate departments, district administrations, and city-owned enterprises — from BVG to Berliner Stadtwerke — digital asset libraries have accumulated hundreds of thousands of duplicate and near-duplicate images over the past decade, clogging storage servers, complicating press work, and driving up licensing costs when originals cannot be identified quickly enough.
The issue is not new, but a review commissioned earlier this year by the Senatsverwaltung für Stadtentwicklung, Bauen und Wohnen brought the scale of it into sharper focus. Procurement records reviewed by The Daily Berlin show that several district communications teams have been paying recurring annual fees to image management software vendors despite having no centralised deduplication protocol in place. The result is a fragmented landscape where the same photograph of, say, the East Side Gallery or Tempelhof Field appears stored under a dozen different filenames across half a dozen separate servers.
The roots of the problem go back to roughly 2014 and 2015, when Berlin's individual Bezirke and Senate departments each began migrating from physical media archives to cloud-adjacent digital asset management systems. The migration was never coordinated centrally. Friedrichshain-Kreuzberg's communications office adopted different software from Mitte's; the BVG press team operated entirely separately from the Verkehrsverbund Berlin-Brandenburg. When photographers submitted images, they went into whichever local system the contracting department used, and cross-referencing was left to individual staff members.
By 2019, the problem was compounding. Berlin hosted increasingly high-profile events — from the re-opening of the Humboldt Forum to major protests along Unter den Linden — generating large photographic batches that were simultaneously filed by multiple departments covering the same scenes. Nobody was tasked with reconciling the libraries afterward. Storage costs rose, but because they were buried inside departmental IT budgets rather than reported as a single line item, the aggregate waste stayed invisible to senior officials until auditors began pushing for a consolidated digital infrastructure review in late 2024.
That review, which drew on work by the Rechnungshof Berlin, found that redundant file storage across city entities was contributing to unnecessary expenditure. The Rechnungshof has previously flagged inefficiencies in city IT procurement in annual reports, though the image-archive issue specifically emerged as a distinct sub-finding only in the most recent audit cycle.
The practical consequence is what administrators are now calling a "duplicate image replacement" initiative — essentially, a city-wide effort to audit existing archives, identify authoritative master copies, and retire redundant files from active systems. The Senate Chancellery at Rotes Rathaus is understood to be overseeing the coordination, with technical work contracted to a Berlin-based IT consultancy under a framework agreement that runs through the end of 2027.
Bezirk Mitte's communications office and the Kulturprojekte Berlin GmbH, which manages image rights for many publicly funded cultural events, are among the first institutions scheduled to complete their deduplication audits, with a target date of September 2026. The process involves automated perceptual hashing — software that identifies visually near-identical images even when file sizes or names differ — followed by manual review for anything flagged as uncertain.
For working journalists who rely on the city's press portals for official images, the practical benefits should eventually be significant. Searching for licensed photographs of Alexanderplatz or the Tempelhofer Feld currently returns dozens of results pointing to inconsistently tagged files, some of which carry expired usage rights. A consolidated, deduplicated library would resolve those ambiguities at the point of search rather than requiring reporters to verify licensing terms individually.
The timeline is ambitious. City IT projects of this scale in Berlin have historically run over schedule — the BVG's own passenger information system modernisation stretched well past its original 2022 completion date. Departments involved in the current initiative have been asked to submit their archive inventories to the coordinating team by 31 August 2026. Whether that deadline holds will be an early signal of how seriously the administration is treating a problem it has spent a decade avoiding.
How does this story make you feel?
Spread the word
About this article
Published by The Daily Berlin
Daily brief
Free, in your inbox before 7am. Weekdays.
More in News