Kostenlos abonnieren
The Daily Berlin

Berlin news, every day

News

How Berlin's Digital Archives Ended Up Full of the Same Image Twice: The Story Behind the Duplicate Problem

Years of rapid digitisation, fragmented city IT systems, and competing databases have left Berlin's public image libraries bloated, inconsistent, and now overdue for a serious fix.

By Berlin News Desk · Published 4 July 2026, 8:45 pm

3 min read

How Berlin's Digital Archives Ended Up Full of the Same Image Twice: The Story Behind the Duplicate Problem
Photo: Photo by Vinay Reddy Sama on Pexels
Wird übersetzt…

Berlin's municipal digital infrastructure has a clutter problem. Across the city's network of public institutions — from the Stadtbibliothek branches in Mitte to the Senate Department for Urban Development's planning portals — thousands of duplicate images sit stored in overlapping databases, consuming server space, complicating public records searches, and costing the city money it does not need to spend. The issue, long flagged by archivists and IT administrators, has finally reached the desk of policymakers inside the red-brick walls of the Rotes Rathaus.

The timing matters. Berlin is midway through a broader digitalisation push under the city's Digital Strategy 2025–2030 framework, and the SPD-led Senate has staked political credibility on delivering leaner, more accessible public services. A tangle of duplicate image files is an unglamorous but concrete obstacle to that goal. When the same photograph of, say, a Neukölln housing block appears in the city's press archive, the BVG transport authority's media library, and the Senate's urban planning portal under three different file names and two different copyright attributions, the downstream problems multiply fast.

How the Duplication Built Up

The roots of the problem trace back to roughly 2010, when individual Berlin districts — all twelve of them — began independent digitisation programmes without a unified metadata standard. Bezirksamt Friedrichshain-Kreuzberg adopted one tagging system; Charlottenburg-Wilmersdorf used another. The Landesarchiv Berlin, headquartered on Eichborndamm in Reinickendorf, maintained its own catalogue structure, as did the Berlin Senate Chancellery's communications office. When images migrated between systems — during server upgrades, staff transitions, or emergency imports following the 2017 consolidation of several district IT contracts — duplicates accumulated silently.

The Berlin Open Data portal, launched in its current form in 2018, accelerated the problem rather than solving it. Institutions uploading public-domain photographs to daten.berlin.de frequently pulled from existing internal drives without first running deduplication checks. By 2023, internal audits — the details of which have not been made public — reportedly identified tens of thousands of redundant image entries across the city's interconnected systems, though no official count has been released.

The BVG, which manages one of Europe's busiest urban transit networks with around 1.5 billion passenger journeys recorded before the pandemic, maintains its own media asset library for press and communications work. Because the transit authority operates under partial public ownership but runs its own IT procurement, its image database developed separately from Senate systems. Cross-uploads between the BVG's library and the city's central communications archive created predictable duplication wherever both organisations covered the same infrastructure projects — the U5 extension to Hönow, the S21 north-south rail link, the refurbishment of Ostbahnhof.

What a Fix Actually Looks Like

The Senate Department for Digital Transformation, operating out of offices near Alexanderplatz, has been piloting an AI-assisted deduplication tool since January 2026 in cooperation with the Zuse Institute Berlin, the applied mathematics and computing research centre based in Dahlem. The pilot targets approximately 200,000 image files held across three test databases. Early internal assessments, shared at a February 2026 working group session but not yet published, suggested the tool could flag potential duplicates with high accuracy, though human review remains mandatory before any file is deleted under the city's archival protection rules.

The practical stakes extend beyond tidy servers. When journalists, researchers, or the public request images through official freedom-of-information channels, duplicated files with inconsistent rights metadata can create legal grey areas — particularly around photographs taken by freelancers who retained partial copyright. Berlin's Informationsfreiheitsgesetz, the state freedom-of-information law, does not exempt the city from copyright liability on incorrectly attributed material.

The Senate has indicated it plans to publish a consolidated image-metadata standard for all public bodies by the end of 2026. Institutions will then have twelve months to bring their archives into compliance. For the Landesarchiv, the Stadtbibliothek system, and the BVG communications office, that deadline is now the clearest signal yet that the years of parallel, uncoordinated digitisation are officially being counted as a cost — and one the city has decided it can no longer ignore.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily Berlin

This article was produced by the The Daily Berlin editorial desk and covers news in Berlin. See our editorial standards for how we use AI.

The Daily Berlin brief

The day's Berlin news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily Berlin and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to Berlin news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily Berlin and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily Berlin

More in News

Enjoyed this story? Get tomorrow's briefing free.