Exploring a cultural “filter bubble” in artwork databases of two large museums of fine art

paper, specified "long paper"
  1. 1. Sara Minster

    Bar-Ilan University

  2. 2. Inna Kizhner

    Siberian Federal University

  3. 3. Maayan Zhitomirsky-Geffet

    Bar-Ilan University

Work text
This plain text was ingested for the purpose of full-text search, not to preserve original formatting or readability. For the most complete copy, refer to the original conference program.

1. Introduction

Digitization of cultural information by GLAM (Galleries, Libraries, Archives and Museums) institutions opens new opportunities for dissemination of cultural data to heterogenous audiences. However, selective information dissemination related to biases in physical collections, policies or difficulties of digitization and aggregation (Mak 2014; Bode 2020; Zhitomirsky-Geffet and Hajibayova 2020; Kizhner et al 2021; Ortolja-Baird and Nyhan 2021) may lead to the creation of a global cultural “filter bubble” where much of the world cultural heritage remains concealed from the public view, thus missing the opportunity to correct the historical injustices and bias in cultural knowledge representation. A filter bubble is a situation when users are only exposed to homogenous information that conforms with their views and prejudices (Pariser 2011). We can say that a cultural “filter bubble” created by culturally and geographically homogenous artwork collections published online encircles users within the information from the dominant culture, thus narrowing their intellectual and cultural horizons (Zhitomirsky-Geffet 2019).
In this study, we aim to investigate and quantitatively measure the cultural “filter bubble” by comparing data publishing and dissemination practices of two famous national museums of fine arts, the Metropolitan Museum of Art, New York, and the Rijksmuseum, Amsterdam. Both museums present artworks to ‘contemporary national and international audiences

’ and appear in a study of the most influential museums published in 2017 (Van Riel and Heijndijk 2017). The study’s contribution to bias or “filter bubble” detection in large cultural heritage databases is crucial to many research projects in digital humanities that are based on the analysis of such databases and large amounts of cultural heritage data.

2. Research methodology

The study examined three dissemination channels used by the two museums: 1) data extracted from searchable online collections



, 2) data extracted from the museum datasets available for the general use via open APIs of both museums as CSV files for the Metropolitan Museum


and an XML file for the Rijksmuseum


, and 3) items with a Collection Property (P195) retrieved from Wikidata using SPARQL query language. As a result, we obtained six databases (three databases for each of the two museums under study, as shown in Table 1). To measure the geographical and cultural diversity of the databases as conveyed by the five criteria of the ethical evaluation framework (Zhitomirsky-Geffet and Hajibayova 2020) adopted in this study, we ran queries in the six databases that computed the distribution of artworks according to the following variables: artworks’ continent, country of origin, artist’s nationality, and culture.

3. Results

Our results demonstrate that all six databases from both museums focus on the Western culture (Western and Central Europe and North America), although the Metropolitan Museum of Art shows higher cultural diversity scores compared to Rijksmuseum as it also covers ancient cultures, such as Egypt or Mesopotamia (in line with an art historical canon), in its API dataset (Figure 1). Surprisingly, Metropolitan Museum’s Wikidata collection focuses on the United States with 90% of all artworks, while almost all the items in the Rijksmuseum Wikidata collection were created in Europe.
We found that Asian cultures are weakly pronounced in all six datasets (16-18% in Metropolitan’s API dataset and online search system, less than 5% for all Rijksmuseum’s databases, and almost none in both museums’ Wikidata collections) compared to Western and Central European cultures (with over 50% for Rijksmuseum’s databases and Metropolitan’s online database) and North American cultures (with over 20% in all Metropolitan’s databases). Japanese culture, with the highest amount of records from Asia, accounts for 4.7% of the records published online by Metropolitan and 2.3% of records accessed via the Rijksmuseum API, with other Asian countries presented to a much lesser extent. In addition, we found that the representation of the native peoples’ artworks (mostly from Asia) that the Netherlands conquered in early modern history, such as Dutch East Indies, Java, Batavia or Indonesia, ranges between 0.6% to 2.8% in the three Rijksmuseum databases. Similarly, the artworks of Native Americans constituted only less than 1% of the three Metropolitan’s databases. Although our findings may reflect the rates of physical collections of these museums, they do not comply with the ethical evaluation criteria, nor do they reflect the museums’ mission statements, e.g. “Metropolitan collects, studies, conserves, and presents significant works of art across all times and cultures

”, “Rijskmuseum offers a representative overview of Dutch art and history … as well as major aspects of European and Asian art


Figure 1 – Percentage of items for each continent in all the databases.
Interestingly, the dominance of the Western culture in Wikidata is also reflected by geographic distribution of contributing institutions. Thus, out of the 5,040 (as of June 2021) museums and galleries from around the globe that published their artwork data in Wikidata, 42.25% are located in Western and Central Europe and 17.92% are in North America, while leaving less than 40% to the rest of the world. Only 19.07% of Wikidata museums are located in Asia and 11.93% are in Eastern Europe.

Database name

Online system

API Dataset



Table 1 - Amounts of museum items in each database.

4. Conclusion

We provide evidence to the existence of a cultural “filter bubble” (namely, bias towards Western cultures) in the online databases of the two influential museums, accessed via website search, APIs or queried from Wikidata. All databases appear to have significant constraints and biases in terms of presenting the diversity of cultures, especially in the data submitted to Wikidata, a channel that is important for the dissemination of cultural content among those users who rarely visit institutional websites and museums (Navarrete and Villaespesa 2020). Further investigation into the reasons for the obtained results is needed that could include underrepresentation of various cultures in physical museum collections, curatorial decisions on selective digitization and publishing policies influenced by art history canons or previous institutional policies.

Bode, K. (2020). Why you can’t model away bias,
Modern Language Quarterly, 81: 1.

Inna Kizhner, Melissa Terras, Maxim Rumyantsev, Valentina Khokhlova, Elisaveta Demeshkova, Ivan Rudov, Julia Afanasieva, Digital cultural colonialism: measuring bias in aggregated digitized content held in Google Arts and Culture,
Digital Scholarship in the Humanities
, Volume 36, Issue 3, September 2021, Pages 607–640.

Mak, B. (2014). Archaeology of a digitization.
Journal of the Association for Information Science and Technology, 65(8): 1515–26.

Navarrete, T., & Villaespesa, E. (2020). Digital Heritage Consumption: The case of the Metropolitan Museum of Art. Magazén, 1(2).
Ortolja-Baird, A. & Julianne Nyhan, (2021). Encoding the haunting of an object catalogue: on the potential of digital technologies to perpetuate or subvert the silence and bias of the early-modern archive,
Digital Scholarship in the Humanities, 2021; fqab065.

Pariser, E. (2011). The filter bubble: What the Internet is hiding from you. Penguin UK.
Van Riel C., Heijndijk P. (2017). Why people love art museums: a reputation study about the 18 most famous museums among visitors in 10 countries. Rotterdam School of Management, Erasmus University.

Zhitomirsky-Geffet, M. and Hajibayova, L. (2020), "A new framework for ethical creation and evaluation of multi-perspective knowledge organization systems", 
Journal of Documentation
, Vol. 76 No. 6, pp. 1459-1471. 



Zhitomirsky-Geffet M. (2019). Towards a Diversified Knowledge Organization System – An Open Network of Inter-Linked Subsystems with Multiple Validity Scopes. 
Journal of Documentation,
75(5): 1124-1138

If this content appears in violation of your intellectual property rights, or you see errors or omissions, please reach out to Scott B. Weingart to discuss removing or amending the materials.

Conference Info

In review

ADHO - 2022
"Responding to Asian Diversity"

Tokyo, Japan

July 25, 2022 - July 29, 2022

361 works by 945 authors indexed

Held in Tokyo and remote (hybrid) on account of COVID-19

Conference website: https://dh2022.adho.org/

Contributors: Scott B. Weingart, James Cummings

Series: ADHO (16)

Organizers: ADHO