Getting Started in DS

Authors
Affiliations

Nora McGregor

Eleonora Gandolfi

Published

May 13, 2025

Modified

May 13, 2025

Abstract

This guide aims to define Digital Scholarship & Data Science in a research library context and provide an overview of the key competency frameworks, reports, networks and communities of practice for library professionals to explore and learn more.

Keywords

Digital Scholarship, Data Science, Digital Skills Frameworks

Introduction

In this DS Topic Guide, we aim to help you connect your work in libraries to the world of digital scholarship and data science, and inspire new possibilities for evolving your practices. We’ll provide tips on how to use Digital Scholarship & Data Science Topic Guides for Library Professionals as a personal resource for building new skills. We’ll explore broad definitions of digital scholarship and data science in the context of research libraries, and share key resources to help you understand how emerging technologies and methods are transforming the traditional work of collecting, curating, creating, and sharing in cultural heritage institutions—and why developing skills in this area can be so valuable!

What is “digital scholarship”?

Though there are many definitions of what constitutes “digital scholarship” out there, we tend to favour the broadest of interpretations, that is, roughly, any type of innovative research or library activity that combines the methodologies from traditional humanities & social science disciplines with computational tools and digital methods provided by computing disciplines, such as data science.

Though closely aligned to, and generously informed by, the academic discipline Digital Humanities, “digital scholarship” allows us to consider more broadly the full range of innovative scholarly activities our users seek to undertake with our digital collections and data, across a diverse range of disciplines.

“Digital scholarship” allows us to define a space for us heritage professionals, where research is undertaken, in our own right, during the course of our daily work, utilising computational methods in the curation, creation, collecting and sharing of our digital collections and data, but is not confined to formal academic pursuits or a particular discipline.

Is this all just “data science”?

Not quite. Data science, also an interdisciplinary academic field, is more specifically focussed on the use of algorithms, machine learning, and statistical modeling to make predictions and uncover deeper insights from noisy, structured, and unstructured data. In libraries, data science is used to improve services, enhance user experiences, and optimize library operations through data-driven insights. Here are a few ways it’s applied:

  • User Behavior Analysis: By analyzing data on book checkouts, website visits, and user preferences, libraries can understand which genres or materials are most popular, and tailor their collections accordingly. This helps in making data-informed decisions about acquisitions and promotions.
  • Predicting Trends: Data science can help libraries forecast trends — for example, which types of books might become popular in the future or when certain resources are likely to be in demand. This allows libraries to better plan their inventory and schedules.
  • Improving Resource Allocation: Libraries can use data science to optimise staffing and the allocation of resources. By examining patterns in library visits, they can predict busy times and adjust staff schedules, ensuring efficient service delivery.
  • Personalized Recommendations: Just like Netflix recommends movies, libraries can use data science to suggest books or resources to users based on their reading history and preferences, making library services more personalized.
  • Enhancing User Engagement: By analyzing usage patterns and feedback data, libraries can identify gaps in services or areas for improvement. They can use this information to engage users better and develop targeted programs, like workshops or events, based on what users are interested in.

Digital scholarship and data science intersect in ways that are especially valuable for libraries, offering rich possibilities for advancing library services and operations. Digital Scholarship & Data Science Topic Guides for Library Professionals explores these synergies to support informed, innovative library work.

Relevance to the Library Sector (Case Studies/Use Cases)

So what might “digital scholarship and data science in libraries” actually look like?

  • a subject librarian using a digital tool (like OpenRefine) to clean up a set of catalogue records in order to understand gaps in the metadata and collection scope
  • a collaborative project to automatically transcribe handwritten texts from old manuscripts
  • a library director reading up on the latest in AI and seeking expert perspectives in order to write a strategy document
  • a metadata specialist packaging up digital collections into datasets that can be used by researchers
  • a digitisation project looking to improve the searchability of printed books in under-resourced languages
  • a marketing analysts using data analytics and data science techniques to understand visitor trends
  • an assistant curator attending meetings of an international knowledge exchange networks (such as International GLAM Labs Community)
  • a reference librarian pointing a researcher to datasets that might help them with their research enquiry
  • an imaging technician creating a 3D model of a collection item
  • a licensing manager keeping up to date on the latest uses of Text and Data Mining (TDM) and AI in research so that digital collections meets the needs of library users and staff who want to work with them at scale
  • a project manager leading a major interdisciplinary research project using the latest technologies to ask research questions of digital heritage collections
  • a curator creating an interactive online exhibit with annotations
  • a research software engineer using a machine learning model to identify genre of digitised texts
  • an assistant librarian attending a summer school on working with cultural heritage data

Why is it important for library staff to learn Digital Scholarship and Data Science skills?

In the recommended reading section of this DS Topic Guide we have linked to a number of key competency and skills reports and frameworks that define the need for such skills in library work. But in the most high-level terms we know that:

  • Libraries need to continually keep apace this digital turn in order to understand the change in service requirements and support colleagues and each other keen to make the most of it.
  • Digital scholarship work is collaborative, requires input across disciplines and domain expertise, our curatorial experts have an essential role to play in that.
  • We’ve so much to gain from understanding digital methods and having closer collaborations with digital scholars—there’s a synergy in solving shared issues such as correcting OCR, enriching collections metadata, conquering back-cataloguing and so on.
  • Digital scholars are, today, using technology in innovative ways, expectations have already changed, they’re seeking access at scale to our collections for computational analysis, they’re using Generative AI to ask their research questions, we need to understand these technologies to understand how the nature of archival enquiry is changing
  • Cultural heritage digital collections are only going to grow and we need the digital skills to work confidently with them at scale

Hands-on activity and other self-guided tutorial(s)

Each DS Topic Guide on this site includes a range of specific hands-on activities and other self-guided tutorials colleagues across Europe personally recommend. When you’re ready to go further and have a better idea of the specific skills you need for a particular task, we can recommend having a good search through these excellent platforms which host or link to a great many in-depth training materials:

AI4Culture https://ai4culture.eu/

AI4Culture is a capacity building platform for the application of AI in the Cultural Heritage Sector. AI4Culture has been co-funded by the European Union under the Digital Europe Programme. Their aim is to enable professionals, researchers, and enthusiasts within the sector with the resources they need to integrate AI into their daily workflow, find creative ways to use them and solve their current problems. The platform hosts a pool of readily deployed AI software tools, along with training and testing datasets that have been curated for use within the sector.

CLARIN Learning Resources https://www.clarin.eu/content/learning-and-training-resources

CLARIN stands for “Common Language Resources and Technology Infrastructure”, it is a European Research Infrastructure Consortium (ERIC). The CLARIN Learning Hub gives access to open educational resources on various topics of relevance to digital scholarship and data science, including full online training modules to learn these new skills.

DARIAH-Campus https://campus.dariah.eu/

DARIAH stands for “Digital Research Infrastructure for the Arts and Humanities”, like CLARIN it was established as a European Research Infrastructure Consortium (ERIC). As a pan-European infrastructure it supports digital research as well as the teaching of digital research methods for arts and humanities scholars working with computational methods. Though not specific to the library professional context, tutorials here are useful for applying techniques to digital collections.

The Glam Workbench https://glam-workbench.net/

The GLAM Workbench is the brainchild of Tim Sherratt, a historian, and is a collection of Jupyter notebooks to help you explore and use data from GLAM institutions (galleries, libraries, archives, and museums). It includes tools, tutorials, examples, hacks, and even some pre-harvested datasets. It is aimed at researchers in the humanities but has useful tutorials for anyone interested in working with GLAM data.

Ineo https://www.ineo.tools/

Ineo is a project developed and maintained by CLARIAH (“Common Lab Research Infrastructure for the Arts and Humanities”, a collaboration of CLARIN and DARIAH) that lets you search, browse, find and select digital resources for research in humanities and social sciences. It offers access to thousands of tools, datasets, workflows, standards and learning material. It is a work in progress so do keep that in mind when browsing.

Library Carpentry https://librarycarpentry.org/

Library Carpentry is an international volunteer community, under the Carpentries, focussed building software and data skills within library and information-related communities. The lessons here are meant to be taught as workshops led by a Carpentries certified instructor (for a fee) but you may find it useful to have a read through the content which is open and available to all.

The Programming Historian https://programminghistorian.org/en/

The Programming Historian has been publishing peer-reviewed tutorials on digital tools and techniques for humanists since 2008 and though they’re generally aimed at academic researchers, staff at British Library have found them highly useful over the years in their own work!

Social Sciences & Humanities Open Marketplace https://marketplace.sshopencloud.eu/search?order=score&categories=training-material

Built as part of the Social Sciences and Humanities Open Cloud project (SSHOC), the Social Sciences and Humanities Open Marketplace is a discovery portal which pools and contextualises resources for Social Sciences and Humanities research communities: tools, services, training materials, datasets, publications and workflows. The Marketplace highlights and showcases solutions and research practices for every step of the SSH research data life cycle.

Finding Communities of Practice

As you embark on learning more about digital scholarship and data science in a library context, you might want to explore and join existing communities of practice.

LIBER Working Groups

Working groups are open to staff at participating LIBER Member institutions:

International Communities/Networks

  • AI4LAM An international, participatory community focused on advancing the use of artificial intelligence in, for and by libraries, archives and museums (Free)
  • Code4Lib International, diverse and inclusive community of developers and technologists for libraries, museums, and archives who are dedicated to seeking to share ideas and build collaboration (Free)
  • IIIF Community IIIF Groups meet regularly to discuss various contexts of IIIF usage for a particular idea or initiative from technical and non-technical perspectives. (Free)
  • IMPACT Centre of Competence IMPACT is a not for profit organisation with the mission to make the digitisation of text “better, faster, cheaper” and to further advance the state-of-the-art in the field of document imaging, language technology and the processing of historical text (Paid membership model)
  • International GLAM Labs Community An international group dedicated to sharing knowledge around setting up, maintaining and sustaining Galleries, Libraries, Archives and Museums’ cultural heritage innovation Labs (Free)
  • Museums Computer Group The Museums Computer Group (MCG) is a non-profit association of individuals, volunteers, who share a common interest in encouraging, improving and influencing best practice in the use of technology and digital platforms within the museum and heritage sector. (Free to join their discussion list)
  • READ Co-op/Transkribus Community A co-operative organisation, born out of two major EU projects, with the goal of revolutionising access to archival documents by providing a comprehensive range of tools and services that empower researchers, institutions, and individuals with cutting-edge technology such as Handwritten Text Recognition (Paid membership model, but also some free membership options for students and scholars)

National Networks (European)

Ireland/UK - RLUK Digital Scholarship Network

We’d love to hear it! Suggest edits by opening a new Issue or adding to the discussion on existing Issues on the project Github. If you’re new to GitHub don’t worry, we have a DS Topic Guide for that: GitHub: How to navigate and contribute to Git-based projects! Or just drop us a line at digitalresearch@bl.uk!