VLDB 2021 CROWD SCIENCE WORKSHOP
Experts in the field met together to present their work and discuss the future of the field!

August 20. Copenhagen. Hybrid mode

|
VLDB 2021 CROWD SCIENCE WORKSHOP
Experts in the field met together to present their work and discuss the future of the field!

August 20. Copenhagen. Hybrid mode

|
Crowdsourcing has become a standard tool to obtain large amounts of carefully annotated data. Nowadays, its applicability has exceeded the classical purpose of collecting data to train AI algorithms. Indeed, crowdsourcing is often used to conduct research studies and to gain valuable insights from public.
The focus of this workshop is on the best practices
of efficient and trustworthy crowdsourcing
Workshop outcomes
VLDB 2021 Crowd Science Challenge on Aggregating Crowdsourced Audio Transcriptions. In: Proceedings of the 2nd Crowd Science Workshop: Trust, Ethics, and Excellence in Crowdsourced Data Management at Scale. Copenhagen, Denmark, 2021, pp. 1–7.
Proceedings of the 2nd Crowd Science Workshop: Trust, Ethics, and Excellence in Crowdsourced Data Management at Scale co-located with 47th International Conference on Very Large Data Bases
Invited speakers
Stanford University
Data Science Nigeria
Delft University of Technology
Workshop Schedule
Intro & Shared Task
10:00 - 11:10
Session 1
11:30 - 13:15
Session 2
14:00 - 15:10
Session 3
15:30 - 17:45
Panel discussion 15:30 - 16:45
moderator: Olga Megorskaya
8 researchers from various universities and companies meet together to share their view on the future of crowdsourcing:
Jie Yang
Assistant Professor at Delft University of Technology
Marco Brambilla
Head of Data Science Lab at DEIB Department of Politecnico di Milano
Mohamed Amgad
Visiting Predoctoral Fellow, Pathology, Northwestern University
Jorge Ramirez
Ph.D. candidate Dept. of Information Engineering and Computer Science University of Trento
Zack Lipton
BP Junior Chair Assistant Professor of Operations Research and Machine Learning, Carnegie Mellon University
Novi Listyaningrum
Student at Institut Kesenian Jakarta, Toloka Performer
Konstantin Kashkarov
Freelancer, Toloka Performer
Grace Abuhamad
Applied Research Scientist, Trustworthy AI, ServiceNow
Focus areas
We identify three key areas to talk about:
Large-Scale Data Excellence
Data is a crux of crowdsourcing. On the one hand, requesters want to annotate their data without compromising its privacy. On the other hand, workers produce large amounts of data that platforms can utilize to improve their services. Data-management practices are at the heart of our workshop and we welcome submissions in this area.
Trust and Ethics
Crowdsourcing is a multi-agent system in which hundreds of requesters interact with thousands of workers through the interface of the platform. To ensure that the system progresses in a fair and efficient manner, it is critical that all parties trust each other and adhere to certain ethical standards. We hope to discuss these standards and critically analyze the current state of affairs in our workshop.
Crowd-AI Interplay
Crowdsourcing is an important source of data for AI algorithms. But do we understand how does the crowd impact the development of AI? Human decision-making is known to be susceptible to various problems, including noise, bias, subjectivity, and miscalibration. These issues can adversely impact the properties of downstream AI methods that rely on crowdsourced data. Understanding and preventing such undesirable artifacts is a key goal for the whole crowdsourcing community and we will focus on this problem in our workshop.
Organizers
Daria Baidakova
Toloka
Fabio Casati
ServiceNow
Alexey Drutsa
Toloka
Nikita Pavlichenko
Toloka
Ivan Stelmakh
Carnegie Mellon Univeristy
Dmitry Ustalov
Toloka
Key dates
1
June 1. Submission portal opens
2
July 5. Submission deadline
3
July 19. Decisions are sent to authors
4
August 20. Meet us in Copenhagen or online
TL;DR
We welcome all submissions in the broad area of crowdsourcing: research papers that present new results, vision papers that discuss the future of the area, industry papers that describe an interesting use case, and any other papers in the area.