Crowd Science Seminars unite researchers and practitioners to share the latest developments in crowdsourcing and spark ideas for the new experiments. We asked our speakers to send handy insights from their research, which can help other enthusiasts. Look what we've got!
A unique group of academic researchers from different study fields, industry experts and crowd performers from Toloka recently came together at the VLDB21 workshop to talk about the factors that affect data labelling. Don't spend your time watching recordings, check out our full transcription of this discussion.
This year, Toloka led a Crowd Science Workshop at VLDB 2021 in Copenhagen, Denmark. As part of the workshop, a contest was organized by the Toloka Research team to determine the best aggregation methods for crowdsourced text files. The so-called Crowd Science Challenge offered $6,000 in cash prizes, with $3,000 going to the winner, $2,000 to the runner-up, and the remaining $1,000 to anyone who finished 3rd. Crucially, as a result of this exciting contest, we were able to find the best approach to aggregating crowdsourced texts.
ANNOTATING BREAST CANCER TISSUE BY MEDICAL STUDENTS
by Mohamed Amgad, Northwestern University
High-resolution mapping of cells and tissue structures provides a foundation for developing interpretable machine-learning models for computational pathology. Deep learning algorithms can provide accurate mappings given large numbers of labeled instances for training and validation. Generating adequate volume of quality labels has emerged as a critical barrier in computational pathology given the time and effort required from pathologists. Mohamed Amgad describes an approach for engaging crowds of medical students and pathologists that was used to produce a dataset of over 220,000 annotations of cell nuclei in breast cancers.
Feet-on-street crowdsourcing, also known as spatial crowdsourcing, implies performing the given task in a real-world location. Thus, it requires non-standard designs in the task assignments and a new approach for control methods. Evgenii Konovalov, Head of the spatial service in Yandex, gives several instructions to create a successful spatial crowdsourcing project, considering the specifics of the offline tasks.
We introduce a checklist to aid authors in being thorough and systematic when describing the design and operationalization of their crowdsourcing experiments. The checklist also aims to help readers navigate and understand the underlying details behind crowdsourcing studies. By providing a checklist and depicting where the research community stands in terms of reporting practices, we expect our work to stimulate additional efforts to move the transparency agenda forward, facilitating a better assessment of the validity and reproducibility of experiments in crowdsourcing research.
The NLP team of Skolkovo Institute of Science and Technology trained Neural Network to warn the dialogue agents not to discuss "sensitive" topics in a dangerous manner. Nikolay Babakov told us about all the research steps at our seminar - check the video inside!
Let's stay in touch!
Sign up for our mailing list to receive biweekly updates about seminars, new challenges and projects