Bang Wong is the creative director of the Broad Institute and an adjunct assistant professor in the Department of Art as Applied to Medicine at The Johns Hopkins University School of Medicine.
Nils Gehlenborg is a research associate at Harvard Medical School.
Martin Krzywinski is a staff scientist at Canada’s Michael Smith Genome Sciences Centre.
Marc Streit is an assistant professor of computer science at Johannes Kepler University Linz.
Cydney Nielsen is a Research Associate at the BC Cancer Research Center.
Rikke Schmidt Kjærgaard is an assistant professor in the Interdisciplinary Nanoscience Center at Aarhus University.
Noam Shoresh is a senior computational biologist at the Broad Institute
Erica Savig is a PhD candidate in Cancer Biology at Stanford University.
Alberto Cairo is a Professor of Professional Practice at the School of Communication of the University of Miami.
Alexander Lex is a postdoctoral fellow in computer science at Harvard University.
Gregor McInerny is a Senior Research Fellow at the Department of Computer Science, University of Oxford.
Barbara J. Hunnicutt is a research assistant at Oregon Health and Science University.
I don’t have good luck in the match points. —Rafael Nadal, Spanish tennis player
In many experimental designs, we need to keep in mind the possibility of confounding variables, which may give rise to bias in the estimate of the treatment effect.
If the control and experimental groups aren't matched (or, roughly, similar enough), this bias can arise.
Sometimes this can be dealt with by randomizing, which on average can balance this effect out. When randomization is not possible, propensity score matching is an excellent strategy to match control and experimental groups.
Kurz, C.F., Krzywinski, M. & Altman, N. (2024) Points of significance: Propensity score matching. Nat. Methods 21:1770–1772.
We'd like to say a ‘cosmic hello’: mathematics, culture, palaeontology, art and science, and ... human genomes.
All animals are equal, but some animals are more equal than others. —George Orwell
This month, we will illustrate the importance of establishing a baseline performance level.
Baselines are typically generated independently for each dataset using very simple models. Their role is to set the minimum level of acceptable performance and help with comparing relative improvements in performance of other models.
Unfortunately, baselines are often overlooked and, in the presence of a class imbalance, must be established with care.
Megahed, F.M, Chen, Y-J., Jones-Farmer, A., Rigdon, S.E., Krzywinski, M. & Altman, N. (2024) Points of significance: Comparing classifier performance with baselines. Nat. Methods 21:546–548.
Celebrate π Day (March 14th) and dig into the digit garden. Let's grow something.