2024 π Daylatest newsbuy art
Poetry is just the evidence of life. If your life is burning well, poetry is just the ashLeonard Cohenburn somethingmore quotes
very clickable
data visualization + art
The COVID Charts are case studies of data visualization and science communication of the coronavirus outbreak. If fix the inaccurate, the sloppy and the illegible.

BD Genomics stereoscopic art exhibit — AGBT 2017

Art is science in love.
— E.F. Weisslitz

BD Genomics 3D art exhibit - AGBT 2017 / Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Our art exhibit at AGBT 2017 asked new school questions in old school ways.
BD Genomics 3D art exhibit - AGBT 2017 / Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca

1 · The art of storytelling in science

BD Genomics 3D art exhibit - AGBT 2017 / Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Instead of 'explain, not merely show,' seek to 'narrate, not merely explain.' Krzywinski M & Cairo A (2013) Points of View: Storytelling. Nat. Methods 10:687.

Science cannot move forward without storytelling. While we learn about the world and its patterns through science, it is through stories that we can organize and sort through the observations and conclusions that drive the generation of scientific hypotheses.

With Alberto Cairo, I've written about the importance of storytelling as a tool to explain and narrate in Storytelling (2013) Nat. Methods 10:687. There we suggest that instead of "explain, not merely show," you should seek to "narrate, not merely explain."

Our account received support (Should scientists tell stories. (2013) Nat. Methods 10:1037) but not from all (Against storytelling of scientific results. (2013) Nat. Methods 10:1045).

A good science story must present facts and conclusions within a hierarchy—a bag of unsorted observations isn't likely to engage your readers. But while a story must always inform, it should also delight (as much as possible), and inspire. It should make the complexity of the problem accessible—or, at least, approachable—without simplifications that preclude insight into how concepts connect (they always do).

2 · The story of making science stories

Just like science, explaining science is a process—one that can be more vexing than the science itself!

In science one tries to tell people, in such a way as to be understood by everyone, something that no one ever knew before. But in poetry, it’s the exact opposite.
—Paul Dirac, Mathematical Circles Adieu by H. Eves [quoted]

I have previously written about the process of taking a scientific statement (Creating Scientific American Graphic Science graphics) and turning it into a data visualization or, more broadly, visual story.

BD Genomics 3D art exhibit - AGBT 2017 / Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
December 2015. Composition of bacteria in household dust.
BD Genomics 3D art exhibit - AGBT 2017 / Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
June 2015. Relationship between genes and traits.
BD Genomics 3D art exhibit - AGBT 2017 / Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
September 2014. Similarity of human, Denisovan, chimp, bonobo, and gorilla genomes.

The process of the creation of one of these visual stories is itself a story. A story about how the genome is not a blueprint, a discovery of Hilbertonians, which are creatures that live on the Hilbert curve, how algorithms for protein folding can be used to generate art based on the digits of `\pi`, or how we can make human genome art by humans with genomes. I've also written about my design process in creating the cover for Genome Research and the cover of PNAS. As always, not everything works out all the time—read about the EMBO Journal covers that never made it.

BD Genomics 3D art exhibit - AGBT 2017 / Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Cover image accompanying our article on mouse vasculature development. Biology turns astrophysical. PNAS 1 May 2012; 109 (18)
BD Genomics 3D art exhibit - AGBT 2017 / Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Cover image accompanying Spark: A navigational paradigm for genomic data exploration. Genome Research 22 (11).
BD Genomics 3D art exhibit - AGBT 2017 / Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Pi Day 2014 poster | 132 paths with E=-23 of 64 digits of Pi, sorted by aspect ratio.

Here, I'd like to walk you through the process and sketches of creating a story based on the idea of differences in data and how the story can be used to understand the function of cells and disease.

3 · The difference is in the differences

The visual story is a creative collaboration with Becton Dickinson and The Linus Group and its creation began with the concept of differences. The art was on display at AGBT 2017 conference and accompanies BD's launch of the Resolve platform and "Difference of One in Genomics".

Starting with the idea of the "difference of one", our goal was to create artistic representations of data sets generated using the BD Resolve platform, which generates single-cell transcriptomes, that captured a variety of differences that are relevant in genomics research.

The data art pieces were installed in a gallery style, with data visualization and artistic expression in equal parts.

The art itself is an old school take on virtual reality. Unlike modern VR, which isolates the participants from one another, we chose a low-tech route that not only brings the audience closer to the data but also to each other.

4 · Data in the art

The data were generated using the BD Resolve single-cell transcriptomics platform. For each of the three art pieces, we identified a data set that captured a variety of differences.

  1. disease onset—how does gene expression in tumor cells differ from normal cells?
  2. disease progression—as a tumor grows and spreads, how does expression change?
  3. background variation—how does gene expression change between normal cells that perform a different function?
BD Genomics 3D art exhibit - AGBT 2017 / Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca

The real surprise and insight is in difference that ultimately advance our thinking (Data visualization: amgibuity as a fellow traveller. (2013) Nat. Methods 10:613-615).

Figuring out which differences are of this kind requires that instead of "What's new?" we ask "What's different?"

news + thoughts

Propensity score matching

Mon 16-09-2024

I don’t have good luck in the match points. —Rafael Nadal, Spanish tennis player

In many experimental designs, we need to keep in mind the possibility of confounding variables, which may give rise to bias in the estimate of the treatment effect.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Nature Methods Points of Significance column: Propensity score matching. (read)

If the control and experimental groups aren't matched (or, roughly, similar enough), this bias can arise.

Sometimes this can be dealt with by randomizing, which on average can balance this effect out. When randomization is not possible, propensity score matching is an excellent strategy to match control and experimental groups.

Kurz, C.F., Krzywinski, M. & Altman, N. (2024) Points of significance: Propensity score matching. Nat. Methods 21:1770–1772.

Nasa to send our human genome discs to the Moon

Sat 23-03-2024

We'd like to say a ‘cosmic hello’: mathematics, culture, palaeontology, art and science, and ... human genomes.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
SANCTUARY PROJECT | A cosmic hello of art, science, and genomes. (details)
Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
SANCTUARY PROJECT | Benoit Faiveley, founder of the Sanctuary project gives the Sanctuary disc a visual check at CEA LeQ Grenoble (image: Vincent Thomas). (details)
Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
SANCTUARY PROJECT | Sanctuary team examines the Life disc at INRIA Paris Saclay (image: Benedict Redgrove) (details)

Comparing classifier performance with baselines

Fri 22-03-2024

All animals are equal, but some animals are more equal than others. —George Orwell

This month, we will illustrate the importance of establishing a baseline performance level.

Baselines are typically generated independently for each dataset using very simple models. Their role is to set the minimum level of acceptable performance and help with comparing relative improvements in performance of other models.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Nature Methods Points of Significance column: Comparing classifier performance with baselines. (read)

Unfortunately, baselines are often overlooked and, in the presence of a class imbalance, must be established with care.

Megahed, F.M, Chen, Y-J., Jones-Farmer, A., Rigdon, S.E., Krzywinski, M. & Altman, N. (2024) Points of significance: Comparing classifier performance with baselines. Nat. Methods 21:546–548.

Happy 2024 π Day—
sunflowers ho!

Sat 09-03-2024

Celebrate π Day (March 14th) and dig into the digit garden. Let's grow something.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
2024 π DAY | A garden of 1,000 digits of π. (details)

How Analyzing Cosmic Nothing Might Explain Everything

Thu 18-01-2024

Huge empty areas of the universe called voids could help solve the greatest mysteries in the cosmos.

My graphic accompanying How Analyzing Cosmic Nothing Might Explain Everything in the January 2024 issue of Scientific American depicts the entire Universe in a two-page spread — full of nothing.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
How Analyzing Cosmic Nothing Might Explain Everything. Text by Michael Lemonick (editor), art direction by Jen Christiansen (Senior Graphics Editor), source: SDSS

The graphic uses the latest data from SDSS 12 and is an update to my Superclusters and Voids poster.

Michael Lemonick (editor) explains on the graphic:

“Regions of relatively empty space called cosmic voids are everywhere in the universe, and scientists believe studying their size, shape and spread across the cosmos could help them understand dark matter, dark energy and other big mysteries.

To use voids in this way, astronomers must map these regions in detail—a project that is just beginning.

Shown here are voids discovered by the Sloan Digital Sky Survey (SDSS), along with a selection of 16 previously named voids. Scientists expect voids to be evenly distributed throughout space—the lack of voids in some regions on the globe simply reflects SDSS’s sky coverage.”

voids

Sofia Contarini, Alice Pisani, Nico Hamaus, Federico Marulli Lauro Moscardini & Marco Baldi (2023) Cosmological Constraints from the BOSS DR12 Void Size Function Astrophysical Journal 953:46.

Nico Hamaus, Alice Pisani, Jin-Ah Choi, Guilhem Lavaux, Benjamin D. Wandelt & Jochen Weller (2020) Journal of Cosmology and Astroparticle Physics 2020:023.

Sloan Digital Sky Survey Data Release 12

constellation figures

Alan MacRobert (Sky & Telescope), Paulina Rowicka/Martin Krzywinski (revisions & Microscopium)

stars

Hoffleit & Warren Jr. (1991) The Bright Star Catalog, 5th Revised Edition (Preliminary Version).

cosmology

H0 = 67.4 km/(Mpc·s), Ωm = 0.315, Ωv = 0.685. Planck collaboration Planck 2018 results. VI. Cosmological parameters (2018).

Error in predictor variables

Tue 02-01-2024

It is the mark of an educated mind to rest satisfied with the degree of precision that the nature of the subject admits and not to seek exactness where only an approximation is possible. —Aristotle

In regression, the predictors are (typically) assumed to have known values that are measured without error.

Practically, however, predictors are often measured with error. This has a profound (but predictable) effect on the estimates of relationships among variables – the so-called “error in variables” problem.

Martin Krzywinski @MKrzywinski mkweb.bcgsc.ca
Nature Methods Points of Significance column: Error in predictor variables. (read)

Error in measuring the predictors is often ignored. In this column, we discuss when ignoring this error is harmless and when it can lead to large bias that can leads us to miss important effects.

Altman, N. & Krzywinski, M. (2024) Points of significance: Error in predictor variables. Nat. Methods 21:4–6.

Background reading

Altman, N. & Krzywinski, M. (2015) Points of significance: Simple linear regression. Nat. Methods 12:999–1000.

Lever, J., Krzywinski, M. & Altman, N. (2016) Points of significance: Logistic regression. Nat. Methods 13:541–542 (2016).

Das, K., Krzywinski, M. & Altman, N. (2019) Points of significance: Quantile regression. Nat. Methods 16:451–452.

Martin Krzywinski | contact | Canada's Michael Smith Genome Sciences CentreBC Cancer Research CenterBC CancerPHSA
Google whack “vicissitudinal corporealization”
{ 10.9.234.152 }