Wednesday, January 28, 2015

Using tools to analyze my uBiome results

To study my uBiome samples more carefully, I’ve written a few tools that you may find helpful too. You’ll need to know a little about the statistical programming language R — just enough to load my scripts and enter a few commands at the prompts. The nice part is that you can easily read the results in Excel, letting you manipulate, sort, or graph the results to your heart’s content. Here’s an example comparing my uBiome samples taken in May and October.

These are the species that were found in both samples, and the normalized counts showing the relative increase for the second one:

Bacteroides plebeius
Bifidobacterium animalis
bacterium NLAE-zl-P430
[Ruminococcus] obeum
Lactobacillus rogosae
Blautia faecis
Clostridium baratii
Corynebacterium freneyi
Coprococcus catus
bacterium NLAE-zl-H54
Bacteroides salyersiae

Here are the species that were still in the October sample, but at reduced count:
Roseburia sp. 11SE38
Barnesiella intestinihominis
Parasutterella excrementihominis
Coprococcus sp. DJF_CR49
Bacteroides massiliensis
Bacteroides uniformis
Clostridium clostridioforme
Odoribacter laneus
Faecalibacterium prausnitzii

I took several tablespoons daily of potato starch during the week before the October test, so those species were probably affected most.

Next, here’s a look at the gut flora that went extinct between May and October:
original count_normMissing
Bifidobacterium tsurumiense
Subdoligranulum variabile
Dialister sp. oral clone BS095
Desulfovibrio sp. oral clone BB161
Adlercreutzia equolifaciens
Ruminococcus sp. ID1
Clostridiales bacterium 60-7e
Tannerella sp. 6_1_58FAA_CT1
[Clostridium] spiroforme
Clostridium disporicum
Lactobacillus paracasei

and some new ones that were not there originally but showed up in October:

CountNew species
butyrate-producing bacterium A1-86
Clostridium chartatabidum
Bifidobacterium adolescentis
Ruminococcus bromii
Eubacterium siraeum
bacterium NLAE-zl-H436
Parabacteroides distasonis

The above examples were all done at the species taxonomic rank, but the tools let you look at other ranks, such as genus or phylum, just as easily.

The R source code, along with the data I used for this analysis are all available here on GitHub.  

(Huge thanks to Dr. Grace Liu and her Animal Pharm web site for help understanding what this stuff means).

Sunday, January 18, 2015

How to analyze a uBiome sample in Excel

Although I’ve done three samples at this point, I’m by no means a uBiome expert, but hopefully the following guide will be useful to others who are trying to understand their own results better.
After your uBiome sample has been submitted and processed, you will be given access to a private web page with some basic web tools to help you understand your results, but the tools are still under development and you may find them hard to interpret. You also can’t do A/B comparisons of your sample to others, or to your own samples over time, so at some point you’ll need the raw data, which fortunately is easy to get. Log in at, then browse your way to a screen where you’ll see this on the left side:
UBiome Explore your Microbiome
When you select “Raw Taxonomy”, you’ll go to a page with a bunch of apparently garbled noise like this:
uBiome Results
That’s the raw data. Now go to my uBiome site on Github for more instructions and tools to turn that mess into a CSV file, which you can open in Excel.  If that’s too much trouble, an alternative is to simply select-all the information on that page, and then copy/paste it into
Once you open that CSV file in Excel, you are home free, thanks to the many, easy built-in analysis tools there. For example, here are the first few rows of my June 2014 sample:
Screenshot uBiome Excel CSV
Next you need to understand a few column labels:
  • count: the actual number of organisms found in the sample.
  • count_norm: a “normalized” version of the count, which you can think of as a percentage. It appears to be a number that uBiome assigned based on the other data they’ve accumulated from other people over the years. I don’t know, but it doesn’t seem to be an industry-standard benchmark. Anyway, you can think of it as 1 / 10,000th of a percent, so in the sample above my level of Firmucutes would be 62.2877%.
  • tax_name: this is the classification of the organism, based on the level of its taxonomy. Generally, we think of an organism in terms of species (e.g. homo sapiens), but each organism belongs to other, bigger clusters. For example, humans are members of the class mammalia, along with tigers and horses. If this spreadsheet were counting organisms at the level of class mammalia, the count_norm would almost certainly be bigger than the count_norm for humans alone, unless humans were the only type of mammal found in the sample. Make sense?
  • tax_rank: tells which level of taxonomy is represented by this row. You’ll need to know a tiny bit of botany to understand this, which I’ll explain more below, but for now just note that some of the counts will appear duplicated unless you take this into account.
  • taxon and parent: these help identify the ranking in a more precise way by pointing out which tax_ranks are subsets of which. For example, Bacteroidia above has a parent = 976, meaning that it is a subset of the taxon 976, Bacteroidetes. When you follow the various taxons and parents up the chain, you’ll see they all end in the superkingdom Bacteria, which has a taxon of 2. I have no idea why they assign the parent of Bacteria the number 131567, but don’t worry: it doesn’t matter.
  • tax_color doesn’t matter for this anaysis, but it appears to be how the uBiome software colorizes their pretty graphs to make them more readable.
Now for the botany. The science of Taxonomy (worked out first by Carl Linnaeus in the 1700s), divides all life into seven major categories (ranks): Kingdom, Phylum, Class, Order, Family, Genus, Species (which I was taught in sixth grade to remember by the mnemonic “King Philip Came Over for Girl Scouts”). When you analyze a uBiome sample, you cannot mix and match these ranks. If you talk about species, for example, you must compare the counts to other species — never with a different taxonomic rank.
With that out of the way, the next thing is to tell Excel to apply a filter to the whole sheet. In Mac Excel, I just select the filter icon like this:
Screenshot Excel filter
The first row is transformed into a nifty filtering device. Note how the right side of each cell has a little upside down triangle. Select that and a new pop-up menu will appear that lets you sort and filter the column however you like.  Here’s what it looks like when I look just at the tax_rank = phylum, and sort the count_norm in descending order (I also hid the avg column to simply the image):
Screenshot Excel Phylum
The count_norm column now corresponds exactly to the percentage breakdown in the fancy charts on the uBiome site!
Screenshot uBiome chart
Here’s the Excel donut chart I made of the above sample:
Screenshot Excel Chart
See? Same info!
But now that it’s in Excel, you can do much easier analysis than what you get currently on the uBiome site. For example, try selecting tax_rank = species and then sort tax_name alphabetically.  Now I can look alphabetically at all the different Bifido species found in my sample:
Screenshot Excel bifido
There’s much more to say about uBiome, which I’ll cover in a future post.  (Huge shout-out to Dr. Grace Liu and her Animal Pharm web site for the tons of information I’m just beginning to pour through — highly recommended).

Tuesday, January 13, 2015

[book] How We Learn

Who doesn’t want to be better at learning? Learn faster, more efficiently, with better recall…that’s this book, by Benedict Carey, a journalist who has long been tracking the science of learning. The whole book is worth reading, but here are a few of my takeaways:

  • Use the spacing affect, aka “distributed learning”. There is a whole science of how much and how often you should repeat something in order to seal it in memory. Not too often, just enough space between practice, not too seldom. Nowadays you can get software to help with this.  Good idea.
  • Testing is another form of learning, so do it on yourself all the time. Spend about 1/3 of your time studying and 2/3rds testing yourself (flashcards, re-writing your thoughts, etc.)
  • Distraction can actually be good, but only if: you’ve focused long enough on the task to feel stumped. When that happens, take a break and do something else for a while.
  • Systematically alter your practice: change the place where you study, change the background noises or music. If you study the same material in two different places, you’ll learn the material better than if you studied it in the same, familiar location.
  • Sleep is free learning: never go to bed without a problem to review in your mind. Think of it as “learning with your eyes closed”.

There is much more to absorb from this book, and I’m sure I’ll continue coming back to the concepts regularly.


Monday, January 12, 2015

My uBiome third sample

Eventually it will be possible, perhaps routine, to get regular real-time updates to what’s happening among the bacteria crawling all over our insides, but for now the best a normal person like me can do is continue sending samples to the uBiome testing service.Here’s an overdue look at my latest report:

Sprague uBiome Results

 (I discussed my previous uBiome results a few months ago)

Unfortunately, I don’t see much of interest. That last sample was taken a few months after the first two. I can’t point to anything unusual that might have happened between the tests. Nothing notable about my diet (I’m a fairly normal omnivore). No antibiotics, no bouts of sickness, no unusual travel or other changes.

Three samples is not enough to draw any conclusions, especially since they are taken in informal, non-scientific circumstances (i.e. my house, at whatever random time I feel like). You can read these however you like, but it’s interesting that the first and third samples are most similar on the two strains of microbes that seem most common in modern America. The level of firmicutes, in particular, is consistent with somebody like me who is of normal weight and health. There is speculation that firmicutes (pronounced fir MIK kyoo teez, by the way) play a role in fat absorption; too much or too little may play a role in obesity).

At 6+%, the actinobacteria in the latest sample is much higher than normal. Since those strains are more common in other parts of the body (e.g. nose, genitals), I’m wondering if there’s a little contamination going on.

I’ve got another uBiome kit waiting to be used, so stay tuned for more updates.


Saturday, January 10, 2015

Starbucks Reserve Roastery and Tasting Room

It’s as impressive as you’d expect: full-blown Probat roaster right out in the open, a long line of Clovers at the main bar, 

As a tourist attraction it doesn’t disappoint: even in a city known for gourmet coffee shops, you won’t find a better example of showmanship and art. They obviously spent a great deal of time and money planning all the details. But it also means long lines (you’ll wait at least 20 minutes to get a coffee) and impossible parking. (Although I noticed they offer valet parking if you want to go that way).
I tried the Ethiopian Konga, which my understanding is another name for Yrgecheffe coffee, which became my favorite after trying it at Terroir in Boston a few years ago. Yes I asked for it Clover-brewed rather than a hand-pour. 
Starbucks Reserve

Saturday, October 25, 2014

[book] The Rise of Superman

The Rise of Superman: Decoding the Science of Ultimate Human Performance by Steven Kotler is mostly about extreme sports — the people behind really crazy activities like kayaking off a 56.7 meter waterfall or snowboarding off impossibly-high cliffs). If that part interests you, then you’ll hear insider accounts of the various legends you already know about. I’m not an extreme sport-watcher, but I came away with new respect for the people who do that stuff: they’re the modern day equivalents of great explorers past, like Magellan or Pizzaro

But the interesting part to me was the discussion of “flow", the mental state achieved by these people and by anyone working at peak performance.  Flow, also known as “being in the zone” or in religious contexts something like satori or enlightenment, is a place where every ounce of your being is fully alive, where you are “acting on all cylinders” and being the best you can be. Kotler dissects this state with scientists like Keith Sawyer and many others, including neuroscientists who study the phenomenon and divide it into these stages:

  • Struggle: trying to amp up and get a handle on problem, focusing with all your might. Very nerve-wracking here.
  • Release: the ‘aha’ moment
  • Zone: now you’re in pure perfection
  • Recovery: consolidate memories 

And these neurochemicals:

  • dopamine (pleasure producer like cocaine)
  • norepinephrine (like speed)
  • endorphins (opiates more powerful than morphine)
  • anandamide (“bliss”, inhibits ability to feel fear)
  • serotonin (helps cope with distress)
 Flow is about focus and concentration, and it happens in groups too. Here are some of the key characteristics:
  • serious concentration
  • shared, clear goals
  • good communication (immediate feedback)
  • equal participation
  • element of risk
  • familiarity: the group has a common knowledge base
  • blending egos
  • sense of control
  • close listening
  • always say yes
There’s obviously much more to say about Flow, but I found many of the lessons were buried in anecdotes about extreme heroes, who if that’s your thing will be more interesting to you than it was to me. Still, I definitely want to learn more, especially about some of the Quantified Self devices mentioned, like BrainSport from SenseLabs (formerly Neurotopia) and of course the Flow Genome Project.

Interestingly, Kotler is also co-author with Singularity University and X-Prize Peter Diamondis of Abundance: The Future Is Better Than You Think, a book I’ll have to add to my reading list.

Friday, October 17, 2014

Comparing uBiome data through time

Like I said previously, the data (and tools) at the uBiome site are fantastic, and it’s taking me a long time to understand what meaningful conclusions I can make. But first, one basic question I had is how stable the results are. If the microbiota changes all the time, then maybe you can’t really conclude much unless you track long-term trends.

In an excellent study published over the summer, Lawrence David at Duke University followed two subjects, measuring their microbiota and zillions of other variables every day for a year. His team concluded that although illness or travel can dramatically change microbiota composition in a single day, and different foods cause levels to fluctuate by up to 15% per day, generally things stay pretty stable.

So what happens if I send two samples, taken a few weeks apart, to uBiome? That’s what I tried, keeping close track of exactly what I ate and did in the meantime.

Here are the results for the first sample (the one I posted before)


Here’s the newer sample, taken three weeks later:


Hmmm. As you can see, these results seem quite different – far more than the 15% daily fluctuation from the Lawrence study.

No special travel or other unusual activity during those three weeks. Nothing unusual in my diet. I’m still analyzing what I ate, but for example here’s the amount of fiber per day, starting the week before the first test.


Nothing super unusual there. Note that I am a completely healthy male, normal/stable weight, no history of anything. Last antibiotic use was a long time ago.

Incidentally, Tina Saey (@thsaey) sent the same sample to two labs (uBiome plus American Gut Project), and got dramatically different results. uBiome wrote a detailed response – my takeway was that the differences can be mostly explained by how each lab handles the samples but that you can correct for that (mostly).

Still more analysis ahead of me. Meanwhile, I’ve sent them a third sample to compare further.