Richard Sprague: January 2015

Wednesday, January 28, 2015

Using tools to analyze my uBiome results

To study my uBiome samples more carefully, I’ve written a few tools that you may find helpful too. You’ll need to know a little about the statistical programming language R — just enough to load my scripts and enter a few commands at the prompts. The nice part is that you can easily read the results in Excel, letting you manipulate, sort, or graph the results to your heart’s content. Here’s an example comparing my uBiome samples taken in May and October.

These are the species that were found in both samples, and the normalized counts showing the relative increase for the second one:

Species	Increase
Bacteroides plebeius	86248
Bifidobacterium animalis	37483
bacterium NLAE-zl-P430	17859
[Ruminococcus] obeum	9238
Lactobacillus rogosae	8726
Blautia faecis	6577
Clostridium baratii	4365
Corynebacterium freneyi	3940
Coprococcus catus	3420
bacterium NLAE-zl-H54	3077
Bacteroides salyersiae	2880

Here are the species that were still in the October sample, but at reduced count:

Roseburia sp. 11SE38	-5250
Barnesiella intestinihominis	-6780
Parasutterella excrementihominis	-6986
Coprococcus sp. DJF_CR49	-8958
Bacteroides massiliensis	-10835
Bacteroides uniformis	-11248
Clostridium clostridioforme	-13732
Odoribacter laneus	-28286
Faecalibacterium prausnitzii	-93781

I took several tablespoons daily of potato starch during the week before the October test, so those species were probably affected most.

Next, here’s a look at the gut flora that went extinct between May and October:

original count_norm	Missing
8295	Bifidobacterium tsurumiense
4650	Subdoligranulum variabile
2074	Dialister sp. oral clone BS095
780	Desulfovibrio sp. oral clone BB161
475	Adlercreutzia equolifaciens
459	Ruminococcus sp. ID1
328	Clostridiales bacterium 60-7e
305	Tannerella sp. 6_1_58FAA_CT1
194	[Clostridium] spiroforme
182	Clostridium disporicum
174	Lactobacillus paracasei

and some new ones that were not there originally but showed up in October:

Count	New species
92665	butyrate-producing bacterium A1-86
47129	Clostridium chartatabidum
19001	Bifidobacterium adolescentis
16245	Ruminococcus bromii
8495	Eubacterium siraeum
5929	bacterium NLAE-zl-H436
5803	Parabacteroides distasonis

The above examples were all done at the species taxonomic rank, but the tools let you look at other ranks, such as genus or phylum, just as easily.

The R source code, along with the data I used for this analysis are all available here on GitHub.

(Huge thanks to Dr. Grace Liu and her Animal Pharm web site for help understanding what this stuff means).

Sunday, January 18, 2015

How to analyze a uBiome sample in Excel

Although I’ve done three samples at this point, I’m by no means a uBiome expert, but hopefully the following guide will be useful to others who are trying to understand their own results better.

After your uBiome sample has been submitted and processed, you will be given access to a private web page with some basic web tools to help you understand your results, but the tools are still under development and you may find them hard to interpret. You also can’t do A/B comparisons of your sample to others, or to your own samples over time, so at some point you’ll need the raw data, which fortunately is easy to get. Log in at http://beta.ubiome.com, then browse your way to a screen where you’ll see this on the left side:

When you select “Raw Taxonomy”, you’ll go to a page with a bunch of apparently garbled noise like this:

That’s the raw data. Now go to my uBiome site on Github for more instructions and tools to turn that mess into a CSV file, which you can open in Excel. If that’s too much trouble, an alternative is to simply select-all the information on that page, and then copy/paste it into https://json-csv.com

Once you open that CSV file in Excel, you are home free, thanks to the many, easy built-in analysis tools there. For example, here are the first few rows of my June 2014 sample:

Next you need to understand a few column labels:

count: the actual number of organisms found in the sample.
count_norm: a “normalized” version of the count, which you can think of as a percentage. It appears to be a number that uBiome assigned based on the other data they’ve accumulated from other people over the years. I don’t know, but it doesn’t seem to be an industry-standard benchmark. Anyway, you can think of it as 1 / 10,000th of a percent, so in the sample above my level of Firmucutes would be 62.2877%.
tax_name: this is the classification of the organism, based on the level of its taxonomy. Generally, we think of an organism in terms of species (e.g. homo sapiens), but each organism belongs to other, bigger clusters. For example, humans are members of the class mammalia, along with tigers and horses. If this spreadsheet were counting organisms at the level of class mammalia, the count_norm would almost certainly be bigger than the count_norm for humans alone, unless humans were the only type of mammal found in the sample. Make sense?
tax_rank: tells which level of taxonomy is represented by this row. You’ll need to know a tiny bit of botany to understand this, which I’ll explain more below, but for now just note that some of the counts will appear duplicated unless you take this into account.
taxon and parent: these help identify the ranking in a more precise way by pointing out which tax_ranks are subsets of which. For example, Bacteroidia above has a parent = 976, meaning that it is a subset of the taxon 976, Bacteroidetes. When you follow the various taxons and parents up the chain, you’ll see they all end in the superkingdom Bacteria, which has a taxon of 2. I have no idea why they assign the parent of Bacteria the number 131567, but don’t worry: it doesn’t matter.
tax_color doesn’t matter for this anaysis, but it appears to be how the uBiome software colorizes their pretty graphs to make them more readable.

Now for the botany. The science of Taxonomy (worked out first by Carl Linnaeus in the 1700s), divides all life into seven major categories (ranks): Kingdom, Phylum, Class, Order, Family, Genus, Species (which I was taught in sixth grade to remember by the mnemonic “King Philip Came Over for Girl Scouts”). When you analyze a uBiome sample, you cannot mix and match these ranks. If you talk about species, for example, you must compare the counts to other species — never with a different taxonomic rank.

With that out of the way, the next thing is to tell Excel to apply a filter to the whole sheet. In Mac Excel, I just select the filter icon like this:

The first row is transformed into a nifty filtering device. Note how the right side of each cell has a little upside down triangle. Select that and a new pop-up menu will appear that lets you sort and filter the column however you like. Here’s what it looks like when I look just at the tax_rank = phylum, and sort the count_norm in descending order (I also hid the avg column to simply the image):

The count_norm column now corresponds exactly to the percentage breakdown in the fancy charts on the uBiome site!

Here’s the Excel donut chart I made of the above sample:

See? Same info!

But now that it’s in Excel, you can do much easier analysis than what you get currently on the uBiome site. For example, try selecting tax_rank = species and then sort tax_name alphabetically. Now I can look alphabetically at all the different Bifido species found in my sample:

Easy!

There’s much more to say about uBiome, which I’ll cover in a future post. (Huge shout-out to Dr. Grace Liu and her Animal Pharm web site for the tons of information I’m just beginning to pour through — highly recommended).

Tuesday, January 13, 2015

[book] How We Learn

Who doesn’t want to be better at learning? Learn faster, more efficiently, with better recall…that’s this book, by Benedict Carey, a journalist who has long been tracking the science of learning. The whole book is worth reading, but here are a few of my takeaways:

Use the spacing affect, aka “distributed learning”. There is a whole science of how much and how often you should repeat something in order to seal it in memory. Not too often, just enough space between practice, not too seldom. Nowadays you can get software to help with this. Good idea.
Testing is another form of learning, so do it on yourself all the time. Spend about 1/3 of your time studying and 2/3rds testing yourself (flashcards, re-writing your thoughts, etc.)
Distraction can actually be good, but only if: you’ve focused long enough on the task to feel stumped. When that happens, take a break and do something else for a while.
Systematically alter your practice: change the place where you study, change the background noises or music. If you study the same material in two different places, you’ll learn the material better than if you studied it in the same, familiar location.
Sleep is free learning: never go to bed without a problem to review in your mind. Think of it as “learning with your eyes closed”.

There is much more to absorb from this book, and I’m sure I’ll continue coming back to the concepts regularly.

Monday, January 12, 2015

My uBiome third sample

Eventually it will be possible, perhaps routine, to get regular real-time updates to what’s happening among the bacteria crawling all over our insides, but for now the best a normal person like me can do is continue sending samples to the uBiome testing service.Here’s an overdue look at my latest report:

Sprague uBiome Results

(I discussed my previous uBiome results a few months ago)

Unfortunately, I don’t see much of interest. That last sample was taken a few months after the first two. I can’t point to anything unusual that might have happened between the tests. Nothing notable about my diet (I’m a fairly normal omnivore). No antibiotics, no bouts of sickness, no unusual travel or other changes.

Three samples is not enough to draw any conclusions, especially since they are taken in informal, non-scientific circumstances (i.e. my house, at whatever random time I feel like). You can read these however you like, but it’s interesting that the first and third samples are most similar on the two strains of microbes that seem most common in modern America. The level of firmicutes, in particular, is consistent with somebody like me who is of normal weight and health. There is speculation that firmicutes (pronounced fir MIK kyoo teez, by the way) play a role in fat absorption; too much or too little may play a role in obesity).

At 6+%, the actinobacteria in the latest sample is much higher than normal. Since those strains are more common in other parts of the body (e.g. nose, genitals), I’m wondering if there’s a little contamination going on.

I’ve got another uBiome kit waiting to be used, so stay tuned for more updates.

Saturday, January 10, 2015

Starbucks Reserve Roastery and Tasting Room

It’s as impressive as you’d expect: full-blown Probat roaster right out in the open, a long line of Clovers at the main bar,

As a tourist attraction it doesn’t disappoint: even in a city known for gourmet coffee shops, you won’t find a better example of showmanship and art. They obviously spent a great deal of time and money planning all the details. But it also means long lines (you’ll wait at least 20 minutes to get a coffee) and impossible parking. (Although I noticed they offer valet parking if you want to go that way).

I tried the Ethiopian Konga, which my understanding is another name for Yrgecheffe coffee, which became my favorite after trying it at Terroir in Boston a few years ago. Yes I asked for it Clover-brewed rather than a hand-pour.