Lady Gaga shoes

"A girl's just as hot as the shoes she choose."
- Lady Gaga

Science is a weird career, if you think about it. People either get into this business because they love data and learning or because they love exploring the world. I think of these two types as "indoor" and "outdoor" scientists. Lucky for me, I have a foot in both camps, as do many ecologists. In fact, the combination of the two is what I love the most about my job. I love thinking, analyzing, and posing questions, then traveling, adventuring, and exploring, then returning home to my lab with samples to start the process over again. 

Today is an indoor day for me. I've made it far enough with my model of succession in fouling communities that I can produce simulated data - and lots of it. The key to my analysis will be comparing the simulated data to my actual data, to see which version of the model fits my real data best. It's a mathematical challenge, so at the advice of a collaborator, I dug into some books. 

First up: The statistical analysis of compositional data. Yes, it is exactly as dense as it sounds. I had been warned it would be difficult to read, so I was prepared. After making it through about half the book, it started to feel like nothing the author was describing would work for my data. I couldn't quite put my finger on it, but none of it really seemed to fit. Turns out, the methods outlined in the book were all meant for data that meet a very specific set of conditions: they have parametric distributions, and they do not contain zeros.

Now, I don't know how much experience you have analyzing ecological data, friends, but if you remember one thing, remember this: there is nothing parametric about ecological data. They don't fit normal distributions; they rarely have equal variances, and there are always a ton of zeros. 

To use an analogy, parametric data are like a pair of tennis shoes. They're symmetrical. They fit nicely into a rectangular box. They slide onto your feet easily. They're useful for a lot of things.

Ecological data do not usually fit the "tennis shoe" mold. The types of datasets we end up with in ecology have numbers coming out in all directions (outliers), strange gaps in the middle (zeros), unexpected cross-connections (correlations), and a complete lack of symmetry. Sure, you might be able to force a foot into one of these shoes, but you could certainly never ship them in a rectangular box. We're talking Elton John shoes - Lady Gaga shoes - crazy stuff!

So I took a step back and thought about my data. I wasn't quite sure what to do with it, so I ended up reaching out to the same collaborator who had recommended the book. Where do I go from here? 

Thankfully, I have amazing collaborators, and it just took a brief phone conversation for him to come up with a solution to my problem. I have a new path to pursue now, and I'm looking forward to seeing if it works!

Comments