Saturday 29 October 2016

FODMAPs 02 – Exploratory data analysis… Also, I think I have a beef and wedding intolerance

Previous post in this series: FODMAPs 01 – Data collection.

I have been collecting data for five weeks in an attempt to identify what foods cause my symptoms of food intolerances. Using the Memento Database app, I log the intake of each food/ingredient, which is datetime stamped. Here’s a snapshot of the exported CSV. The Fibre column indicates if I took some psyllium husk as recommended by my dietitian. Enzymes indicates when I took a magic out-of-body enzyme pill, which was rare.



My intolerance symptoms post-meal was recorded with datetime stamps. I used four descriptions: "Bloated" was when I was feeling, well, bloated. "Tightening" was when my guts felt uncomfortably tight during digestion. "Fatigue" was when I suddenly felt tired. "Abdominal pain" indicated sharp stabby pains in my gut. Since I’m not concerned about these distinctions, I coded each symptom with “1”. I wish to identify the foods that cause ANY symptoms of intolerance. A good day is when I have no symptoms.

The data was wrangled. Datetimes were coerced to dates. Foods and Symptoms datasets were joined by date. Here’s a look at the merged data in RStudio. It’s terribly simple – Date, Food, Symptoms (flagged with “1” when present on a given date).



I’m no dietician/nutritionist. I assume that when one tries to identify problem foods in one’s diet they look at when the symptoms occur, then look back to see what foods were consumed. With that general approach, I chose some strict parameters to identify the bad foods that led to intolerance symptoms.

Any day of a symptom is considered a bad day. Even one symptom. Thus, a good day was a symptom-free day. To my delighted surprise, I had a string of good days. Setting my diet to low-FODMAPS did make me feel generally better. I was less fatigued, I could concentrate more at work, and I had more nights of decent sleep. Sure, I became a social bore when I limited what food I could eat when dining out. Telling friends I could just go out for tea was met with disappointment. It was easier to stay at home and eat cold cuts by my lonesome. This was all in the name of science, and data, and in the next blog post, some data science (logistic regression).

Consider the good days. The code would look at the previous day and note the foods that were consumed. These foods were all considered “good”. Let’s think about this moving forward in time – I would eat all this good mostly gluten-free food. The following day I would be symptom-free. Therefore, any food the day before a symptom-free day is in my good books.

Consider the bad days. Similar premise – any food consumed the day before a bad day are bad foods. But not all of them. I have a mix of good and bad foods, followed by a bad day. I can’t cast the good foods caught in this net as bad by association. Therefore, any food on my good food list was used to subtract-out from the bad food list. A “really bad” food list became the difference. Drum roll… Here are the really bad foods.




OK, a couple of things stood out. I think I’m allergic to weddings. “wedding beef”, “wedding cake”, “wedding canapes”, “wedding salad”. Guys, I went to a weeding during the diet, OK? I couldn’t not eat the food, it was really really good. Other foods consumed at the wedding included the potato, prawns, pumpkin, oysters. Resolution one: Avoid weddings [1].

There was another grouping I discerned from the really bad foods list. “beef mince”, “beef patties”, “olivo wagyu steak”, “wedding beef”. OMG, I think I have a beef intolerance. No! Stupid, stupid ethnic digestive tract, why?

I Googled – beef intolerance is indeed a thing. As is intolerance towards asparagus, basil and cauliflower. I’m not jumping to conclusions. I have an appointment with my dietitian in several weeks, and I’ll show her the data. She may very well think this approach was a bit much, but, I truly believe that the little data we collect has meaning. Now, it’s easier to collect data, primarily because most data collection occurs in an automated fashion. From Fitbit to Netflix and Google, there’s a spectrum of our personalised data being gathered. Sometimes this data is accessible, such as from Fitbit. Taking those next steps from reported data to insightful and actionable data may take some coding [2].


References and notes
1. I was concerned that my bad days were simply the wedding day. Not the case. I had 16 days when I consumed from the really bad foods list.
2. The code fodmaps_wrangling_exploration.R is on GitHub repo: https://github.com/muhsinkarim/fodmaps