This post is based on an analysis of a subset of the China Study II data, using HealthCorrelator for Excel (HCE), which is publicly available for download and use on a free trial basis. You can access the original data on the HCE web site, under “Sample datasets”.
HCE was designed to be used with small and individual personal datasets, but it can also be used with larger datasets for multiple individuals.
This analysis focuses on two main variables from the China Study II data: mortality in the 35-69 age range, and mortality in the 70-79 range. The table below shows the coefficients of association calculated by HCE for those two variables. The original variable labels are shown.
One advantage of looking at mortality in these ranges is that they are more likely to reflect the impact of degenerative diseases. Infectious diseases likely killed a lot of children in China at the time the data was being collected. Heart disease, on the other hand, is likely to have killed more people in the 35-69 and 70-79 ranges.
It is also good to have data for both ranges, because factors that likely increased longevity were those that were associated with decreased mortality in both ranges. For example, a factor that was strongly associated with mortality in the 35-69 range, but not the 70-79 range, might simply be very deadly in the former range.
The mortalities in both ranges are strongly correlated with each other, which is to be expected. Next, at the very top for both ranges, is sex. Being female is by far the variable with the strongest, and negative, association with mortality.
While I would expect females to live longer, the strengths of the associations make me think that there is something else going on here. Possibly different dietary or behavioral patterns displayed by females. Maybe smoking cigarettes or alcohol abuse was a lot less prevalent among them.
Markedly different lifestyle patterns between males and females may be a major confounding variable in the China Study sample.
Some of the variables are redundant; meaning that they are highly correlated and seem to measure the same thing. This is clear when one looks at the other coefficients of association generated by HCE.
For example, plant food consumption is strongly and negatively correlated with animal food consumption; so strongly that you could use either one of these two variables to measure the other, after inverting the scale. The same is true for consumption of rice and white flour.
Plant food consumption is not strongly correlated with plant protein consumption; many plant foods have little protein in them. The ones that have high protein content are typically industrialized and seed-based. The type of food most strongly associated with plant protein consumption is white flour, by far. The correlation is .645.
The figure below is based on the table above. I opened a separate instance of Excel, and copied the coefficients generated by HCE into it. Then I built two bar charts with them. The variable labels were replaced with more suggestive names, and some redundant variables were removed. Only the top 7 variables are shown, ordered from left to right on the bar charts in order of strength of association. The ones above the horizontal axis possibly increase mortality in each age range, whereas the ones at the bottom possibly decrease it.
When you look at these results as a whole, a few things come to mind.
White flour consumption doesn’t seem to be making people live longer; nor does plant food consumption in general. For white flour, it is quite the opposite. Plant food consumption reflects white flour consumption to a certain extent, especially in counties where rice consumption is low. These conclusions are consistent with previous analyses using more complex statistics.
Total food is positively associated with mortality in the 35-69 range, but not the 70-79 range. This may reflect the fact that folks who reach the age of 70 tend to naturally eat in moderation, so you don’t see wide variations in food consumption among those folks.
Eating in moderation does not mean practicing severe calorie restriction. This post suggests that calorie restriction doesn't seem to be associated with increased longevity in this sample. Eating well, but not too much, is.
The bar for rice (consumption) on the left chart is likely a mirror reflection of the white flour consumption, so it may appear to be good in the 35-69 range simply because it reflects reduced white flour consumption in that range.
Green vegetables seem to be good when you consider the 35-69 range, but not the 70-79 range.
Neither rice nor green vegetables seem to be bad either. For overall longevity they may well be neutral, with the benefits likely coming from their replacement of white flour in the diet.
Dietary fat seems protective overall, particularly together with animal foods in the 70-79 range. This may simply reflect a delayed protective effect of animal fat and protein consumption.
The protective effect of dietary fat becomes clear when we look at the relationship between carbohydrate calories and fat calories. Their correlation is -.957, which essentially means that carbohydrate intake seriously displaces fat intake.
Carbohydrates themselves may not be the problem, even if coming from high glycemic foods (except wheat flour, apparently). This post shows that they are relatively benign if coming from high glycemic rice, even at high intakes of 206 to 412 g/day. The problem seems to be caused by carbohydrates displacing nutrient-dense animal foods.
Interestingly, rice does not displace animal foods or fat in the diet. It is positively correlated with them. Wheat flour, on the other hand, displaces those foods. Wheat flour is negatively and somewhat strongly correlated with consumption of animal foods, as well as with animal fat and protein.
There are certainly several delayed effects here, which may be distorting the results somewhat. Degenerative diseases don’t develop fast and kill folks right away. They often require many years of eating and doing the wrong things to be fatal.
HCE was designed to be used with small and individual personal datasets, but it can also be used with larger datasets for multiple individuals.
This analysis focuses on two main variables from the China Study II data: mortality in the 35-69 age range, and mortality in the 70-79 range. The table below shows the coefficients of association calculated by HCE for those two variables. The original variable labels are shown.
One advantage of looking at mortality in these ranges is that they are more likely to reflect the impact of degenerative diseases. Infectious diseases likely killed a lot of children in China at the time the data was being collected. Heart disease, on the other hand, is likely to have killed more people in the 35-69 and 70-79 ranges.
It is also good to have data for both ranges, because factors that likely increased longevity were those that were associated with decreased mortality in both ranges. For example, a factor that was strongly associated with mortality in the 35-69 range, but not the 70-79 range, might simply be very deadly in the former range.
The mortalities in both ranges are strongly correlated with each other, which is to be expected. Next, at the very top for both ranges, is sex. Being female is by far the variable with the strongest, and negative, association with mortality.
While I would expect females to live longer, the strengths of the associations make me think that there is something else going on here. Possibly different dietary or behavioral patterns displayed by females. Maybe smoking cigarettes or alcohol abuse was a lot less prevalent among them.
Markedly different lifestyle patterns between males and females may be a major confounding variable in the China Study sample.
Some of the variables are redundant; meaning that they are highly correlated and seem to measure the same thing. This is clear when one looks at the other coefficients of association generated by HCE.
For example, plant food consumption is strongly and negatively correlated with animal food consumption; so strongly that you could use either one of these two variables to measure the other, after inverting the scale. The same is true for consumption of rice and white flour.
Plant food consumption is not strongly correlated with plant protein consumption; many plant foods have little protein in them. The ones that have high protein content are typically industrialized and seed-based. The type of food most strongly associated with plant protein consumption is white flour, by far. The correlation is .645.
The figure below is based on the table above. I opened a separate instance of Excel, and copied the coefficients generated by HCE into it. Then I built two bar charts with them. The variable labels were replaced with more suggestive names, and some redundant variables were removed. Only the top 7 variables are shown, ordered from left to right on the bar charts in order of strength of association. The ones above the horizontal axis possibly increase mortality in each age range, whereas the ones at the bottom possibly decrease it.
When you look at these results as a whole, a few things come to mind.
White flour consumption doesn’t seem to be making people live longer; nor does plant food consumption in general. For white flour, it is quite the opposite. Plant food consumption reflects white flour consumption to a certain extent, especially in counties where rice consumption is low. These conclusions are consistent with previous analyses using more complex statistics.
Total food is positively associated with mortality in the 35-69 range, but not the 70-79 range. This may reflect the fact that folks who reach the age of 70 tend to naturally eat in moderation, so you don’t see wide variations in food consumption among those folks.
Eating in moderation does not mean practicing severe calorie restriction. This post suggests that calorie restriction doesn't seem to be associated with increased longevity in this sample. Eating well, but not too much, is.
The bar for rice (consumption) on the left chart is likely a mirror reflection of the white flour consumption, so it may appear to be good in the 35-69 range simply because it reflects reduced white flour consumption in that range.
Green vegetables seem to be good when you consider the 35-69 range, but not the 70-79 range.
Neither rice nor green vegetables seem to be bad either. For overall longevity they may well be neutral, with the benefits likely coming from their replacement of white flour in the diet.
Dietary fat seems protective overall, particularly together with animal foods in the 70-79 range. This may simply reflect a delayed protective effect of animal fat and protein consumption.
The protective effect of dietary fat becomes clear when we look at the relationship between carbohydrate calories and fat calories. Their correlation is -.957, which essentially means that carbohydrate intake seriously displaces fat intake.
Carbohydrates themselves may not be the problem, even if coming from high glycemic foods (except wheat flour, apparently). This post shows that they are relatively benign if coming from high glycemic rice, even at high intakes of 206 to 412 g/day. The problem seems to be caused by carbohydrates displacing nutrient-dense animal foods.
Interestingly, rice does not displace animal foods or fat in the diet. It is positively correlated with them. Wheat flour, on the other hand, displaces those foods. Wheat flour is negatively and somewhat strongly correlated with consumption of animal foods, as well as with animal fat and protein.
There are certainly several delayed effects here, which may be distorting the results somewhat. Degenerative diseases don’t develop fast and kill folks right away. They often require many years of eating and doing the wrong things to be fatal.