This is going to be a fairly technical article, discussing a recent(ish) study by Marques and colleagues. I’ve previously written quite a bit about the relationship between muscle growth and strength gains, and I’m not going to recap all of that previous content in this article. If this is your first time giving much thought to the subject, you may enjoy reading some of those prior articles:
For our purposes here, the most important thing to note is that, until recently, research investigating the relationship between hypertrophy and strength gains painted a pretty consistent picture. In studies on untrained subjects, we see a pretty weak relationship between hypertrophy and strength gains. However, as training status increases, this relationship gets progressively stronger.

The graph below roughly depicts our previous understanding about the explanatory power of hypertrophy for predicting changes in strength.

And, for what it’s worth, I think this type of relationship makes intuitive sense. If hypertrophy is causally related to strength gains (and I believe it is), we should expect the relationship between hypertrophy and strength gains to get stronger as other factors influencing strength gains become more and more normalized/equated between subjects. Namely, early strength gains are primarily attributable to improvements in technique, motor learning, etc. In studies on untrained lifters, the average gains observed are a 5% increase in muscle size and a 22% increase in strength. Even if hypertrophy has a causal impact on strength gains in untrained lifters, it can only explain, at most, a bit less than a quarter of the strength gains observed in that population (in most circumstances). However, as training status increases, there are simply fewer gains to be had from further marginal improvements in technique and motor learning. As a result, hypertrophy explains more and more of the variance in strength gains as training status (and mastery of the lifts used to assess strength) increases.
However, a recent study appeared to flip this understanding on its head. The study by Marques and colleagues is titled Muscle Growth Is Very Strongly Correlated with Strength Gains after Lower Body Resistance Training: New Insight from Within-Participant Associations. This study proposes that prior research on this topic may have underestimated the strength of the relationship between hypertrophy and strength gains by using suboptimal statistical methods. From the study’s introduction:
“Importantly, the statistical analysis could also affect our understanding of the relationship between strength gains and putative underpinning variables. Previous research has heavily relied on the use of between-participant analyses, such as simple regression or Pearson’s correlation, to investigate the relationships between changes in muscle size or activation with changes in strength. These methods assess interindividual associations but may lack sensitivity in capturing within-participant changes over time. For example, the most common approach has been to consider the percentage change (i.e., pre-training to post-training) in strength and size/activation over a training period, using one data point per subject, which may obscure the association due to the aggregation of the two dependent time points. In other words, relatively fixed individual factors (e.g., moment arm, contractile specific tension) may be substantial, such that individual pre and post values are dependent and better considered together. Thus, when two or more measurement points are obtained from the same individual, the within-participant (or repeated-measures) correlation is the preferred statistical method for analyzing the common intraindividual association. Because repeated-measures correlation accounts for the non-independence of each paired data, it tends to yield much greater power than data that are averaged or derived from the changes between time points.”
That may sound abstract, but this is a fairly easy concept to grasp. Imagine two people who currently have biceps of the same size. However, person A’s biceps have favorable insertions (long internal moment arms at the elbow) and very high specific tension (the amount of linear contractile force a muscle can produce per unit of CSA), and person B’s biceps have unfavorable insertions and very low specific tension. As a result, if both of these individuals experience the same increase in biceps size, person A will likely experience a larger gain in strength than person B. So, if you plotted the relative increase in size and strength for these two individuals, you might see a 5% increase in biceps size for both, but a strength increase of 20% in person A versus 10% for person B. Repeat this for a group of 20-30 subjects, and it may appear that there’s a weak relationship between hypertrophy and strength gains for your cohort, even if hypertrophy is actually having a strong, direct effect on strength gains within each individual in the cohort.
Repeated measures correlation can be a valid statistical tool to account for some of these inter-individual differences. To illustrate, below you’ll find a figure from a paper cited by the authors illustrating how impactful the use of repeated measures correlations can be. In this theoretical example, the relationship between the two variables for each subject (denoted by different colors) has a different intercept, but the slope of the relationship is the same for all subjects. As a result, within-subject associations (on the left) reveal a very strong relationship, whereas between-subject associations (on the right) would fail to identify the relationship.

In general, I really, really like the idea of focusing on within-subject associations for the same reasons provided by the authors. Furthermore, prior research has shown that statistical methods that account for within-subject associations can explain a larger proportion of the variance in the strength/hypertrophy relationship than methods that only consider associations between subjects.
In the prior study by Vigotsky and colleagues, associations between subjects were in line with what we typically observe in untrained subjects: changes in muscle size explained <5% of the variance in strength gains. However, a hierarchical linear model (HLM), was able to explain 7.4-24.1% of the variance (equating to a r-value in the range of ~0.25-0.50). It was able to explain so much more of the variance because it allowed each subject’s strength/hypertrophy relationship to have its own intercept and slope, meaning it accounted for differences between subjects at baseline, and differences in how hypertrophy or atrophy might impact strength changes between individuals.

So, with that context, let’s briefly discuss this new study by Marques and colleagues.
In this study, 39 untrained men completed 15 weeks of resistance training focused on the quads. Strength (knee extension 1RM and maximum isometric torque) and size (quadriceps volume, assessed via MRI) were assessed before and after this training period. On average, isometric knee extension strength increased by 21.6 ± 9.2%, knee extension 1RM increased by 28.6 ± 12.9%, and quadriceps volume increased by 12.7 ± 7.1%.

Furthermore, the r-value for the repeated measures correlation between hypertrophy and strength was 0.92 for isometric knee extension strength, and 0.89 for knee extension 1RM. In contrast, the between-subjects associations were in the range of r = 0.35-0.60. So, using repeated measures correlations, hypertrophy appeared to explain ~80-85% of the variance in strength gains, whereas more traditional between-subject correlations suggested that hypertrophy only explained ~12-35% of the variance.

At first, this appears to be a very striking result. This finding doesn’t just conflict with previous research on the topic – it’s in an entirely different universe. If we take these results at face value, it would suggest that prior researchers failed to detect an almost perfect relationship between hypertrophy and strength gains in untrained subjects due to inadequate statistical methods, and were instead only able to detect a very weak correlation.
So, is that what actually happened here?
Not exactly.
The first hint that something is amiss arises when we contrast the results of this study with the results of the prior study by Vigotsky and colleagues. If the shortcomings of prior research just boiled down to the use of between-subjects analyses rather than within-subjects analyses, the Vigotsky study should have identified the same sky-high correlations observed by Marques and colleagues, since it also used a statistical approach designed to assess within-subject associations. In fact, the statistical models in the Vigostky study should have been able to account for even more of the variance than the repeated measures correlation used by Marques and colleagues. Repeated measures correlation, as employed by Marques and colleagues, is tantamount to a hierarchical linear model that allows intercepts to vary between subjects while assuming that the slope of the size/strength relationship is the same for all subjects. By contrast, the HLM in the Vigotsky study allowed for slopes and intercepts to vary between subjects.
Unless a given increase in size yielded literally the exact same increase in strength in all subjects, allowing both intercepts and slopes to vary between subjects would necessarily allow you to explain more of the variance than only allowing intercepts to vary. In other words, if hypertrophy truly explains >80% of the variance in strength gains in untrained subjects, we should have seen associations of a similar (or even greater) strength in the Vigotsky study. And, to be sure, within-subject analyses did explain more of the variance than between-subject analyses in the Vigotsky study, but even the within-subject analyses (with varied intercepts and slopes) found that hypertrophy explained less than 25% of the variance in strength gains.
So, what might explain this difference?
Let’s start by asking a simple question: how strong of an association would we have seen in the Marques paper with repeated measures correlation if hypertrophy and strength gains weren’t actually related?
You’d assume that two things that aren’t related would have a correlation coefficient of r = 0. However, that’s actually not the case here.
To test it out, I randomly generated 5000 “subjects” matching the summary statistics in the Marques paper (with the same group-level means, standard deviations, change scores, and change score SDs for knee extension 1RM, knee extension isometric torque, and quadriceps volume). However, by design, changes in quadriceps volume were entirely unrelated to changes in strength in these “subjects.” This simulation exists in a universe where hypertrophy has zero impact on strength gains. Each “subject’s” 1RM, isometric knee extension strength, and quadriceps volume changed by a random amount consistent with the means and standard deviations of the reported change scores, but each change was totally independent of all other changes.

Next, I calculated the repeated measures correlation for these subjects. The correlation coefficient is calculated using this formula:

To calculate “SSMeasure,” you just need to subtract the pre-training and post-training strength measures from the average strength measure for each subject, and square the results. So, for instance, if someone had a 1RM of 30kg pre-training and 40kg post-training, their individual “SSmeasure” would be (30-35)^2+(40-35)^2 = 50. To calculate “SSMeasure” for the entire sample, you just repeat this process for all of the subjects, and sum the results.
For “SSError”, you’d first calculate how much of the subject’s strength would be expected to change, given their change in quadriceps volume. For the entire sample, quadriceps volume increased by an average of 237.5cm3, and 1RM knee extension strength increased by an average of 14.6kg. So, you’d expect each subject’s knee extension 1RM to increase by 14.5/237.5 = 0.0615kg per cm3 increase in quadriceps volume. So, for instance, if someone’s quadriceps volume increased by 300cm3, you’d expect their 1RM knee extension to increase by 300 * .0615 = 18.45kg.
From there, you fit a regression line with a slope of 0.0615kg per cm3 that passes through the point corresponding to the average of the subject’s pre- and post-training quadriceps volume (x-coordinate) and the average of their pre- and post-training knee extension 1RM (y-coordinate), and solve for their expected pre- and post-training knee extension 1RMs at x-coordinates corresponding to their pre- and post-training quadriceps volumes. Then, subtract the expected knee extension 1RM values from the actual values, square both results, sum them together, and repeat the process for all other subjects.
Once you’ve calculated your sums of squares, you just plug those values into the formula above.
So, let’s circle back to our question: If hypertrophy had no impact on strength in this study by Marques and colleagues, how strong would the repeated measures correlation still appear to be?
Even if hypertrophy and strength gains were entirely unrelated, and hypertrophy had no impact on strength gains, the repeated measures correlation performed in the Marques study would have returned r-values in the range of 0.81-0.83. In other words, repeated measures correlation would make it appear that hypertrophy explained ~65-70% of the variance in strength gains in this study, even if hypertrophy actually had 0 impact on strength gains, and no relationship to strength gains whatsoever.
With that context, the reported r-values of 0.89-0.92 look considerably less impressive. To be clear, these values are meaningfully higher than the r-values we’d expect to see in the null case outlined above, at least nominally. But, they roughly imply that hypertrophy only explains about 12-17% more variance (additively) than would be explained by random noise matching the subjects’ summary statistics. Though, to be charitable, that additive difference of 12-17% accounts for about 35-55% of the unexplained variance. In other words, if you’d expect to see a r2 value of approximately 0.67 when the actual correlation is 0, we can just treat an r2 value of 0.67 as our effective r2 of 0, and scale from there. If the difference between “no correlation” and “perfect, causal relationship” is (a bit less than) 33 points instead of 100, an additional 12 points gets us about 37% of the way from our effective 0 to 100, and an additional 17 points gets us about 53% of the way from our effective 0 to 100.
| Relationship between gains in quadriceps volume and gains in knee extension 1RM | ||
|---|---|---|
| r | r2 | |
| Between-subjects correlation | 0.35 | 0.12 |
| Repeated measures (within-subjects) correlation | 0.89 | 0.79 |
| Additional variance explained in repeated measures correlation after accounting for the null case | — | 0.37 |
| Relationship between gains in quadriceps volume and gains in knee extension isometric torque | ||
|---|---|---|
| r | r2 | |
| Between-subjects correlation | 0.60 | 0.36 |
| Repeated measures (within-subjects) correlation | 0.92 | 0.85 |
| Additional variance explained in repeated measures correlation after accounting for the null case | — | 0.53 |
My general take is that the values at the bottom of these tables likely provide the fairest and most accurate description of the study’s results (and they roughly imply that hypertrophy may explain about ~20-25% more of the variance in strength gains than between-subjects correlations on change scores would suggest – 37% instead of 12% for hypertrophy and 1RM strength, and 53% instead of 36% for hypertrophy and isometric torque). I do think standard between-subjects correlations underestimate the actual strength of the relationship between hypertrophy and strength gains, but it’s pretty clear that repeated measures correlation (when interpreted uncritically) overshoots the strength of the relationship to a hilarious degree. Just to hammer this point home, the correlation coefficient for the relationship between hypertrophy and 1RM strength in the Marques study (r = 0.89) doesn’t even differ significantly (statistical significance; i.e., p > 0.05) from the null case (r = 0.82), since the reported confidence interval extended from r = 0.81-0.94.
For emphasis: Repeated measures correlations produce such inflated r-values that we can’t be confident that r = 0.89 actually implies the existence of an association meaningfully stronger than what we’d expect to see purely by chance.
I have a few more notes about repeated measures correlations, but first, I’d just like to make it clear that I have no major criticisms of the study itself, or even the choice of statistical approach. I do think the way the results were presented and discussed is a bit off the mark, but not in any way that’s suggestive of dishonesty or an attempt to deceive. I don’t think it would have hurt to do some simulations before running the study, and I think some simulations would have made it clear to the authors that correlation coefficients of ~0.9 with repeated measures correlations require a more cautious interpretation than correlation coefficients of a similar strength from between-subjects analyses – but I also applaud them for trying something new since the typical approach (Pearson correlations on percent changes) has plenty of its own shortcomings.
Also, it’s worth noting that even their “old school” between-subjects Pearson regression analysis found stronger correlations than we typically see in untrained lifters (r = 0.35 for the relationship between hypertrophy and changes in 1RM, and r = 0.60 for the relationship between hypertrophy and changes in isometric strength). As discussed in a recent article, we typically see weaker correlations between hypertrophy and strength gains in the research than most people would expect, and I strongly suspect that some of the factors explaining these relatively weak correlations are just boring statistical considerations. There will always be some degree of measurement error, but if there’s a longer duration between measurements, and (relatedly) if larger actual gains in strength and muscle size can occur, your measurements will necessarily reflect relatively more signal and relatively less noise.
This study by Marques ran for longer than most studies on untrained lifters (15 weeks), resulting in quite a bit more hypertrophy and slightly larger strength gains than we typically see. Furthermore, one of their strength measurements required minimal skill (maximal isometric knee extension torque), which helps reduce the impact of a major confounder – different rates and degrees of skill acquisition influencing strength adaptations. So, although the Pearson correlations in this study were quite a bit stronger than we typically see in studies on untrained subjects, they also don’t surprise me too much, and I applaud the researchers for the overall methodological quality of the study.
With that said, I just want to close with a final word of caution about repeated measures correlations, because I strongly suspect they’ll start cropping up more frequently in the exercise science literature. As mentioned above, my primary interest was in poking around at the “null case” – the types of apparent associations we should expect to see when no relationship actually exists. What I found was that repeated measures correlations are only minimally affected by mean change scores and variability between subjects at baseline, but they’re very sensitive to change score SDs (i.e., change score coefficients of variation).
Strength and hypertrophy outcomes tend to have coefficients of variation in the neighborhood of 1.0 (meaning that the standard deviations of the changes we observe tend to be comparable to the mean changes. In other words, if the average strength increase in a study is 10kg, the standard deviation tends to be somewhere between 5 and 15kg). See Figure 8C and D here. When strength and hypertrophy outcomes are both positive and both have CVs between 0.5-1.5, repeated measures correlations in the range of 0.65-0.85 may be values that are actually suggestive of little-to-no association (the smaller the CVs, the higher the repeated measures correlation will be, even when there’s no causal association between variables).

Looking back at the formula above, to get an r-value below 0.5, SSError must exceed SSMeasure threefold, which is pretty unlikely to occur, since the fixed slope for each individual corresponds to the average slope for your entire cohort. To get anywhere near r = 0.5, you’d need to have several very large outliers where the actual change is minimal, but the predicted change is very large (or vice versa – very small predicted changes, but a handful of very large actual changes). But, most of the time, repeated measures correlation will spit out a reasonably large correlation coefficient any time two mean changes are moving in the same direction. To illustrate, here’s a plot I whipped up in literally 2 minutes. The “baseline” measures for both variables are just 0 ± 1. Dummy 1 increases by 1 ± 1 unit, and Dummy 2 increases by 3 ± 1 units completely at random – there’s no interaction between Dummy 1 and Dummy 2 whatsoever. And yet, rrm = 0.83.

And, for the record, Pearson regression nails the (lack of) correlation:

Again, I’m not saying there’s anything wrong with repeated measures correlation. However, if repeated measures correlations start showing up in publications more frequently, I think that appropriately interpreting them is going to take quite a bit more work than most people are accustomed to when encountering a correlation coefficient. I’m fully braced for a load of apparent r = 0.7-0.9s that end up being complete mirages, and I’m writing this article to send to folks every time I need to explain that r = 0.7-0.9 doesn’t always mean what you think it means. Literally speaking, r = 0.8 with repeated measures correlations means the same thing as r = 0.8 with Pearson correlation (i.e., they just communicate the degree of variance explained by your statistical model), but they can imply very different things about how strongly changes in one variable are predicted by changes in another variable.
At the very least, if you’re using repeated measures correlations in a study you’re conducting, then please run some simulations to estimate the r and r2 values you should expect with null results. Whip up a simulated dataset matching the descriptive statistics of your results (or the results you expect to see) where the outcomes you’re running correlations on are, by design, completely independent and unrelated to each other. See what repeated measures r-value functionally equates to r = 0 for your dataset, and keep that in mind when interpreting your results. I think that focusing on the additive or relative increase in r or r2 values beyond what you’d expect to see in the null case is probably more representative of the strength of your findings than just presenting the r-value as-is, and interpreting it the same way you’d interpret a basic Pearson correlation.
And, the same applies if you’re a consumer of scientific information encountering repeated measures correlation. Don’t just take the reported results at face value, or interpret the r-values the same way you’d interpret any other correlation coefficient. Thankfully, there’s a shiny app (made and maintained by the authors of the paper about repeated measures correlation cited above) that makes the analysis much easier – you just need to be able to whip up some dummy data in Excel, and you’re off to the races. That shiny app was invaluable for me while I was working on this article, to make sure my understanding of repeated measures correlations was correct.
So, just to wrap things up:
- Hypertrophy does probably contribute more to strength gains in new lifters than prior studies suggested … but the relationship is nowhere near as strong as (what is typically implied by) r = 0.9.
- Be cautious when interpreting repeated measures correlations. That high r-value probably isn’t quite as impressive as it appears at first glance.

