Off-campus UMass Amherst users: To download campus access dissertations, please use the following link to log into our proxy server with your UMass Amherst user name and password.

Non-UMass Amherst users: Please talk to your librarian about requesting this dissertation through interlibrary loan.

Dissertations that have an embargo placed on them will not be available to anyone until the embargo expires.

Author ORCID Identifier



Open Access Dissertation

Document Type


Degree Name

Doctor of Philosophy (PhD)

Degree Program


Year Degree Awarded


Month Degree Awarded


First Advisor

Patty S. Freedson

Subject Categories

Exercise Science


The aim of study one of this dissertation was to compare consumer activity trackers (ATs) with the research-grade ActiGraph™ GT3X-BT accelerometer (AG) in estimating energy expenditure (EE) and steps during orbital shaking at different frequencies. To address this aim, we utilized an electronic orbital shaking protocol (twenty-four, 3-minute trials; 2-hour trials). For all comparisons, the AG served as the reference measure. In the 3-min protocol, we showed that on average, the NL-1000 pedometer (NL) produced the lowest error (-9 steps/3-min) at 0.9 Hz (corresponding to moderate intensity). The magnitude of the error for the NL was 14 steps/3-min at a 3.0 Hz frequency (corresponding to very vigorous intensity). For the 2-hr protocol, estimates from all others were equivocal, with some overestimating steps (bias range: 1,331 steps/2-hrs for the Misfit Shine to 1,921 steps/2-hrs for the Misfit Flash [MFF]). For estimated EE bias ranged from26.6 kcals/2-hrs for the MFF to 45.8 kcals/2-hrs for the Misfit Shine. For other ATs, steps were underestimated (bias range: -5,770 steps/2-hrs for the Garmin Vivofit [GV] to -570 steps/2-hrs for the NL). For EE, the bias ranged from -436.8 kcals/2-hrs for the GV to -261.7 kcals/2-hrs for the Fitbit Flex [FBF]). This study provides evidence about the differences in prediction algorithms by device across a broad range of oscillation frequencies that corresponded to different PA intensity levels.

For study two, we sought determine the accuracy and precision of activity trackers (ATs) in estimating steps, EE, activity minutes and sedentary time compared to direct observation (DO)-derived measures (criterion measures) in free-living settings. We also validated commonly used research-grade devices (e.g. hip-worn AG (AGhip), wrist-worn AG (AGwrist). Thirty-two healthy men and women (50% female, 37.5% minority; mean ± SD: Age = 32.3 ± 13.3 years; BMI = 24.4 ± 3.3 kg∙m-2) were directly observed while completing three, 2-hour visits on different days while wearing ten ATs, three research-grade devices and a biometric shirt. A validated DO system was used to derive criterion measures for activity and sedentary time (ST) outcomes. ATs were accurate with varying precision in estimating physical activity (PA) behaviors in free-living settings. Additionally, ATs and research-grade accelerometers performed similarly (e.g. more accurate in estimating steps and less accurate in estimating moderate-to-vigorous PA [MVPA] minutes). For all devices, step estimates were accurate and strongly correlated (r range: 0.91 for the Apple iWatch to 0.97 for the AGhip) with criterion measures but EE and MVPA estimates were less accurate and more variable (EE: r = 0.32 [GV] to r = 0.85 [AGhip]; MVPA: r = 0.2 [NL] to r = 0.75 [AGhip]). For ATs, estimates of sedentary time were the least accurate and weakly correlated (r=0.06 Fitbit One [FBO] and FBF) with criterion measures derived from DO. Implications from this study are that consumers and the research community using ATs such as Fitbit (FB) to track steps can be confident in estimating steps but less confident in estimating sedentary time. This study advances our understanding of the performance characteristics of ATs in free-living natural settings using a validated DO method to derive PA and ST measures. This work significantly advances the field of activity monitor validation that should set the standard for future work.

The aims of study three were: 1) to examine the ability of ATs to detect change in PA and ST in free-living settings and 2) to examine the ability of research-grade accelerometers to detect change in PA and ST in free-living settings. To address these aims, we used an innovative approach to analyze data from study two. We defined change as a visit-to-visit difference that was greater than the within-subject standard deviation of the criterion measure (estimated by a linear-mixed model). Confusion matrices were used to examine percent agreement between DO visit-to-visit change and device visit-to-visit change. Key findings were focused on the widely used FBO and FBF and research-grade devices. We showed that, there was similar agreement between the hip-worn FBO and FBF with AGhip and AGwrist in estimates of change in steps (89.1% FBO, 88.8% FBF and 88.3% AGwrist, 91.4% AGhip correct classification), EE (73.4% FBO, 70.6% FBF and 77.0% AGhip correct classification) and MVPA minutes (accept FBF) (79.7% FBO, 65.2% FBF and 71.2% AGwrist, 77.0% AGhip correct classification) with criterion measured change. However, change in ST was more difficult to detect for the FB and AGhip (46.8% FBO, 42.3% FBF, 53.1% AGhip and 72.7% AGwrist correct classification). This novel study provides evidence that as an alternative to research-grade accelerometers, researchers may employ FB to measure step accumulation pre- and post-intervention and have a satisfactory level of confidence in FB change detection.

This work significantly advances the field of activity monitor validation research and informs intervention practices that should set the standard for future work. This body of work provides the first comprehensive validation of ATs from highly controlled orbital shaker testing to directly-observed free-living settings. This translational research which has broad applications for using ATs for surveillance and intervention research and by the consumer.