The relevance and requirement of youth physical fitness (PF) testing is currently being emphasized in many different settings based on a multi-faceted structure of PF (i.e., cardio-vascular endurance, strengths, etc.; e.g., Fleishman, 1964). In practice, PF is often assessed using identical item sets across childhood. Aim of this study is to investigate the developmental validity of unchanged item sets over time as well as the development of the dimensional structure underlying PF assessments. In the context of the project 'healthy children in sound communities' (Naul et al., 2012) a total of 4417 (N6years=790, N7years=1371, N8years=1331, N9years=925; 48.2 % female) completed a fitness test. The test consists of 6min run, push-ups, sit-ups, standing broad jump, 20m sprint, jumping sideways and balancing backwards, covering different dimensions of PF (i.e., cardio-vascular endurance, strength, speed, coordination). Detailed analyses based on Mixed-Rasch modeling (Rost, 1990) separately for each age group show the only acceptable fit indices with ordered threshold parameters within the one-class solution for all items for six- to eight-year-old children (.12 ≤ Qi ≤ .16). For nine-year-old children analyses show the one-class solution (.09 ≤ Qi ≤ .14), only if balancing backwards is excluded due to unordered threshold parameters in the upper proficiency level. This study reveals a one-dimensional structure underlying fitness assessments in childhood instead of a multi-faceted structure. As a practical implication, it is shown that the usage of identical item sets, which is simple and easily understandable, is feasible, but identical item sets are not necessarily valid across developmental stages.