A cognitively plausible language model should (i) process language incrementally, (ii) be trained on naturalistic input, and (iii) mirror the developmental stages observed in child language acquisition. This study focuses on the third point by exploring the adherence of language models’ developmental patterns to the predictions of two empirically grounded theories of syntactic acquisition, the Growing Trees and the Neo-Emergentist approaches. Using an evaluation method based on perplexity, we test whether small and medium Italian-tuned LMs (two small GPT2 LMs, GePpeTto, and Minerva-7B) show sensitivity to syntactic phenomena corresponding to three acquisitional stages documented in child Italian. Our results suggest that smaller open models only partially reflect the stagewise progression observed in children.
Acquisition in Babies and Machines: Comparing the Learning Trajectories of LMs in Terms of Syntactic Structures (ATTracTSS Test Set)
Sarah RossiWriting – Original Draft Preparation
;Guido FormichiWriting – Original Draft Preparation
;Sofia NeriWriting – Review & Editing
;Tommaso SgrizziWriting – Review & Editing
;Asya ZanolloData Curation
;Cristiano ChesiConceptualization
2025-01-01
Abstract
A cognitively plausible language model should (i) process language incrementally, (ii) be trained on naturalistic input, and (iii) mirror the developmental stages observed in child language acquisition. This study focuses on the third point by exploring the adherence of language models’ developmental patterns to the predictions of two empirically grounded theories of syntactic acquisition, the Growing Trees and the Neo-Emergentist approaches. Using an evaluation method based on perplexity, we test whether small and medium Italian-tuned LMs (two small GPT2 LMs, GePpeTto, and Minerva-7B) show sensitivity to syntactic phenomena corresponding to three acquisitional stages documented in child Italian. Our results suggest that smaller open models only partially reflect the stagewise progression observed in children.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


