This dissertation contributes to the growing literature on green entrepreneurship by developing novel methodological tools to identify and analyze green startups, a key indicator of the transition toward sustainable growth. Despite the increasing relevance of green entrepreneurship for policymakers, society, and scholars, empirical research in this field faces substantial challenges, particularly the limited availability of data and the lack of a clear operational definition of what constitutes a green startup. To address these issues, this study leverages recent advances in Natural Language Processing (NLP) to propose new approaches for identifying and examining green startups in Italy and for analyzing how they establish legitimacy. The first chapter introduces an innovative NLP-based methodology to identify green startups and compares it with two alternative approaches commonly used to capture the broader category of sustainable startups. To operationalize the concept of green startup, we develop a classification framework grounded in the United Nations Sustainable Development Goals (SDGs), which comprise 17 goals aimed at advancing social, economic, and environmental sustainability. Focusing specifically on environmental dimensions, we apply NLP techniques to extract environment-related content from startup websites and cluster it into 14 green thematic topics. We then implement three text-based approaches—Dictionary, Latent Dirichlet Allocation (LDA), and BERTopic—on a dataset of 10,939 websites of Italian innovative startups. The results indicate that the three methods identify partially overlapping but distinct sets of green startups, as each approach captures different dimensions of environmental engagement. We further relate the regional distribution of identified startups to progress on the SDGs and to expenditures under the National Recovery and Resilience Plan, showing that entrepreneurial activity and public policy converge on issues such as sustainable energy, sustainable practices, and air quality, while public spending plays a relatively stronger role in water quality and disaster resilience. The second chapter investigates the emergence of the green startups identified in the first chapter by extending the Knowledge Spillover Theory of Entrepreneurship (KSTE). We examine how green demand, the local stock of knowledge, and its composition influence the creation of green startups across Italian provinces. Consistent with prior literature, we find that knowledge stocks positively affect green startup formation. Green demand also exerts a positive effect and strengthens the impact of local knowledge stocks. These findings suggest that green demand facilitates entrepreneurs’ ability to exploit knowledge spillovers by softening the Knowledge Filter, increasing short-term expected returns on innovation investments, and accelerating the conversion of knowledge into new ventures. Moreover, green knowledge does not exert a stronger effect than non-green knowledge, indicating that green startups innovate by recombining diverse knowledge bases. The third chapter examines how green and non-green entrepreneurs build legitimacy and develop networks through narratives shared on social media. Using a subsample of 1,703 founders for whom LinkedIn profile data and posts were collected, we introduce a novel methodology based on Large Language Models to identify cognitive, normative, and pragmatic legitimacy claims in entrepreneurial communication. We find that these different forms of legitimacy claims significantly influence online legitimacy, measured by audience engagement in terms of reactions to posts. Overall, the dissertation advances both methodological and theoretical understanding of green entrepreneurship by integrating text-based methods, regional economic analysis, and the study of digital entrepreneurial narratives.

Questa tesi contribuisce alla letteratura sull’imprenditorialità verde sviluppando nuovi strumenti metodologici per identificare e analizzare le startup green, un indicatore chiave della transizione verso modelli di crescita sostenibile. In risposta alle persistenti sfide legate alla definizione e alla misurazione del fenomeno, proponiamo un approccio innovativo basato sul Natural Language Processing (NLP) per individuare le startup green a partire dai contenuti testuali dei loro siti web. Partendo da un quadro di classificazione fondato sui Nazioni Unite Obiettivi di Sviluppo Sostenibile (SDG), identifichiamo 14 aree tematiche ambientali e confrontiamo tre metodi di analisi testuale — Dictionary, Latent Dirichlet Allocation (LDA) e BERTopic — applicati a 10.939 startup innovative italiane. I risultati mostrano che i diversi approcci individuano insiemi parzialmente sovrapposti ma distinti di startup green, evidenziando come ciascun metodo catturi differenti dimensioni dell’impegno ambientale. L’analisi evidenzia inoltre una convergenza tra attività imprenditoriale e politiche pubbliche (inclusa la spesa del Piano Nazionale di Ripresa e Resilienza) su temi quali energia sostenibile, pratiche sostenibili e qualità dell’aria, mentre l’intervento pubblico risulta relativamente più incisivo in ambiti come qualità dell’acqua e resilienza ai disastri. Il secondo capitolo esamina i determinanti territoriali dell’emergere delle startup green nelle province italiane, estendendo la Knowledge Spillover Theory of Entrepreneurship. I risultati mostrano che gli stock locali di conoscenza favoriscono la nascita di startup green e che la domanda green non solo esercita un effetto diretto positivo, ma amplifica anche l’impatto della conoscenza disponibile. La domanda green attenua il Knowledge Filter, aumentando i rendimenti attesi di breve periodo dell’innovazione e facilitando la trasformazione degli spillover di conoscenza in nuove iniziative imprenditoriali. Inoltre, la conoscenza green non risulta più rilevante di quella non green, suggerendo che l’innovazione nelle startup green deriva dalla ricombinazione di conoscenze eterogenee. Il terzo capitolo analizza infine come imprenditori green e non-green costruiscano legittimità e reti relazionali attraverso le narrazioni condivise, contribuendo a comprendere le dinamiche simboliche e comunicative che sostengono l’affermazione dell’imprenditorialità sostenibile. Complessivamente, la tesi offre un contributo metodologico e teorico allo studio dell’imprenditorialità verde, integrando strumenti di analisi testuale avanzata con un’analisi economica e istituzionale multilivello.

Imprenditoria verde: identificazione, emergere e legittimità / Le Masle, Baptiste Erouan Antoine. - (2026 May 13).

Imprenditoria verde: identificazione, emergere e legittimità

LE MASLE, BAPTISTE EROUAN ANTOINE
2026-05-13

Abstract

This dissertation contributes to the growing literature on green entrepreneurship by developing novel methodological tools to identify and analyze green startups, a key indicator of the transition toward sustainable growth. Despite the increasing relevance of green entrepreneurship for policymakers, society, and scholars, empirical research in this field faces substantial challenges, particularly the limited availability of data and the lack of a clear operational definition of what constitutes a green startup. To address these issues, this study leverages recent advances in Natural Language Processing (NLP) to propose new approaches for identifying and examining green startups in Italy and for analyzing how they establish legitimacy. The first chapter introduces an innovative NLP-based methodology to identify green startups and compares it with two alternative approaches commonly used to capture the broader category of sustainable startups. To operationalize the concept of green startup, we develop a classification framework grounded in the United Nations Sustainable Development Goals (SDGs), which comprise 17 goals aimed at advancing social, economic, and environmental sustainability. Focusing specifically on environmental dimensions, we apply NLP techniques to extract environment-related content from startup websites and cluster it into 14 green thematic topics. We then implement three text-based approaches—Dictionary, Latent Dirichlet Allocation (LDA), and BERTopic—on a dataset of 10,939 websites of Italian innovative startups. The results indicate that the three methods identify partially overlapping but distinct sets of green startups, as each approach captures different dimensions of environmental engagement. We further relate the regional distribution of identified startups to progress on the SDGs and to expenditures under the National Recovery and Resilience Plan, showing that entrepreneurial activity and public policy converge on issues such as sustainable energy, sustainable practices, and air quality, while public spending plays a relatively stronger role in water quality and disaster resilience. The second chapter investigates the emergence of the green startups identified in the first chapter by extending the Knowledge Spillover Theory of Entrepreneurship (KSTE). We examine how green demand, the local stock of knowledge, and its composition influence the creation of green startups across Italian provinces. Consistent with prior literature, we find that knowledge stocks positively affect green startup formation. Green demand also exerts a positive effect and strengthens the impact of local knowledge stocks. These findings suggest that green demand facilitates entrepreneurs’ ability to exploit knowledge spillovers by softening the Knowledge Filter, increasing short-term expected returns on innovation investments, and accelerating the conversion of knowledge into new ventures. Moreover, green knowledge does not exert a stronger effect than non-green knowledge, indicating that green startups innovate by recombining diverse knowledge bases. The third chapter examines how green and non-green entrepreneurs build legitimacy and develop networks through narratives shared on social media. Using a subsample of 1,703 founders for whom LinkedIn profile data and posts were collected, we introduce a novel methodology based on Large Language Models to identify cognitive, normative, and pragmatic legitimacy claims in entrepreneurial communication. We find that these different forms of legitimacy claims significantly influence online legitimacy, measured by audience engagement in terms of reactions to posts. Overall, the dissertation advances both methodological and theoretical understanding of green entrepreneurship by integrating text-based methods, regional economic analysis, and the study of digital entrepreneurial narratives.
13-mag-2026
SVILUPPO SOSTENIBILE E CAMBIAMENTO CLIMATICO
Questa tesi contribuisce alla letteratura sull’imprenditorialità verde sviluppando nuovi strumenti metodologici per identificare e analizzare le startup green, un indicatore chiave della transizione verso modelli di crescita sostenibile. In risposta alle persistenti sfide legate alla definizione e alla misurazione del fenomeno, proponiamo un approccio innovativo basato sul Natural Language Processing (NLP) per individuare le startup green a partire dai contenuti testuali dei loro siti web. Partendo da un quadro di classificazione fondato sui Nazioni Unite Obiettivi di Sviluppo Sostenibile (SDG), identifichiamo 14 aree tematiche ambientali e confrontiamo tre metodi di analisi testuale — Dictionary, Latent Dirichlet Allocation (LDA) e BERTopic — applicati a 10.939 startup innovative italiane. I risultati mostrano che i diversi approcci individuano insiemi parzialmente sovrapposti ma distinti di startup green, evidenziando come ciascun metodo catturi differenti dimensioni dell’impegno ambientale. L’analisi evidenzia inoltre una convergenza tra attività imprenditoriale e politiche pubbliche (inclusa la spesa del Piano Nazionale di Ripresa e Resilienza) su temi quali energia sostenibile, pratiche sostenibili e qualità dell’aria, mentre l’intervento pubblico risulta relativamente più incisivo in ambiti come qualità dell’acqua e resilienza ai disastri. Il secondo capitolo esamina i determinanti territoriali dell’emergere delle startup green nelle province italiane, estendendo la Knowledge Spillover Theory of Entrepreneurship. I risultati mostrano che gli stock locali di conoscenza favoriscono la nascita di startup green e che la domanda green non solo esercita un effetto diretto positivo, ma amplifica anche l’impatto della conoscenza disponibile. La domanda green attenua il Knowledge Filter, aumentando i rendimenti attesi di breve periodo dell’innovazione e facilitando la trasformazione degli spillover di conoscenza in nuove iniziative imprenditoriali. Inoltre, la conoscenza green non risulta più rilevante di quella non green, suggerendo che l’innovazione nelle startup green deriva dalla ricombinazione di conoscenze eterogenee. Il terzo capitolo analizza infine come imprenditori green e non-green costruiscano legittimità e reti relazionali attraverso le narrazioni condivise, contribuendo a comprendere le dinamiche simboliche e comunicative che sostengono l’affermazione dell’imprenditorialità sostenibile. Complessivamente, la tesi offre un contributo metodologico e teorico allo studio dell’imprenditorialità verde, integrando strumenti di analisi testuale avanzata con un’analisi economica e istituzionale multilivello.
COLOMBELLI, ALESSANDRA
BIANCHINI, STEFANO
File in questo prodotto:
File Dimensione Formato  
final thesis Le Masle.pdf

accesso aperto

Descrizione: Tesi def
Tipologia: Tesi di dottorato
Dimensione 46.38 MB
Formato Adobe PDF
46.38 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12076/25521
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact