Skip to main content

Reflecting trends in the academic landscape of sustainable energy using probabilistic topic modeling



Facing planetary boundaries, we need a sustainable energy system providing its life support function for society in the long-term within environmental limits. Since science plays an important role in decision-making, this study examines the thematic landscape of research on sustainable energy, which may contribute to a sustainability transformation. Understanding the structure of the research field allows for critical reflections and the identification of blind spots for advancing this field.


The study applies a text mining approach on 26533 Scopus-indexed abstracts published from 1990 to 2016 based on a latent Dirichlet allocation topic model. Models with up 1100 topics were created. Based on coherence scores and manual inspection, the model with 300 topics was selected. These statistical methods served for highlighting timely topic trends, differing thematic fields, and emerging communities in the topic network. The study critically reflects the quantitative results from a sustainability perspective.


The study identifies a focus on establishing and optimizing the energy infrastructure towards 100% renewable energies through key modern technology areas: materials science, (biological) process engineering, and (digital) monitoring and control systems. Energy storage, photonic materials, nanomaterials, or biofuels belong to the topics with the strongest trends. The study identifies decreasing trends for general aspects regarding sustainable development and related economic, environmental, and political issues.


The discourse is latently adopting a technology-oriented paradigm focusing on renewable energy generation and is moving away from the multi-faceted concept of sustainability. The field has the potential to contribute to climate change mitigation by optimizing renewable energy systems. However, given the complexity of these systems, horizontal integration of the various valuable vertical research strands is required. Furthermore, the holistic ecological perspective considering the global scale that has originally motivated research on sustainable energy might be re-strengthened, e.g., by an integrated energy and materials perspective. Beyond considering the physical dimensions of energy systems, existing links from the currently technology-oriented discourse to the social sciences might be strengthened. For establishing sustainable energy systems, future research will not only have to target the technical energy infrastructure but put a stronger focus on issues perceivable from a holistic second-order perspective.


For a vital society, the energy system has always been a key component. Facing climate change in the context of “planetary boundaries” [1, 2], it is crucial to establish a sustainable energy system that can provide its life support function for society in the long-term within environmental limits [3,4,5]. This triggers the twofold question: What are the main elements of an energy system, and what is an optimum design of our human-made energy system, in particular, with respect to its interactions with nature [6, 7]? As a response to this question, the normative multi-faceted concept of sustainability has developed into a popular guiding principle over the past decades [8,9,10,11,12,13,14]. It is expected to lead to a socio-technical system that does not exceed the “carrying capacity” of the natural system [9, 15].

These thoughts increasingly influence guidelines, initiatives, and policies, as shown by various examples. To name a few, at global level, the Sustainable Development Goals of the United Nations promote sustainability in different sectors and, for the energy sector, propose to “ensure access to affordable, reliable, sustainable and modern energy” via Goal 7 [16]. Regarding the European continent, the European Energy Strategy aims at a “sustainable, competitive and secure energy system” [17]. At a national level, examples are China´s 13th Five-Year Plan incorporating aspects of “green development” [18, 19] or the German Energy Transition supporting a systemic shift towards renewable energies [20, 21]. Of course, these examples are not final answers and face various challenges [22,23,24,25,26]. However, they are steps within a transformation towards sustainability, of which the field of sustainable energy is a crucial element.

Science has been a key initiator for the attention to sustainability and is to further play an active role [6, 27,28,29]. The outputs of the research branch concerned with sustainable energy are, therefore, precursors of a potentially sustainable global energy system of the future. For anticipating the nature of this kind of system, this study analyzes the structure of the hybrid research field of energy and sustainability. The following background briefly summarizes research trends based on a high-level qualitative review. This initial overview is extended throughout this study by a text-mining approach.

Screening the 100 most cited scientific review articles containing the term “sustainable energy” indexed in the Scopus database (Digital Object Identifiers see Additional file 1) shows that the dominant themes are electricity generation by renewables, bioenergy, storage technologies, and especially materials science. A general observation is that the reviews largely take a techno-economic perspective. The majority focuses on individual technologies, discusses the technical energy infrastructure while considering the costs of energy generation, or envisions various technological pathways.

Regarding renewable energy generation, photovoltaic energy might develop to the primary future energy source due to expected efficiency improvements, especially through intensive research on advancing materials for solar cells [30,31,32,33,34,35]. Wind energy, in particular, direct-drive turbine technology, might be the major secondary source [31,32,33,34,35]. In general, solar and wind-powered systems are deemed to have tolerable effects on ecosystems [32, 33, 35]. However, these effects will require continued attention [34, 36]. Since fossil fuels will remain competitive in the near future, some articles also refer to efficiency measures for fossil power plants [34, 35]. For decarbonization, carbon capture, storage, and use (CCSU), as well as nuclear energy, are also considered, while acknowledging the lack of conclusive risk assessments [32,33,34,35]. Several articles consider geothermal energy as a technology that, despite the untapped potential, has received comparably low attention [33, 34, 37]. However, the expansion of deep geothermal power generation will require diligent risk assessments [34].

The complexity of renewable energy systems stemming from intermittent and decentralized generation has motivated research in the field of storage technologies, modeling, and smart energy. For encountering intermittency, electrochemical energy storage via batteries, fuel cells using various storable fuels, or supercapacitors is intensively investigated [35, 38,39,40]. Given the plethora of technological options, optimization modeling tools have become essential for the efficient, dynamic, and economically viable planning and operation of renewable energy systems [31, 35, 37]. In this context, some articles propose smart energy systems as a promising techno-economic approach for establishing efficient energy systems by integrating thermal, gas, and electricity infrastructure [41, 42].

The future of fuels in the transportation sector but also other sectors seems less clear than the future of electricity generation. Bioenergy and biotechnology for producing biofuels have received considerable attention [43,44,45], especially the production of bioethanol through fermentation of sugarcane or grains [31, 33, 46, 47]. Due to limited sustainable availability of these feedstocks [33, 34], research also investigates alternative feedstocks such as lignocellulosic materials or microalgae and conversion technologies such as biorefineries or microbial reactors [33, 46, 48,49,50,51,52,53,54,55,56]. Another technological pathway discussed is the hydrogen economy based on water splitting [33, 57,58,59,60,61,62]. The direct use of hydrogen in fuel cells is a desirable long-term option; however, establishing the required hydrogen infrastructure is challenging [34, 39, 57, 63]. Therefore, synthesizing hydrocarbon fuels and distributing them via existing infrastructures is discussed [33]. Another pathway would be the methanol economy based on the yet impracticable artificial photosynthesis [63,64,65,66]. Considering this diversity of research on fuels, the electrification of the transportation sector might be achieved via fuel cells in the long-term but will probably be dominated by the already mature battery technologies in the short-term [35, 39].

Beyond discussing energy systems at large scale, the major share of highly cited reviews covers research on improving or creating advanced materials for energy technologies. Research has particularly focused on materials for batteries, fuel cells, or supercapacitors as well as hydrogen production and storage, thermoelectric devices, or solar cells [39, 40, 58, 63, 67,68,69,70]. In these applications, new materials can improve the performance of anodes, cathodes, electrolytes, catalysts, photoactive layers, diffusion layers, or storage structures [55, 58, 63, 69,70,71,72,73]. Materials science further has the potential to find replacements for materials with limited abundance such as noble metals in catalysts or electrodes [61, 63, 69, 71,72,73,74,75,76], lithium in batteries [39], or rare-earth magnetic materials in wind turbines [35, 63]. However, this field of materials science is often experimental, and no straightforward pathway regarding the choice of materials has emerged [69]. As a special field, nanoscience has become crucial for energy technologies [63, 68]. Nanotechnology allows synthesizing organic, inorganic, or composite structural elements such as particles, fibers, grids, thin-films or three-dimensional structures such as nanotubes, porous membranes, metal-organic frameworks, or carbon gels [63, 68, 69, 75, 77,78,79]. Controlling the geometric, physical, chemical, or electrical properties of materials can significantly increase the efficiency or functionality of various technologies regarding, e.g., surface to volume ratio, kinetics, conductivity, storage capacities, or stability of materials [39, 68, 69, 75].

Beyond the above themes that represent the main focus of the examined review articles, several themes receive less attention. Only a few articles discuss details regarding the conditions and developments in individual end-use sectors. There is only one article dedicated to a specific sector, i.e., the building sector [80]. Furthermore, despite the focus on materials science, only a few articles explicitly address the scalability of technologies considering the availability of chemical elements [81]. Non-technological aspects such as potential rebound effects, consumer behavior and energy-saving lifestyles, or political measures are only marginally discussed [34, 35, 82]. However, regarding issues of decision-making, a few articles deal with the details of advanced multi-criteria decision-making methods, especially for energy system planning [83, 84].

For extending the insights gained from the above qualitative review, this study uses a quantitative exploratory text mining approach. Qualitative reviews provide valuable insights. However, they are limited in the number of articles that can be analyzed and are, thus, selective to a certain extent. Therefore, they might not cover the full thematic breadth in a representative way. Considering the potential that large-scale text-based mappings of academic fields offer [29, 85,86,87,88], this study applies a text mining approach for detecting unknown patterns from text [89,90,91,92]. This approach does not replace the qualitative review method but offers a different integrated perspective that allows large-scale quantitative analyses and is unbiased regarding the selection of articles.

For reflecting latent paradigms in the academic field of sustainable energy and triggering further questions from an overarching perspective, this study quantitatively maps the scientific discourse with regard to prevalent topics, trends, and research communities. For this purpose, it applies a probabilistic topic modeling approach [93,94,95]. Several studies used topic modeling for investigating, e.g., research on hydropower [96], transportation [97], or knowledge flows in energy research in general between researchers in Asia and the USA [98]. However, topic modeling has not yet been applied for a broad mapping of research on sustainable energy. This study uses this method for analyzing research trends quantitatively without specifying themes a priori. In addition, it highlights the interconnections between topics and detects research communities. Thereby, it extends other applied topic modeling studies. There are a few studies dealing with correlations of topics that, however, follow a methodological but less a content-related interest [85]. So far, applied topic modeling studies usually do not highlight topic co-occurrence [87, 88, 96,97,98]. Moreover, while other text mining studies often focus on reporting quantitative results [86, 96,97,98], this study goes further. It critically reflects the modeling results by a detailed qualitative discussion from a sustainability perspective.

Using this large-scale approach, this study provides a new overall picture of the scientific discourse on sustainable energy. Beyond reporting and discussing themes side by side as it is done, e.g., in several of the studies screened above, it ranks thematic trends and provides a not yet available quantitative indication where the discourse is heading. It improves the understanding of the structure of the research field and assesses the integration of sustainability elements therein. This study can confirm several of the trends identified in the above literature review, show some in a different light, identify new trends, and point out blind spots to support advancing research on sustainable energy. In general, this study shows that the discourse is latently adopting a technology-oriented paradigm and is moving away from the multi-faceted concept of sustainability. Based on this empirical evidence, it highlights several important perspectives to be considered in future research and enables adjusting research priorities from a holistic sustainability perspective.

The rest of this paper is organized as follows. The section “Materials and methods” provides details on the text data used. It also explains and reflects the text mining methodology. The section “Results” presents the output of topic modeling regarding the timely trends, thematic fields, and the network of topics. The section “Discussion” examines the findings concerning selected energy system stages and principles of sustainability. It further relates the results to the above background and selected literature. The final section provides summarizing conclusions.

Materials and methods

This section summarizes the applied text mining methodology and provides reflections on methodological limitations and prospects. Additional file 1 provides comprehensive technical details on the methodology and underlying statistical models. The code for this study was written in R [99]. The code and, to the extent permitted by copyright, the data used are available in an open-source repository [100].

Data—Scopus abstracts

The database of this study consists of 26,533 Scopus-indexed abstracts of original journal articles published in the period from 1990 to 2016 whose abstracts, titles, or keywords contain the terms sustainab* and energy. The exact search phrase used in November 2017 for collecting data was “TITLE-ABS-KEY(sustainab* AND energy) AND NOT INDEX(medline) AND (LIMIT-TO (DOCTYPE, “ar”) OR LIMIT-TO(DOCTYPE, “ip”)) AND (LIMIT-TO(SRCTYPE, “j”))”. The following pre-selection criteria served to exclude non-representative or low-quality entries: (i) minimum length of 200 characters per abstract and (ii) no duplicate entries regarding title, abstract content, or EID number. By using Scopus as a bibliometric database with less strict criteria for entering the index system in contrast to, e.g., Web Of Science, this study potentially covers a broad range of emerging lines of thought [101, 102]. A disadvantage of Scopus is the current limitation of bulk downloads making manual iterative downloading necessary.

Overview of the topic modeling methodology

The core of the methodological sequence programmed for this study is the basic latent Dirichlet allocation (LDA) [103] model. Considering the family of probabilistic topic models [93,94,95], LDA is one of the basic models that has successfully been applied for, e.g., identifying key topics in scientific discourses [87, 88, 96,97,98]. Given a corpus, i.e., a collection of documents, LDA assumes that each document is a mixture of topics and that each topic is a mixture of words [93]. These mixtures are probability distributions. Each topic has a certain probability of appearing in a specific document. Also, each term of the whole corpus has a certain probability of belonging to a specific topic. The posterior distributions inferred by LDA are stored in probability matrices. For instance, the document topic matrix has rows representing documents and columns representing topics, while each entry shows the prevalence of a topic in a given document. LDA uses a generative algorithm for inferring the posterior distributions that represent a fitted model for a corpus. This kind of distributions can be analyzed and coupled with other meta-data of documents, e.g., the year of publication. For this study, the methodological sequence covers (i) pre-processing the raw abstracts for increasing the quality of the input data, (ii) topic modeling using LDA, and (iii) analysis of the output regarding topic trends over time, differing thematic fields in the corpus, and topic communities emerging from the network of topics.


An advanced pre-processing procedure served for increasing the data quality of the raw texts. First, several natural language processing (NLP) steps served for standardizing the symbolic representation and harmonization of individual terms. Second, using the TreeTagger software [104,105,106] and the koRpus package [107], a part-of-speech (POS) model served for identifying the grammatical structure of sentences in order to discard irrelevant word classes, here, everything but nouns, verbs, adjectives, and adverbs, and to lemmatize terms. An example of lemmatization, a method for the unification of terms, is provided in Additional file 1. Third, a collocation model detected n-grams, i.e., multi-word terms occurring to a statistically relevant degree in the corpus, to mitigate the assumption regarding the irrelevance of grammar by LDA (explanation see the following section). Collocation detection was performed for noun compounds [108] using pointwise mutual information (PMI) [109] and log-frequency biased mutual dependency (LFMD) [110] as statistical metrics for detecting meaningful collocations. In the final pre-processing step, the vocabulary of the corpus was pruned by setting thresholds for document occurrence of terms and term length and by removing stopwords, e.g., “and,” “the,” or “methodology.” By using this kind of advanced pre-processing procedure beyond common methods such as stemming or setting minimum term length thresholds, this study advances pre-processing procedures of previous studies with similar scope [87, 88, 96, 97].

Topic modeling

Latent Dirichlet allocation

As input to LDA, the pre-processed corpus was vectorized to a document-term-matrix (DTM) representing the counts of individual terms per document. The pre-processed corpus contained 2,018,059 terms in total and 28,768 unique terms. Vectorizing text means that, before topic modeling, a bag-of-words model is adopted assuming that grammar is negligible. The collocation model used here (see the previous section) mitigates this assumption.

For finding a suitable LDA model, several models were created by varying one of the key parameters, i.e., n, the number of topics assumed for the corpus. A common challenge of any modeling approach is that there is not only one single model that potentially fits the data. Varying n served to create a set of models, from which a model with a potentially good fit could modeling. LDA models were generated for n from 5 up to 1100 in steps of 5. Apart from the parameter n, there are two additional key parameters of LDA models. The hyper-parameter α, which determines the granularity of the topic distribution over documents, was set to α = 50/n, and β, which determines the specificity of topics, was set to β = 0.1. These hyper-parameter settings have been proposed by Griffiths and Steyvers in a study with similarly broad scope and size of the corpus [88], which is one of the central references in the field of topic modeling. The chosen values create a model assuming that documents consist of a few key topics and topics consist of a few key terms. This assumption behind these values fits the characteristics of scientific abstracts covering a broad field, here, sustainable energy. For comparison, setting smaller values can be reasonable for studying specific narrower fields. For instance, a study of 17,163 scientific articles on transportation research applied values of n = 50, α = 5/n, and β = 0.01 for modeling sparser distributions of topics and words [97]. Having adjusted the above settings, the comparably fast WarpLDA algorithm [111, 112] served for creating the LDA models. This algorithm allowed creating a large set of models within an acceptable time.

Model selection via coherence

Six different topic coherence metrics programmed in R for this study [100, 109, 111, 113,114,115,116,117,118,119] and a Wikipedia-based reference corpus served for selecting the most suitable LDA model from the set of models created. The maximum model likelihood or subjective choices are often the basis for model selection [88, 96, 97]. Maximum likelihood models may produce topics with comparably low interpretability [112]. Therefore, this study uses coherence metrics as rational selection criteria. Considering general consistency metrics for evidential support [116, 120], Röder et al. proposed a framework for calculating coherence scores of topic models [115]. The basic idea is to pick the top topic terms and to check to what extent their co-occurrence in a reference corpus is statistically related. This study uses the top 10 words per topic. For calculating the coherence of the complete LDA models, the mean was used for aggregating topic scores.

For the intrinsic coherence metrics, logratio [114, 115] and probabilistic difference (DIF) [116, 117], the corpus of the investigation itself served as the reference corpus. For the extrinsic metrics, 1,737 thematically related Wikipedia articles served as the reference corpus. The extrinsic used metrics are pointwise mutual information (PMI) [109, 118], normalized PMI (NPMI) [119], cosine similarity of NPMI vectors (NMPI COSIM) [111], and cosine similarity of NPMI vectors to the sum of the NPMI vectors [115]. These reference articles were selected via snowball sampling, beginning with manually selected thematically relevant portal pages. A web scraping algorithm using the WikipediR package [121] served for downloading these articles.

Methods for analyzing the topic model

Topic trends

This study examines the timely trends of individual topics by coupling the date of publication of individual documents with the posterior probability distribution matrix of topics over documents. For revealing “hot topics” and “cold topics” with increasing or decreasing trends [88], linear models were fit for each topic over the selected time period from 1990 to 2016. This approach has been used in other studies [88, 96, 97, 122]. In addition, this study facilitates the visual inspection of trends by generating smoothed trend lines using locally weighted polynomial regression (LOESS) applied to the mean values of topic probabilities per year [123, 124]. The corrected Akaike Information Criterion (AICc) served for automatically selecting the smoothing span [125, 126].

For highlighting key trends, this study focuses on the topics with the strongest positive or negative trends as well as on topics with the highest abundance. Here, a strong positive trend means that the slope of the linear trend line was equal to or above the 95% quantile regarding the slopes of all topics. The topics with the strongest negative trend were identified using the 5% quantile as a threshold. Further, the abundance of a topic is defined here as the cumulative sum of topic probability over all years and documents. Topics with the highest abundance were identified by limiting the topic selection to the topics with a positive linear trend and setting the 95% quantile regarding abundance as a threshold.

Inter-topic distance and thematic fields

The next analytical step analyzes how individual topics are embedded in the context of all other topics and highlight differing thematic fields. For this purpose, the inter-topic distance, hence, the content-related (dis-)similarity between each pair of topics, was calculated. On this basis, the topics can be clustered into the most different thematic fields of the corpus. For evaluating the distances between the probability distributions given in the columns of posterior distribution matrix of terms over topics, i.e., the topic pairs, the Jensen Shannon Divergence was used as a symmetric similarity metric [127, 128]. Classical multidimensional scaling of similarities [129,130,131], also called principal coordinates analysis [132], then served to project the distance between topics into a two-dimensional space. This step uses an adaption of the code of the LDAvis package [133, 134]. In addition, hierarchical clustering using Ward’s method [135, 136] was applied to the resulting coordinates for detecting the most different thematic fields or discourses.

Topic network and topic communities

A network analysis served for analyzing connections between topics within the individual documents and for identifying major research communities. The co-occurrence of topics, which indicates how often each topic co-occurs with each other topic, can be calculated as the cross product of the posterior document topic distribution matrix. The resulting adjacency matrix constitutes the network of topics. Since LDA assumes a distribution of all topics over all documents, the network is very dense. Documents usually include few topics with high probability but also many with low probability. Hence, following the assumptions of LDA, all topics are somehow connected to all other topics.

For a focused analysis of key linkages only, the study applies several restrictions. To highlight stable or emerging trends, the network is restricted to topics with a positive trend. Furthermore, two types of thresholds serve for pointing out the strongest connections. First, only the top topics per document were included. Second, only co-occurrence values equal or larger than a pre-defined co-occurrence threshold were considered, whereas values below the thresholds were set to zero. Setting thresholds in network analyses is a sensitive issue. Therefore, several networks were generated and compared by setting different thresholds. The comparison included a variation of the top topics per document threshold between 2, 5, and 10 and of the co-occurrence threshold between the 0%, 25%, 50%, and 75% quantile respectively.

The analysis of the resulting network was carried out by calculating the betweenness centrality metric for each topic and applying network community clustering. Topics with high betweenness may be interpreted as bridges [137] or influential nodes regarding network communication [138]. Finally, the Louvain hierarchical clustering algorithm [139, 140] served for detecting communities of multiple topics, which represent research communities. The algorithm uses the idea of maximizing the modularity of a partitioned network [141]. Modularity measures the degree of cross-linking within communities in relation to the degree of cross-linking between communities [141,142,143]. Communities of topics represent sub-networks that stand out of their environment due to their high within-community cross-link density [139].

Methodological limitations and prospects

A basic limitation arises from the database that covers a specific type of research output. Beyond journal articles, the scientific discourse comprises other relevant communication channels not covered here such as conferences, open web archives, or websites. Hence, this study provides a comprehensive review of relevant research but is limited to the part of the discourse accessible via standardized databases. Furthermore, the study is based on a single search phrase. A more complex but laborious approach might use a variation of search phrases and comparatively analyze the resulting sets of documents for establishing an even more differentiated picture.

Key methodological limitations stem from using a particular type of topic model with a specific set of model parameters. As in several other studies with similar scope [87, 88, 96, 97], the basic LDA model produced reasonable insights. Variations of the basic model might provide extended insights, e.g., the dynamic topic model that allows studying the internal development of topic topics over time [144]. Furthermore, varying the hyper-parameters or even using asymmetric hyper-parameters [145] leads to a broader set of topic models, from which a potentially improved model might emerge. For selecting the most appropriate model, this study applies an advanced approach based on coherence metrics. Research regarding suitable combinations of metrics, reference corpora, and parameter settings for different contexts is ongoing [115] and might lead to improved or more broadly proven options for identifying good models.

The probably most notable limitation is the human factor deciding whether the chosen LDA model based on machine learning creates a reasonable lens for interpreting the data. Computers perform a significant dimensional reduction of the complex meanings contained in texts via LDA. Human interpretations of resulting topics have to be made carefully and might even be misleading [146]. Topic models are a “lens” for understanding a corpus [147]. For finding a sufficiently clear lens that allows for useful interpretations, a suitable combination of the data set, model parameters, and content-related background knowledge is required [147, 148]. This study is not exempt from criticism regarding the combination applied. This study uses a lens created by iterative model adaption based on the interaction between machine learning and human interpretations. Admittedly, not all of the resulting individual topics were perfectly interpretable. However, for the majority of topics, the general meaning seemed to be clear intuitively. Since this study aims at identifying large-scale patterns, the lens adopted here was deemed an acceptable basis for further high-level interpretations.


Number of articles over time

For an overview of publication dynamics, the cumulative number of abstracts over time was modeled as an exponential curve shown in Fig. 1a. Publication activities before 1990 were sparse. To avoid fitting noise around the baseline, the models created for this study, including the topic models, only consider the documents published since 1990. For modeling publication dynamics, different models were compared, of which the exponential model cum _ n = 8.6037e − 158  exp(0.18445  Year) had the best fit with a high R2 of 0.9996 and a mean relative error of 10%. This accuracy seems acceptable for a simple descriptive purpose. In addition to the cumulative numbers, Fig. 1b shows the number of publications per year. Furthermore, Fig. 1c shows the steep growth compared to the publication dynamics retrieved by using only the search term energy.

Fig. 1

a Cumulative number of Scopus-indexed journal articles per year, which contain the search terms sustainab* and energy; records before 1990 are just shown as supplementary information and have not been included in any of the modeling procedures of this study. b Number of articles per year. c Comparison of the cumulative number from a on a semi-logarithmic scale to the cumulative number of articles retrieved from using the single search term energy; the graphical presentations in a and b are inspired by and allow a direct comparison to [96]

LDA model coherence

After having evaluated the coherence scores and screened the top topic terms of the various LDA models, the model with 300 topics appeared to be the most appropriate. The coherence metrics DIF, NPMI, and NPMI COSIM were the most informative in the sense of showing comparably clear maxima. Figure 2 shows the scores of the fitted LDA models. For direct comparison, scores are normalized. For all metrics, maxima are recognizable for models with up to 55 topics. Manually examining the top topic terms of the corresponding models revealed easily interpretable topics. However, these general topics with a coarse resolution did not provide sufficient insights for investigating, e.g., individual disciplines or technologies. Aiming at a higher level of detail here, the models with the clearest maxima beyond 55 topics were examined manually. The DIF metric, which measures coherence on a document level, proposed to use the model with 200 topics, whereas the NPMI metric, which measures coherence at the sentence level, proposed to use the model with 300 topics. Manual inspection of these models led to the decision to study the model with 300 topics since it provides a higher resolution regarding individual fields. As an example, the model with 200 topics contains an integrated topic on solar energy addressing photovoltaic systems and solar thermal energy together. The model with 300 topics generated two separate topics for these two fields.

Fig. 2

Normalized coherence metrics for a varying number of topics based on Wikipedia reference corpus; vertical dashed lines mark peaks indicating potentially optimal numbers of topics proposed by different metrics

Topic trends

There are 137 hot topics, of which the 15 topics with the strongest positive trends address specific technical fields revolving around the following superordinate themes: electrical energy storage, fuel cells, photocatalytic hydrogen production, nanotechnology, chemical catalysis, digital network communication, motion energy harvesting, sustainable concrete, biofuels, optimization, and modeling. Generally, the topics tend towards materials science. Table 1 lists information on the 15 topics with the strongest positive trend. Figure 3a visualizes the topic trends of the top 5 hot topics. Details on p values and the linear slopes of the trend lines are available in the Additional file 1. The p values for the slopes are all below p = 1e−12 and indicate the significance of the trends.

Table 1 Hot topics with the strongest positive trend
Fig. 3

Topic trend plot; the 5 topics with the strongest positive trend (a) or strongest negative trend (b); the slope rank of 1 indicates that the slope of the linear trend line has the highest overall value and a rank of 300 indicates the lowest value; dots represent the mean topic probability per year; the solid lines show AICc-optimized LOESS-smoothed trend lines; the dashed lines show the linear trend lines based on fitting linear models on the topic probabilities per year (not the mean values); since publication rates increase over the years (see Fig. 1b), the recent years have a stronger influence on the linear trend lines than earlier years

The topics with the highest abundance include topics speaking of life cycle assessment, heating systems, and urbanization. Table 2 lists information on the highly abundant topics. Four topics with high abundance also belong to the group of topics with the strongest positive trends (see Table 1). Those address energy storage, catalysis, building materials, and general aspects concerning optimization. The three additional topics without such overlap address: sustainable urbanization at city and neighborhood scale at rank 2, life cycle analysis with a focus on environmental impacts at rank 3, and heating systems including cooling options at rank 6.

Table 2 Highly abundant topics

There are 163 cold topics, of which the 15 topics with the strongest negative trends address a mixture of aspects regarding sustainable development and related economic, environmental, and political issues from a practical and theoretical perspective. This includes issues of international cooperation and trade, legislation and regulation, electricity markets, economic growth and quality of life, rural areas, agricultural systems, and forestry. Also, nuclear energy is among the cold topics. Table 3 lists information on the 15 coldest topics with the strongest negative trend. The p values for the slopes are all below p = 1e−6 and indicate the significance of the trends. The topic trends of the top 5 cold topics are visualized in Fig. 3b.

Table 3 Cold topics with the strongest negative trend

Inter-topic distance and thematic fields

The topics with a positive trend may be clustered into four major thematic fields highlighting the general topic (dis-)similarity. The inter-topic distance presented in Fig. 4 is independent of the prevalence of topics. It merely shows the content-related distance (see also section “Materials and methods”). The clustering dendrogram, which is provided in Additional file 1, suggested to partition into the following clusters: (1) low-carbon transitions and decision-making, (2) monitoring and optimization, (3) materials science and process engineering, and (4) (renewable) power systems. While highlighting thematic fields, Fig. 4 also provides a more general overview of topics with a positive trend that is not limited to the top trends as the results of the previous section.

Fig. 4

Inter-topic distance or principal coordinates plot of the hot topics, i.e., topics with a positive trend, showing the top 5 terms per topic; for better readability, only the top 60% of these topics are shown; main thematic gradients or differences may be identified by inspecting the topics along the principal coordinates from left to right and from top to bottom; terms are colored from light to dark grey according to their membership to clusters resulting from clustering the coordinates; the 4 cluster areas indicating main differing thematic fields are highlighted in grey; for avoiding text overlapping several topic terms are moved from their original positions, which are shown as dots, to which they are linked via a straight lines

Topic network and topic communities

Connections between topics and resulting topic communities were studied by means of the network generated from the top 10 topics per document with the co-occurrence threshold set to the 50% quantile. The reasoning for selecting this network by manual inspection from the variation of networks was to study a network that potentially includes broad thematic connections that, at least, have a medium connection strength. Figure 5 shows the network with a focus on topic communities and betweenness centrality scores. The figure only shows a selection of the strongest connections between individual topics for providing a general impression of the interconnectedness. A detailed analysis of individual connections is not provided here and might be the subject of future studies. The emerging topic communities discussed below were similar in meaning across the generated network variations. Hence, the information provided on topic communities can be considered robust within the range of the tested parameters. However, the betweenness centrality scores changed with varying co-occurrence thresholds. The findings based on the analysis of these scores are only valid for the specific parameter settings used here.

Fig. 5

Topic network highlighting four topic communities A to D that emerge from Louvain network clustering; network generated by limiting to top 10 topics per document and setting minimum threshold for co-occurrence to the 50% quantile (see section “Methods for topic model analysis” for details); plotting parameters: (i) line width of connections proportional to co-occurrence, (ii) plot limited to the top 1.5% strongest connections, (iii) topics with strongest positive trends (see Table 1) in bold letters, (iv) graph shows two link types, connections are plotted in dark grey when both topics of a pair have a strong positive trend; connections are plotted in light grey if only one of the topics has a strong positive trend, (v) vertex size proportional to betweenness centrality, (vi) top 5% vertices regarding betweenness centrality (see subsection Topic network and topic communities” under the “Results” section) marked in grey

Inspecting the labels of the network community vertices (Fig. 5) reveals four topic communities A to D addressing biofuels, materials science, renewable power systems, and sustainability transitions. Community A links topics about feedstocks and biological process engineering for biofuels to reduce greenhouse gas emissions. In this context, a vital task of process engineering is optimizing operational performance for achieving the maximum yield. Target products are biomass or liquid and gaseous biofuels, including hydrogen. Community B focuses on materials science for different fields of application. This community involves the highest number (8 out of 15) of topics with strong trends (Table 1) in comparison to other communities. Aspects covering chemical synthesis and catalysis in general, thermal treatment, or peculiarities at the nanoscale are linked to more specific fields of application such as photochemistry and photonics, electrochemistry, building, or composite materials. Community C links topics concerned with monitoring and optimization of renewable power systems considering technologies such as wind, photovoltaic, or geothermal energy, but also electric vehicles. For this dynamic task, the topics of this community address digital monitoring solutions for analyzing, simulating, and forecasting production and demand in order to optimize operational performance, e.g., in smart grids. Community D revolves around assessment and decision-making for sustainability transitions. This community does not include any of the topics with the strongest positive trends (Table 1). The community involves setting up transition initiatives, strategies, or frameworks. These activities are informed by life cycle assessments regarding environmental impacts from energy consumption, especially in terms of carbon footprints, as well as water consumption. The community particularly considers consumer decisions in an urban context as well as business development, e.g., regarding supply chains or business barriers and opportunities.

Analyzing betweenness centrality scores provides indications concerning key bridging topics in the network. Topics dealing with simulation and forecasting, especially in the context of controlling electric grids, have a strong interconnecting role. This also applies to carbon footprints and climate change mitigation. In this context, business opportunities and barriers are other cohesive themes. Additional file 1 provides further details on betweenness centrality scores and the topics with high scores.


The selected LDA model with 300 topics provides indications on the strongest topic trends, inter-topic distances or general thematic areas, and topic communities in the research field dedicated to sustainable energy. Regarding the high publication rates (Fig. 1), this research field can be considered as dynamically growing. This is encouraging considering the severe global sustainability problems caused by the current energy system [5]. For solving these problems, it will be crucial that energy research further contributes to a transformation towards sustainability and advances its research structure.

In this context, the following sections discuss selected results of this study critically. The patterns recognizable in the results are starting points for various interpretations that emerge when relating these patterns to or underlining them with selected literature, including but not limited to some of the articles reviewed in the “Background” section. As indicated already, this study can confirm several of the trends summarized in the “Background” section, show some in a different light, identify new trends, and point out blind spots. For example, for electricity generation, the discourse clearly focuses on photovoltaic and wind energy, whereas highly cited reviews dedicated to sustainable energy still discuss conventional options [32,33,34,35]. Further, this study can confirm the dominance of batteries for electrifying the transportation sector [35, 40, 63, 68]. Interestingly, regarding fuel cells, the discourse tends to associate them with biofuels rather than hydrogen, which indicates that a comparably low realization potential is ascribed to the vision of a hydrogen economy [33, 57,58,59,60,61,62]. This study further detects the silent rise of research on mechanical energy harvesting as a potentially relevant new trend. Regarding blind spots, this study validates the concerns that have been anticipated already, e.g., regarding the availability of material resources for energy transitions [81, 149,150,151], and points out a lack of attention to the role and structure of different end-use sectors. Beyond dealing with these thematically specific trends, this study can, with its holistic approach, show that the discourse is moving away from the multi-faceted concept of sustainability to a more narrow technological perspective.

It should be noted again that the topic model serves for identifying average trends regarding the increase or decrease of attention to specific topics. Thus, if certain themes seem to be missing in the academic landscape, it can only be concluded that the majority of articles do not refer to these themes, but not that none refers to them.

Considering the hybrid nature of the field involving energy research and sustainability science, the discussion addresses these two fields separately. The first part of the discussion is organized along key energy system stages: conversion, storage and distribution, markets, and end-use sectors [152,153,154]. The second part provides reflections from an overarching sustainability perspective. Acknowledging the various conceptualizations of sustainability [13, 14, 155,156,157,158,159], this study focuses on selected elements: justice between generations, societal sub-systems, levels or scales, and the operational principles of strong sustainability.

Energy system

If the trends identified via topic modeling became reality, the future energy system would be highly electrified using photovoltaic and wind energy but would also intensively make use of bioenergy (compare: cluster 4 in Fig. 4; community C in Fig. 5; hot topics in Table 1). This observation is largely in line with the trends in the literature highlighted in the “Background” section [30,31,32,33,34,35]. The research field seems to be distancing itself from technologies with high damage potential in the case of failure. For instance, expanding deep geothermal energy might be connected to high risks [33, 34, 37]. This might explain why geothermal energy only receives an accompanying role here (only present in community C in Fig. 5 but not in Fig. 4 using a threshold regarding the strength of trends). Furthermore, topics on nuclear energy or fossil power plants, e.g., in combination with CCSU, which are part of several reviews addressed in the “Background” section [32,33,34,35], did not emerge from this study as prominent topics (compare: cold topics 225, 10 in Table 3). Instead, the results of this study suggest that the focus lies on improving technologies with potentially lower direct risks. The discourse seems to perceive the basic renewable energy conversion as a mastered task and now traverses on the learning curve to a phase focusing on optimization (compare: topic 90 in Table 1, community C in Fig. 5). This involves more detailed technological development and physico-chemical advancement of materials and processes. This study indicates that, in the near future, improvements might be expected, in particular, for biotechnology and photovoltaics (compare: community A and community B in Fig. 5).

While advancing conversion technologies with low direct risks, long-term or latent risks connected to the upstream or downstream energy system stages might require greater attention and accompanying strategies. Due to the urgency to decarbonize the energy system and the renewable energy potential, the recent focus on advancing energy conversion is justifiable. However, the material basis for renewable power generation units seems to be taken for granted and potential social or environmental conflicts arising from their construction and operation have not been prominent topics. This study shows that current research already deals with advancing material properties in the production phase and, to a certain extent, considers life cycle assessment in connection with biomass from microalgae and building materials (compare: community A and community C in Fig. 5). However, research does not seem to apply a comprehensive life cycle perspective for discovering latent risks. Furthermore, the topic model does not identify direct attention to the supply and recycling of materials, especially metals, for energy systems. Several studies, including a few of the reviews screened for the “Background” section, underline the relevance of these issues [81, 149,150,151]. However, the results of this study indicate that these topics generally receive low attention. Future research on energy systems might be broadened and based on a more integrated energy and materials perspective.

In parallel to research on energy conversion, intensive research on energy storage and non-fossil fuels contributes to establishing renewable energy systems. The research on battery or supercapacitor technologies advances crucial elements of future electric grids, which need to be capable of integrating and balancing fluctuating renewable energy generation at large scale. The reviews screened for the “Background” section [35, 38,39,40] as well the topic model highlight the efforts made in these fields (compare: hot topics 1, 109 in Table 1). Even more challenging might be decarbonizing the transportation sector using small storage units and alternative fuels. Concerning the assumption that batteries might precede fuel cells in the near future [35, 39], the results of this study point in the same direction (compare: rank of hot topics 57 and 13 in Table 1). However, they also indicate significant progress in fuel cell technology. The hot topics identified here suggest that the fuel converted in these cells will probably not be hydrogen from electrolysis but biofuels (compare: hot topics in Table 1, community A in Fig. 5). This assumption results from the observation that topics addressing a large-scale hydrogen infrastructure, which is a major bottleneck for establishing a hydrogen economy, do not emerge as prominent topics. Instead, the topic model reveals a strong focus on bioenergy. Although the availability of sustainable biomass is limited [160], the progress of biotechnology for different feedstocks identified here could boost the biofuels pathway. In line with other studies [161], this indicates that it is unclear whether the transportation sector will follow the bioenergy or battery pathway, or which type of mix thereof. This emphasizes that scenarios and decision-making on the future of transportation are very sensitive and need to take into account the uncertainty regarding the learning curves of various technologies.

In the context of complex renewable energy systems, digital technologies and artificial intelligence for optimizing performance are already incorporated as future-oriented solutions and will require continued attention. The main application of digital technologies identified here is the optimization of renewable energy systems (compare: hot topic 255 and community C in Fig. 5). In this field, digitalization is expected to have a net positive effect regarding climate change mitigation [162] and promises further improvements for planning, operating, and managing energy systems including the various end-use sectors [162,163,164,165,166,167,168,169,170]. However, potential negative effects need to be taken into account such as rebound effects, e.g., in the transportation sector [167, 171], or socio-economic concerns regarding the replacement of human labor by machines [167]. With the increasing digitalization of the energy system, accompanying strategies that support the beneficial effects and decrease the adverse effects will be crucial [162, 167, 172].

While being concerned with technological progress, the academic field seems to perceive the task of establishing energy markets as an operational or political one. A possible interpretation of the prominence of optimization methods (compare: hot topics 90, 255 in Table 1, community C in Fig. 5) and the emergence of business environments as a bridging topic (compare: topic with high betweenness centrality at the bottom of community D in Fig. 5) is that the research field takes for granted that energy markets are functioning reliably. Efforts for optimizing business operations would probably be less evident if the reliability of markets would be questioned fundamentally. Also, the observation that topics on electricity markets and regulation receive declining attention strengthens this assumption (compare: cold topics 49, 221 in Table 3). A possible interpretation is that, in the context of increasing liberalization [173], science tends to see the responsibility of establishing markets with politics [35]. However, establishing decentralized renewable energy markets that fit the different regional requirements is challenging [173, 174]. Therefore, strengthening cooperative research between science and politics might be reasonable [174]. Such research efforts might examine, in particular, the social, economic, and political struggles of implementation for facilitating the establishment of sustainable markets. Further, in the context of increasing decentralization, the declining attention to international trade, e.g., in global energy markets, appears logical (compare: cold topics 104, 185 in Table 3). However, considering the uneven global distribution of material resources and possible political or economic tensions, the international resource markets might become a decisive factor for developing the future energy infrastructure and, as this and other studies highlight, require increased attention [175, 176].

Regarding the end-use stage, research focuses on energy end-use by residents in urban areas (cluster 1, 4 in Fig. 4; community B and community C in Fig. 5; abundant topics in Table 2). Green buildings and electric vehicles are addressed as two key end-use sectors associated with the technical urban environment of individual consumers. Another very prominent topic is concerned with harvesting mechanical or motion energy (hot topic 111 in Table 1). The latter is identified here as a growing research field that does, however, not appear prominently in review articles on sustainable energy yet such as the ones referred to in the “Background” section [30,31,32,33,34,35]. The above topics indicate that research on sustainable energy pays increasing attention to urban living environments and, thus, is connecting to the level of individual consumers. A part of the research is also concerned with raising consumer awareness via product labeling (compare: bottom of cluster 1 in Fig. 4 or bottom of community D in Fig. 5). However, in sum, technological research seems to prevail. For instance, as discussed above, efforts for increasing sustainability in the transportation sector are primarily seen in improving fuel technologies. This and other studies show that systemic or behavioral energy-saving potentials that do not primarily stem from advancing individual technologies have received comparably low attention, e.g., traffic planning, improving public transport systems, or increasing vehicle occupancy [34, 97, 161] (compare: no corresponding links in community D in Fig. 5). This seems surprising considering the high relevance of urbanization identified (compare: abundant topic 100 in Table 2). A possible interpretation is that research has not yet sufficiently understood the complex modern urban systems and transportation systems for establishing a consolidated research structure. More research will be needed for unraveling this complexity and understanding the interactions between these fields.

The majority of energy-intensive industries receive comparably low attention, except the cement industry and, to a certain extent, the manufacturing industry (compare: hot topic 125 in Table 1, community D in Fig. 5). Several energy-intensive industrial branches that have significant carbon dioxide emission reduction potentials, especially the steel sector [177, 178], are not among the prominent topics. Only non-metallic building materials, especially concrete, receive high attention. Although a similar potential exists in the steel industry [178], a reason for the cement industry standing out might be its potentials at the material level. The cement industry offers comparably high potential for reducing carbon dioxide emissions, e.g., by using replacement materials [178,179,180]. Due to the tendency of the discourse towards materials science, potentials connected thereto might receive higher attention than process technology options. This might be an indication that, to a certain degree, research distances itself from the traditional industry. Only one of the topics in Fig. 4 (compare: top of cluster 2) directly refers to another industrial branch, i.e., the manufacturing industry, but is rather connected to the micro-level of energy-efficient machines than to a meso- or macro-perspective as applied, e.g., in the field of industrial ecology [181, 182] or circular economy [183, 184]. Also, only a few reviews presented in the “Background” section deal with industry as part of the energy system [33, 63]. Further, the ones that do so only apply a technological perspective regarding decarbonization opportunities. These observations emphasize that future research will have to understand better the different industrial sectors and their interactions with the energy system for leveraging decarbonization potentials and for supporting the industrial transition towards sustainability. In this context, research should not only focus on the production phase in value chains. This proposal is in line with other studies calling for a more integrated perspective on sustainable product-service systems that consider the interplay of consumers, i.e., the users of energy-consuming products, with the phases of product design, manufacturing, and recycling [185].


This study indicates that research on sustainable energy is navigating towards a technology-oriented perspective (compare: hot topics in Table 1) and is moving away from the normative concepts connected to sustainability and sustainable development that have initially motivated this research field (compare: cold topics in Table 3). This becomes apparent when examining the various cold topics related to sustainable development. An alternative interpretation would be that the normative concepts have been integrated to an extent that makes the need for explicitly referencing conceptual ideas obsolete. However, based on the following discussion, this study tends to conclude that conceptual ideas of sustainability actually have decreasing influence.

Research on sustainable energy is clearly concerned with inter-generational justice, whereas the attention to intra-generational justice seems less apparent. The strong focus on renewable energies represents efforts to ensure the availability of energy while limiting the negative effects of climate change. These efforts do not guarantee to secure a livelihood for future generations but have the potential to contribute to it. However, as shown in the previous section, the attention to depletable material resources is low. The physical availability of raw materials might not be the major bottleneck in the near future for establishing a low-carbon economy, whereas environmental, social, and economic issues of resource extraction seem to be more relevant [150]. Therefore, the latter might deserve greater attention. Turning to intra-generational justice, there are no prominent topics addressing, e.g., energy poverty or land-use conflicts. This indicates that issues of intra-generational justice might be underrepresented. For not leaving anybody behind, research on, e.g., the relationship between developed and developing countries might be strengthened.

Regarding the systemic perspective adopted, research seems to follow a socio-technical system perspective, which might benefit from increased attention to the environmental system (compare: communities A, B, C vs. D in Fig. 5; cold topics in Table 3). This study identifies a focus on infrastructural and technological systems (compare: hot topics in Table 1). However, topic community D also refers to several societal sub-systems, especially the economic, the social, and the government system. The consideration of operational aspects of transition processes involving these systems indicates an action-oriented research agenda, which matches the scholarly tradition of transition management for socio-technical systems (STS) [186, 187]. This match is more obvious than it would be for, e.g., the research tradition on social-ecological systems (SES) [1, 157]. While in research on SES and STS parts of the societal subsystems considered are similar, research on SES ascribes a stronger role to the ecological system. In the discourse investigated here, the environment system is considered by several topics addressing assessment of environmental footprints (compare, e.g., topic 276 in Table 2). However, apart from referring to such aggregated indicators, explicit considerations on specific elements of the natural system appear to be missing among the major topics. Independent of which research tradition will prevail in future research, it is vital to strengthen the focus on interactions of modern energy systems with nature for not overseeing potential harmful effects.

Although a social science perspective is clearly present in research on sustainable energy (compare: cluster 1 in Fig. 4 and community D in Fig. 5), it could experience a higher degree of integration with the strongly growing technological perspectives. A key indication for this can be derived by studying the four most distant thematic fields (Fig. 4) and the communities in the topic network (Fig. 5) in parallel. Technology-related topics from the thematic fields mix up across the topic communities. Hence, an achievement of research on sustainable energy is that research has become interdisciplinary in terms of technology. However, the majority of topics of the theme about low carbon transitions and decision-making mainly end up in one single topic community. This community incorporates aspects related to the social sciences. Since it shows weak interconnections with technological themes, there seems to be potential for improving the interdisciplinary collaboration between technological research and the social sciences. Other studies have already found that there is a lack of social science perspectives in the general field of energy research [188,189,190]. This study shows that in research on sustainable energy, the integration of the social sciences is higher but could still be improved. A starting point might be the bridging topics related to data science (compare: bottom right of community C in Fig. 5). Data science is common to technical but also social science domains. Connecting different domains might happen based on connected data infrastructures accompanied by personal interdisciplinary exchange for making sense of, e.g., prediction models.

Regarding the levels or scales considered, the global perspective seems to fade and might be reinforced (compare: abundant topics in Table 2; cold topics in Table 3). Considering the increasing urbanization trend [191, 192], the abundance of topics dealing with urban areas seems reasonable for addressing critical sustainability problems that affect a high share of the global population. Engaging with the local level for providing insights into bottom-up developments and understanding the respective local dynamics is of significant importance. However, at the same time, successful approaches for sustainability also require an integrated global perspective and a top-down perspective [6]. This kind of aggregated perspective is necessary for avoiding isolated knowledge processes but instead integrating the valuable insights from the local level. The global perspective was key for supporting the normative sustainability agenda that is increasingly integrated into human society. In research on sustainable energy, this kind of global perspective seems to fade and might be reinforced by connecting the various available local perspectives in order to support solutions to sustainability problems, e.g., via intergovernmental cooperation and regulation.

Regarding the triad of operational principles of strong sustainability, i.e., consistency, efficiency, and sufficiency, a clear focus lies on the first two (compare: community C in Fig. 5; abundant topics in Table 2). Consistency is clearly addressed through research on renewable energy technologies using non-depletable energy resources. However, as already mentioned above, more attention should be paid to the availability, consumption, and recycling of depletable material resources. Turning to the next principle, efficiency receives high attention for optimizing power systems and saving energy in the end-use sector. This principle is connected to advanced technologies that save costs while keeping the level of comfort. In contrast, the presence of the principle of sufficiency, which favors foregoing consumption instead of only optimizing it, is not as clear, although the literature has highlighted it as a necessary companion of efficiency [41, 42, 172, 193]. In summary, this indicates that research on sustainable energy has not yet integrated all the principles in a balanced way that are necessary for heading towards a steady-state-economy that is not governed by a paradigm of growth but of (sustainable) development [194].


This study provides a review of research on sustainable energy based on an advanced latent Dirichlet allocation topic modeling approach for detecting high-level patterns. The main overarching pattern identified is that the discourse is latently adopting a technology-oriented paradigm and is moving away from the multi-faceted concept of sustainability. The study highlights that the research field on sustainable energy is focusing on finding ways to establish and optimize renewable energy systems by reverting to materials science, (biological) process engineering, and digital monitoring and control systems. From a technological perspective, research on sustainable energy seems to keep up with technological progress and has the potential to contribute to climate change mitigation. However, given the complexity of renewable energy systems, no straightforward technological pathway could be identified. Therefore, this study recommends improving horizontal integration of the various valuable vertical research strands for preparing scientific-technological knowledge in a way that enables efficient and far-sighted decision-making. For establishing sustainable energy systems, advancing research on sustainable energy will require not only targeting the core technical energy infrastructure, for which many solutions have been proposed already, but strengthening the focus on issues that can be perceived from a holistic second-order perspective. Therefore, this study further recommends to re-strengthen a holistic ecological perspective on energy systems considering the global scale, e.g., by considering the complete material and environmental life cycle of the energy infrastructure. Beyond considering the physical dimensions of energy systems, another key recommendation of this study is to strengthen the existing links of the research field to the social sciences. This will be crucial for a balanced discourse completing the technology-orientated agenda that research has increasingly been adopting in recent decades. Extending the research scope in this way would support an explicit consideration of all societal subsystems required for a sustainability transformation.

Availability of data and materials

The datasets generated and analyzed during the current study and the code used in the study are available in the Github repository in form of an R package. The data and code are further archived in the Zenodo online library The doi number is .

Due to copyright restrictions, the raw abstract texts could not be included in the dataset. However, the dataset includes the document-term-matrix used as the basis for the analysis. The dataset also includes the doi numbers of the abstracts analyzed, so that the original text data may be accessed by users with the required access rights.



(Normalized) pointwise mutual information


Carbon capture, storage, and use


Difference (metric)




Latent Dirichlet allocation


Log-frequency biased mutual dependency


Natural language processing


NPMI cosine similarity (to set)




Social-ecological system


Socio-technical system


  1. 1.

    Rockström J, Steffen W, Noone K et al (2009) Planetary boundaries: exploring the safe operating space for humanity. Ecol Soc 14(2).

  2. 2.

    Steffen W, Richardson K, Rockström J et al (2015) Sustainability. Planetary boundaries: guiding human development on a changing planet. Science 347(6223):1259855.

    Article  Google Scholar 

  3. 3.

    IPCC (2015) Climate change 2014: synthesis report: Contribution of Working Groups I, II and III to the Fifth Assessment Report of the Intergovernmental Panel on Climate Change [Core Writing Team, R.K. Pachauri and L.A. Meyer (eds.)]. IPCC, Geneva, Switzerland

  4. 4.

    Schlör H, Fischer W, Hake J-F (2015) The system boundaries of sustainability. J Cleaner Prod 88:52–60.

    Article  Google Scholar 

  5. 5.

    Johansson TB, Patwardhan AP, Gomez-Echeverri L et al. (eds) (2012) Global energy assessment: toward a sustainable future, Cambridge University Press, Cambridge, UK and New York, NY, USA and International Institute for Applied Systems Analysis, Laxenburg, Austria

  6. 6.

    Kates RW (2001) Sustainability Science. Science 292(5517):641–642.

    Article  Google Scholar 

  7. 7.

    Clark WC, Dickson NM (2003) Sustainability science: the emerging research program. Proc. Natl. Acad. Sci. U.S.A. 100(14):8059–8061.

    Article  Google Scholar 

  8. 8.

    Heinrichs H, Martens P, Michelsen G et al. (eds) (2016) Sustainability science: an introduction, 1st ed. 2016. Springer Netherlands, Dordrecht, s.l.

  9. 9.

    WCED (1987) Our common future: Brundtland Report. World Commission on Environment and Development, Oslo

  10. 10.

    UN (2011) The millennium development goals report 2011. United Nations, New York

  11. 11.

    UN (2018) The Sustainable Development Goals Report 2018. United Nations, New York

  12. 12.

    UN (1992) Agenda 21. United Nations, New York

  13. 13.

    Michelsen G, Adomßent M, Martens P et al (2016) Sustainable development – background and context. In: Heinrichs H, Martens P, Michelsen G et al (eds) Sustainability Science. Springer Netherlands, Dordrecht, pp 5–29

    Chapter  Google Scholar 

  14. 14.

    Robert KW, Parris TM, Leiserowitz AA (2005) What is sustainable development? Goals, indicators, values, and practice. Environment 47(3):8–21.

    Article  Google Scholar 

  15. 15.

    Costanza R, Daly HE (1987) Toward an ecological economics. Ecol Model 38(1-2):1–7.

    Article  Google Scholar 

  16. 16.

    UN (2015) Transforming our world: the 2030 agenda for sustainable development: A/RES/70/1, United Nations, New York

  17. 17.

    EC (2011) Energy roadmap 2050: COM(2011) 885. European Commission, Brussels

  18. 18.

    CHN (2016) The 13th Five-Year Plan for Economic and Social Development of The People’s Republic of China,; Accessed 29 Aug 2018

  19. 19.

    UNDP CHN (2016) 13th Five-Year Plan: what to expect from China. ISSUE BRIEF No. 15 - Domestic Policies. United Nations Development Program China, Beijing. Accessed 29 Aug 2018

  20. 20.

    BMWi, BMU (2010) Energiekonzept für eine umweltschonende, zuverlässige und bezahlbare Energieversorgung, Bundesministerium für Wirtschaft und Technologie, Bundesministerium für Umwelt, Naturschutz und Reaktorsicherheit Berlin. Accessed 21 Oct 2018

  21. 21.

    GER (2011) Energiewende - die Gesetze. Die Bundesregierung Deutschland, Berlin, Germany. Accessed 29 Aug 2018

  22. 22.

    Gawel E, Lehmann P, Korte K et al (2014) The future of the energy transition in Germany. Energ Sustain Soc 4(1):73.

    Article  Google Scholar 

  23. 23.

    Fischer W, Hake J-F, Kuckshinrichs W et al (2016) German energy policy and the way to sustainability: five controversial issues in the debate on the “Energiewende”. Energy 115:1580–1591.

    Article  Google Scholar 

  24. 24.

    Peidong Z, Yanli Y, Jin S et al (2009) Opportunities and challenges for renewable energy policy in China. Renewable and Sustainable Energy Reviews 13(2):439–449.

    Article  Google Scholar 

  25. 25.

    Lo K (2014) A critical review of China’s rapidly developing renewable energy and energy efficiency policies. Renewable and Sustainable Energy Reviews 29:508–516.

    Article  Google Scholar 

  26. 26.

    Lehmann P, Creutzig F, Ehlers M-H et al (2012) Carbon lock-out: advancing renewable energy policy in Europe. Energies 5(2):323–354.

    Article  Google Scholar 

  27. 27.

    Komiyama H, Takeuchi K (2006) Sustainability science: building a new discipline. Sustain Sci 1(1):1–6.

    Article  Google Scholar 

  28. 28.

    Spangenberg JH (2011) Sustainability science: a review, an analysis and some empirical lessons. Envir Conserv 38(03):275–287.

    Article  Google Scholar 

  29. 29.

    Kajikawa Y, Tacoa F, Yamaguchi K (2014) Sustainability science: the changing landscape of sustainability research. Sustainability Sci 9(4):431–438

    Article  Google Scholar 

  30. 30.

    Breyer C, Gerlach A (2013) Global overview on grid-parity. Prog Photovolt Res Appl 21(1):121–136.

    Article  Google Scholar 

  31. 31.

    Bazmi AA, Zahedi G (2011) Sustainable energy systems: role of optimization modeling techniques in power generation and supply—a review. Renewable Sustainable Energy Rev 15(8):3480–3500.

    Article  Google Scholar 

  32. 32.

    Asif M, Muneer T (2007) Energy supply, its demand and security issues for developed and emerging economies. Renewable Sustainable Energy Rev 11(7):1388–1413.

    Article  Google Scholar 

  33. 33.

    Muradov N, Veziroglu T (2008) “Green” path from fossil-based to hydrogen economy: an overview of carbon-neutral technologies. Int J Hydrogen Energy 33(23):6804–6839.

    Article  Google Scholar 

  34. 34.

    Lior N (2010) Sustainable energy development: the present (2009) situation and possible paths to the future. Energy 35(10):3976–3994.

    Article  Google Scholar 

  35. 35.

    Chu S, Majumdar A (2012) Opportunities and challenges for a sustainable energy future. Nature 488(7411):294–303.

    Article  Google Scholar 

  36. 36.

    Kuvlesky WP, Brennan LA, Morrison ML et al (2007) Wind energy development and wildlife conservation: challenges and opportunities. J Wildlife Manag 71(8):2487–2498.

    Article  Google Scholar 

  37. 37.

    Baños R, Manzano-Agugliaro F, Montoya FG et al (2011) Optimization methods applied to renewable and sustainable energy: a review. Renewable Sustainable Energy Rev 15(4):1753–1766.

    Article  Google Scholar 

  38. 38.

    Evans A, Strezov V, Evans TJ (2012) Assessment of utility energy storage options for increased renewable energy penetration. Renewable Sustainable Energy Rev 16(6):4141–4147.

    Article  Google Scholar 

  39. 39.

    Yoo HD, Markevich E, Salitra G et al (2014) On the challenge of developing advanced technologies for electrochemical energy storage and conversion. Mater Today 17(3):110–121.

    Article  Google Scholar 

  40. 40.

    Larcher D, Tarascon J-M (2015) Towards greener and more sustainable batteries for electrical energy storage. Nat Chem 7(1):19–29.

    Article  Google Scholar 

  41. 41.

    Lund H, Werner S, Wiltshire R et al (2014) 4th Generation District Heating (4GDH). Energy 68:1–11.

    Article  Google Scholar 

  42. 42.

    Lund H, Andersen AN, Østergaard PA et al (2012) From electricity smart grids to smart energy systems – a market operation based approach and understanding. Energy 42(1):96–102.

    Article  Google Scholar 

  43. 43.

    Sims R, Hastings A, Schlamadinger B et al (2006) Energy crops: current status and future prospects. Global Change Biol 12(11):2054–2076.

    Article  Google Scholar 

  44. 44.

    Purnick PEM, Weiss R (2009) The second wave of synthetic biology: from modules to systems. Nat Rev Mol Cell Biol 10(6):410–422.

    Article  Google Scholar 

  45. 45.

    Himmel ME, Ding S-Y, Johnson DK et al (2007) Biomass recalcitrance: engineering plants and enzymes for biofuels production. Science 315(5813):804–807.

    Article  Google Scholar 

  46. 46.

    Agbor VB, Cicek N, Sparling R et al (2011) Biomass pretreatment: fundamentals toward application. Biotechnol Adv 29(6):675–685.

    Article  Google Scholar 

  47. 47.

    Goldemberg J (2007) Ethanol for a sustainable energy future. Science 315(5813):808–810.

    Article  Google Scholar 

  48. 48.

    Fernando S, Adhikari S, Chandrapal C et al (2006) Biorefineries: current status, challenges, and future direction. Energy Fuels 20(4):1727–1737.

    Article  Google Scholar 

  49. 49.

    Logan BE, Call D, Cheng S et al (2008) Microbial electrolysis cells for high yield hydrogen gas production from organic matter. Environ Sci Technol 42(23):8630–8640.

    Article  Google Scholar 

  50. 50.

    Ahmad AL, Yasin NM, Derek CJC et al (2011) Microalgae as a sustainable energy source for biodiesel production: a review. Renewable Sustainable Energy Rev 15(1):584–593.

    Article  Google Scholar 

  51. 51.

    Patil V, Tran K-Q, Giselrød HR (2008) Towards sustainable production of biofuels from microalgae. Int J Mol Sci 9(7):1188–1195.

    Article  Google Scholar 

  52. 52.

    Wigmosta MS, Coleman AM, Skaggs RJ et al (2011) National microalgae biofuel production potential and resource demand. Water Resour Res 47(3):294.

    Article  Google Scholar 

  53. 53.

    Franks AE, Nevin KP (2010) Microbial fuel cells, a current review. Energies 3(5):899–919.

    Article  Google Scholar 

  54. 54.

    Oliveira VB, Simões M, Melo LF et al (2013) Overview on the developments of microbial fuel cells. Biochem Eng J 73:53–64.

    Article  Google Scholar 

  55. 55.

    Pant D, Singh A, van Bogaert G et al (2012) Bioelectrochemical systems (BES) for sustainable energy production and product recovery from organic wastes and industrial wastewaters. RSC Adv 2(4):1248–1263.

    Article  Google Scholar 

  56. 56.

    Logan BE, Rabaey K (2012) Conversion of wastes into bioelectricity and chemicals by using microbial electrochemical technologies. Science 337(6095):686–690.

    Article  Google Scholar 

  57. 57.

    Turner JA (2004) Sustainable hydrogen production. Science 305(5686):972–974.

    Article  Google Scholar 

  58. 58.

    Dutta S (2014) A review on production, storage of hydrogen and its utilization as an energy resource. J Industr Eng Chem 20(4):1148–1156.

    Article  Google Scholar 

  59. 59.

    Hosseini SE, Wahid MA (2016) Hydrogen production from renewable and sustainable energy resources: Promising green energy carrier for clean development. Renewable Sustainable Energy Rev 57:850–866.

    Article  Google Scholar 

  60. 60.

    Midilli A, Ay M, Dincer I et al (2005) On hydrogen and hydrogen energy strategies. Renewable Sustainable Energy Rev 9(3):255–271.

    Article  Google Scholar 

  61. 61.

    Katsounaros I, Cherevko S, Zeradjanin AR et al (2014) Oxygen electrochemistry as a cornerstone for sustainable energy conversion. Angew Chem Int Ed Engl 53(1):102–121.

    Article  Google Scholar 

  62. 62.

    Leary R, Westwood A (2011) Carbonaceous nanomaterials for the enhancement of TiO2 photocatalysis. Carbon 49(3):741–772.

    Article  Google Scholar 

  63. 63.

    Chu S, Cui Y, Liu N (2016) The path towards sustainable energy. Nat Mater 16(1):16–22.

    Article  Google Scholar 

  64. 64.

    Izumi Y (2013) Recent advances in the photocatalytic conversion of carbon dioxide to fuels with water and/or hydrogen using solar energy and beyond. Coord Chem Rev 257(1):171–186.

    Article  Google Scholar 

  65. 65.

    Jiang Z, Xiao T, Kuznetsov VL et al (2010) Turning carbon dioxide into fuel. Philos Trans A Math Phys Eng Sci 368(1923):3343–3364.

    Article  Google Scholar 

  66. 66.

    Ganesh I (2014) Conversion of carbon dioxide into methanol – a potential liquid fuel: fundamental challenges and opportunities (a review). Renewable Sustainable Energy Rev 31:221–257.

    Article  Google Scholar 

  67. 67.

    Snyder GJ, Toberer ES (2008) Complex thermoelectric materials. Nat Mater 7(2):105–114.

    Article  Google Scholar 

  68. 68.

    Serrano E, Rus G, García-Martínez J (2009) Nanotechnology for sustainable energy. Renewable Sustainable Energy Rev 13(9):2373–2384.

    Article  Google Scholar 

  69. 69.

    Candelaria SL, Shao Y, Zhou W et al (2012) Nanostructured carbon for energy storage and conversion. Nano Energy 1(2):195–220.

    Article  Google Scholar 

  70. 70.

    Jena P (2011) Materials for hydrogen storage: past, present, and future. J Phys Chem Lett 2(3):206–211.

    Article  Google Scholar 

  71. 71.

    Wang D-W, Su D (2014) Heterogeneous nanocarbon materials for oxygen reduction reaction. Energy Environ Sci 7(2):576.

    Article  Google Scholar 

  72. 72.

    Xia W, Mahmood A, Liang Z et al (2016) Earth-abundant nanomaterials for oxygen reduction. Angew Chem Int Ed Engl 55(8):2650–2676.

    Article  Google Scholar 

  73. 73.

    Zhang W, Lai W, Cao R (2017) Energy-related small molecule activation reactions: oxygen reduction and hydrogen and oxygen evolution reactions catalyzed by porphyrin- and corrole-based systems. Chem Rev 117(4):3717–3797.

    Article  Google Scholar 

  74. 74.

    Tahir M, Pan L, Idrees F et al (2017) Electrocatalytic oxygen evolution reaction for energy conversion and storage: a comprehensive review. Nano Energy 37:136–157.

    Article  Google Scholar 

  75. 75.

    Wang H, Zhu Q-L, Zou R et al (2017) Metal-organic frameworks for energy applications. Chem 2(1):52–80.

    Article  Google Scholar 

  76. 76.

    Lu Q, Yu Y, Ma Q et al (2016) 2D Transition-metal-dichalcogenide-nanosheet-based composites for photocatalytic and electrocatalytic hydrogen evolution reactions. Adv Mater Weinheim 28(10):1917–1933.

    Article  Google Scholar 

  77. 77.

    Liu Z, Xu J, Chen D et al (2015) Flexible electronics based on inorganic nanowires. Chem Soc Rev 44(1):161–192.

    Article  Google Scholar 

  78. 78.

    Thavasi V, Singh G, Ramakrishna S (2008) Electrospun nanofibers in energy and environmental applications. Energy Environ Sci 1(2):205.

    Article  Google Scholar 

  79. 79.

    Ambrosi A, Chua CK, Bonanni A et al (2014) Electrochemistry of graphene and related materials. Chem Rev 114(14):7150–7188.

    Article  Google Scholar 

  80. 80.

    GhaffarianHoseini A, Dahlan ND, Berardi U et al (2013) Sustainable energy performances of green buildings: a review of current theories, implementations and challenges. Renewable Sustainable Energy Rev 25:1–17.

    Article  Google Scholar 

  81. 81.

    Vesborg PCK, Jaramillo TF (2012) Addressing the terawatt challenge: scalability in the supply of chemical elements for renewable energy. RSC Adv 2(21):7933.

    Article  Google Scholar 

  82. 82.

    Frederiks ER, Stenner K, Hobman EV (2015) Household energy use: applying behavioural economics to understand consumer decision-making and behaviour. Renewable Sustainable Energy Rev 41:1385–1394.

    Article  Google Scholar 

  83. 83.

    Wang J-J, Jing Y-Y, Zhang C-F et al (2009) Review on multi-criteria decision analysis aid in sustainable energy decision-making. Renewable Sustainable Energy Rev 13(9):2263–2278.

    Article  Google Scholar 

  84. 84.

    Pohekar SD, Ramachandran M (2004) Application of multi-criteria decision making to sustainable energy planning—a review. Renewable Sustainable Energy Rev 8(4):365–381.

    Article  Google Scholar 

  85. 85.

    Blei DM, Lafferty JD (2007) A correlated topic model of Science. Ann Appl Stat 1(1):17–35.

    MathSciNet  Article  MATH  Google Scholar 

  86. 86.

    Kajikawa Y, Yoshikawa J, Takeda Y et al (2008) Tracking emerging technologies in energy research: toward a roadmap for sustainable energy. Technol Forecasting Soc Change 75(6):771–782.

    Article  Google Scholar 

  87. 87.

    D’Amato D, Droste N, Allen B et al (2017) Green, circular, bio economy: a comparative analysis of sustainability avenues. J Clean Prod 168:716–734.

    Article  Google Scholar 

  88. 88.

    Griffiths TL, Steyvers M (2004) Finding scientific topics. Proc Natl Acad Sci U.S.A. 101(Suppl 1):5228–5235.

    Article  Google Scholar 

  89. 89.

    Bickel MW (2017) A new approach to semantic sustainability assessment: text mining via network analysis revealing transition patterns in German municipal climate action plans. Energ Sustain Soc 7(1):641.

    MathSciNet  Article  Google Scholar 

  90. 90.

    Blake C (2011) Text mining. Ann. Rev. Info. Sci. Tech. 45(1): 121–155. doi:

    Article  Google Scholar 

  91. 91.

    Fayyad U, Piatetsky-Shapiro G, Smyth P (1996) From data mining to knowledge discovery in databases. AI Mag 17(3):37–53

    Google Scholar 

  92. 92.

    Usai A, Pironti M, Mital M et al (2018) Knowledge discovery out of text data: a systematic review via text mining. J Knowledge Manag 22(7):1471–1488.

    Article  Google Scholar 

  93. 93.

    Blei DM (2012) Probabilistic topic models. Commun ACM 55(4):77–84.

    Article  Google Scholar 

  94. 94.

    Steyvers M, Griffiths TL (2007) Probabilistic topic models. In: Landauer TK, McNamara DS, Dennis S et al. (eds) Handbook of Latent Semantic Analysis. Taylor and Francis, Hoboken, pp 424–440

  95. 95.

    Hofmann T (1999) Probabilistic latent semantic analysis. In: Laskey KB (ed) Uncertainty in artificial intelligence: Proceedings of the fifteenth conference (1999), July 30 - August 1, 1999, Royal Institute of Technology (KTH), Stockholm, Sweden. Morgan Kaufmann, San Francisco, Calif., pp 289–296

  96. 96.

    Jiang H, Qiang M, Lin P (2016) A topic modeling based bibliometric exploration of hydropower research. Renewable Sustainable Energy Rev 57:226–237.

    Article  Google Scholar 

  97. 97.

    Sun L, Yin Y (2017) Discovering themes and trends in transportation research using topic modeling. Transport Res Part C Emerg Technol 77:49–66.

    Article  Google Scholar 

  98. 98.

    Hassan S-U, Haddawy P (2015) Analyzing knowledge flows of scientific literature through semantic links: a case study in the field of energy. Scientometrics 103(1):33–46.

    Article  Google Scholar 

  99. 99.

    R Core Team (2018) R: a language and environment for statistical computing.

  100. 100.

    Bickel MW (2019) textility - an R package for applied text mining with an example of topic modelling in the field of research on sustainable energy

  101. 101.

    Ball R, Tunger D (2007) Science indicators revisited – Science Citation Index versus SCOPUS: a bibliometric comparison of both citation databases. ISU 26(4):293–301.

    Article  Google Scholar 

  102. 102.

    Archambault É, Campbell D, Gingras Y et al (2009) Comparing bibliometric statistics obtained from the Web of Science and Scopus. J Am Soc Inf Sci 60(7):1320–1326.

    Article  Google Scholar 

  103. 103.

    Blei DM, Ng AY, Jordan MI (2003) Latent dirichlet allocation. J Mach Learn Res 3(Jan):993–1022

    MATH  Google Scholar 

  104. 104.

    Schmid H (1994) Probabilistic part-of-speech tagging using decision trees. In: Proceedings of International Conference on New Methods in Language Processing, Manchester, UK

  105. 105.

    Schmid H (1995) Improvements in part-of-speech tagging with an application to German. In: Proceedings of the ACL SIGDAT-Workshop, Dublin, Ireland

  106. 106.

    Schmid H (1994) TreeTagger - a part-of-speech tagger for many languages. Ludwig-Maximilians-Universität Munich

  107. 107.

    Michalke M (2017) koRpus: an R package for text analysis

  108. 108.

    Hore C, Asahara M, Matsumoto Y (2005) Automatic extraction of fixed multiword expressions. In: Hutchison D, Kanade T, Kittler J et al. (eds) Natural Language Processing – IJCNLP 2005, vol 3651. Springer Berlin Heidelberg, Berlin, Heidelberg, pp 565–575

    Google Scholar 

  109. 109.

    Church KW, Hanks P (1990) Word association norms, mutual information, and lexicography. Comput Linguist 16(1):22–29

    Google Scholar 

  110. 110.

    Thanopoulos A, Fakotakis N, Kokkinakis G (2002) Comparative evaluation of collocation extraction metrics. LREC(2): 620–625

  111. 111.

    Aletras N, Stevenson M (2013) Evaluating topic coherence using distributional semantics. In: Erk K, Koller A (eds) Proceedings of the 10th International Conference on Computational Semantics (IWCS 2013) – Long Papers: W13-0100, pp 13–22

  112. 112.

    Jonathan C, Sean G, Chong W et al. (2009) Reading tea leaves: how humans interpret topic models. Advances in neural information processing systems: 288–296

  113. 113.

    Selivanov D, Bickel M, Wang Q (2018) text2vec: modern text mining framework for R.

  114. 114.

    Mimno D, Wallach HM, Talley E et al. (2011) Optimizing semantic coherence in topic models. EMNLP 2011 - Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference

  115. 115.

    Röder M, Both A, Hinneburg A (2015) Exploring the space of topic coherence measures. In: Cheng X, Li H, Gabrilovich E et al. (eds) Proceedings of the Eighth ACM International Conference on Web Search and Data Mining - WSDM ‘15. ACM Press, New York, New York, USA, pp 399–408

  116. 116.

    Eells E, Fitelson B (2002) Symmetries and asymmetries in evidential support. Philosophical Studies 107(2):129–142.

    Article  Google Scholar 

  117. 117.

    Jones T (2018) textmineR: functions for text mining and topic modeling

  118. 118.

    Newman D, Karimi S, Cavedon L (2009) External evaluation of topic models. In: Kay J, Thomas P, Trotman A (eds) Proceedings of the Fourteenth Australasian Document Computing Symposium. School of Information Technologies, University of Sydney, Sydney

  119. 119.

    Bouma G (2009) Normalized (pointwise) mutual information in collocation extraction. Proceedings of GSCL: 31–40

  120. 120.

    Douven I, Meijs W (2007) Measuring coherence. Synthese 156(3):405–425.

    MathSciNet  Article  MATH  Google Scholar 

  121. 121.

    Keyes O, Tilbert B (2017) WikipediR: A MediaWiki API Wrapper

  122. 122.

    Hall D, Jurafsky D, Manning CD (2008) Studying the history of ideas using topic models. In: Lapata M, Ng HT (eds) Proceedings of the Conference on Empirical Methods in Natural Language Processing - EMNLP ‘08. Association for Computational Linguistics, Morristown, NJ, USA, p 363

  123. 123.

    Cleveland WS, Devlin SJ (1988) Locally weighted regression: an approach to regression analysis by local fitting. J Am Statist Assoc 83(403):596–610.

    Article  MATH  Google Scholar 

  124. 124.

    Cleveland WS (1979) Robust locally weighted regression and smoothing scatterplots. J Am Statist Assoc 74(368):829.

    MathSciNet  Article  MATH  Google Scholar 

  125. 125.

    Hurvich CM, Simonoff JS, Tsai C-L (1998) Smoothing parameter selection in nonparametric regression using an improved Akaike information criterion. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 60(2):271–293.

    MathSciNet  Article  MATH  Google Scholar 

  126. 126.

    Akaike H (1973) Information theory and an extension of the maximum likelihood principle. In: Petrov BN, Cslki F (eds) In Proc. 2nd Int. Symp. Information Theory. Akadémiai Kiadó, Budapest, pp 267–281

  127. 127.

    Rao CR (1982) Diversity and dissimilarity coefficients: a unified approach. Theor Popul Biol 21(1):24–43.

    MathSciNet  Article  MATH  Google Scholar 

  128. 128.

    Kullback S, Leibler RA (1951) On information and sufficiency. Ann Math Stat 22(1):79–86

    MathSciNet  Article  MATH  Google Scholar 

  129. 129.

    Cailliez F (1983) The analytical solution of the additive constant problem. Psychometrika 48(2):305–308.

    MathSciNet  Article  MATH  Google Scholar 

  130. 130.

    Mardia KV (1978) Some properties of classical multi-dimesional scaling. Communications in Statistics - Theory and Methods 7(13):1233–1241.

    MathSciNet  Article  MATH  Google Scholar 

  131. 131.

    Cox TF, Cox MAA (2001) Multidimensional scaling. Chapman & Hall/CRC, Boca Raton

  132. 132.

    Gower JC (1966) Some distance properties of latent root and vector methods used in multivariate analysis. Biometrika 53(3-4):325–338.

    MathSciNet  Article  MATH  Google Scholar 

  133. 133.

    Sievert C, Shirley K (2014) LDAvis: a method for visualizing and interpreting topics. In: Association for Computational Linguistics (ed) Proceedings of the Workshop on Interactive Language Learning, Visualization, and Interfaces, pp 63–70

  134. 134.

    Sievert C, Shirley K (2015) LDAvis: interactive visualization of topic models

  135. 135.

    Murtagh F, Legendre P (2014) Ward’s hierarchical agglomerative clustering method: which algorithms implement Ward’s criterion? J Classif 31(3):274–295.

    MathSciNet  Article  MATH  Google Scholar 

  136. 136.

    Ward JH (1963) Hierarchical grouping to optimize an objective function. J Am Statist Assoc 58(301):236–244.

    MathSciNet  Article  Google Scholar 

  137. 137.

    Butts CT (2016) sna: tools for social network analysis: R package

  138. 138.

    Freeman LC (1978) Centrality in social networks conceptual clarification. Social Networks 1(3):215–239.

    Article  Google Scholar 

  139. 139.

    Blondel VD, Guillaume J-L, Lambiotte R et al. (2008) Fast unfolding of communities in large networks. J Stat Mech 2008(10): P10008. doi:

    Article  MATH  Google Scholar 

  140. 140.

    Gabor Csardi, Tamas Nepusz (2006) The igraph software package for complex network research. InterJournal Complex Systems: 1695

  141. 141.

    Newman MEJ (2004) Analysis of weighted networks. Phys Rev E Stat Nonlin Soft Matter Phys 70(5 Pt 2):56131.

    Article  Google Scholar 

  142. 142.

    Newman MEJ (2006) Modularity and community structure in networks. Proc Natl Acad Sci U.S.A. 103(23):8577–8582.

    Article  Google Scholar 

  143. 143.

    Girvan M, Newman MEJ (2002) Community structure in social and biological networks. Proc Natl Acad Sci U.S.A. 99(12):7821–7826.

    MathSciNet  Article  MATH  Google Scholar 

  144. 144.

    Blei DM, Lafferty JD (2006) Dynamic topic models. In: Cohen W, Moore A (eds) Proceedings of the 23rd international conference on Machine learning - ICML ‘06. ACM Press, New York, New York, USA, pp 113–120

  145. 145.

    Wallach HM, Mimno DM, McCallum A (2009) Rethinking LDA: why priors matter. In: Bengio Y, Schuurmans D, Lafferty JD et al. (eds) Advances in Neural Information Processing Systems 22. Curran Associates, Inc, pp 1973–1981

  146. 146.

    Schmidt BM (2012) Words alone: dismantling topic models in the humanities. J Digit Human 2(1):49–65

    Google Scholar 

  147. 147.

    DiMaggio P, Nag M, Blei D (2013) Exploiting affinities between topic modeling and the sociological perspective on culture: application to newspaper coverage of U.S. government arts funding. Poetics 41(6):570–606.

    Article  Google Scholar 

  148. 148.

    Tang J, Meng Z, Nguyen X et al (2014) Understanding the limiting factors of topic modeling via posterior contraction analysis. Proceedings of the 31st International Conference on Machine Learning. PMLR 32(1):190–198

    Google Scholar 

  149. 149.

    Valero A, Valero A, Calvo G et al (2018) Material bottlenecks in the future development of green technologies. Renewable Sustainable Energy Rev 93:178–200.

    Article  Google Scholar 

  150. 150.

    de Koning A, Kleijn R, Huppes G et al (2018) Metal supply constraints for a low-carbon economy? Resour Conserv Recycling 129:202–208.

    Article  Google Scholar 

  151. 151.

    Jacobson MZ, Delucchi MA (2009) A path to sustainable energy by 2030. Sci Am 301(5):58–65.

    Article  Google Scholar 

  152. 152.

    Nakicenovic N, Grübler A, Ishitani H et al. (1996) Energy primer. In: Watson RT, Zinyowera MC, Moss RH (eds) Climate change, 1995: Impacts, adaptations, and mitigation of climate change: scientific-technical analyses : contribution of working group II to the second assessment report of the Intergovernmental Panel on Climate Change. Cambridge University Press, Cambridge (England), New York, NY, USA, pp 75–92

  153. 153.

    Rogner H (1994) Fuel cells, energy system evolution and electric utilities. Int J Hydrogen Energy 19(10):853–861.

    Article  Google Scholar 

  154. 154.

    Grubler A, Johansson TB, Mundaca L et al. (2012) Chapter 1 - Energy primer. In: Johansson TB, Patwardhan AP, Gomez-Echeverri L et al. (eds) Global Energy Assessment: Toward a Sustainable Future, Cambridge University Press, Cambridge, UK and New York, NY, USA and International Institute for Applied Systems Analysis, Laxenburg, Austria, pp 99–150

  155. 155.

    Tremmel J (2003) Nachhaltigkeit als politische und analytische Kategorie: Der deutsche Diskurs um nachhaltige Entwicklung im Spiegel der Interessen der Akteure. Teilw zugl.: Frankfurt/Main, Univ., Diplomarbeit, 2003. Hochschulschriften zur Nachhaltigkeit, vol 4. Ökom-Verl., München

  156. 156.

    Heinrichs H, Wiek A, Martens P et al. (2016) Sustainability Science. In: Heinrichs H, Martens P, Michelsen G et al. (eds) Sustainability Science. Springer Netherlands, Dordrecht, pp 1–4

  157. 157.

    Ostrom E (2007) A diagnostic approach for going beyond panaceas. Proc Natl Acad Sci U.S.A. 104(39):15181–15187.

    Article  Google Scholar 

  158. 158.

    Mebratu D (1998) Sustainability and sustainable development. Environ Impact Assess Rev 18(6):493–520.

    Article  Google Scholar 

  159. 159.

    Schultz J, Brand F, Kopfmüller J et al (2008) Building a ‘theory of sustainable development’: two salient conceptions within the German discourse. IJESD 7(4):465.

    Article  Google Scholar 

  160. 160.

    Scarlat N, Dallemand J-F, Monforti-Ferrario F et al (2015) The role of biomass and bioenergy in a future bioeconomy: policies and facts. Environ Dev 15:3–34.

    Article  Google Scholar 

  161. 161.

    Dominković DF, Bačeković I, Pedersen AS et al (2018) The future of transportation in sustainable energy systems: opportunities and barriers in a clean energy transition. Renewable Sustainable Energy Rev 82:1823–1838.

    Article  Google Scholar 

  162. 162.

    Hilty LM, Arnfalk P, Erdmann L et al (2006) The relevance of information and communication technologies for environmental sustainability – a prospective simulation study. Environ Model Software 21(11):1618–1629.

    Article  Google Scholar 

  163. 163.

    Aiello M, Pagani GA (2016) How energy distribution will change: an ICT perspective. In: Beaulieu A, Wilde Jd, Scherpen JMA (eds) Smart grids from a global perspective: Bridging old and new energy systems. Springer, Cham, pp 11–25

    Chapter  Google Scholar 

  164. 164.

    Diamantoulakis PD, Kapinas VM, Karagiannidis GK (2015) Big data analytics for dynamic energy management in smart grids. Big Data Res 2(3):94–101.

    Article  Google Scholar 

  165. 165.

    Siano P (2014) Demand response and smart grids—a survey. Renewable and Sustainable Energy Reviews 30:461–478.

    Article  Google Scholar 

  166. 166.

    BIO Intelligence Service (2008) Impacts of ICT on energy efficiency: report to European Commission DG INFSO

  167. 167.

    Beier G, Niehoff S, Ziems T et al (2017) Sustainability aspects of a digitalized industry – a comparative study from China and Germany. Int J Precis Eng Manuf Green Tech 4(2):227–234.

    Article  Google Scholar 

  168. 168.

    Connolly D, Lund H, Mathiesen BV et al (2010) A review of computer tools for analysing the integration of renewable energy into various energy systems. Appl Energy 87(4):1059–1082.

    Article  Google Scholar 

  169. 169.

    Sinha S, Chandel SS (2015) Review of recent trends in optimization techniques for solar photovoltaic–wind based hybrid energy systems. Renewable Sustainable Energy Rev 50:755–769.

    Article  Google Scholar 

  170. 170.

    Mosavi A, Salimi M, Faizollahzadeh Ardabili S et al (2019) State of the art of machine learning models in energy systems, a systematic review. Energies 12(7):1301.

    Article  Google Scholar 

  171. 171.

    Gossart C (2015) Rebound effects and ICT: a review of the literature. In: Hilty LM, Aebischer B (eds) ICT Innovations for Sustainability, vol 310. Springer International Publishing, Cham, pp 435–448

    Google Scholar 

  172. 172.

    Hilty LM (2008) Information technology and sustainability: essays on the relationship between ICT and sustainable development. Books on Demand, Norderstedt

  173. 173.

    IEA (2016) Re-powering markets - market design and regulation during the transition to low-carbon power systems, International Energy Agency, Paris. Accessed 29 Jun 2018

  174. 174.

    Bryant ST, Straker K, Wrigley C (2019) The discourses of power – governmental approaches to business models in the renewable energy transition. Energy Policy 130:41–59.

    Article  Google Scholar 

  175. 175.

    Massari S, Ruberti M (2013) Rare earth elements as critical raw materials: focus on international markets and future strategies. Resources Policy 38(1):36–43.

    Article  Google Scholar 

  176. 176.

    Zepf V, Reller A, Rennie C et al. (2014) Materials critical to the energy industry. An introduction., 2nd edition, London

  177. 177.

    Bataille C, Åhman M, Neuhoff K et al (2018) A review of technology and policy deep decarbonization pathway options for making energy-intensive industry production consistent with the Paris Agreement. J Clean Prod 187:960–973.

    Article  Google Scholar 

  178. 178.

    Worrell E, Bernstein L, Roy J et al (2009) Industrial energy efficiency and climate change mitigation. Energy Efficiency 2(2):109–123.

    Article  Google Scholar 

  179. 179.

    Flower DJM, Sanjayan JG (2007) Green house gas emissions due to concrete manufacture. Int J Life Cycle Assess 12(5):282–288.

    Article  Google Scholar 

  180. 180.

    Higgins D (2012) Briefing: specifying concrete for sustainability. Proceedings of the Institution of Civil Engineers - Engineering Sustainability 165(2): 125–127. doi:

    Article  Google Scholar 

  181. 181.

    Ehrenfeld J, Gertler N (1997) Industrial ecology in practice: the evolution of interdependence at Kalundborg. J Indust Ecol 1(1):67–79.

    Article  Google Scholar 

  182. 182.

    Lowe EA, Evans LK (1995) Industrial ecology and industrial ecosystems. J Clean Prod 3(1-2):47–53.

    Article  Google Scholar 

  183. 183.

    Murray A, Skene K, Haynes K (2017) The circular economy: an interdisciplinary exploration of the concept and application in a global context. J Bus Ethics 140(3):369–380.

    Article  Google Scholar 

  184. 184.

    Ghisellini P, Cialani C, Ulgiati S (2016) A review on circular economy: the expected transition to a balanced interplay of environmental and economic systems. J Clean Prod 114:11–32.

    Article  Google Scholar 

  185. 185.

    Ceschin F, Gaziulusoy I (2016) Evolution of design for sustainability: from product design to design for system innovations and transitions. Design Studies 47:118–163.

    Article  Google Scholar 

  186. 186.

    Geels FW, Sovacool BK, Schwanen T et al (2017) The socio-technical dynamics of low-carbon transitions. Joule 1(3):463–479.

    Article  Google Scholar 

  187. 187.

    Grin J, Rotmans J, Schot J et al. (2010) Transitions to sustainable development: new directions in the study of long term transformative change. Routledge studies in sustainability transitions, vol 1. Routledge, New York, NY

    Book  Google Scholar 

  188. 188.

    Sovacool BK (2014) What are we doing here?: analyzing fifteen years of energy scholarship and proposing a social science research agenda. Energy Res Social Sci 1:1–29.

    Article  Google Scholar 

  189. 189.

    Sovacool BK, Ryan SE, Stern PC et al (2015) Integrating social science in energy research. Energy Res Soc Sci 6:95–99.

    Article  Google Scholar 

  190. 190.

    Büscher C, Sumpf P (2015) “Trust” and “confidence” as socio-technical problems in the transformation of energy systems. Energ Sustain Soc 5(1):20.

    Article  Google Scholar 

  191. 191.

    UN DESA (2014) World urbanization prospects: The 2014 revision. United Nations - Department of Economic and Social Affairs, New York

  192. 192.

    UN DESA (2018) World Urbanization Prospects: The 2018 Revision. United Nations - Department of Economic and Social Affairs, New York

  193. 193.

    Ott K, Döring R (2004) Theorie und Praxis starker Nachhaltigkeit. Ökologie und Wirtschaftsforschung, vol 54. Metropolis-Verlag, Marburg

  194. 194.

    Daly HE (1990) Sustainable development: from concept and theory to operational principles. Popul Dev Rev 16:25–43.

    Article  Google Scholar 

Download references


The author wants to thank the anonymous reviewers for their comments that supported improving the quality of this article. Further thank goes to Dmitriy Selivanov, the main author of the text2vec R package, for his advice regarding R programming and text mining. The author also wants to thank the Stackoverflow online community for providing a questions-and-answers forum that supported writing the code for this study. Finally, the author would like to thank Franziska Schaube for her general comments on the manuscript.


This work was funded by the Leuphana University under its PhD research fellowship program [grant number 73000882]. Apart from providing funding, the funding body was not involved in this study.

Author information




MWB is the sole author of this article. The author read and approved the final manuscript.

Corresponding author

Correspondence to Manuel W. Bickel.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The author declares that he has no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Additional file 1.

Appendix A. DOIs of reviewed articles; hardware requirements; details on models and methods; extended results.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Bickel, M.W. Reflecting trends in the academic landscape of sustainable energy using probabilistic topic modeling. Energ Sustain Soc 9, 49 (2019).

Download citation


  • Sustainable energy
  • Topic modeling
  • Text mining
  • Research trends
  • Research networks