This research studies the impact of online news on social and economic consumer perceptions through semantic network analysis. Using over 1.8 million online articles on Italian media covering four years, we calculate the semantic importance of specific economic-related keywords to see if words appearing in the articles could anticipate consumers’ judgments about the economic situation and the Consumer Confidence Index. We use an innovative approach to analyze big textual data, combining methods and tools of text mining and social network analysis. Results show a strong predictive power for the judgments about the current households and national situation. Our indicator offers a complementary approach to estimating consumer confidence, lessening the limitations of traditional survey-based methods.
Monthly reports of the nation’s level of consumer confidence can offer a first understanding of the consumers’ sentiment and predict their spending. Consumer confidence has been traditionally associated with objective political and economic conditions and external factors like news media coverage. Indeed, previous research has shown that survey data, such as the Consumer Confidence Index (CCI), can successfully support the forecasting of economic variables released with a substantial delay e.g., 1,2,3 . Survey data are often used as initial conditions for macroeconomic models or for modifying a baseline distribution to match certain moment conditions of interest given by the survey e.g., 4,5,6 . Cascaldi-Garcia and colleagues 7 also showed that opinion surveys, such as the CCI, are particularly important for nowcasting economic variables released with a substantial delay and push forward the idea of predicting economic opinions surveys.
However, despite the general attention given to consumer confidence surveys, their reliability in providing information about the future path of household spending is still not entirely explored 8 . It has been demonstrated how the predictive power of consumer confidence surveys is influenced by factors including economic conditions, current and past political situations, trust in the government, and the influence of the news industry 9,10,11 . The media, both mainstream and digital, can influence how consumers feel about the economy. Barsky and Sims 12 found that the relationship between consumer confidence and consequent activity is almost entirely reflective of the news component. In a series of studies on the role of media in influencing the stock market and the financial performance of companies, Tetlock 13,14 used a Bag-of-Words approach to quantify the language used in financial news stories and found that, contrary to popular belief, media pessimism weakly predicts increases in market volatility. In another study, Tetlock, Saar‐Tsechansky and Macskassy 15 found that negative language used in financial press articles can predict low earnings for firms, suggesting that the words used in news stories are not superfluous information, but rather, they capture essential aspects of a company’s fundamentals that are otherwise difficult to quantify. Li’s study 16 on the usage of the terms “risk” and “uncertain” in a company’s annual reports highlights the importance of paying attention to the language used in financial reports. By analyzing the words chosen by companies, investors can gain insight into the level of risk associated with the company’s operations. Market prediction and consumer behavior mechanisms that rely on online text mining are only now beginning to be thoroughly investigated, thanks to the significant advancements in computational processing power and network speed in recent years 17 . In our study, we adopt a Big Data methodology to forecast consumer confidence, looking at the role of online news and its influence on consumer confidence. We investigate how online news—as reported in digital newspapers and other online sources—influences consumer confidence using a methodology that relies on an indicator that calculates the importance of economic-related keywords (ERKs) appearing on digital news media. This is a departure from the time-consuming manual content analysis of economic news used in the past 18 . Our focus is on the Italian Consumer Confidence Climate index, which provides an indication of the optimism and pessimism of consumers who evaluate the Italian general economic situation and report their expectations for the future. We chose an indicator of semantic importance, called Semantic Brand Score (SBS), which calculates the relative importance of one or more keywords in the news 19 . We selected this indicator because of its ability to forecast various outcomes, from financial market trends 20 to election results 21 and tourism demand 22 . Based on methodologies drawn from social network analysis and text mining, the semantic importance of keywords is calculated in terms of their prevalence, i.e., frequency of word occurrences; connectivity, i.e., degree of centrality of a word in the discourse; and diversity, i.e., richness and distinctiveness of textual associations. The approach we use in our study is different from past research that focused on the evaluation of news sentiment e.g., 23,24 . We have implemented a new integrated semantic index as a measure of semantic significance. This metric has been proven to be more informative than sentiment analysis, which can be subject to variable error rates and reliability issues 25 , and represents a valuable tool for analyzing and understanding relationships among words in a corpus 20,22 .
This study contributes to the discussion on online media’s role in shaping consumer confidence. By providing an alternative method based on semantic network analysis, we investigate the antecedents of consumer confidence in terms of current and future economic expectations. Our approach is not intended to replace the information obtained from traditional tools but rather to supplement them. For instance, we may use consumer surveys in conjunction with our methods to gain a more comprehensive understanding of the market.
Section "The connection between news and consumer confidence" delves into the impact of news on consumers’ perceptions of the economy. Section "Research design" outlines the methodology and research design employed in our study. Section "Results" showcases the primary findings, subsequently analyzed in Section "Discussion and conclusions".
Effective news coverage plays a crucial role in shaping the current and future expectations of individuals. Both digital and mainstream media provide information that can significantly impact people’s economic evaluations of present and future conditions and influence economic decisions. The information disseminated through news channels can significantly impact the way people perceive the economy, leading to changes in their spending habits, investment decisions, and overall economic behavior 26,27 . The news may influence consumer confidence, especially when people are exposed to ambiguous messages 11 , or when media coverage does not fully reflect economic conditions, or when it is biased by partisanship 9,28 . For example, Damstra and Boukes 29 investigated the impact of the real economy on economic news in Dutch newspapers and confirmed that the description of economic reality offered by the media is skewed to the negative, which in turn affects people’s economic expectations about the future, but not their current evaluations. Other studies show the role of rumors in shaping consumer response and spending 30 , while others demonstrate how the tone of economic news may influence consumer confidence, with a slight difference between prospective versus retrospective economic evaluations 18,31 . For example, Boukes et al. 18 found that consumers’ retrospective evaluations were not influenced by the tone of the news. Other studies explored the effect of the negativity bias on consumer confidence and demonstrated how consumers react only to bad news 10 . The negativity bias, well documented in social psychology, political science, and economics 32,33 , is at the basis of this asymmetry in response to bad versus good news: negative information often has a more profound effect on the formation of impressions than positive information. As a result, negative information can have a lasting impact on our perceptions and judgments. Other scholars have challenged the negativity bias and the asymmetric response of consumers. In a study examining the relationship between media reporting of economic news and consumer confidence in the United States, Casey and Owen 31 found evidence of positive and negative consumer confidence asymmetries.
Empirical studies have demonstrated how alternative methods based on textual analysis are more reliable and could complement and reduce the limitations of survey-based methods to describe current economic conditions and better predict a household’s future economic activity. For example, a recent study conducted on the accuracy of Swiss opinion surveys revealed that the level of survey bias varies significantly depending on the policy areas being measured. The study found that the strongest biases were observed in areas related to immigration, the environment, and specific types of regulation.
This information is crucial for policymakers and researchers who rely on public opinion surveys to inform their decisions. By understanding the potential biases in survey results, they can make more informed decisions and develop more effective policies 34 . Song and Shin 35 have recently conducted a study on sentiment analysis of South Korean news articles using a lexicon approach. Their findings have demonstrated the potential of news as a valuable source for developing alternative economic indicators that can supplement traditional Consumer Confidence indices. News data is not only cheaper to acquire: its advantages, compared to monthly national surveys, include the ability to observe consumer trends at a more granular level, with more data points, and the ability to capture the social and economic impact of specific issues through a broader perspective 36 . Additional empirical evidence confirms the complicated relationship between consumers and news reported by the media. Through an investigation of the association between consumer spending for durable goods and consumer confidence, Ahmed and Cassou 37 found that news has a relevant impact on confidence during economic expansions, though it is generally not important during economic recessions.
Contributing to this stream of research, we use a novel indicator of semantic importance to evaluate the possible impact of news on consumers’ confidence.
Consumer confidence climate is a monthly economic indicator that measures the degree of optimism perceived by consumers regarding the overall state of the economy and their financial situation, evaluated through their saving and spending habits. Its value is high when consumers spend more and save less and low when consumers save more and spend less. Its trend typically increases when the economy expands and decreases when the economy contracts, reflecting the outlook of consumers with respect to their ability to find and retain good jobs according to their perception of the current state of the economy and their financial situation.
In Italy, the Consumer Confidence Climate survey is composed of a set of questions designed to assess consumers’ perceived optimism or pessimism around the Italian economic situation and their expectations for the future. Survey participants provide their opinion about future unemployment, current and future households’ financial situation, current and future possibility of savings, current opportunities for durable goods purchases, and current family budget. The answers to nine questions are aggregated, and the result is reported in a seasonally adjusted index 38 . The Consumer Confidence Climate can be broken down into four sub-indices released by the Italian Institute of Statistics (ISTAT). These indices are: the Economic Climate, the Personal Climate, the Current Climate, and the Future Climate. The Economic Climate Index considers consumers’ current assessment and future expectations regarding the general economic situation in Italy, as well as their outlook on future unemployment. The index of Personal Climate takes into account various factors that impact a household’s financial well-being. These include the current financial situation, savings, significant purchases of durable goods, and the family budget. The Current Climate index analyzes various factors that impact the Italian economy, including the current financial situation of households, their savings, expenditures on durable goods, and family budget. Finally, the Future Climate includes questions related to the foreseen future of the Italian general economic situation, the households’ financial situation, unemployment expectations, and savings. We downloaded the target series data from the Italian National Institute of Statistics (ISTAT) website (https://www.istat.it).
From the Consumer Confidence Climate survey, we extracted economic keywords that were recurring in the survey’s questions. We then extended this list by adding other relevant keywords that matched the economic literature and the independent assessment of three economics experts. The inclusion of external experts to validate the selection of keywords is aligned with the methodology used in similar studies 39 . These keywords, translated from Italian, include home, rent, income, pensions, savings, credit, loans, interest rates, prices, market, job, competition, economy, public sector, politics, institutions, basic necessities, global, family, trust, discomfort/distress, consumer, education degree, purchase, car, PC, and holidays. These keywords provide insight into the concerns and priorities of Italian society. From the basic necessities of home and rent to the complexities of the economy and politics, these words refer to some of the challenges and opportunities individuals and institutions face. We also considered their synonyms and, drawing from past research 20,40 , we considered additional sets of keywords related to the economy or the Covid emergency, including singletons—i.e., individual words—such as Covid and lockdown.
Table 1 shows the full list of ERKs, with the RelFreq column indicating the ratio of the number of times they appear in the text to the total number of news articles.
Diversity is a dimension of the SBS index that considers the relationship of economic keywords with the other words in the text. This is related to the construct of brand image 50 and to the idea that, when associations are less common and in a high number, the keyword is more important 19,51 . We operationalized diversity through the following formula, based on the metric of distinctiveness centrality 52 :
In general, we consider a graph G, made of n nodes (words) and E edges (word links), associated with a set of connection weights W. In the formula, gj is the degree of node j, which is one of the neighbors of node i (the one for which diversity is calculated). \(I(_>0)\) is an indicator function that is equal to 1 when the edge connecting nodes i and j exists, i.e., when wij > 0, and is equal to 0 when this edge is missing.
The last dimension of the SBS, connectivity, is measured as the weighted betweenness centrality of the ERKs 53,54 and represents their ‘brokerage power’, i.e. how much each keyword can serve as a bridge to connect other terms and topics in the discourse 19 . The connectivity formula is based on the analysis of the shortest paths connecting each pair of nodes 49 :
$$Connectivity\, \left(i\right)= \sum_The final SBS indicator was calculated by summing the standardized scores of its components, considering all the words in the corpus for each timeframe. Aligned with past studies, e.g. 19,21 , we used an equal weighting scheme and carried out standardization by subtracting the mean and dividing by the standard deviation, as in the following formula:
$$SBS\, (i) = \frac<_ - \overline > +\frac_ - \overline > +\frac_ - \overline >$$where PR is prevalence, DI is diversity, and CO is connectivity. We also tested different approaches, such as subtracting the median and dividing by the interquartile range, which did not yield better results.
Lastly, we calculated the language sentiment of all articles as a control variable and a possible additional predictor of the Consumer Confidence Index and its dimensions. Sentiment was computed using the SBS BI web app 45 , which uses a lexicon similar to VADER 55 for the Italian language. Sentiment scores range from − 1 to + 1, with − 1 indicating very negative article content and + 1 the opposite.
The Consumer Confidence series have a monthly frequency, whereas our predictor variables are weekly data series. In order to use the leading information coming from ERKs, we transformed the monthly time series into weekly data points using a temporal disaggregation approach 56 . The primary objective of temporal disaggregation is to obtain high-frequency estimates under the restriction of the low-frequency data, which exhibit long-term movements of the series. Given that the Consumer Confidence surveys are conducted within the initial 15 days of each month, we conducted a temporal disaggregation to ensure that the initial values of the weekly series were in line with the monthly series. To obtain weekly values, we applied a cubic spline interpolation 57,58,59 . Figure 2 illustrates the disaggregated series we obtained.
To measure whether the SBS indicators offered relevant information to anticipate our economic variables, we performed Granger Causality tests. In general, a time series is said to Granger‐cause another time series if the former has incremental predictive power on the latter. Therefore, Granger causality provides an indication of whether one event or variable occurs prior to another. We also looked at the cross-correlation of the target series with our predictors (i.e., ERKs series) to see if they were in phase (positive signs of cross-correlation) or out of phase (negative sign) 60,61 .
Figure 3 outlines the methodology employed in our research design. We started by identifying the Economic Related Keywords (singletons or word sets). We then calculated the SBS indicators to measure the keyword’s importance and applied Granger causality methods to predict the consumer confidence indicators.
In this section, we discuss the signs of cross-correlation and the results of the Granger causality tests used to identify the indicators that could anticipate the consumer confidence components (see Table 2). In line with past research, e.g. 62,63 , we dynamically selected the number of lags using the Bayesian Information Criteria. The models indicate that 61% of the semantic importance series of ERKs Granger-cause the Personal component of the Consumer Climate index, while only 34% Granger-cause the Future component and 27% the Current component. It is not surprising that average consumers have a better understanding of their personal situation when responding to questions but may be less informed about economic cycles. When answering questions about their own financial situation, individuals are likely to have a more accurate understanding of their personal circumstances. However, when it comes to broader economic trends and cycles, the average consumer may not have the same level of knowledge or expertise. This is understandable, as economic cycles can be complex and difficult to understand without specialized training or experience. Interestingly, this representation of the current situation comes from online news, which may report what is currently happening more than depicting future scenarios—which may directly impact consumers’ opinions and economic decisions.
Through a granular analysis of the dimensions of consumer confidence, we found that the extent to which the news impacts consumers’ economic perception changes if we consider people’s current versus prospective judgments. Our forecasting results demonstrate that the SBS indicator predicts most consumer perception categories more than the language sentiment expressed in the articles. ERKs seem to impact more the Personal climate, i.e., consumers’ perception of their current ability to save, purchase durable assets, and feel economically stable. In addition, we find a disconnect between the ERKs’ impact on the current and future assessments of the economy, which is aligned with other studies 68,69 . While the Consumer Confidence Index has often been considered a suitable predictor of economic growth and a good indicator of consumers’ optimism about the current economy, short-term estimations may show deviations from long-term trends, likely caused by nonsystematic shocks.
Lastly, keywords associated with national or European political decisions seem to lead to more uncertainty and pessimism. This is consistent with other empirical evidence demonstrating how the conduct of politics—in our case, both at a national and European level—plays a role in determining how consumers feel about the economy’s future in both the long and short run 9 . The higher prominence and predictive power of political keywords, both as it refers to economic and non-economic concerns, have been considered in past research among the key determinants of consumers’ perception of the future of the economy 26,70 .
These results are aligned with previous studies showing how exposure to uncertain information makes people feel uncertain and more pessimistic about their future 11,14,71 . People’s reaction was more positive when keywords were associated with clear financial concepts (e.g., gold or monetary policy). When keywords were related to political discussions or concepts like rent, the role of Europe, or retirement, people’s reaction was more negative. Interestingly, the keyword “gold” had an impact on determining consumer confidence in six of the nine questions: Evaluation of the Economic situation in Italy; Evaluation of the household economic situation; Evaluation of the household budget; Current Opportunities for Savings; Current Opportunities of Purchasing Durable Goods and Expectations on the economic situation of Italy. During economic downturns, such as the 2018 and 2019 recession in Italy, financial institutions often increase their holdings of gold as reserve assets. This may be due to the perception that gold is a safe and stable investment during times of economic uncertainty. As a result, consumers may view this move positively, as it signals financial stability and security within the institution. As demonstrated by a study commissioned by the IMF 72 , macroeconomic announcements have a significant impact on both the price of gold and consumer confidence.
Even if exploratory in nature, our study suggests that news has important implications on consumer confidence during economic recessions, not only during an economic expansion, as suggested by recent research 37 . Overall, our models confirm the important role played by the media in shaping current judgments and future expectations 11 , and the impact that national and European politics have on shaping these assessments 9 .
This article investigates the antecedents of consumer confidence by analyzing the importance of economic-related keywords as reported on online news. After mining online Italian news over a period of four years, we found that most of the selected keywords impact how consumers perceive their personal economic situation.
Overall, this study offers valuable insights into the potential of semantic network analysis in economic research and underscores the need for a multidimensional approach to economic analysis. This study contributes to consumer confidence and news literature by illustrating the benefits of adopting a big data approach to describe current economic conditions and better predict a household’s future economic activity. The methodology in this article uses a new indicator of semantic importance applied to economic-related keywords, which promises to offer a complementary approach to estimating consumer confidence, lessening the limitations of traditional survey-based methods. The potential benefits of utilizing text mining of online news for market prediction are undeniable, and further research and development in this area will undoubtedly yield exciting results. For example, future studies could consider exploring other characteristics of news and textual variables connected to psychological aspects of natural language use 73 or consider measures such as language concreteness 74 .
Finally, our research highlights the importance of media communication in shaping public opinion and influencing consumer behavior. As such, it is crucial for businesses and policymakers to be aware of the potential impact of media on consumer confidence and take appropriate measures to mitigate any negative effects.
The data that support the findings of this study are available from the author Barbara Guardabascio upon reasonable request. This refers to the numerical data resulting from the analysis of the news articles and the trained BERT models. However, the authors are not allowed to share the raw news data provided by Telpress International B.V. These data are the property of the company, and the authors have deleted them after the analysis.
The authors wish to thank Vincenzo D’Innella Capano, CEO of Telpress International B.V., and to Lamberto Celommi, for making the news data available. The computing resources and the related technical support used for this work were provided by CRESCO/ENEAGRID High Performance Computing infrastructure and its staff. CRESCO/ENEAGRID High Performance Computing infrastructure is funded by ENEA, the Italian National Agency for New Technologies, Energy and Sustainable Economic Development and by Italian and European research programs.