In today’s information-rich environment, organizations grapple with vast amounts of unstructured data,
including social media posts, news articles, and customer reviews. This data holds a wealth of actionable
insights, particularly in understanding public sentiment—a crucial aspect for businesses, policymakers,
and investors alike. Yet, traditional sentiment analysis methods have struggled to keep pace with the
nuanced complexity of human expression and the sheer scale of available data. While large language
models (LLMs) promise scalability and speed, their limitations in contextual understanding hinder their
ability to deliver consistently reliable insights. This is where the innovative SENDEX (Sentiment Index)
framework steps in, blending human judgment with AI efficiency to overcome these limitations.
Large language models, including state-of-the-art systems, excel at generating coherent text and
identifying surface-level patterns. However, their performance often falters when deeper contextual
understanding is required. This shortfall becomes particularly evident in two critical phenomena:
intrusion – when unrelated elements are erroneously grouped together – and diffusion – when a single
coherent idea is scattered across multiple categories or responses. These issues are symptomatic of a
larger limitation: LLMs often rely on statistical correlations within their training data, which may not
reflect nuanced, real-world associations.
The SENDEX framework addresses these challenges by integrating human contextual expertise at
strategic points in the analysis pipeline. SENDEX combines the speed and scalability of LLMs with the
nuanced judgment of human analysts, creating a best-of-both-worlds solution.
The main features of SENDEX include:
a novel LLM-based probabilistic topic modeling framework
a multi-agent sentiment analysis ensemble that incorporates human feedback for coherence
a Tableau dashboard to graphically view SENDEX features as time series plots with flexibility of
choosing between features, concepts, temporal windows, etc.
While SENDEX is designed as a domain-agnostic framework capable of analyzing sentiment across any
field, we demonstrate its capabilities through a focused application in environmental news coverage.
For our analysis, we considered an extensive dataset of 1 million articles that not only provides a robust
foundation for validating SENDEX's capabilities, but also enables us to evaluate its performance across
various market cycles and evolving environmental narratives.
The Inflation Reduction Act (IRA), signed in August 2022, brought a positive shock to the renewable energy sector through tax credits and clean energy incentives for solar, wind, and EVs. The SENDEX dashboard shows how regulator and investor sentiments shot up due to this change. The disaggregate concept list selected for this plot was:
(Renewable Energy) + (Automotive Industry) EVs + (Environmental Management) Solar Roadways
In August 2015, the Shanghai Composite Index fell by 8.5%, causing US stock futures to drop by 7% before markets opened due to fear. When the stock market opened in the US, High Frequency-Traders (HFTs) and algorithmic trading systems contributed to a massive sell-off, due to which more than 1200 stock circuit breakers and limit up, limit down (LULD) halts took place on the same day. Due to the immense number of halts, more pressure was put on improving regulation and finding a better system for the stock circuit breakers. For this phase, we tracked two key indices – S&P500 and VIX (%), and the regulator stakeholder sentiments. The SENDEX dashboard shows how the disaggregated regulator sentiments dip negative in the same period; however, this effect may have been mixed with other positive events involving regulators and the same concepts, as we do not see a drastic reduction in sentiments. However, the sentiments respond to other fluctuations in S&P500 and VIX (%) in the dips seen in February 2016 and July 2016. The disaggregate concept list selected for this plot was:
(US Economy), (Real Estate) Housing, (Finance), (Energy Industry)
To incentivize renewable electricity generation, the Production Tax Credits (PTC) provides a federal tax credit per kilowatt-hour (kWh) for electricity generated from qualified renewable energy sources for facilities placed in service after December 31, 2021. Originally, PTC was set to expire at the end of 2012, but the details of its expiration had a catch: the credit was made available for any project that began construction by the end of 2013, meaning that everyone tried to jump into that pool instead of waiting until the end of 2012 when it had dried up. We tracked the iShares Global Clean Energy (ICLN) index with investor and regulator sentiments on the SENDEX dashboard, and the results showed that for high volume concepts selected under (Renewable Energy), the investor and regulator sentiments dip negative as the credits are about the expire near the end of 2012, but come back up possibly due to other events, with SENDEX showing the ICLN reaching an all-time low at the end of 2012.
As LLMs continue to evolve, the need for human-AI frameworks like SENDEX will only grow. While technological advancements may reduce some limitations, the intrinsic contextual gaps in AI’s reasoning underline the irreplaceable value of human judgment. The future lies not in choosing between human or AI approaches but in designing systems that synergize their strengths—delivering clarity, precision, and impact in an increasingly complex world. SENDEX is not just a tool; it’s a testament to what’s possible when we combine the best of human and artificial intelligence. For businesses, policymakers, and researchers navigating the intricate dynamics of public sentiment, this hybrid framework offers a roadmap to deeper understanding and more informed decision-making.
Methodology
Check out the SENDEX methodology, which includes details of the examples cited above here.
Receive updates on activities and events from the Institute
Adding {{itemName}} to cart
Added {{itemName}} to cart