Outperformed by AI: Time to Replace Your Analyst? Find Out Which GenAI Model Does It Best

Discover how Large Language Models (LLMs) compare to human analysts in generating SWOT analyses.

LLM-based AI models are rapidly replacing human labor, especially in low-value, repetitive tasks such as customer service chatbots, appointment scheduling, and business lead generation. But what about high-value research roles? How do these models compare to human expertise, and could they eventually replace Ph.D.-level experts like me?

A recent study explored this question through a controlled experiment involving six LLMs and three real-world companies. Each model produced a SWOT analysis under two conditions—a basic prompt and an advanced prompt. The outputs were then evaluated across six key criteria, offering a nuanced assessment of the quality and reliability of AI-generated SWOT analyses.

Paper reviewed:

Schopf, Michael, Outperformed by AI: Time to Replace Your Analyst?

Find Out Which GenAI Model Does It Best (April 15, 2025). Available at SSRN: https://ssrn.com/abstract=5222427 or http://dx.doi.org/10.2139/ssrn.5222427

Summary

This research paper investigates the capabilities of LLMs in generating SWOT analyses, revealing that advanced prompting and reasoning-optimized models can outperform human analysts in terms of specificity and depth.

Key Findings

Implications

Business and Policy Implications

Introduction

The rapid advancement of Generative AI, particularly Large Language Models (LLMs), has significant implications for financial analysis and investment research. This paper explores the capability of LLMs to generate high-quality SWOT analyses, a critical component of investment research, and compares their performance with that of human analysts. By examining the outputs of six leading LLMs on three diverse companies, the study assesses the impact of prompt engineering, the performance of different LLMs, and the potential for LLMs to augment or replace human analysts in certain tasks.

Background and Context

The use of AI in finance is not new, but the emergence of LLMs represents a significant leap forward in terms of their ability to analyze and generate complex financial content. SWOT analysis, a framework used to identify the strengths, weaknesses, opportunities, and threats of a company, is a key task in investment research. Traditionally, crafting a nuanced SWOT analysis requires extensive knowledge and careful reasoning. LLMs promise to automate much of this groundwork, rapidly generating SWOT analyses from vast amounts of data. This study investigates the extent to which LLMs can fulfill this promise and how they can be effectively leveraged by investment professionals.

The study involved a comparative analysis of six state-of-the-art LLMs: o1 Pro, DeepSeek R1, Google Gemini Advanced 2.5, Grok 3, ChatGPT 4.5, and ChatGPT 4o. These models were tasked with generating SWOT analyses for three public companies across different industries and geographies: Deutsche Telekom AG, Daiichi Sankyo Co., Ltd., and Kirby Corporation. The outputs were evaluated based on six criteria: formal structure, overall plausibility, specificity, depth of analysis, cross-checking with public information, and meta-evaluation. The results were then compared with human-generated SWOT analyses to assess the relative performance of LLMs.

The findings of this study have significant implications for investment research and the use of AI in finance. They highlight the potential of LLMs to augment human analysts, particularly in tasks that require the analysis of large datasets and the generation of comprehensive reports. However, they also underscore the importance of human judgment and oversight in interpreting the outputs of LLMs and ensuring their relevance and accuracy.

One of the key takeaways from the study is the critical role of prompt engineering in determining the quality of LLM-generated SWOT analyses. Detailed prompts that guide the LLMs to provide specific, relevant, and well-reasoned analyses significantly enhance the quality of the outputs. This finding emphasizes the need for investment professionals to develop skills in crafting effective prompts to maximize the utility of LLMs in their research tasks.

The study also highlights the performance differences among the LLMs tested. Models optimized for reasoning, such as o1 Pro and Gemini Advanced 2.5, outperformed standard LLMs in generating SWOT analyses, demonstrating a greater ability to handle complex instructions and provide insightful analyses. This distinction is crucial for investment firms choosing AI tools to support their research processes.

Furthermore, the study touches on the geopolitical implications of the LLM landscape, noting the current dominance of US and Chinese models and the relative absence of European LLMs. This raises strategic considerations for European financial institutions, including issues of dependency on foreign technology and data privacy.

As the LLM landscape continues to evolve, with new models and capabilities emerging regularly, the need for ongoing evaluation and adaptation is clear. The evaluation framework developed in this study provides a useful tool for assessing the quality of LLM-generated analyses and can be applied to future models as they become available.

In conclusion, LLMs have the potential to significantly enhance the efficiency and effectiveness of investment research by automating certain tasks and providing comprehensive analyses. However, their use should be complemented with human oversight and judgment to ensure the accuracy and relevance of the generated insights. As the technology continues to advance, it is likely to play an increasingly important role in the investment research process.

Main Results

The study compared six leading Large Language Models (LLMs) - o1 Pro, DeepSeek R1, Google Gemini Advanced 2.5, Grok 3, ChatGPT 4.5, and ChatGPT GPT-4o - on their ability to generate SWOT analyses for three diverse global companies: Deutsche Telekom AG, Daiichi Sankyo Co., Ltd., and Kirby Corporation. The results show that advanced prompting significantly improves the quality of SWOT analyses across all models.

Key Findings

  1. Advanced Prompts Yield Significantly Better SWOTs: Detailed prompts consistently enhanced SWOT analysis quality across all models. For example, Gemini Advanced 2.5 achieved the best result with an advanced prompt, ranking 1.5, versus 3.8 with a simple prompt.
  2. LLMs Often Match or Exceed Human Analysts in Output Quality: LLM-generated SWOT analyses were rated on par with, and often superior to, those written by human analysts, especially in terms of specificity and depth.
  3. Reasoning-Optimised Models Outperform Standard Models: Models designed for deep reasoning and multi-step analysis, such as o1 Pro and Gemini Advanced 2.5, produced more comprehensive and insightful SWOTs than standard models like GPT-4o.
  4. Evaluation Framework and Scoring Approach: The study developed a robust evaluation framework with six criteria (Formal Structure, Plausibility, Specificity, Depth, Cross-Checking, and Meta-Evaluation) to assess SWOT quality, providing a useful tool for ongoing model assessment.

Detailed Analysis

The study's findings have significant implications for investment professionals. Advanced prompting is crucial for generating high-quality SWOT analyses. For instance, Gemini Advanced 2.5 with an advanced prompt outperformed other models, demonstrating the importance of detailed instructions.

Advanced Prompts and LLM Performance

The use of advanced prompts boosted average specificity and depth scores by roughly 30–40% for the outputs, compared to basic prompt outputs. This highlights the need for investment professionals to develop prompt engineering skills to harness the full potential of LLMs.

LLMs vs. Human Analysts

While LLMs can rival human output in many quantitative and informational aspects, they lack the qualitative, strategic insight that comes from human experience and interaction. For example, LLMs occasionally missed or underemphasised subtler points that human analysts caught, especially those requiring a sense of corporate culture or management’s thinking.

Strategic Implications

  1. Hybrid Approach: The optimal approach is a hybrid one, using LLMs to do the heavy lifting on information gathering and generating a solid first draft of analysis, then having human analysts interpret, verify, and add strategic context.
  2. Prompt Engineering as a Key Skill: Knowing how to ask questions of LLMs is now a valuable competency. Investment professionals should cultivate some competence in crafting effective prompts to unlock tremendous value from LLMs.
  3. Human Oversight Remains Paramount: Human judgment remains the final checkpoint, especially in finance where decisions have real monetary consequences. Analysts should approach AI outputs with a mix of openness and scepticism.

Methodology Insights

The study's methodology involved a controlled experiment with six LLMs and three real-world companies. Each model generated a SWOT analysis under two conditions: basic prompt and advanced prompt. The outputs were evaluated against six key criteria, providing a nuanced assessment of SWOT quality.

Importance of the Methodology

The study's approach allowed for the isolation of the effect of prompt engineering on SWOT analysis quality. By comparing the outputs from basic vs. advanced prompts, the study demonstrated the significant impact of prompt design on LLM performance.

Analysis and Interpretation

The findings suggest that LLMs can be powerful tools to extend human analysts, but not replacements. The optimal use of LLMs is in a hybrid workflow, capitalizing on their speed, breadth, and consistency, while leveraging human depth of understanding, intuition, and judgment.

Patterns and Trends

  1. Reasoning-Optimised Models Excel: Models like o1 Pro and Gemini Advanced 2.5 outperformed standard models in SWOT generation, highlighting the importance of deep reasoning capabilities for complex analytical tasks.
  2. Geopolitical Considerations: The study notes the current imbalance in the LLM landscape, with most top models coming from the US, and a notable absence of European LLM leaders. This raises strategic considerations for European financial institutions regarding dependency on foreign technology and data privacy.

Real-World Implementation Considerations

  1. Choosing the Right LLM: Investment firms should be aware of the distinction between reasoning-optimised models and standard LLMs. For critical tasks like SWOT analysis, reasoning-optimised models are likely to deliver more value.
  2. Continuous Evaluation: Given the rapid evolution of LLMs, investment teams should regularly evaluate new models and updates to ensure they are using the best tools available.

By understanding these findings and implications, business leaders and investment professionals can effectively integrate LLMs into their research workflows, enhancing efficiency and insight quality while maintaining the critical role of human judgment.

Practical Implications

The study's findings have significant practical implications for businesses and investment professionals. The ability of Large Language Models (LLMs) to generate high-quality SWOT analyses has far-reaching consequences for how companies approach investment research and decision-making.

Real-World Applications

  1. Augmenting Human Analysts: LLMs can handle a substantial portion of the initial legwork in company research, freeing human analysts to focus on higher-level interpretation and judgment.
  2. Streamlining Research Processes: By leveraging LLMs, investment teams can accelerate their research workflows, enabling them to respond more quickly to market developments.
  3. Enhancing Insight Quality: The use of LLMs can surface new perspectives and insights that human analysts may have overlooked, potentially leading to more informed investment decisions.

Strategic Implications

  1. Competitive Advantage: Firms that effectively integrate LLMs into their research workflows may gain a competitive edge by making more timely and informed investment decisions.
  2. Talent Development: As LLMs become more prevalent, investment professionals will need to develop skills in prompt engineering and AI oversight to maximize the benefits of these tools.
  3. Risk Management: The use of LLMs also introduces new risks, such as over-reliance on AI-generated insights. Firms will need to implement robust oversight mechanisms to mitigate these risks.

Who Should Care

  1. Investment Professionals: Portfolio managers, analysts, and researchers will need to understand how to effectively utilize LLMs in their workflows.
  2. Business Leaders: Executives responsible for investment decisions will need to be aware of the potential benefits and limitations of LLMs.
  3. AI and Technology Leaders: Those responsible for implementing and overseeing AI solutions within organizations will need to stay abreast of the latest developments in LLMs.

Actionable Recommendations

To effectively integrate LLMs into investment research workflows, businesses and managers can take the following actions:

Specific Actions

  1. Develop Prompt Engineering Skills: Invest in training and development programs to enhance the prompt engineering capabilities of investment professionals.
  2. Implement a Hybrid Approach: Use LLMs to generate initial drafts of analyses, such as SWOT reports, and have human analysts review and refine the outputs.
  3. Continuously Evaluate New LLMs: Regularly assess new models and updates to ensure the use of the most effective tools available.

Implementation Considerations

  1. Model Selection: Choose LLMs that are optimized for reasoning and complex analysis, as these are likely to deliver more valuable insights for tasks like SWOT analysis.
  2. Oversight and Validation: Implement robust oversight mechanisms to validate AI-generated insights and prevent over-reliance on LLMs.
  3. Integration with Existing Workflows: Carefully integrate LLMs into existing research workflows to maximize efficiency gains and minimize disruption.

Conclusion

Summarize Main Takeaways

  1. LLMs Can Match or Exceed Human Analysts: In certain analytical tasks, such as SWOT analysis, LLMs can produce outputs that are comparable to or even surpass those of human analysts.
  2. Importance of Prompt Engineering: The quality of LLM outputs is heavily dependent on the quality of the prompts used to guide them.
  3. Hybrid Approach is Optimal: The most effective use of LLMs is in conjunction with human analysts, who can provide strategic context and oversight.

Final Thoughts

The integration of LLMs into investment research workflows has the potential to significantly enhance efficiency and insight quality. However, it is crucial to approach this integration thoughtfully, recognizing both the benefits and limitations of these powerful tools. By doing so, businesses and investment professionals can harness the full potential of LLMs to drive better investment decisions and maintain a competitive edge in a rapidly evolving market landscape.