26/9/2024
Understanding the digital ecosystem involves navigating through large volumes of data to identify patterns, narratives and key players. Traditional methodologies, based on manual analysis of information, have begun to lag behind the size of databases and their constant changes. With these difficulties, artificial intelligence (AI) can be an innovative tool to improve the efficiency and depth of digital investigations.
Recently at Linterna Verde we have combined qualitative and quantitative analysis methodologies with an AI-assisted deductive coding model. This approach has been employed to identify simple categories such as themes, actors or places, but in our experience the model has achieved up to 95% success in complex categorizations. Although the categories used were predefined through a codebook, the ability of AI to process large volumes of data allowed us to detect patterns and narratives more quickly and accurately in later stages of the research, facilitating the detailed mapping of actors and networks in the media ecosystem.
One of the main advantages of AI in data analysis is its ability to optimize time and human resources. Activities that previously required the collaboration of several researchers over weeks can now be completed in a few hours under the supervision of a single person. This efficiency not only significantly reduces the time spent, but also frees researchers to concentrate on more substantial tasks, increasing both productivity and quality of work.
In addition, AI allows a smooth transition between small-scale and large-scale analysis. By working with large databases, it is possible to generate prompts - model-specific instructions - based on a small sample that are then generalized to the rest of the data. This gives us the opportunity to refine the analytical categories and match them more precisely to the objectives of the study. In this way, AI improves the efficiency and depth of analysis, allowing researchers to obtain more robust and comprehensive results.
The implementation of these tools also presents challenges. The uses of prompts engineering and category design for social research are still poorly documented. This lack of references forces teams to start from a basic level with each new project, which slows the advancement of knowledge and hinders the more widespread adoption of these technologies. It is essential that researchers share their experiences and knowledge so that the use of AI is extended and optimized in the field.
On the other hand, although AI models have the ability to generate inferences on their own, the results are more accurate when provided with predefined categories by the researchers based on a literature review. The accuracy of the results depends not only on the quality of the data entered, but also on the clarity and thoroughness of the prompts used. This underscores the importance of careful methodological design to ensure that the AI contributes effectively to the analysis and the achievement of the research objectives.
Despite initial fears about the impact of AI on the social sciences, evidence shows that this technology does not replace researchers, but rather amplifies their capabilities. Coding models facilitate the analysis of large volumes of data and automate the most repetitive tasks, allowing researchers to focus on the more complex and creative aspects of the process. In this way, this tool complements human work, improving both the efficiency and depth of qualitative analysis.
For the use of AI in social research to be effective and accessible, it is crucial that the lessons learned from its implementation be shared. Given that the use of these tools requires specialized technical knowledge, their use in the field is still relatively limited. The dissemination of these advances will contribute to closing the knowledge gap and democratizing access to these technologies, which is especially important in contexts with great social research needs such as Colombia.
Bibliography
Chang, K., Xu, S., Wang, C., Luo, Y., Xiao, T., & Zhu, J. (2024). Efficient prompting methods for large language models: A survey. arXiv. https://arxiv.org/abs/2404.01077
Chew, R., Bollenbacher, J., Wenger, M., Speer, J., & Kim, A. (2023). LLM-assisted content analysis: Using large language models to support deductive coding. RTI International. https://arxiv.org/abs/2306.14924
Törnberg, P. (2023). How to use large language models for text analysis. Institute of Language, Logic and Computation, University of Amsterdam. https://arxiv.org/abs/2307.13106