Data Trends for 2024
Posted: Sun Dec 22, 2024 5:02 am
Data has been playing an increasingly important role in the technology sector. The ongoing digital transformation of society as a whole, as well as the emergence of new technologies that allow for greater value extraction from available information, mean that the availability of data is as important in technological evolution cycles as the availability of computing power. The main trend for 2024 is certainly the intensification of this centrality of data for companies, individuals and governments.
The advancement of artificial intelligence throughout 2023, in particular, has significantly transformed the way organizations use data to generate value, and has also led to important discussions about access, privacy, biases and ownership of information.
In this article, taking advantage of the turn of the year, we will detail the main trends in the data area for 2024 and show how data will continue to shape the innovations that are shaping the future of digital transformation. Let's go!
Data Productization
One of the main consequences of the popularization of Large Scale Language Models (LLMs) , such as ChatGPT, and the emergence of so-called “multimodal” models, that is, models capable of processing and generating data in different formats (text, images, video, voice, etc.), was an evidence of the importance of data for artificial intelligence. One of the biggest differences between these models and more classic AI techniques is precisely the volume of information processed in training and in the processing they perform.
Most of the popular models on the market australia business mailing list today were trained on large public datasets, without any regard for copyright or intellectual property issues. While this was a successful strategy for initial product launches, it is quickly proving unsustainable. Just look at the lawsuits already being filed against some of the most prominent AI companies.
Regardless of the results achieved in these processes, what has become clear is the importance of obtaining quality data for training these models and for new versions that will be released in the future. And, to ensure the legal security of potential customers of these artificial intelligences, it is essential that this data is obtained in a more “well-behaved” manner, supported by commercial agreements between the companies that build the models and the holders of the information.
Thus, one trend we see for 2024 is the creation of more and more data “products” to be sold to those building new LLMs, or to those customizing these models for a specific application. Large data owners (large websites, newspapers, photo archives, and so on) have a unique opportunity to monetize their information, and they certainly won’t let it pass them by. We will see these large collections being offered in different commercial models to model developers, which also opens up the opportunity to sell to other use cases.
Computational Reduction
Until now, one of the defining characteristics of processing large volumes of data has been the need for large volumes of computers to perform this processing. Throughout 2023, it was possible to observe an increase in the value of companies that develop computer chips specialized in training artificial intelligence, due to the high demand for these chips. Not only that, but one of the market mantras has been that large amounts of computing power are essential for any company that wants to enter the AI area.
By 2024, we should start to see this trend reverse. With the popularization of these models, and the publication of dozens of open-source versions of them, we have more and more researchers and developers working on optimizing algorithms and data processing processes, resulting in a drop in the computational cost of developing new models.
If we follow the trends of previous technologies, we are most likely to see a significant reduction (50%) in the computational cost of training an equivalent model every 18 or 24 months, which means that we will already see significant reductions throughout 2024.
This reduction in computational costs opens the door for more and more companies to use artificial intelligence in their day-to-day operations and expand their value generation processes based on data.
Data Privacy and Security
Although it is not a new trend in the data area, the explosion in the capacity to process and discover patterns within the most different types of data, which was brought to light by the almost magical capabilities of new artificial intelligence models, brings a new urgency to issues related to privacy and the protection of individuals' data around the world.
We are already seeing the first movements in this direction, with the evolution of the AI Act in the European Union, which should have several of its clauses and regulatory elements reflected in other laws around the world, in the same way as occurred with the LGPD.
In addition to the government's increased regulatory movement, there is a growing trend towards public awareness of the importance of data protection (even more so than privacy itself). This creates a double pressure on companies: on the one hand, we have society pressuring them to take better care of their information; on the other, the government monitoring the use of data and pressuring them to prevent abuse of power.
In this scenario, the implementation of robust measures to ensure that user information is treated with the highest degree of security and respect becomes a basic necessity, not a luxury item for companies.
Data Processing Automation
One of the most dominant trends in the technology sector as a whole, which has emerged as a consequence of the evolution of artificial intelligence tools, is the automation of different types of work. From programming to writing formulas in Excel, this specific knowledge required to work in the technical area is increasingly being supplied by so-called “co-pilots”, artificial intelligences capable of automatically generating source code, mathematical formulas, or any other technical content required to build a product.
In the data area, things will be no different. Data processing, which is still largely done manually, will become increasingly automated. This has two significant side effects. The first is that it will be possible for an increasing number of people to work with data, because the barrier of specific knowledge of tools and technologies is being removed. The second, which results from the first, is the reduction in the market value of professionals in this area, which also reduces the cost for companies in working with data.
The advancement of artificial intelligence throughout 2023, in particular, has significantly transformed the way organizations use data to generate value, and has also led to important discussions about access, privacy, biases and ownership of information.
In this article, taking advantage of the turn of the year, we will detail the main trends in the data area for 2024 and show how data will continue to shape the innovations that are shaping the future of digital transformation. Let's go!
Data Productization
One of the main consequences of the popularization of Large Scale Language Models (LLMs) , such as ChatGPT, and the emergence of so-called “multimodal” models, that is, models capable of processing and generating data in different formats (text, images, video, voice, etc.), was an evidence of the importance of data for artificial intelligence. One of the biggest differences between these models and more classic AI techniques is precisely the volume of information processed in training and in the processing they perform.
Most of the popular models on the market australia business mailing list today were trained on large public datasets, without any regard for copyright or intellectual property issues. While this was a successful strategy for initial product launches, it is quickly proving unsustainable. Just look at the lawsuits already being filed against some of the most prominent AI companies.
Regardless of the results achieved in these processes, what has become clear is the importance of obtaining quality data for training these models and for new versions that will be released in the future. And, to ensure the legal security of potential customers of these artificial intelligences, it is essential that this data is obtained in a more “well-behaved” manner, supported by commercial agreements between the companies that build the models and the holders of the information.
Thus, one trend we see for 2024 is the creation of more and more data “products” to be sold to those building new LLMs, or to those customizing these models for a specific application. Large data owners (large websites, newspapers, photo archives, and so on) have a unique opportunity to monetize their information, and they certainly won’t let it pass them by. We will see these large collections being offered in different commercial models to model developers, which also opens up the opportunity to sell to other use cases.
Computational Reduction
Until now, one of the defining characteristics of processing large volumes of data has been the need for large volumes of computers to perform this processing. Throughout 2023, it was possible to observe an increase in the value of companies that develop computer chips specialized in training artificial intelligence, due to the high demand for these chips. Not only that, but one of the market mantras has been that large amounts of computing power are essential for any company that wants to enter the AI area.
By 2024, we should start to see this trend reverse. With the popularization of these models, and the publication of dozens of open-source versions of them, we have more and more researchers and developers working on optimizing algorithms and data processing processes, resulting in a drop in the computational cost of developing new models.
If we follow the trends of previous technologies, we are most likely to see a significant reduction (50%) in the computational cost of training an equivalent model every 18 or 24 months, which means that we will already see significant reductions throughout 2024.
This reduction in computational costs opens the door for more and more companies to use artificial intelligence in their day-to-day operations and expand their value generation processes based on data.
Data Privacy and Security
Although it is not a new trend in the data area, the explosion in the capacity to process and discover patterns within the most different types of data, which was brought to light by the almost magical capabilities of new artificial intelligence models, brings a new urgency to issues related to privacy and the protection of individuals' data around the world.
We are already seeing the first movements in this direction, with the evolution of the AI Act in the European Union, which should have several of its clauses and regulatory elements reflected in other laws around the world, in the same way as occurred with the LGPD.
In addition to the government's increased regulatory movement, there is a growing trend towards public awareness of the importance of data protection (even more so than privacy itself). This creates a double pressure on companies: on the one hand, we have society pressuring them to take better care of their information; on the other, the government monitoring the use of data and pressuring them to prevent abuse of power.
In this scenario, the implementation of robust measures to ensure that user information is treated with the highest degree of security and respect becomes a basic necessity, not a luxury item for companies.
Data Processing Automation
One of the most dominant trends in the technology sector as a whole, which has emerged as a consequence of the evolution of artificial intelligence tools, is the automation of different types of work. From programming to writing formulas in Excel, this specific knowledge required to work in the technical area is increasingly being supplied by so-called “co-pilots”, artificial intelligences capable of automatically generating source code, mathematical formulas, or any other technical content required to build a product.
In the data area, things will be no different. Data processing, which is still largely done manually, will become increasingly automated. This has two significant side effects. The first is that it will be possible for an increasing number of people to work with data, because the barrier of specific knowledge of tools and technologies is being removed. The second, which results from the first, is the reduction in the market value of professionals in this area, which also reduces the cost for companies in working with data.