Tech Policy Press - Dismantling AI Data Monopolies Before it’s Too Late

October 16, 2024 Courtney Radsch

CJL director Courtney Radsch gives insight on the continuous consolidation of AI within Big Tech companies are edging dangerously close to irreversible damage to developmental building blocks of generative AI.

Less than two years after ChatGPT turned AI into a topic of dinner table conversation and propelled it to the top of the policy agenda, it is clear that AI is further concentrating power in a handful of Big Tech firms and their “partners” who have access to the building blocks needed to develop and deploy the most advanced generative AI systems: data, compute, and talent. The unprecedented amounts of data needed to train foundation models and make them capable of everything from writing poetry to diagnosing cancer is entrenching the power of existing tech gatekeepers while creating a potentially insurmountable advantage to incumbents.

Data is a defining characteristic of generative AI and critical to consideration of market power and dominance. As sophisticated prediction engines, the defining characteristic of generative AI is data, whereas compute and talent are factors that “apply more generally in industrial organization and antitrust” and are thus relevant to many fields. General purpose AI models, often called foundation models, were created and trained on vast amounts of data to learn and predict patterns, grammar, and context to generate coherent and contextually relevant content, while smaller models and the processes that make consumer-facing chatbots and information retrieval work also rely on access to real-time and high quality data.

Big Tech firms have access to vast troves of data to develop, train, fine-tune, and ground their AI models and the deep pockets to fend off lawsuits while indemnifying their customers for potential intellectual property claims in an effort to encourage wider adoption and integration of their GAI products.

The structural advantages these players enjoy, in addition to anti-competitive practices such as self-preferencing and tying, have put up huge barriers to entry for newcomers to the AI market or those who want to play fair. Late entrants are also running into obstacles acquiring usable data–giving Big Tech an enormous advantage, since the first movers have already scraped vast amounts of data and have access to proprietary datasets.

Read full article here.