Legalpioneer Moves Over to GitHub, Looking to Offer AI Training Data

Legal tech and professional services platform Legalpioneer—a free database of containing information on legal tech and other companies—has been transferred over to GitHub.

Raymond Blijd, the founder of Legalpioneer and Legalcomplex, told Legaltech News that the transition came in response to the legal industry’s growing interest in AI technologies, and as an effort to increase transparency within the legal tech industry.

As a part of its shift over to GitHub, the Legalpioneer dataset, comprising 13,000 companies pertinent to the legal, tax and professional services sector, will move over to the open-source platform. This dataset, now available under the Open Database License (ODbl) on GitHub, is meant to offer resources for students, researchers, entrepreneurs, and legal professionals who might want to stay abreast of the legal services industry.

“Legalpioneer is our nonprofit service, and I was [originally] planning to update the website with the new profiles from Legalcomplex,” he said. “But then I thought, let’s make life easier and dump everything [onto GitHub]” so all the data points, “including the number of investors, capital, funding … and the number of employees [in various companies].”

The companies will be organized by category, like “practice management or contracts,” Blijd said.

In the coming weeks, Legalpioneer Github will receive a new column that shows the number of a company’s funding rounds. Users can use this to calculate growth in given legal tech areas, he noted. Additionally, a set of 28,000 investors in legal and adjacent sectors will be included in the GitHub dataset.

Legapioneer’s move over to GitHub may signal a growing trend of putting legal data on open-source platforms and the subsequent reliance of legal tech providers on AI tools that need to be trained with legal data.

In July, Louis Brulé Naudet, a French data analyst and law student, started a community of legal professionals on the AI platform HuggingFace. The community, dubbed HF for Legal with the mission subhead “Breaking the opacity of language models for legal professionals,” had, at the time of reporting, racked up 51 members.

Blijd approached Legalpioneer’s transition from a similar perspective—increased accessibility to legal data.

“We see AI as a key driver in revealing new patterns and opportunities within the legal landscape,” he said. “We encourage users to experiment with our dataset, and we’re eager to learn from the discoveries you uncover.”

Leave a Reply

Your email address will not be published. Required fields are marked *