Join our daily and weekly newsletters for the latest updates and the exclusive content on AI coverage. Learn more
A Brooklyn -based startup aims at one of the most notorious points of pain in the world of artificial intelligence and data analysis: the meticulous data preparation process.
Structure emerged from the furtive mode today, announcing its public launch alongside $ 4.1 million in start -up financing led by Capital Ventures bathwith the participation of 8VC,, Integral ventures and strategic providential investors.
The company platform uses a owner visual language model called Dora To automate the collection, cleaning and structuring of data – a process that generally consumes up to 80% of the time of data scientists, according to industry surveys.
“The volume of information available today has absolutely exploded,” said Ronak Gandhi, co-founder of Strutify, in an exclusive interview with Venturebeat. “We have reached a major inflection point in data availability, which is both a blessing and a curse. Although we have unprecedented access to information, it remains largely inaccessible because it is so difficult to convert to the right format to make significant commercial decisions. ”
Strutify’s approach reflects an increasing accent on an industry scale on the resolution of what data experts call “the bottleneck of data preparation”. Gartner Research indicates that Preparation of inadequate data There remains one of the main obstacles to a successful implementation of the AI, four of the five companies devoid of the data foundations necessary to fully capitalize on the generative AI.
How the transformation of data fueled by AI unlocks a large -scale hidden commercial intelligence
Basically, Strutify allows users to create personalized data sets by specifying the data scheme, by selecting sources and deployment of AI agents to extract this data. The platform can manage everything, dry deposits and LinkedIn profiles in press articles and specialized industry documents.
What distinguishes the structure, according to Gandhi, is their internal Dora model, which sails on the web as a human would do.
“It’s super of high quality. He sails and interacts with things like a person,” said Gandhi. “So we are talking about human quality – this is the first and the first center of principles behind Dora. He reads on the internet as a human would do.”
This approach allows the structure to support a free level, which, according to Gandhi, will help democratize access to structured data.
“The way you think about the data is now, it’s this really precious object,” said Gandhi. “This really precious thing that you spend so much time to finish and do yourself and fight, and when you have it, you say to yourself:” Oh, if someone had to delete it, I cry. “”
Protrucy’s vision is to “merchant data” – which does something that can be easily recreated if it is lost.
From finance to construction: how companies deploy personalized data sets to resolve specific challenges to industry
The company has already seen adoption in several sectors. Financial teams use it to extract information from pitch decks, construction companies transform complex geotechnical documents into readable tables and sales teams bring together organizational tables in real time for their accounts.
Slater StichPartner of Bain Capital Ventures, underlined this versatility in the announcement of funding: “Each company with which I worked has a handful of data sources which are both extremely important and enormous pain to work, whether it is buried figures in PDFs, dispersed on hundreds of web pages, hidden behind a business soap API, etc.
The diversity of customers of the first structural customers reflects the universal nature of data preparation challenges. According to Technological researchThe preparation of data generally involves a series of stages with high intensity of labor: collection, discovery, profiling, cleaning, structuring, transformation and validation – any real analysis can begin.
Why human expertise remains crucial for the precision of the AI: the “quadruple verification system” of Inside
A key differentiator of the structure is its “quadruple verification” process, which combines AI with human surveillance. This approach responds to a critical concern in the development of AI: ensuring precision.
“Whenever a user sees something suspect, or identify certain data as potentially suspicious, we can send it to an expert in this case of specific use,” said Gandhi. “This expert can act in the same way as [DoRa]Access the right information, extract, save it, then check if it is right. »»
This process corrects not only the data but also creates examples of training which improve the performance of the model over time, in particular in specialized fields such as construction or pharmaceutical research.
“These things are so messy,” noted Gandhi. “I never thought in my life that I would have a strong understanding of geology. But we are there, and it is, I think, a huge force – to be able to learn from these experts and put it directly in Dora. ”
As data extraction tools become more powerful, confidentiality concerns inevitably occur. Structuty has implemented guarantees to solve these problems.
“We do not do any authentication, everything that required a connection, everything that forces you to go behind a sense of information – our agent does not do it because it is a confidentiality problem,” said Gandhi.
The company also prioritizes transparency by providing information on direct supply. “If you want to know more about particular information, you go directly to this content and see it, as opposed to the type of inheritance suppliers where it is this black box.”
Strutututy enters a competitive landscape which includes both established players and other startups which address various aspects of the data preparation challenge. Companies love Alteryx,, Computer scientist,, MicrosoftAnd Painting All offer data preparation capacities, while several specialists have been acquired in recent years.
What differentiates the structure, according to CEO Alex Reichenbach, is its combination of speed and precision. A recent Reichenbach LinkedIn post said that he had accelerated their agent “10x while reducing the cost of ~ 16x” thanks to the optimization of the model and the improvement of infrastructure.
The launch of the company intervenes in an increasing interest in the automation of data powered by AI. According to a Techtarget reportAutomation of data preparation “is frequently cited as one of the main areas of investment for data and analysis teams”, with increasingly important increased data capacities.
How the experiences of preparing frustrating data inspired two friends to revolutionize the industry
For Gandhi, the structure solves the problems with which he was confronted with first -hand in the previous roles.
“What is important about the founding history of the structure is that it is both a kind of personal and professional thing,” recalls Gandhi. “I said [Alex] About almost when I worked as data analyst and made ops and consultations, preparing these really niche data sets, tailor -made for customers – lists of all fitness influencers and their following measures, lists of companies and what jobs they publish, museums on the east coast … I spent a lot of time manually keep them, scratch, data seizure stuff. “”
The inability to quickly get the idea to the set of data was particularly frustrating. “What brought me was that you couldn’t have iterer and go from the idea to quickly defined data,” Gandhi said.
Its co-founder, Alex Reichenbach, has encountered similar challenges while working in an investment bank, where data quality problems have hampered efforts to create models in addition to structured data sets.
How the structure plans to use its seed financing of $ 4.1 million to transform the preparation of corporate data
With new financing, structuring the plans to develop your technical team and establish itself as “the essential data tool in industries”. The company currently offers free and paid levels, with business options for those who need advanced features such as on -site deployment or highly specialized data extraction.
While more and more companies are investing in AI initiatives, the importance of high -quality structured data will only increase. A recent MIT Technology Review Insights Report have found that four out of five companies are not ready to capitalize on a generative AI due to bad data foundations.
For Gandhi and the structural team, the resolution of this fundamental challenge could unlock a significant value in the industries.
“The fact that you can even imagine a world that creates data sets is iterative is a kind of bewildered mind for many of our users,” said Gandhi. “In the end, the field is to be able to have this control and this personalization.”